POD (page 3) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » POD (page 3)

December 28, 2012

Posted by Andrej Mitrovic
in reply to Walter Bright

Andrej Mitrovic

Posted in reply to Walter Bright

On 12/28/12, Walter Bright <newshound2@digitalmars.com> wrote:
> There needs to be an easy way to query a type to see if it is POD, and currently there isn't one.

I suggest we implement a compiler trait via __traits(isPOD, Type). C++11 has is_pod, D should have a Phobos template that wraps around the new trait.

C++03/11 also define their meaning of PODs to allow interfacing with
C. C++03 rules:
http://stackoverflow.com/questions/4178175/what-are-aggregates-and-pods-and-how-why-are-they-special/4178176#4178176

and C++11 rules:

http://stackoverflow.com/questions/4178175/what-are-aggregates-and-pods-and-how-why-are-they-special/7189821#7189821

According to that function you gave access specifiers in D do not change the meaning of PODs (unlike in C++), but other than that the rules are very similar.

What about field initalizers though? Can their presence make the default-generated struct constructor non-trivial in D, or does it not matter?

E.g. in C++11:

struct NotAggregate {
    int x = 5; // OK
    std::vector<int> s{1,2,3}; // makes it a non-aggregate, therefore non-POD
};

I'm not sure if D has these sort of issues.

December 28, 2012

Posted by Andrej Mitrovic
in reply to Walter Bright

Andrej Mitrovic

Posted in reply to Walter Bright

On 12/28/12, Walter Bright <newshound2@digitalmars.com> wrote:
> There's more than that. POD structs can exist solely in registers. Having ctor/dtor/postblit means that structs cannot be enregistered, because those operations require a pointer to the instance.
>
> There are subtleties to it. It affects all kinds of things like ABI call/return issues, varargs, etc.

These are good points and reminds us that we need to update the documentation, there's a section that reads:

"In C++ parlance, a D struct is a POD (Plain Old Data) type, with trivial constructors and destructors."

It needs further explanation. Just a few weeks ago someone tried to add a non-trivial constructor in a D struct which interfaces with C and his code stopped working, these are likely the reasons why.

December 29, 2012

Posted by Andrej Mitrovic
in reply to Walter Bright

Andrej Mitrovic

Posted in reply to Walter Bright

On 12/28/12, Walter Bright <newshound2@digitalmars.com> wrote:
> D's definition of POD:
>
> /***************************************
>   * Return true if struct is POD (Plain Old Data).
>   * This is defined as:
>   *      not nested
>   *      no postblits, constructors, destructors, or assignment operators
>   *      no fields with with any of those
>   * The idea being these are compatible with C structs.
>   *
>   * Note that D struct constructors can mean POD, since there is always
> default
>   * construction with no ctor, but that interferes with OPstrpar which wants
> it
>   * on the stack in memory, not in registers.
>   */
> bool StructDeclaration::isPOD()
> {
>      if (isnested || cpctor || postblit || ctor || dtor)
>          return false;
>
>      /* Recursively check any fields have a constructor.
>       * We should cache the results of this.
>       */
>      for (size_t i = 0; i < fields.dim; i++)
>      {
>          Dsymbol *s = fields[i];
>          VarDeclaration *v = s->isVarDeclaration();
>          assert(v && v->storage_class & STCfield);
>          if (v->storage_class & STCref)
>              continue;
>          Type *tv = v->type->toBasetype();
>          while (tv->ty == Tsarray)
>          {   TypeSArray *ta = (TypeSArray *)tv;
>              tv = tv->nextOf()->toBasetype();
>          }
>          if (tv->ty == Tstruct)
>          {   TypeStruct *ts = (TypeStruct *)tv;
>              StructDeclaration *sd = ts->sym;
>              if (!sd->isPOD())
>                  return false;
>          }
>      }
>      return true;
> }
>

I think there's a bug here, it never checks for the presence of opAssign. Maybe change the first if check to:

 if (isnested || cpctor || postblit || ctor || dtor ||
search_function(this, Id::assign))

I can make a pull with this fix (and add the new isPOD trait). Whaddya think?

December 29, 2012

Posted by Walter Bright
in reply to Andrej Mitrovic

Walter Bright

Posted in reply to Andrej Mitrovic

On 12/28/2012 4:23 PM, Andrej Mitrovic wrote:
> I think there's a bug here, it never checks for the presence of
> opAssign. Maybe change the first if check to:
>
>   if (isnested || cpctor || postblit || ctor || dtor ||
> search_function(this, Id::assign))
>
> I can make a pull with this fix (and add the new isPOD trait). Whaddya think?
>

I don't think a dependency on opAssign is necessary. I think it's great you're doing a pull request for it.

The definition of POD may change slightly in the future, and it may differ on different platforms, which is why I think it is important to have a __traits for it that gets the compiler's view of what's a POD.

I see you already saw:

http://d.puremagic.com/issues/show_bug.cgi?id=9237

December 29, 2012

Posted by David Nadlinger
in reply to Walter Bright

David Nadlinger

Posted in reply to Walter Bright

On Friday, 28 December 2012 at 22:40:13 UTC, Walter Bright wrote:
> D's definition of POD: […]

The definition of POD as used in the x86-64 System V ABI – and its implementation is really the only place where isPOD matters right now – is as follows (ver. 0.99.6):

"If a C++ object has either a non-trivial copy constructor or a non-trivial
destructor, it is passed by invisible reference […]."

and as a footnote:

"A de/constructor is trivial if it is an implicitly-declared default de/constructor and if:
 - its class has no virtual functions and no virtual base classes, and
 - all the direct base classes of its class have trivial de/constructors, and
 - for all the nonstatic data members of its class that are of class type (or array thereof), each such class has a trivial de/constructor."

Note that the absence of a constructor is *not* required for a C++ class/struct to be considered POD under the rules of the x86-64 ABI. I would want to be *very* sure about my reasons before changing this behavior as it will definitely lead to confusion, especially considering that ABI details are black magic to many programmers anyway.

In fact, I had been planning to ping you about this since I started to work on the LDC ABI implementation a few months ago (a place where this makes a difference are e.g. some types from core.time), but I then shifted my focus more towards resolving the immediate release blocker issues and apparently in turn also forgot to raise this issue on the mailing list.

Anyway, I would ask you to reconsider your decision because:

 - It breaks the principle of least surprise regarding the C<->D mapping: While you can add adornments like constructors to the C++ definition of a given C type, you can't in D. Also, as mentioned by Andrej, as a programmer you just don't expect a constructor to make your struct somehow behave differently than a plain C struct.

 - More or less another way of stating the last point, it is not at all obvious from the x86-64 ABI specification: I re-implemented the x86-64 ABI for D from scratch for LDC recently, and this was just about the only difference between your implementation and mine (the other, as far as I remember, being that DMD mistakenly never enregisters 3-byte structs). Sure, we need to properly document the D behavior anyway, but if the general rule is "D follows the ABI of the host C compiler", I think we should avoid diverting from the obvious path as much as possible.

 - The new C++ standard defines POD structs as being "a non-union class that is both a trivial class and a standard-layout class, and has no
non-static data members of type non-POD struct, non-POD union (or array of such types)" ([class]p10), where "trivial" means no copy/move constructors or destructors, and standard-layout essentially excludes virtuals, different access control modifiers and base classes with non-static fields. In particular, POD structs are allowed to have constructors, and going a different route with D will likely be a source of confusion.

 - There are also possible performance implications: Structs that just wrap a single field in order to enrich it with type information, such as Duration or a Quantity struct from a units of measurement library, will likely benefit from being passed in registers (if available), even if they will probably have constructors. In fact, having such one-field structs behave like a naked value might be vitally important to be able to decorate a C API with e.g. units information (even if such code would strictly speaking be architecture dependent).

As far as I can see – and I'm merely guessing based on the isPOD comment here – the only reason for not just applying the C/C++ definition in the obvious way might be a quirk in the DMD backend implementation?

David

December 29, 2012

Posted by bearophile
in reply to David Nadlinger

bearophile

Posted in reply to David Nadlinger

David Nadlinger:

>  - More or less another way of stating the last point, it is not at all obvious from the x86-64 ABI specification: I re-implemented the x86-64 ABI for D from scratch for LDC recently, and this was just about the only difference between your implementation and mine (the other, as far as I remember, being that DMD mistakenly never enregisters 3-byte structs).

I have done a small experiment. This is C99 code:


#include "stdint.h"
typedef uint8_t ubyte;
typedef struct { ubyte r, g, b; } RGB;
RGB bar(RGB c1, RGB c2) {
    return (RGB){ (ubyte)(c1.r + c2.r),
                  (ubyte)(c1.g + c2.g),
                  (ubyte)(c1.g + c2.b) };
}



Compiled with Clang 3.0 gives:


bar:
    pushl   %esi
    movl    8(%esp), %ecx
    movl    12(%esp), %eax
    leal    (%eax,%ecx), %edx
    movzbl  %dl, %esi
    andl    $16776960, %ecx # imm = 0xFFFF00
    shrl    $8, %ecx
    andl    $16777215, %eax # imm = 0xFFFFFF
    movl    %eax, %edx
    shrl    $16, %edx
    addl    %ecx, %edx
    shll    $16, %edx
    orl %esi, %edx
    shrl    $8, %eax
    addl    %ecx, %eax
    shll    $8, %eax
    movzwl  %ax, %eax
    orl %edx, %eax
    popl    %esi
    ret



This is D code:


struct RGB { ubyte r, g, b; }
RGB bar(RGB c1, RGB c2) {
    return RGB(cast(ubyte)(c1.r + c2.r),
               cast(ubyte)(c1.g + c2.g),
               cast(ubyte)(c1.g + c2.b));
}



DMD 2.061alpha gives (-O -release -inline):

_D4test3barFS4test3RGBS4test3RGBZS4test3RGB:
        push    EAX
        push    EBX
        mov EBX,EAX
        push    ESI
        push    EDI
        movzx   ECX,byte ptr 018h[ESP]
        movzx   EDX,byte ptr 014h[ESP]
        add CL,DL
        mov [EBX],CL
        movzx   ESI,byte ptr 019h[ESP]
        mov ECX,ESI
        movzx   EDI,byte ptr 015h[ESP]
        mov EDX,EDI
        mov 0Ch[ESP],ECX
        add CL,DL
        mov 1[EBX],CL
        mov ECX,0Ch[ESP]
        movzx   ESI,byte ptr 016h[ESP]
        mov EDX,ESI
        add CL,DL
        mov 2[EBX],CL
        pop EDI
        pop ESI
        pop EBX
        pop ECX
        ret 8


Bye,
bearophile

December 29, 2012

Posted by bearophile
in reply to bearophile

bearophile

Posted in reply to bearophile

>                   (ubyte)(c1.g + c2.b) };

Sorry, that should be (and the same in the D code):

>                   (ubyte)(c1.b + c2.b) };


But it's not important, also because both versions have the same little bug.

Bye,
bearophile

December 29, 2012

Posted by Walter Bright
in reply to David Nadlinger

Walter Bright

Posted in reply to David Nadlinger

On 12/28/2012 5:53 PM, David Nadlinger wrote:
> As far as I can see – and I'm merely guessing based on the isPOD comment here –
> the only reason for not just applying the C/C++ definition in the obvious way
> might be a quirk in the DMD backend implementation?

The idea with POD is not necessarily to be compatible with C++ POD, but to be compatible what the technical point of POD is. As I wrote on github,

"Remember, a POD type can be enregistered, if it also fits in registers and the code gen supports it. A non-POD can never be enregistered. A POD type can also be bit copied around at will without regard to needing to call any functions for any part of that. (This means POD can be spilled into registers, cached in registers, stored half in registers and half in memory, etc.)"

POD data must also exactly match the C version of it, otherwise, obviously, binary compatibility with C will be lost. D has no requirement to be binary compatible with C++ non-POD structs.

So, as long as we satisfy those requirements, we can invent our own definition of POD in a way that makes the most sense for D.

For example, C++ disallows mixed public/private data in a POD. I've never understood the reason for that restriction other than "C doesn't have private fields". D doesn't propagate that aspect.

Also, you made some good points about changing the D definition of POD. That is worthy of more discussion, and serves to underscore that we need an isPOD trait that is set by the compiler, not a reinvention in the library, because when isPOD changes then user code will automatically change in step.

December 29, 2012

Posted by David Nadlinger
in reply to bearophile

David Nadlinger

Posted in reply to bearophile

On Saturday, 29 December 2012 at 02:41:22 UTC, bearophile wrote:
> I have done a small experiment. This is C99 code: […]

Eww, AT&T syntax… ;)

I finally added a bugzilla entry for the issue, which includes a more minimal example: http://d.puremagic.com/issues/show_bug.cgi?id=9239

David

December 29, 2012

Posted by Andrei Alexandrescu
in reply to Walter Bright

Andrei Alexandrescu

Posted in reply to Walter Bright

On 12/28/12 10:12 PM, Walter Bright wrote:
> For example, C++ disallows mixed public/private data in a POD. I've
> never understood the reason for that restriction other than "C doesn't
> have private fields". D doesn't propagate that aspect.

The idea was to leave C++ compilers to change the order of fields in classes with private data for optimal layout. I don't know of any compilers that do, but that was the intent. PODs were not susceptible to such layout optimization because they had to be compatbile with C's layout.

Andrei

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation