November 04, 2019
On Monday, 4 November 2019 at 16:48:21 UTC, Atila Neves wrote:
> On Monday, 4 November 2019 at 09:17:57 UTC, Jacob Carlborg wrote:
>> On Sunday, 3 November 2019 at 16:37:36 UTC, Cristian Becerescu wrote:
>>
>>> When encountering anonymous structs or unions, dpp gives them a name. And this forces dpp to declare a member of that 'now-named-anon' record type and also provide accessors for the 'now-named-anon' record (because unnamed records also implicitly declare a member of their type).
>>
>> But the question is why is DPP generating named unions/structs when D supports anonymous ones?
>
> Probably because I either didn't know that at the time but knew and forgot. I don't remember what I was trying to do or why just doing the obvious failed.

I don't know if this is related, but dpp ignored anonymous enums, so it may have been the same issue.
November 05, 2019
On 05/11/2019 8:33 AM, Jacob Carlborg wrote:
> On 2019-11-04 17:54, rikki cattermole wrote:
> 
>> There is a variant of struct/union in C which has a named instance but no type name.
> 
> You mean like this?
> 
> struct Foo
> {
>      struct
>      {
>          int a;
>      } b;
> }
> 
> That still needs to be translated to a named struct in D.

Yes it does.

This is probably why I'm guessing that Atila generated a name. Just did so too liberally ;)

November 05, 2019
On Tuesday, 5 November 2019 at 01:25:40 UTC, rikki cattermole wrote:
> On 05/11/2019 8:33 AM, Jacob Carlborg wrote:
>> On 2019-11-04 17:54, rikki cattermole wrote:
>> 
>>> There is a variant of struct/union in C which has a named instance but no type name.
>> 
>> You mean like this?
>> 
>> struct Foo
>> {
>>      struct
>>      {
>>          int a;
>>      } b;
>> }
>> 
>> That still needs to be translated to a named struct in D.
>
> Yes it does.
>
> This is probably why I'm guessing that Atila generated a name. Just did so too liberally ;)

Sounds about right.
November 16, 2019
Update for week 4 of Milestone 2 (ended on November 10th, sorry for the delay)

Continued testing dpp with virtio.h and found multiple bugs, especially regarding renaming:

---Bug #1---
Accessors for members of anonymous records are not renamed when the members are keywords

So in this case

struct A {
    union {
        unsigned int version;
        char module;
    };
};

the members themselves would be renamed (to version_ and module_), but the accessor function names would not (they would be auto version(...) and auto module(...) etc.).

(Solved during week 5 of Milestone 2)

---Bug #2---
The fixFields() method doesn't work when multiple structs have a field (of type pointer to struct) which needs renaming

So in this case

struct A;
struct B {
    struct A *A;
};

struct C {
    struct A* A;
};

dpp would only rename one of the fields (either the one in B or in C), because the _fieldDeclarations associative array overwrites the already existing (if any) line number with a new one. So it would rename only the field which is contained in the last processed struct.

Also, this affects C11 anon records, if, for example we would add the struct below to the ones above

struct D {
    union {
        struct A* A;
        int d;
    };
};

(Both of those problems have a proposed solution, during week 5 of Milestone 2)

---Bug #3---
In some cases, dpp writes a clang warning into the generated D file. For example (for virtio.h):

foo.d.tmp:81822:141: warning: __VA_ARGS__ can only appear in the expansion of a C99 variadic macro #define __PVOP_VCALL ( op , pre , post, ... ) ____PVOP_VCALL ( op , CLBR_ANY , PVOP_VCALL_CLOBBERS , VEXTRA_CLOBBERS , pre, post , ## __VA_ARGS__ ).

This is written exactly as is in the D file. I still have to figure why this happens.

---Bug #3---
Some code is translated to enum BLA = 6LLU;
LLU is not valid in D (LL needs to be changed to L).

---Bug #4---
An enum is initialized with a value of 68719476704, which produces this:
Error: cannot implicitly convert expression 68719476704L of type long to int

I will probably need to check if some values of the enum are > int.max, and, if so, declare "enum : long" instead of "enum".

---Bug #5---
Again regarding C11 anon records: in its output, dpp writes "const(_Anonymous_55) version_;", and this conflicts with the accessor functions' name for a struct field. I will try to figure out the reason why dpp adds that declaration in the first place and fix this (most probably another renaming issue).

---(Maybe) Bug #6---
Error: undefined identifier xattr_handler

The above happens in a struct declaration (a field of that type is declared) and also in a function declaration (a parameter of that type). The struct declaration for xattr_handler is not in the generated D file .Will look more into this.

---Bug #7---
usr/bin/ld: Warning: size of symbol `_D3foo10local_apic12__reserved_1MUNaNbNdNiNfZk' changed from 15 in foo.o to 18 in foo.o

After manually trying to solve the bugs above in the generated foo.d D file, the only problem remaining is the linker one above.
November 16, 2019
On Saturday, 16 November 2019 at 14:59:51 UTC, Cristian Becerescu wrote:
>
> ---Bug #4---
> An enum is initialized with a value of 68719476704, which produces this:
> Error: cannot implicitly convert expression 68719476704L of type long to int
>
> I will probably need to check if some values of the enum are > int.max, and, if so, declare "enum : long" instead of "enum".

Shouldn't that be done automatically by the D compiler? It's normal promotion rule, it should reflect in the base type of the enum. Afaict it is done that way in C (at least as an extension to gcc).


November 16, 2019
On Saturday, 16 November 2019 at 18:39:51 UTC, Patrick Schluter wrote:
> On Saturday, 16 November 2019 at 14:59:51 UTC, Cristian Becerescu wrote:
>>
>> ---Bug #4---
>> An enum is initialized with a value of 68719476704, which produces this:
>> Error: cannot implicitly convert expression 68719476704L of type long to int
>>
>> I will probably need to check if some values of the enum are > int.max, and, if so, declare "enum : long" instead of "enum".
>
> Shouldn't that be done automatically by the D compiler? It's normal promotion rule, it should reflect in the base type of the enum. Afaict it is done that way in C (at least as an extension to gcc).

I tested some cases.

enum e { A = 68719476704 }; // Compiles fine
enum e { A = 68719476704, B = 1 }; // Still compiles fine
enum e { B = 1, A = 68719476704 }; // Error: cannot implicitly convert expression 68719476704L of type long to int

I looked through the docs [1] and found this:
"If the EnumBaseType is not explicitly set, and the first EnumMember has an AssignExpression, it is set to the type of that AssignExpression. Otherwise, it defaults to type int."

[1] https://dlang.org/spec/enum.html
November 17, 2019
On Saturday, 16 November 2019 at 23:55:45 UTC, Cristian Becerescu wrote:
> On Saturday, 16 November 2019 at 18:39:51 UTC, Patrick Schluter wrote:
>> On Saturday, 16 November 2019 at 14:59:51 UTC, Cristian Becerescu wrote:
>>>
>>> ---Bug #4---
>>> An enum is initialized with a value of 68719476704, which produces this:
>>> Error: cannot implicitly convert expression 68719476704L of type long to int
>>>
>>> I will probably need to check if some values of the enum are
>>> > int.max, and, if so, declare "enum : long" instead of
>>> "enum".
>>
>> Shouldn't that be done automatically by the D compiler? It's normal promotion rule, it should reflect in the base type of the enum. Afaict it is done that way in C (at least as an extension to gcc).
>
> I tested some cases.
>
> enum e { A = 68719476704 }; // Compiles fine
> enum e { A = 68719476704, B = 1 }; // Still compiles fine
> enum e { B = 1, A = 68719476704 }; // Error: cannot implicitly convert expression 68719476704L of type long to int
>
> I looked through the docs [1] and found this:
> "If the EnumBaseType is not explicitly set, and the first EnumMember has an AssignExpression, it is set to the type of that AssignExpression. Otherwise, it defaults to type int."
>
> [1] https://dlang.org/spec/enum.html

I wanted to interject that it is in violation of the C standard (as D has one of its goal that constructs that are syntactically identical to C behave the same as in C) but after checking the standard C99, it is in fact implementation-defined, with the added remark "An implementation may delay the choice of which integer type until all enumeration constants have been seen".
So, even in C the compiler may implement as it wishes.
November 18, 2019
Update for week 5 of Milestone 2

I've implemented and submitted solutions for the following bugs:

> ---Bug #1---
> Accessors for members of anonymous records are not renamed when the members are keywords

PR: https://github.com/atilaneves/dpp/pull/213
Status: Merged.


> ---Bug #2---
> The fixFields() method doesn't work when multiple structs have a field (of type pointer to struct) which needs renaming

PR: https://github.com/atilaneves/dpp/pull/214
Status: Merged.

> ---Bug #3---
> Some code is translated to enum BLA = 6LLU;
> LLU is not valid in D (LL needs to be changed to L).

PR: https://github.com/atilaneves/dpp/pull/215
Status: Need to add unit test.

> ---Bug #4---
> An enum is initialized with a value of 68719476704, which produces this:
> Error: cannot implicitly convert expression 68719476704L of type long to int

PR: https://github.com/atilaneves/dpp/pull/217
Status: AppVeyor build fails (on Windows 32-bit), will look into it.

> ---Bug #5---
> Again regarding C11 anon records: in its output, dpp writes "const(_Anonymous_55) version_;", and this conflicts with the accessor functions' name for a struct field. I will try to figure out the reason why dpp adds that declaration in the first place and fix this (most probably another renaming issue).

PR: https://github.com/atilaneves/dpp/pull/216
Status: Just need to add a comment and after that probably done.

As for the rest of the issues, I'm still debugging and trying to figure out how to reproduce some of them on smaller examples.
November 25, 2019
Update for week 1 of Milestone 3

(1) I've identified the actual cause for a previous issue and also implemented a solution for it:

> ---(Maybe) Bug #6---
> Error: undefined identifier xattr_handler
>
> The above happens in a struct declaration (a field of that type is declared) and also in a function declaration (a parameter of that type). The struct declaration for xattr_handler is not in the generated D file. Will look more into this.

The problem there was that we would have something like this:

void f(struct A**); // C code; A is undeclared

and, because A is undeclared, DPP should generate an plain 'struct A;'. It turns out that previously, DPP was only looking at the cursor type for the first pointee of 'struct A**', which was 'struct A*' (which is not a Record type). So DPP would not generate the corresponding plain struct.

The solution is simply moving on the pointee type until the cursor type is not a Pointer type (so eventually we would get to the Record type in the case above).

PR: https://github.com/atilaneves/dpp/pull/218


(2) Found another bug, this time about bitfields. A succinct explanation and a small example can be found here: https://gist.github.com/cbecerescu/29188e4c0f0bb83e0e85e4e0dccc8c30


(3) I've tried using DPP with other kernel headers as well (specifically netdevice.h), and usually I get an error telling me the resources have been exhausted.

My guess is that the issue is the 'lines' array, which contains all of the lines to be written in the generated D file. As far as I know, we don't flush at any point during this process. I assume this array gets very large at some point (I've seen a maximum of ~100K generated D files so far, although other factors could be also impactful, maybe the AST).

I thought of two approaches here: either try to write to the file when 'lines' gets too big, or translate each C header into different D modules. I'm still open to suggestions.

(4) I still have not got to the cause of this (I assume it's a sort of redirection of stderr, but I can't reproduce it with any warnings I tried):

> ---Bug #3---
> In some cases, dpp writes a clang warning into the generated D file. For example > (for virtio.h):
>
>foo.d.tmp:81822:141: warning: __VA_ARGS__ can only appear in the expansion of a C99 variadic macro #define __PVOP_VCALL ( op , pre , post, ... ) ____PVOP_VCALL ( op , CLBR_ANY , PVOP_VCALL_CLOBBERS , VEXTRA_CLOBBERS , pre, post , ## __VA_ARGS__ ).
>
> This is written exactly as is in the D file. I still have to figure why this happens.
December 02, 2019
Update for week 2 of Milestone 3

- Fixed the bitfields issue (https://github.com/atilaneves/dpp/pull/219)
- Tested DPP with all the kernel headers included by Alex's driver + other 70 random kernel headers
- Discovered (what I hope to be the last) 5 issues

1. In some obscure parts of the kernel, there exists something like 'extern __visibile const void __nosave_begin;', which is a variable of type void (but declared extern, so the compiler doesn't complain). Still thinking about how DPP should translate this to valid D code.

2. Unknown enums in function return/param types won't generate opaque definitions for that enum (as it is the case for struct).

e.g.: enum A f(enum B); // A and B would be uknown identifiers
      struct C *f(struct D *); // C and D would be declared by default by DPP

3. Unreachable struct

e.g.:
// C
struct A {
    struct B {
        int a;
        int b;
    };
};

void f(struct B *); // valid in C

// D
void f(B *); // not valid in D, this is how DPP translates the function declaration
void f(A.B *); // OK, this should probably be the actual output

4. 'struct sockaddr size not known', and some functions return this struct type by value (not pointer to it). Definition of this struct should probably be there. Will check to see if I missed any flags in the Makefile.

> (3) I've tried using DPP with other kernel headers as well (specifically netdevice.h), and usually I get an error telling me the resources have been exhausted.

For this issue I previously proposed 2 approaches, which have their own limitations (thanks Atila for pointing them out).
The exhaustion of memory seems to happen while cpp/clang is still running, so the issue is not about the lines array (which anyway would not be bigger than MB, and the RAM usage is more than 6GB). I will, therefore, do some profiling to narrow down the causes for this and decide if there's something we could do about it.