Thread overview
vk.xml
Feb 21, 2016
Nicholas Wilson
Feb 21, 2016
ZombineDev
Feb 22, 2016
Nicholas Wilson
Feb 23, 2016
Nicholas Wilson
Feb 24, 2016
Nicholas Wilson
Feb 24, 2016
Mike Parker
February 21, 2016
So I was going through the vulcan spec to try to create a better D bindings for it. (pointer /len pairs to arrays adhering to D naming conventions and prettying up the *Create functions functions like vkResult *Create( arg ,&arg, retptr) to a fancy throw on misuse struct with constructors and that kind of stuff.)

 the structure of the registry is as follows.


struct KHR_registry
{

    string comment; // Not as xml comment  but a <comment> ... </comment>

    struct vendorid
    {
        string @Tag name;
        string @Tag id;
        string @Tag comment;
    }
    vendorid[] vendorids;

    struct tag
    {
        string @Tag name;
        string @Tag author;
        string @Tag contact;
    }
    tag[] tags;

    struct validity_type    // does not appear at this level but is common to commands and types
    {
        struct usage_type
        {
            string content;
        }
        usage_type[] usage;
    }
    struct type
    {
        string name;            // inconsistanly in tag and content
        string requires;
        string parent;
        string content;
        enum category_tpye
        {
            include,
            define,
            basetype,
            bitmask,
            handle,
            enum_,
            funcpointer,
            struct_,

        }
        category_tpye category;
        bool returnedonly;

        validity_type validity;
        struct member
        {
            string typename;      //name of the members type variable tag type
            string name;            //name of the member variable
            string len;                //companion member length variable for array pointers
            bool optional;                      //should generate as a default value
            bool noautovalidity;

        }
        enums* enum;                            //pointer to corresponding type enum definition
        struct params
        {
            type* paramType;                    //for some reason any pointer qualifiactions occur after e.g.
                                                // <type> void </type> * see coment below
            string name;
            size_t  inderections;               // number of pointer inderections before reaching the type pointed to
                                                // by paramType
                                                // functions taking no args will have parameters == []
            string content;

        }
        params[] parameters;
    }
    type[] types;

    struct enums
    {
        string name,comment, type, expand;
          // expand is the C enum's Prefix e.g. VK_QUERY_TYPE
        struct enum_
        {
            string value, name, comment;
        }
    }

    struct command
    {
        string successcodes;
        string errorcodes;
        string content;
        struct proto_type
        {
            string content;
            type* returnType;
            string name;


        }
        proto_type proto;
        struct pram
        {
            string content;
            bool optional;
            string externsync;
            string len;
            size_t inderections;
            immutable invalidLen = "null-terminated";
            // if len is this then it is not an array
            // although this is inconsistant. sometimes as a len
            // others in the validity
        }
        validity_type validity;
        string queues, renderpass, cmdbufferlevel;
    }

    command[] commands;
    struct feature
    {
        string api, name, number;

        /*
         <require comment="API version">
         <type name="VK_API_VERSION"/>
         <enum name="VK_VERSION_MAJOR"/>
         <command name="VK_VERSION_MINOR"/>
         <type name="VK_VERSION_PATCH"/>
         </require>*/
        struct require

        {
            string comment;
            string metaType; // type enum command see above variable tag type
            string name;
        }
        require[] requires;
    }
    struct extension
    {
        string name, number, supported;
        struct require
        {
            struct elem
            {
                string name, content;
                string metaType; //variable tag type
                string offset, extends;
            }
        }
    }
    extension[] extensions;
}

where "string content" is the string between the tags (does this have a name?) and thing with @Tag are always in the tag (although it is also inconsistent). I tried looking at the docs for std.xml but they were not very helpful.

To further complicate things not all of the fields are present in all cases also I need to keep track of the number of '*' (see fields named indirections) after a tag has closed  for pointer type declarations in function signatures and structs.

Also is there a way to do sane cross referencing without having internal pointers everywhere.

I'm sure there should be a very elegant way to do this by recursing down through the sub-structs

Many thanks
Nic
February 21, 2016
On Sunday, 21 February 2016 at 12:52:33 UTC, Nicholas Wilson wrote:
> So I was going through the vulcan spec to try to create a better D bindings for it. (pointer /len pairs to arrays adhering to D naming conventions and prettying up the *Create functions functions like vkResult *Create( arg ,&arg, retptr) to a fancy throw on misuse struct with constructors and that kind of stuff.)
>
>  the structure of the registry is as follows.
>
>
> struct KHR_registry
> {
> <snip>
> }
>

I'm glad to see more people looking to create a D binding from vk.xml!
I was also working on this (http://forum.dlang.org/post/ygylvtuwwiwyqtcnlejh@forum.dlang.org), unfortunately I won't be able to continue my work until early March, so I hope you'll do a good job in the mean time ;)

Unfortunately the spec is written in a very C specific way so we need to have a bunch of special cases for parsing the C code that's interleaved with the XML tags.

I haven't used std.xml yet, but I hope I may be able to help you.

> where "string content" is the string between the tags (does this have a name?) and thing with @Tag are always in the tag (although it is also inconsistent). I tried looking at the docs for std.xml but they were not very helpful.

May be you can use
http://dlang.org/phobos/std_xml#.Element.text in combination with
http://dlang.org/phobos/std_xml#.Element.elements

> To further complicate things not all of the fields are present in all cases also I need to keep track of the number of '*' (see fields named indirections) after a tag has closed  for pointer type declarations in function signatures and structs.

You mean things like:
<param><type>void</type>** <name>ppData</name></param>
(from the vkMapMemory command) ?

Yeah this is extremely ugly.
One way to do it is to go to the "param" XML element
and check if the size of it's elements matches it's text size. In the case above everything between <param>  and </param> is 39 characters and the two tags are
17 and 19 character respectively (total 36) which leaves 3 chars for the indirections. If we make a pass to remove all whitespace between tags (between ">" and "<") we should get exactly 2 characters which is the number of indirection we're looking for.

Another way is to just strip the tags and leave only their internal text. In the above example the result should be:
<param>void** ppData</param>
which is valid D code and we can leave it that way for now.

> Also is there a way to do sane cross referencing without having internal pointers everywhere.

I am no XML expert (so there may be an easier way), but I think it would be easier if we use associative arrays, instead of plain arrays. That way we can check for example:

if ("VkBuffer" in types && "VkDevice" in types)
    types["VkBuffer"].parent = types["VkDevice"];

> I'm sure there should be a very elegant way to do this by recursing down through the sub-structs

The simplest solution, IMO, is to insert all the types, commands, enums, etc. in associative arrays first and then do cross-referencing.

E.g.

Type[string] types;
struct Type
{
   // ... other members

   // Set to string first, and later change
   // to the proper Type, after the whole `types` table has been read.
   Algebraic(string, Type*) parent;

   // ... more members
}

Otherwise the only sane way to do forward reference resolution is to use D Fibers. I have used fibers in some of my projects, but the extra complications are not worth it for this task as the xml is only 5100 lines and the performance benefits probably would not be very large.

> Many thanks
> Nic

BTW if you run into some other issues with std.xml, you can also check
http://arsdnet.net/web.d/dom.html and
https://github.com/burner/std.xml2

Also don't forget to use unittests extensively!
February 22, 2016
On Sunday, 21 February 2016 at 15:18:44 UTC, ZombineDev wrote:
> On Sunday, 21 February 2016 at 12:52:33 UTC, Nicholas Wilson wrote:
>> So I was going through the vulcan spec to try to create a better D bindings for it. (pointer /len pairs to arrays adhering to D naming conventions and prettying up the *Create functions functions like vkResult *Create( arg ,&arg, retptr) to a fancy throw on misuse struct with constructors and that kind of stuff.)
>>
>>  the structure of the registry is as follows.
>>
>>
>> struct KHR_registry
>> {
>> <snip>
>> }
>>
>
> I'm glad to see more people looking to create a D binding from vk.xml!
> I was also working on this (http://forum.dlang.org/post/ygylvtuwwiwyqtcnlejh@forum.dlang.org), unfortunately I won't be able to continue my work until early March, so I hope you'll do a good job in the mean time ;)
>
> Unfortunately the spec is written in a very C specific way so we need to have a bunch of special cases for parsing the C code that's interleaved with the XML tags.
>
> I haven't used std.xml yet, but I hope I may be able to help you.
>
>> where "string content" is the string between the tags (does this have a name?) and thing with @Tag are always in the tag (although it is also inconsistent). I tried looking at the docs for std.xml but they were not very helpful.
>
> May be you can use
> http://dlang.org/phobos/std_xml#.Element.text in combination with
> http://dlang.org/phobos/std_xml#.Element.elements
>
>> To further complicate things not all of the fields are present in all cases also I need to keep track of the number of '*' (see fields named indirections) after a tag has closed  for pointer type declarations in function signatures and structs.
>
> You mean things like:
> <param><type>void</type>** <name>ppData</name></param>
> (from the vkMapMemory command) ?
>
> Yeah this is extremely ugly.
> One way to do it is to go to the "param" XML element
> and check if the size of it's elements matches it's text size. In the case above everything between <param>  and </param> is 39 characters and the two tags are
> 17 and 19 character respectively (total 36) which leaves 3 chars for the indirections. If we make a pass to remove all whitespace between tags (between ">" and "<") we should get exactly 2 characters which is the number of indirection we're looking for.
>
> Another way is to just strip the tags and leave only their internal text. In the above example the result should be:
> <param>void** ppData</param>
> which is valid D code and we can leave it that way for now.
>
>> Also is there a way to do sane cross referencing without having internal pointers everywhere.
>
> I am no XML expert (so there may be an easier way), but I think it would be easier if we use associative arrays, instead of plain arrays. That way we can check for example:
>
> if ("VkBuffer" in types && "VkDevice" in types)
>     types["VkBuffer"].parent = types["VkDevice"];
>
>> I'm sure there should be a very elegant way to do this by recursing down through the sub-structs
>
> The simplest solution, IMO, is to insert all the types, commands, enums, etc. in associative arrays first and then do cross-referencing.
>
> E.g.
>
> Type[string] types;
> struct Type
> {
>    // ... other members
>
>    // Set to string first, and later change
>    // to the proper Type, after the whole `types` table has been read.
>    Algebraic(string, Type*) parent;
>
>    // ... more members
> }
>
> Otherwise the only sane way to do forward reference resolution is to use D Fibers. I have used fibers in some of my projects, but the extra complications are not worth it for this task as the xml is only 5100 lines and the performance benefits probably would not be very large.
>
>> Many thanks
>> Nic
>
> BTW if you run into some other issues with std.xml, you can also check
> http://arsdnet.net/web.d/dom.html and
> https://github.com/burner/std.xml2
>
> Also don't forget to use unittests extensively!

Thanks for the tips. I used AA and just got it to compile! :) :| :( but fails to link.
Undefined symbols for architecture x86_64:
  "_D28TypeInfo_xC4arsd3dom7Element6__initZ", referenced from:
      _D29TypeInfo_AxC4arsd3dom7Element6__initZ in app.o
  "_D4arsd3dom12__ModuleInfoZ", referenced from:
      _D3app12__ModuleInfoZ in app.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
--- errorlevel 1
any Ideas?

structure is
app.d
app.o
arsd/
        dom.d
        dom.o

-------------------
CODE
-------------------
struct KHR_registry
{

    string comment; // Not as xml comment  but a <comment> ... </comment>

    struct vendorid
    {
        string  name;
        string  id;
        string  comment;
    }
    vendorid[] vendorids;

    struct tag
    {
        string  name;
        string  author;
        string  contact;
    }
    tag[] tags;

    struct validity_type    // does not appear at this level but is common to commands and types
    {
        string[] str;

    }
    struct type
    {
        string name;            // inconsistanly in tag and content
        string requires;
        string parent;          // these are different things
        string parentName;      // ditto
        string content;         // <type ... > content </type>
        string category;
        bool returnedonly;

        validity_type validity;
        struct member
        {
            string typename;                    //name of the members type variable tag type
            string name;                        //name of the member variable
            string len;                         //companion member length variable for array pointers
            bool optional;                      //should generate as a default value
            bool noautovalidity;
            string enum_;

        }
        member[string] members;
    }
    type[string] types;

    struct enums
    {
        string name,comment, type, expand;      // expand is the C enum's Prefix e.g. VK_QUERY_TYPE
        struct enum_
        {
            string value, name, comment;
        }
        enum_[string] values;
    }
    enums[string] enumerations;
    struct command
    {
        string successcodes;
        string errorcodes;
        string returnType;
        string name;
        string content;

        struct param
        {
            string content;
            bool optional;
            string externsync;
            string len;
            size_t inderections;
            immutable invalidLen = "null-terminated";   // if len is this then it is not an array
                                                        // although this is inconsistant. sometimes as a len
                                                        // others in the validity
        }
        param[] params;
        validity_type validity;
        string queues, renderpass, cmdbufferlevel;
    }

    command[string] commands;
    struct feature
    {
        string api, name, number;

        /*
         <require comment="API version">
         <type name="VK_API_VERSION"/>
         <enum name="VK_VERSION_MAJOR"/>
         <command name="VK_VERSION_MINOR"/>
         <type name="VK_VERSION_PATCH"/>
         </require>*/
        struct require

        {
            string comment;
            string metaType; // type enum command see above variable tag type

            string name;


        }
        require requires;
    }
    struct extension
    {
        string name, number, supported;
        struct require
        {
            struct elem
            {
                string name, content;
                string metaType;
                string offset, extends;
            }
        }
    }
    extension[] extensions;
}

KHR_registry regFromArsdDocument(Document doc)
{
    KHR_registry reg;
    size_t i;
    reg.comment   = doc.getElementsByTagName("comment")[0].innerText(); // only one comment
    reg.vendorids = doc.getElementsByTagName("vendorids")[0]
                            .getElementsByTagName("vendorid")
                            .map!(elem =>
                            {
                                KHR_registry.vendorid vid;
                                vid.name    = elem.getAttribute("name");
                                vid.id      = elem.getAttribute("id");
                                vid.comment = elem.getAttribute("comment");
                                return vid;
                            })()
                            .array[].map!(a => a()).array;
    reg.tags      = doc.getElementsByTagName("tags")[0]
                            .getElementsByTagName("tag")
                            .map!(elem =>
                            {
                                KHR_registry.tag t;
                                t.name      = elem.getAttribute("name");
                                t.author    = elem.getAttribute("author");
                                t.contact   = elem.getAttribute("contact");
                                return t;
                            })
                            .array.map!(a => a()).array[];
    /*reg.types   =*/
    doc.getElementsByTagName("types")[0] //associatve array assignment: can we do this as a range?
    .getElementsByTagName("type")
    .map!(elem =>
    {
        KHR_registry.type t;
        t.name = elem.getAttribute("name"); // for root types only
        if (t.name == null)
        {
            t.name          = elem.getElementsByTagName("name")[0].innerText;
            t.parentName    = elem.getElementsByTagName("type")[0].innerText;
        }
        t.requires = elem.getAttribute("requires");
        t.parent   = elem.getAttribute("parent");
        t.content  = elem.innerText;
        t.category = elem.getAttribute("catagory");
        t.returnedonly = elem.getAttribute("returnedonly") != null; // if it is present it is true
        if (elem.getElementsByTagName("validity"))
        {
            t.validity.str = elem.getElementsByTagName("validity")[0]
                                .getElementsByTagName("usage")
                                .map!( elem2 => elem2.innerText).array[];
        }

        if (t.category == "funcpointer")
        {
            t.parentName = null; // first <type> ... </type> is not actuall the parent type
            // we dont use the types in the signature because we use the inner text.
        }
        if (t.category == "struct")
        {   // TODO: preserve the XML comments
            /*t.members = */ elem.getElementsByTagName("name")
                                    .map!(elem2 =>
                                    {
                                        KHR_registry.type.member m;
                                        string name = elem2.getElementsByTagName("name")[0].innerText;
                                        m.name      = name;
                                        m.typename  = elem2.getElementsByTagName("type")[0].innerText;
                                        m.len       = elem2.getAttribute("len");
                                        m.enum_     = elem2.getAttribute("enum");               // for c-style array decls
                                                                                                // dont store text its the
                                                                                                //wrong way round for D fix it
                                                                                                //later
                                        m.optional  = elem2.getAttribute("optional") != null;   // if it is present it is true
                                        m.noautovalidity  = elem2.getAttribute("noautovalidity") != null;

                                        t.members[name] = m;
                                    });

        }

        reg.types[t.name] = t;
    });

    foreach(elem; doc.getElementsByTagName("enums")) // there are multiple <enums> elements
    {
        KHR_registry.enums e;
        e.name      = elem.getAttribute("name");
        e.comment   = elem.getAttribute("comment");
        e.expand    = elem.getAttribute("expand");

        foreach(elem2; elem.getElementsByTagName("enum"))
        {
            KHR_registry.enums.enum_ e2;
            e2.name      = elem.getAttribute("name");
            e2.comment   = elem.getAttribute("comment");
            e2.value    = elem.getAttribute("value");
            e.values[e2.name] = e2;

        }
        reg.enumerations[e.name] = e;
    }
    doc.getElementsByTagName("commands")[0] //associatve array assignment: can we do this as a range?
    .getElementsByTagName("command")
    .map!(elem =>
    {
        KHR_registry.command c;
        c.successcodes      = elem.getAttribute("successcodes");
        c.errorcodes        = elem.getAttribute("errorcodes");
        c.queues            = elem.getAttribute("queues");
        c.renderpass        = elem.getAttribute("renderpass");
        c.cmdbufferlevel    = elem.getAttribute("cmdbufferlevel");
        c.content           = elem.innerText;
        c.name              = elem.getElementsByTagName("proto")[0].getElementsByTagName("name")[0].innerText;
        c.returnType        = elem.getElementsByTagName("proto")[0].getElementsByTagName("type")[0].innerText;
        if (elem.getElementsByTagName("validity"))
        {
            c.validity.str = elem.getElementsByTagName("validity")[0]
                                .getElementsByTagName("usage")
                                .map!( elem2 => elem2.innerText).array[];
        }
        foreach(elem2; elem.getElementsByTagName("param"))
        {
            KHR_registry.command.param p;
            p.optional = elem2.getAttribute("optional") != null; // present = true
            p.externsync = elem2.getAttribute("externsync");
            p.len = elem2.getAttribute("len");

            c.params ~= p;
        }
    });

    return reg;
}


February 23, 2016
On Monday, 22 February 2016 at 07:00:42 UTC, Nicholas Wilson wrote:
> On Sunday, 21 February 2016 at 15:18:44 UTC, ZombineDev wrote:
>> [...]
>
> Thanks for the tips. I used AA and just got it to compile! :) :| :( but fails to link.
> Undefined symbols for architecture x86_64:
>   "_D28TypeInfo_xC4arsd3dom7Element6__initZ", referenced from:
>       _D29TypeInfo_AxC4arsd3dom7Element6__initZ in app.o
>   "_D4arsd3dom12__ModuleInfoZ", referenced from:
>       _D3app12__ModuleInfoZ in app.o
> ld: symbol(s) not found for architecture x86_64
> clang: error: linker command failed with exit code 1 (use -v to see invocation)
> --- errorlevel 1
> any Ideas?
>
> [...]

It was because i was using dmd directly as opposed to dub. Derp!


February 24, 2016
On Sunday, 21 February 2016 at 15:18:44 UTC, ZombineDev wrote:
> On Sunday, 21 February 2016 at 12:52:33 UTC, Nicholas Wilson wrote:
>>[...]
>
> I'm glad to see more people looking to create a D binding from vk.xml!
> I was also working on this (http://forum.dlang.org/post/ygylvtuwwiwyqtcnlejh@forum.dlang.org), unfortunately I won't be able to continue my work until early March, so I hope you'll do a good job in the mean time ;)
>
> [...]

AA's are nice in theory but the non-deterministic nature of their order of iteration is painful...
February 24, 2016
On Wednesday, 24 February 2016 at 00:50:40 UTC, Nicholas Wilson wrote:

> AA's are nice in theory but the non-deterministic nature of their order of iteration is painful...

An ordered map as the default AA implementation would be worse. Most use cases for a hash map don't need ordering. Perhaps we'll have one in std.container at some point, but as it stands I'm unaware of any implementations out there. Neither the EMSI containers [1] nor dcollections [2] has one that I can see.

[1] https://github.com/economicmodeling/containers/tree/master/src/containers
[2] https://github.com/schveiguy/dcollections/tree/master/dcollections
February 24, 2016
On 2/23/16 10:42 PM, Mike Parker wrote:
> On Wednesday, 24 February 2016 at 00:50:40 UTC, Nicholas Wilson wrote:
>
>> AA's are nice in theory but the non-deterministic nature of their
>> order of iteration is painful...
>
> An ordered map as the default AA implementation would be worse. Most use
> cases for a hash map don't need ordering. Perhaps we'll have one in
> std.container at some point, but as it stands I'm unaware of any
> implementations out there. Neither the EMSI containers [1] nor
> dcollections [2] has one that I can see.
>
> [1]
> https://github.com/economicmodeling/containers/tree/master/src/containers
> [2] https://github.com/schveiguy/dcollections/tree/master/dcollections

I've contemplated adding one. I think it's a nice feature of php.

But note that red black trees are ordered if you need some sort of ordering independent of insertion order.

-Steve