Jump to page: 1 2
Thread overview
Covert a complex C header to D
Apr 02, 2017
biocyberman
Apr 02, 2017
Nicholas Wilson
Apr 03, 2017
Nicholas Wilson
Apr 03, 2017
biocyberman
Apr 03, 2017
Nicholas Wilson
Apr 03, 2017
Stefan Koch
Apr 05, 2017
biocyberman
Apr 05, 2017
Nicholas Wilson
OT: It is convert, not covert
Apr 04, 2017
Ali Çehreli
Apr 04, 2017
biocyberman
Apr 04, 2017
Nicholas Wilson
Apr 04, 2017
Ali Çehreli
April 02, 2017
khash.h (http://attractivechaos.github.io/klib/#Khash%3A%20generic%20hash%20table) is a part of klib library in C. I want to covert it to D in the process of learning deeper about D.

First I tried with Dstep (https://github.com/jacob-carlborg/dstep) and read the C to D article (https://dlang.org/ctod.html). I managed to covert the basic statements to D, but all multiline 'define' macros are stripped off. So I am trying to recreate them with D way. For example:


#define __KHASH_TYPE(name, khkey_t, khval_t) \
	typedef struct kh_##name##_s { \
		khint_t n_buckets, size, n_occupied, upper_bound; \
		khint32_t *flags; \
		khkey_t *keys; \
		khval_t *vals; \
	} kh_##name##_t;


I changed to:

template __KHASH_TYPE(string name){
  "struct  kh_" ~ name ~"_t { " ~
                "khint_t n_buckets, size, n_occupied, upper_bound; " ~
                "khint32_t *flags; " ~
                "khkey_t *keys; " ~
                "khval_t *vals; " ~
        "}"

}

// NEXT: use mixin with this template.

I am currently get a bit intimidated looking at KHASH_INIT2 macro in khash.c. How do I convert this to the equivalent and idiomatic D?



April 02, 2017
On Sunday, 2 April 2017 at 21:43:52 UTC, biocyberman wrote:
> khash.h (http://attractivechaos.github.io/klib/#Khash%3A%20generic%20hash%20table) is a part of klib library in C. I want to covert it to D in the process of learning deeper about D.
>
> First I tried with Dstep (https://github.com/jacob-carlborg/dstep) and read the C to D article (https://dlang.org/ctod.html). I managed to covert the basic statements to D, but all multiline 'define' macros are stripped off. So I am trying to recreate them with D way. For example:
>
>
> #define __KHASH_TYPE(name, khkey_t, khval_t) \
> 	typedef struct kh_##name##_s { \
> 		khint_t n_buckets, size, n_occupied, upper_bound; \
> 		khint32_t *flags; \
> 		khkey_t *keys; \
> 		khval_t *vals; \
> 	} kh_##name##_t;
>
>
> I changed to:
>
> template __KHASH_TYPE(string name){
>   "struct  kh_" ~ name ~"_t { " ~
>                 "khint_t n_buckets, size, n_occupied, upper_bound; " ~
>                 "khint32_t *flags; " ~
>                 "khkey_t *keys; " ~
>                 "khval_t *vals; " ~
>         "}"
>
> }
>
> // NEXT: use mixin with this template.
>
> I am currently get a bit intimidated looking at KHASH_INIT2 macro in khash.c. How do I convert this to the equivalent and idiomatic D?

You are on the right track, converting #define's that declare symbols to template strings to be mixed in. But you also need to parameterise the key type and the value type as they are also arguments to the macro.

so you'd go

mixin( __KHASH_TYPE("mytype",string, int));

However it is generally considered better to use templates where possible as they are generally astir to reason about (and look nicer). Since this is a relatively simple case we could just go:

struct kh_hashtable_t(string name,K,V) {
    //kh_hashtable_t is a struct parameterised on the types K and V
    khint_t n_buckets, size, n_occupied, upper_bound;
    khint32_t *flags;
     K *keys;
     V *vals;
}
and not worry about "name", the compiler will generate an internal name for us. Doesn't matter what it is, but it is guaranteed to be unique which is the main property we want. We probably don't even need the nam parameter at all.

(there is also the builtin hash table declared V[K] e.g. int[string] i.e. a hash table of ints indexed by strings.).

So for KHASH_INIT2:

the argument to the macro are
    name: a string
    scope: a protection modifier (in C they use static inline, in D this would be pragma(inline, true) private. But I would ignore this parameter.
    khkey_t: the key type
    khval_t: the value type
    kh_is_map: a bool (not sure of its purpose).
    __hash_func: the function used to generate a hash from the key
   __hash_equal:

so you'd want something like

template KHASH_INIT(string name,K,V,bool kh_is_map, alias keyhash, alias equal =  (V a , V b) => a==b)
{
    //...
}

where K and V are types, "alias keyhash" is a function that transforms a key into a hash and alias equal is a function that deternimes if two values(keys?) are equal.

you'd call it like
KHASH_INIT!("some_name",string,int,true, (string a) => myFancyHash(a) /* leave equal as a default*/);

Let me know if you get stuck.
Nic
April 03, 2017
On Sunday, 2 April 2017 at 21:43:52 UTC, biocyberman wrote:
> template __KHASH_TYPE(string name){
>   "struct  kh_" ~ name ~"_t { " ~
>                 "khint_t n_buckets, size, n_occupied, upper_bound; " ~
>                 "khint32_t *flags; " ~
>                 "khkey_t *keys; " ~
>                 "khval_t *vals; " ~
>         "}"
>
> }

Not that you'll get bitten by it in this case but in D the pointer declarator * is left associative.

i.e. in C

 int *pInt, Int; // "Int" is int not an int*
 int *pInt, Int[3]; // Int is a static array of 3 ints.
but in D

misleading:
 int *pInt, Int; // Int is an int*!!

wrong:
 int *pInt, three_Ints[3]; // Error cannot mix declared types

not misleading
int* pInt, pInt2; // BOTH int*

int*pInt; //pointer to int
int[3] three_Ints; // static array of 3 ints.
April 03, 2017
On Monday, 3 April 2017 at 00:00:04 UTC, Nicholas Wilson wrote:
> On Sunday, 2 April 2017 at 21:43:52 UTC, biocyberman wrote:
>> template __KHASH_TYPE(string name){
>>   "struct  kh_" ~ name ~"_t { " ~
>>                 "khint_t n_buckets, size, n_occupied, upper_bound; " ~
>>                 "khint32_t *flags; " ~
>>                 "khkey_t *keys; " ~
>>                 "khval_t *vals; " ~
>>         "}"
>>
>> }
>
> Not that you'll get bitten by it in this case but in D the pointer declarator * is left associative.
>
> i.e. in C
>
>  int *pInt, Int; // "Int" is int not an int*
>  int *pInt, Int[3]; // Int is a static array of 3 ints.
> but in D
>
> misleading:
>  int *pInt, Int; // Int is an int*!!
>
> wrong:
>  int *pInt, three_Ints[3]; // Error cannot mix declared types
>
> not misleading
> int* pInt, pInt2; // BOTH int*
>
> int*pInt; //pointer to int
> int[3] three_Ints; // static array of 3 ints.

Thank you for some excellent tips, Nicholas Wilson. I made this repo https://github.com/biocyberman/klibD. You are more than welcome to make direct contributions with PRs there. The next milestone want to reach is to complete to conversion of khash.d and have to test code with it.

April 03, 2017
On Monday, 3 April 2017 at 10:04:53 UTC, biocyberman wrote:
> On Monday, 3 April 2017 at 00:00:04 UTC, Nicholas Wilson wrote:
>> On Sunday, 2 April 2017 at 21:43:52 UTC, biocyberman wrote:
>>> template __KHASH_TYPE(string name){
>>>   "struct  kh_" ~ name ~"_t { " ~
>>>                 "khint_t n_buckets, size, n_occupied, upper_bound; " ~
>>>                 "khint32_t *flags; " ~
>>>                 "khkey_t *keys; " ~
>>>                 "khval_t *vals; " ~
>>>         "}"
>>>
>>> }
>>
>> Not that you'll get bitten by it in this case but in D the pointer declarator * is left associative.
>>
>> i.e. in C
>>
>>  int *pInt, Int; // "Int" is int not an int*
>>  int *pInt, Int[3]; // Int is a static array of 3 ints.
>> but in D
>>
>> misleading:
>>  int *pInt, Int; // Int is an int*!!
>>
>> wrong:
>>  int *pInt, three_Ints[3]; // Error cannot mix declared types
>>
>> not misleading
>> int* pInt, pInt2; // BOTH int*
>>
>> int*pInt; //pointer to int
>> int[3] three_Ints; // static array of 3 ints.
>
> Thank you for some excellent tips, Nicholas Wilson. I made this repo https://github.com/biocyberman/klibD. You are more than welcome to make direct contributions with PRs there. The next milestone want to reach is to complete to conversion of khash.d and have to test code with it.

I'm very buy atm but I will give some general tips:
   prefer template over string mixins where possible. This will make the code much more readable.
    try to remove Cisms. Seperate declaration and definition is the most glaring example. But also the function that deal with the kh_hastable should be member function.
    all of the "name" parameters in the macros should not be needed as D has overloading and mangling to handle that.

Other than that, good luck and learn lots!

April 03, 2017
On Monday, 3 April 2017 at 11:18:21 UTC, Nicholas Wilson wrote:
>    prefer template over string mixins where possible. This will make the code much more readable.

My advise would be the opposite.
templates put much more pressure on the compiler then string-mixins do.
Also the code that templates expand to is hard to get.
Whereas the code that string mixins expand to can always be printed one way or another.
April 03, 2017
Covert has a very different meaning. :)

Ali

April 04, 2017
On Tuesday, 4 April 2017 at 05:29:42 UTC, Ali Çehreli wrote:
> Covert has a very different meaning. :)
>
> Ali

Thanks Ali. My fingers argued they are the same :) And I can't find a way to edit my post after posting.

I would love to have your input. I am revisited your  book several times to read relevant sections. But these complex macros are still holding me back.
April 04, 2017
On Tuesday, 4 April 2017 at 09:37:12 UTC, biocyberman wrote:
> On Tuesday, 4 April 2017 at 05:29:42 UTC, Ali Çehreli wrote:
>> Covert has a very different meaning. :)
>>
>> Ali
>
> Thanks Ali. My fingers argued they are the same :) And I can't find a way to edit my post after posting.
>
> I would love to have your input. I am revisited your  book several times to read relevant sections. But these complex macros are still holding me back.

Most of those macros are not needed and can be just part the struct definition:

i.e. you want something like

struct kh_hashtable(K,V,bool _is_map, alias hash_func, alias hash_eq = (K a, K b)=> a == b) {
    khint_t n_buckets, size, n_occupied, upper_bound;
    khint32_t *flags;
     K *keys;
     V *vals;
    //No need for __KHASH_PROTOTYPES / __KHASH_IMPL just declare the function as methods of the struct
    this() { ... } // in place of kh_init_##name
    ~this() { ... } // for destroy
    resize(khint_t new_size){ ... }  //kh_resize_##name
    // and so on for each method in __KHASH_IMPL

}
April 04, 2017
On 04/02/2017 02:43 PM, biocyberman wrote:
> khash.h
> (http://attractivechaos.github.io/klib/#Khash%3A%20generic%20hash%20table)
> is a part of klib library in C. I want to covert it to D in the process
> of learning deeper about D.

These are macros used by the library developer to generate library facilities without repetition. Not uncommon for C libraries... As Nicholas Wilson says, just ignore most of these macros because in the end what you want are the types and functions that the public interface of the library includes. (Or, the public documentation of the library includes.)

In this case, looking at the preprocessor output to see what is generated may help. For example, use the -E compiler switch of gcc. If you're not familiar with this switch, you may be intimidated at first as it includes all headers that your header includes itself. Just search for the said library types and functions to see how they ended up like after preprocessing.

Ali

« First   ‹ Prev
1 2