Thread overview
about pointer syntax
Oct 14, 2010
spir
Oct 14, 2010
bearophile
Oct 14, 2010
spir
Oct 14, 2010
bearophile
Oct 14, 2010
bearophile
Oct 14, 2010
Andrej Mitrovic
October 14, 2010
Hello,

As a way to start learning D by practicing, I'm trying to implement a symbol table as linked list: see prototype code of the structs below. (Hints about good D coding welcome :-)

2 little questions about pointers:
1. To please the compiler, I had to wrap dereferencing in parens, like in "(*symbol).name". Is this just normal (for disambiguation)?
2. Just noticed the compiler accepts "symbol.name", while symbol is a pointer: is there implicit dereferencing of pointers to structs? If yes, does this also apply to pointers to arrays (for information, this what Oberon does).

And a note: had forgotten to add an empty main(), just for compiling: the linker produced a rather big amount of error text without any hint to the actual issue. Maybe dmd could cope with that before calling gcc?

Finally: the reference is rather harsh for me, esp there are very few example (I have few practice of C, and none of C++); the bits of tutorials I found are too light or just started (toc points to empty pages). Is there anywhere online a kind of D programming guide, even not polished or possibly unfinished (I know about the book; maybe I'll order it, but in meantime...).

Thank you,
Denis

================== code ======================
struct Symbol {
    string name  = "" ;
    int element  = 0 ;
    Symbol* next = null ;
}
struct List {
    Symbol* first = null ;
    int element(string name) {
        Symbol* symbol = this.first ;
        while (symbol) {
            if ((*symbol).name ==  name) {return (*symbol).element ;} ;
            symbol = (*symbol).next ;
        }
        return 0 ;
    }
}
==============================================

-- -- -- -- -- -- --
vit esse estrany ☣

spir.wikidot.com

October 14, 2010
spir:

> As a way to start learning D by practicing, I'm trying to implement a symbol table as linked list: see prototype code of the structs below. (Hints about good D coding welcome :-)

I have modified your code a little:

struct Symbol {
    string name;
    int element;
    Symbol* next;

    this(string n, int e, Symbol* s=null) {
        name = n;
        element = e;
        next = s;
    }
}

struct List {
    Symbol* first;

    int element(string name) {
        for (auto symbol = this.first; symbol != null; symbol = symbol.next)
            if (symbol.name == name)
                return symbol.element;
        return 0;
    }
}

void main() {
    auto l = List(new Symbol("baz", 3, new Symbol("bar", 2, new Symbol("foo", 1))));
    assert(l.element("bar") == 2);
}

- There's no need to initialize int to 0, string to "", pointers to null, etc, the compiler does it for you on default. Each type has a default init value (for floating point values it uses a NaN).
- Generally don't put a space before the ending semicolon.
- Four spaces indent and no space before the "*" of the pointer, as you have done, is OK.
- Using a while loop as you have done is OK, I have used a for loop just to show a shorter alternative.
- It's generally better to put one or two blank lines before structs, functions, etc
- (*symbol).name is written symbol.name in D.
- After an { generally a \n is used, there's no need to use the ; after the }
- using 0 as error return value is sometimes OK, but also keep in mind that exceptions are present.
- while(symbol) is acceptable, but some people prefer to put an explicit comparison to make the code more readable.
- sometimes using "auto" helps.
- In D linked lists are doable, but much less used, think about using a dynamic array, or even in this case an associative array:

void main() {
    auto aa = ["foo": 1, "bar": 2, "baz": 3];
    assert(aa["bar"] == 2);
}


> 1. To please the compiler, I had to wrap dereferencing in parens, like in "(*symbol).name". Is this just normal (for disambiguation)?
> 2. Just noticed the compiler accepts "symbol.name", while symbol is a pointer: is there implicit dereferencing of pointers to structs? If yes, does this also apply to pointers to arrays (for information, this what Oberon does).

Generally the field dereferencing doesn't require the "(*symbol).name", in D you use "symbol.name". Pointers to arrays are possible, but quite uncommon in D.


> And a note: had forgotten to add an empty main(), just for compiling: the linker produced a rather big amount of error text without any hint to the actual issue. Maybe dmd could cope with that before calling gcc?

I agree, I have bug report 4680 open on something similar: http://d.puremagic.com/issues/show_bug.cgi?id=4680


> Finally: the reference is rather harsh for me, esp there are very few example (I have few practice of C, and none of C++); the bits of tutorials I found are too light or just started (toc points to empty pages). Is there anywhere online a kind of D programming guide, even not polished or possibly unfinished (I know about the book; maybe I'll order it, but in meantime...).

The D2 documentation isn't abundant yet :-)

Bye,
bearophile
October 14, 2010
On Thu, 14 Oct 2010 17:16:23 -0400
bearophile <bearophileHUGS@lycos.com> wrote:

> spir:
> 
> > As a way to start learning D by practicing, I'm trying to implement a symbol table as linked list: see prototype code of the structs below. (Hints about good D coding welcome :-)
> 
> I have modified your code a little:
> 
> struct Symbol {
>     string name;
>     int element;
>     Symbol* next;
> 
>     this(string n, int e, Symbol* s=null) {
>         name = n;
>         element = e;
>         next = s;
>     }
> }
> 
> struct List {
>     Symbol* first;
> 
>     int element(string name) {
>         for (auto symbol = this.first; symbol != null; symbol = symbol.next)
>             if (symbol.name == name)
>                 return symbol.element;
>         return 0;
>     }
> }
> 
> void main() {
>     auto l = List(new Symbol("baz", 3, new Symbol("bar", 2, new Symbol("foo", 1))));
>     assert(l.element("bar") == 2);
> }

Thank you very much! That's exactly what I need to learn good practice, even more with comments below. (Hope you don't mind if I send other bits of code to be improved that way?)

> - There's no need to initialize int to 0, string to "", pointers to null, etc, the compiler does it for you on default. Each type has a default init value (for floating point values it uses a NaN).

Right. I think at keeping explicit defaults like "int element = 0" for documentation. It means "that's precisely what I want as default value", as opposed to just the language's "init". (So, for Symbol fields, I would only write element & next defaults explicitely, since name has no meaningful default: it must be provided.)

> - Generally don't put a space before the ending semicolon.
> - Four spaces indent and no space before the "*" of the pointer, as you have done, is OK.
> - It's generally better to put one or two blank lines before structs, functions, etc

(About ';', I like to separate actual code from grammatical separators, which are just noise to my eyes. If I ever write a lib for public use, I'll follow such styling guidelines ;-)

> - Using a while loop as you have done is OK, I have used a for loop just to show a shorter alternative.

Thanks, as I said I'm not used to PLs of the C family. Wouldn't have ever thought that the "stepping" statement of a for loop can be something else as incrementation.

> - (*symbol).name is written symbol.name in D.

Very good.

> - After an { generally a \n is used, there's no need to use the ; after the }

Right.

> - using 0 as error return value is sometimes OK, but also keep in mind that exceptions are present.

;-) Was just a placeholder before I decide how to cope with "finding failure".

> - while(symbol) is acceptable, but some people prefer to put an explicit comparison to make the code more readable.

Right, guess you mean while(symbol != null)?

> - sometimes using "auto" helps.

I need to explore this further (what's the actual purpose of "auto").

> - In D linked lists are doable, but much less used, think about using a dynamic array, or even in this case an associative array:
>
> void main() {
>     auto aa = ["foo": 1, "bar": 2, "baz": 3];
>     assert(aa["bar"] == 2);
> }

Yes, it was just an exercise. For curiosity, I intend to benchmark lists vs sequential arrays vs associative arrays, for various element counts.
(Did that already in another language (freepascal), to know what kind of data structures were suited for symbol tables representing the content of record-like objects, which number of entries is typically small since hand-written by the programmer: sophisticated structures like associative arrays and tries started to perform better than plain sequences for counts >> 100.)

> > 1. To please the compiler, I had to wrap dereferencing in parens, like in "(*symbol).name". Is this just normal (for disambiguation)?
> > 2. Just noticed the compiler accepts "symbol.name", while symbol is a pointer: is there implicit dereferencing of pointers to structs? If yes, does this also apply to pointers to arrays (for information, this what Oberon does).
> 
> Generally the field dereferencing doesn't require the "(*symbol).name", in D you use "symbol.name". Pointers to arrays are possible, but quite uncommon in D.

Sure, there are builtin dynamic arrays :-)
I'll check whether manually pointed arrays also silently dereferencing (on element access, on attribute access)?

> > And a note: had forgotten to add an empty main(), just for compiling: the linker produced a rather big amount of error text without any hint to the actual issue. Maybe dmd could cope with that before calling gcc?
> 
> I agree, I have bug report 4680 open on something similar: http://d.puremagic.com/issues/show_bug.cgi?id=4680

Good.

> > Finally: the reference is rather harsh for me, esp there are very few example (I have few practice of C, and none of C++); the bits of tutorials I found are too light or just started (toc points to empty pages). Is there anywhere online a kind of D programming guide, even not polished or possibly unfinished (I know about the book; maybe I'll order it, but in meantime...).
> 
> The D2 documentation isn't abundant yet :-)

I'll do what I can on one of the already started wikibooks about D if I find energy for that...

> Bye,
> bearophile

Thanks again,
Denis
-- -- -- -- -- -- --
vit esse estrany ☣

spir.wikidot.com

October 14, 2010
On 10/14/10, bearophile <bearophileHUGS@lycos.com> wrote:

> Generally the field dereferencing doesn't require the "(*symbol).name", in D
> you use "symbol.name".
> Pointers to arrays are possible, but quite uncommon in D.

I'm not sure, but I think function pointers (the C syntax ones, not variables declared with function/delegate) still need parenthesis for dereferencing? IIRC I was getting back some weird errors a few days ago when I was trying it out. I might be wrong though..
October 14, 2010
spir:

> (Hope you don't mind if I send other bits of code to be improved that way?)

Feel free to show them, if I am busy other people may give you an answer. The worst that may happen is that no one answers you.


> Right. I think at keeping explicit defaults like "int element = 0" for documentation.

Not putting a value, unless it's different from the standard initializator, is a very common idiom in D. It's easy to remember the inits: zero, empty string, NaN, null, invalid Unicode chars.


> If I ever write a lib for public use, I'll follow such styling guidelines ;-)

The D style:
http://www.digitalmars.com/d/2.0/dstyle.html


> Wouldn't have ever thought that the "stepping" statement of a for loop can be something else as incrementation.

This is a kind of incrementation:
symbol = symbol.next;

In C-derived languages this line of code:
symbol = symbol + 5;
May be written:
symbol += 5;

In the same way you may think of this (this is not D syntax):
symbol .= next;
As a compact version of:
symbol = symbol.next;


> Right, guess you mean while(symbol != null)?

Right. But some people don't like that.


> I need to explore this further (what's the actual purpose of "auto").

It just asks the compiler to use the default type, it performs a bit of local type inferencing.


> For curiosity, I intend to benchmark lists vs sequential arrays vs associative arrays, for various element counts.

Good, it's a way to understand the language better. In the lists vs dynamic array I suggest you to also take a look at the produced (dis)assembly.


> to know what kind of data structures were suited for symbol tables representing the content of record-like objects, which number of entries is typically small since hand-written by the programmer: sophisticated structures like associative arrays and tries started to perform better than plain sequences for counts >> 100.)<

D dynamic arrays are not bad, but they are not very efficient, so a sequential scan in a small array of integer numbers is faster than an hash search.

You may also try a "perfect hash", for your purpose. Around you may find C code (that's easy to translate to D) to create a perfect hash out of a sequence of strings.

A binary search in an array of strings-int pairs is an easy option too.

If your symbol are names are short you may also put your string+int pairs in a single flat data structure (even keeping a fixed length for each one of them), to increase CPU cache locality a bit, something like:

struct Pair { char[10] name; int element; }
Pair[20] data;

In C/C++/D languages usually there are *many* complicated ways to speed up code :-) You generally want to use them only in the spots where the profiler tells you you need performance. The profiler is used with the DMD -profile switch.

It's also good if you learn D to start using unittests and design by contract from the beginning. They help a lot avoid bugs. And the more hairy (pointers, etc) your code is, the more useful they are.

Bye,
bearophile
October 14, 2010
> D dynamic arrays are not bad, but they are not very efficient,

I meant associative array, sorry.