Reading about D: few questions

Hi guys!

I'm mostly familiar with C (and a bit of PHP). I've stumbled upon the D language, and I must say I really like it.
Now I'm reading the "The D Programming Language" book, and I have a couple of questions:


1. Uninitialized Arrays and GC.

http://dlang.org/memory.html#uninitializedarrays
It's said here ^ that:
"The uninitialized data that is on the stack will get scanned by the garbage collector looking for any references to allocated memory."
With the given example of: byte[1024] buffer = void;

So does the GC really scan this byte array? Or (sounds more logical to me) does it scan only reference types?
If the latter is true, I think the example should use some kind of a pointer array. Also, in this case, I can't see why "Uninitialized data can be a source of bugs and trouble, even when used correctly."?
If the former is true, then, well, I'll ask more questions.


2. Setting Dynamic Array Length.

http://dlang.org/arrays.html#resize
"A more practical approach would be to minimize the number of resizes"

The solution works but is not as clean as just using array ~= c;
Is there any way (language, runtime, or phobos) to declare an array that would reallocate memory by chunks, which are multiple of x?


3. const and immutable.

Is there any use for const when defining variables?
As I see it, there is no use for using e.g. const int x;, as it can't be modified anyway;
So with immutable, const is only good for reference variables that are initialized to refer to another variable (like a function const ref parameter).
Am I right?


4. if (&lhs != &rhs)?

std.algorithm has this in it's swap function.
Is it different than if (lhs !is rhs)?
Just wondering.


5. Align attribute.

http://dlang.org/attribute.html#align

struct S {
  align(4) byte a; // placed at offset 0
  align(4) byte b; // placed at offset 1
}

Explain this please.


6. Array slices manipulation.

a[] += 1; works but a[]++ doesn't.
Not so important but just wondering: why, and is it intended?


7. Anonymous structs.
In C you can write:

struct { int a; } s = {10};
printf("%d\n", s.a);

In D you must declare the struct first:

struct S { int a; };
S s = {10};
writeln(s.a);

Why doesn't D allow anonymous structs?


Best regards.

December 23, 2011

Re: Reading about D: few questions

Posted by Mafi
in reply to Mr. Anonymous

Permalink

Mafi

Posted in reply to Mr. Anonymous

Permalink

Am 23.12.2011 16:25, schrieb Mr. Anonymous:
> Hi guys!
>
> I'm mostly familiar with C (and a bit of PHP). I've stumbled upon the D
> language, and I must say I really like it.
> Now I'm reading the "The D Programming Language" book, and I have a
> couple of questions:
>
[....]
>
> 3. const and immutable.
>
> Is there any use for const when defining variables?
> As I see it, there is no use for using e.g. const int x;, as it can't be
> modified anyway;
> So with immutable, const is only good for reference variables that are
> initialized to refer to another variable (like a function const ref
> parameter).
> Am I right?
Right. There's no point in a const int but of course there is a big difference between const(int)* and immutable(int)*.

>
>
> 4. if (&lhs != &rhs)?
>
> std.algorithm has this in it's swap function.
> Is it different than if (lhs !is rhs)?
> Just wondering.
>
They're not the same at all. "is" checks if the two operands have binary equality.
To understand, you have to keep in mind that lhs and rhs are references and could refer to one and the same variable as in:
int a = 0; swap(a, a);
Now, if you want to know if two refs refer to the same variable, you
use &lhs == &rhs. If want to know if two class instances are the same, you use is. If you want to know if two things (instances or anything else) are equal, you use ==.
&lhs == &rhs (makes only sense with refs)
rhs is lhs (always true, if the above is true)
rhs == lhs (always true, if the above is true)

import std.stdio;
void f(ref int[] a, ref int[] b) {
  writefln("%s %s %s", &a == &b, a is b, a == b);
}

void main() {
  auto u = [1, 2, 3];
  auto u2 = u;
  auto v = [1, 2, 3];
  auto w = [4, 5, 6];
  f(u, u); // true true true
  f(u, u2);// false true true
  f(u, v); // false false true
  f(u, w); // false false false
}	

> [...]

Mafi

December 23, 2011

Re: Reading about D: few questions

Posted by Ali Çehreli
in reply to Mr. Anonymous

Permalink

Ali Çehreli

Posted in reply to Mr. Anonymous

Permalink

On 12/23/2011 07:25 AM, Mr. Anonymous wrote:

> I have a couple of questions:

I prefer separate threads for each. :)

> 1. Uninitialized Arrays and GC.
>
> http://dlang.org/memory.html#uninitializedarrays
> It's said here ^ that:
> "The uninitialized data that is on the stack will get scanned by the
> garbage collector looking for any references to allocated memory."
> With the given example of: byte[1024] buffer = void;
>
> So does the GC really scan this byte array? Or (sounds more logical to
> me) does it scan only reference types?

I am not an expert on garbage collectors but I've never heard about differentiating the bits of data. The GC would have to need to keep meta data about every part of the allocated space as such and it would not be practical.

> If the latter is true, I think the example should use some kind of a
> pointer array. Also, in this case, I can't see why "Uninitialized data
> can be a source of bugs and trouble, even when used correctly."?

I don't think that the last part is any different than the "initialize all of your variables" advice. The uninitialized data has come from memory that has been used earlier in the program and may have valid data (and references) to existing or already-destroyed data. Hard to debug.

> 2. Setting Dynamic Array Length.
>
> http://dlang.org/arrays.html#resize
> "A more practical approach would be to minimize the number of resizes"
>
> The solution works but is not as clean as just using array ~= c;
> Is there any way (language, runtime, or phobos) to declare an array that
> would reallocate memory by chunks, which are multiple of x?

Array expansion is already more efficient than they look at first. This article is a good read:

  http://www.dsource.org/projects/dcollections/wiki/ArrayArticle

> 3. const and immutable.
>
> Is there any use for const when defining variables?
> As I see it, there is no use for using e.g. const int x;, as it can't be
> modified anyway;
> So with immutable, const is only good for reference variables that are
> initialized to refer to another variable (like a function const ref
> parameter).
> Am I right?

Right. I have two observations myself:

- To be more useful, function parameters should not insist on immutable data, yet we type string all over the place.

- To be more useful, functions should not insist on the mutability of the data that they return.

The following function makes a new string:

char[] endWithDot(const(char)[] s)
{
    return s ~ '.';
}

    char[] s;
    s ~= "hello";
    auto a = endWithDot(s);

It is good that the parameter is const(char) so that I could pass the mutable s to it.

But the orthogonal problem of the type of the return is troubling. The result is clearly mutable yet it can't be returned as such:

Error: cannot implicitly convert expression (s ~ '.') of type const(char)[] to char[]

We've talked about this before. There is nothing in the language that makes me say "the returned object is unique; you can cast it to mutable or immutable freely."

> 5. Align attribute.
>
> http://dlang.org/attribute.html#align
>
> struct S {
> align(4) byte a; // placed at offset 0
> align(4) byte b; // placed at offset 1
> }
>
> Explain this please.

I don't know more than what the documentation says but I remember reading bugs about align().

> 6. Array slices manipulation.
>
> a[] += 1; works but a[]++ doesn't.
> Not so important but just wondering: why, and is it intended?

Again, I remember discussion and limitations about this feature. Fixed-length arrays have better support and the regular increment works:

    double[3] a = [ 10, 20, 30 ];
    ++a[];

> 7. Anonymous structs.
> In C you can write:
>
> struct { int a; } s = {10};
> printf("%d\n", s.a);
>
> In D you must declare the struct first:
>
> struct S { int a; };
> S s = {10};
> writeln(s.a);
>
> Why doesn't D allow anonymous structs?

It may be related to palsing. D does not require the semicolon at the end of the struct definition, so it wouldn't know what 's' is:

struct { int a; }   // definition (of unmentionable type :) )
s = {10};           // unknown s

There could be special casing but I don't think that it would be worth it.

Ali

December 23, 2011

Re: Reading about D: few questions

Posted by Mr. Anonymous
in reply to Ali Çehreli

Permalink

Mr. Anonymous

Posted in reply to Ali Çehreli

Permalink

On 23.12.2011 19:47, Ali Çehreli wrote:
> On 12/23/2011 07:25 AM, Mr. Anonymous wrote:
>
>  > I have a couple of questions:
>
> I prefer separate threads for each. :)

Should I resend the questions as separate messages?

>
>  > 1. Uninitialized Arrays and GC.
>  >
>  > http://dlang.org/memory.html#uninitializedarrays
>  > It's said here ^ that:
>  > "The uninitialized data that is on the stack will get scanned by the
>  > garbage collector looking for any references to allocated memory."
>  > With the given example of: byte[1024] buffer = void;
>  >
>  > So does the GC really scan this byte array? Or (sounds more logical to
>  > me) does it scan only reference types?
>
> I am not an expert on garbage collectors but I've never heard about
> differentiating the bits of data. The GC would have to need to keep meta
> data about every part of the allocated space as such and it would not be
> practical.

Maybe not every part, but only reference parts.

>
>  > If the latter is true, I think the example should use some kind of a
>  > pointer array. Also, in this case, I can't see why "Uninitialized data
>  > can be a source of bugs and trouble, even when used correctly."?
>
> I don't think that the last part is any different than the "initialize
> all of your variables" advice. The uninitialized data has come from
> memory that has been used earlier in the program and may have valid data
> (and references) to existing or already-destroyed data. Hard to debug.
>
>  > 2. Setting Dynamic Array Length.
>  >
>  > http://dlang.org/arrays.html#resize
>  > "A more practical approach would be to minimize the number of resizes"
>  >
>  > The solution works but is not as clean as just using array ~= c;
>  > Is there any way (language, runtime, or phobos) to declare an array that
>  > would reallocate memory by chunks, which are multiple of x?
>
> Array expansion is already more efficient than they look at first. This
> article is a good read:
>
> http://www.dsource.org/projects/dcollections/wiki/ArrayArticle

Thanks, I'll take a look.

>
>  > 3. const and immutable.
>  >
>  > Is there any use for const when defining variables?
>  > As I see it, there is no use for using e.g. const int x;, as it can't be
>  > modified anyway;
>  > So with immutable, const is only good for reference variables that are
>  > initialized to refer to another variable (like a function const ref
>  > parameter).
>  > Am I right?
>
> Right. I have two observations myself:
>
> - To be more useful, function parameters should not insist on immutable
> data, yet we type string all over the place.
>
> - To be more useful, functions should not insist on the mutability of
> the data that they return.
>
> The following function makes a new string:
>
> char[] endWithDot(const(char)[] s)
> {
> return s ~ '.';
> }
>
> char[] s;
> s ~= "hello";
> auto a = endWithDot(s);
>
> It is good that the parameter is const(char) so that I could pass the
> mutable s to it.
>
> But the orthogonal problem of the type of the return is troubling. The
> result is clearly mutable yet it can't be returned as such:
>
> Error: cannot implicitly convert expression (s ~ '.') of type
> const(char)[] to char[]
>
> We've talked about this before. There is nothing in the language that
> makes me say "the returned object is unique; you can cast it to mutable
> or immutable freely."

I saw that std.string functions use assumeUnique from std.exception.
As for your example, it probably should be:

char[] endWithDot(const(char)[] s)
{
    return s.dup ~ '.';
}

>
>  > 5. Align attribute.
>  >
>  > http://dlang.org/attribute.html#align
>  >
>  > struct S {
>  > align(4) byte a; // placed at offset 0
>  > align(4) byte b; // placed at offset 1
>  > }
>  >
>  > Explain this please.
>
> I don't know more than what the documentation says but I remember
> reading bugs about align().
>
>  > 6. Array slices manipulation.
>  >
>  > a[] += 1; works but a[]++ doesn't.
>  > Not so important but just wondering: why, and is it intended?
>
> Again, I remember discussion and limitations about this feature.
> Fixed-length arrays have better support and the regular increment works:
>
> double[3] a = [ 10, 20, 30 ];
> ++a[];

++a[] works, but a[]++ doesn't.

>
>  > 7. Anonymous structs.
>  > In C you can write:
>  >
>  > struct { int a; } s = {10};
>  > printf("%d\n", s.a);
>  >
>  > In D you must declare the struct first:
>  >
>  > struct S { int a; };
>  > S s = {10};
>  > writeln(s.a);
>  >
>  > Why doesn't D allow anonymous structs?
>
> It may be related to palsing. D does not require the semicolon at the
> end of the struct definition, so it wouldn't know what 's' is:
>
> struct { int a; } // definition (of unmentionable type :) )
> s = {10}; // unknown s
>
> There could be special casing but I don't think that it would be worth it.

Sounds reasonable.

>
> Ali
>

December 23, 2011

Re: Reading about D: few questions

Posted by bearophile
in reply to Mr. Anonymous

Permalink

bearophile

Posted in reply to Mr. Anonymous

Permalink

Mr. Anonymous:

> With the given example of: byte[1024] buffer = void;
> 
> So does the GC really scan this byte array?

The current D GC is not precise, so I think the current DMD+GC scan this array. Future better compilers/runtimes probably will be able to avoid it (with a shadow stack the gives precise typing information at runtime, used by a precise GC).


> The solution works but is not as clean as just using array ~= c;
> Is there any way (language, runtime, or phobos) to declare an array that
> would reallocate memory by chunks, which are multiple of x?

Appending to built-in D arrays is several times slower than doing the same thing to a C++ vector, but in many situations the performance is enough. When it's not enough there is the "capacity" function in the object module. Or for even better performance the appender in std.array, that gives performance just a little worse than the C++ vector push back.


> Is there any use for const when defining variables?
> As I see it, there is no use for using e.g. const int x;, as it can't be
> modified anyway;

const int x = 5 + foo(y) * bax(z);

It's better to use immutable or const everywhere this is possible and doesn't give you too many problems. In my D2 code about 70-90% of variables are now const or better immutable. This avoids some bugs and will help future compilers optimize code better.



> 5. Align attribute.
> 
> http://dlang.org/attribute.html#align
> 
> struct S {
>    align(4) byte a; // placed at offset 0
>    align(4) byte b; // placed at offset 1
> }
> 
> Explain this please.

I don't know. Keep in mind that DMD has many bugs, almost 50-100 gets removed every month.


> 6. Array slices manipulation.
> 
> a[] += 1; works but a[]++ doesn't.
> Not so important but just wondering: why, and is it intended?

It's a compiler bug. I think it's already in Bugzilla (but take a look in Bugzilla if you want to be sure).


> 7. Anonymous structs.
> In C you can write:
> 
> struct { int a; } s = {10};
> printf("%d\n", s.a);
> 
> In D you must declare the struct first:
> 
> struct S { int a; };
> S s = {10};
> writeln(s.a);
> 
> Why doesn't D allow anonymous structs?

D doesn't allow top-level anonymous structs.


> ++a[] works, but a[]++ doesn't.

Already known compiler bug.

----------------------------

Ali:

>There is nothing in the language that makes me say "the returned object is unique; you can cast it to mutable or immutable freely."<

The return value of strongly pure functions is implicitly castable to immutable.

And sometimes inout helps.

Bye,
bearophile

December 23, 2011

Re: Reading about D: few questions

Posted by Trass3r
in reply to Mr. Anonymous

Permalink

Trass3r

Posted in reply to Mr. Anonymous

Permalink

> 5. Align attribute.
>
> http://dlang.org/attribute.html#align
>
> struct S {
>    align(4) byte a; // placed at offset 0
>    align(4) byte b; // placed at offset 1
> }
>
> Explain this please.

align is a huge mess imo.
"It matches the corresponding C compiler behavior"
So what's the point of align in the first place, if the compiler does what it wants anyway, see above?

The only thing that really works is

align(1) struct S {...}

for packed structs.

December 23, 2011

Uninitialized Arrays and GC

Posted by Mr. Anonymous
in reply to bearophile

Permalink

Mr. Anonymous

Posted in reply to bearophile

Permalink

On 23.12.2011 21:51, bearophile wrote:
> Mr. Anonymous:
>
>> http://dlang.org/memory.html#uninitializedarrays
>> It's said here ^ that:
>> "The uninitialized data that is on the stack will get scanned by the
>> garbage collector looking for any references to allocated memory."
>> With the given example of: byte[1024] buffer = void;
>>
>> So does the GC really scan this byte array? Or (sounds more logical to
>> me) does it scan only reference types?
>> If the latter is true, I think the example should use some kind of a
>> pointer array. Also, in this case, I can't see why "Uninitialized data
>> can be a source of bugs and trouble, even when used correctly."?
>> If the former is true, then, well, I'll ask more questions.
>
> The current D GC is not precise, so I think the current DMD+GC scan this array. Future better compilers/runtimes probably will be able to avoid it (with a shadow stack the gives precise typing information at runtime, used by a precise GC).

Well, if that's really so, then it's not 100% reliable.
e.g. you generate an array of random numbers, and one of them appears to be an address of an allocated array. This array won't free even if not used anymore.

December 23, 2011

Re: Reading about D: few questions

Posted by Mr. Anonymous
in reply to Ali Çehreli

Permalink

Mr. Anonymous

Posted in reply to Ali Çehreli

Permalink

On 23.12.2011 19:47, Ali Çehreli wrote:
> On 12/23/2011 07:25 AM, Mr. Anonymous wrote:
>  > 2. Setting Dynamic Array Length.
>  >
>  > http://dlang.org/arrays.html#resize
>  > "A more practical approach would be to minimize the number of resizes"
>  >
>  > The solution works but is not as clean as just using array ~= c;
>  > Is there any way (language, runtime, or phobos) to declare an array that
>  > would reallocate memory by chunks, which are multiple of x?
>
> Array expansion is already more efficient than they look at first. This
> article is a good read:
>
> http://www.dsource.org/projects/dcollections/wiki/ArrayArticle

std.array.Appender is what I was talking about :)

December 23, 2011

Re: Uninitialized Arrays and GC

Posted by Ali Çehreli
in reply to Mr. Anonymous

Permalink

Ali Çehreli

Posted in reply to Mr. Anonymous

Permalink

On 12/23/2011 12:36 PM, Mr. Anonymous wrote:
> On 23.12.2011 21:51, bearophile wrote:
>> The current D GC is not precise, so I think the current DMD+GC scan
>> this array. Future better compilers/runtimes probably will be able to
>> avoid it (with a shadow stack the gives precise typing information at
>> runtime, used by a precise GC).
>
> Well, if that's really so, then it's not 100% reliable.
> e.g. you generate an array of random numbers, and one of them appears to
> be an address of an allocated array. This array won't free even if not
> used anymore.

I think it's the other way around: whatever this array seems to be referring to will not be freed until this array is freed. These are good examples of why uninitialized arrays are not for every application.

There is also the option of allocating the memory directly from the GC:

    int * a = cast(int*)GC.calloc(100, GC.BlkAttr.NO_SCAN);

(Some with GC.malloc()).

Now we can fill that array with any value without the fear of having been mistaken for references.

Ali

December 23, 2011

Re: Reading about D: few questions

Posted by Ali Çehreli
in reply to bearophile

Permalink

Ali Çehreli

Posted in reply to bearophile

Permalink

On 12/23/2011 11:51 AM, bearophile wrote:

> Ali:
>
>> There is nothing in the language that makes me say "the returned object is unique; you can cast it to mutable or immutable freely."<
>
> The return value of strongly pure functions is implicitly castable to immutable.

Is that working yet? The commented-out lines below don't compile with 2.057:

void main()
{
    char[] s = "hello".dup;

    char[]            am  = endWithDot(s);
    const(char)[]     ac  = endWithDot(s);
    const(char[])     acc = endWithDot(s);
    // immutable(char)[] ai  = endWithDot(s);
    // immutable(char[]) aii = endWithDot(s);
}

pure char[] endWithDot(const(char)[] s)
{
    char[] result = s.dup;
    result ~= '.';
    return result;
}

Also note that I could not use the better line below in endWithDot():

    return s ~ '.';

as the type of the result is const(char)[]. I insist that it too should be castable to any mutable or immutable type.

> And sometimes inout helps.

Yes but it is only when the types of the parameters and the result should be related.

> Bye,
> bearophile

Ali

Top | Forum index | About this forum

Forums