View mode: basic / threaded / horizontal-split · Log in · Help
January 12, 2007
Documentation of D arrays
Hello!

I'm trying to understand array handling in D. Unfortunately the official
documentation[1] is not very helpful..

[1] http://www.digitalmars.com/d/arrays.html

By trial and error I found out that arrays are passed by some COW magic
(where is this documentated?). So, if I want to change the content of an
array visible for the caller, I have to pass it with an inout-statement
(This works, but is it the canonical way?).

Next question: How can I initialize an array?
It seems like COW works only for parameters. Eg.

void foo(inout char[] s)
{
       s = "blub";
}
void bar()
{
	char[] s;
	foo(s);
	s[1] = 'a'; // will crash
}

So, how can I copy the string "blub" into s? s[] = "blub" doesn't work
because the .length won't be adjusted.
Oh, while writing this I noticed "blub".dup does work. It this the
preferred way or should I manually alter the .length?

So what exactly is T[]? According to the documentation it's a tuple
(pointer, length). So, if I pass a T[] to a function, pointer and length
are passed by value (unless I specify and (in)out statement)? Is this
some array magic or can I use this for own types?

I also found out that I can write
void foo(inout int[] a)
{
	a ~= 1;
}
So "~=" does not only support T[] as RHS but also T. Where is the
documentation for this?

Sorry, if these are obvious questions, but I can't figure this out by
the official documentation (or I'm blind).

Regards,
Sebastian
January 12, 2007
Re: Documentation of D arrays
Sebastian Biallas wrote:
> Hello!
> 
> I'm trying to understand array handling in D. Unfortunately the official
> documentation[1] is not very helpful..
> 
> [1] http://www.digitalmars.com/d/arrays.html
> 
> By trial and error I found out that arrays are passed by some COW magic
> (where is this documentated?). So, if I want to change the content of an
> array visible for the caller, I have to pass it with an inout-statement
> (This works, but is it the canonical way?).

Almost.  Dynamic arrays are declared internally like so in D:

struct Array
{
    size_t len;
    byte*  ptr;
}

So passing a dynamic array by value is essentially the same as passing 
around a pointer.  The only effect adding 'inout' to your function will 
have is that the length of the array can be altered and those changes 
will persist when the function completes.

There is a brief mention of this in:

http://www.digitalmars.com/d/function.html

"For dynamic array and object parameters, which are passed by reference, 
in/out/inout apply only to the reference and not the contents."

> Next question: How can I initialize an array?
> It seems like COW works only for parameters. Eg.
> 
> void foo(inout char[] s)
> {
>         s = "blub";
> }
> void bar()
> {
> 	char[] s;
> 	foo(s);
> 	s[1] = 'a'; // will crash
> }

Doing:

    s = "blurb";

allocates no memory, but rather just changes Array.ptr to point to 
"blurb" and sets Array.len appropriately.  The above code will actually 
work in Windows because the data segment where string constants are 
stored is not read-only.

> So, how can I copy the string "blub" into s? s[] = "blub" doesn't work
> because the .length won't be adjusted.

    s = "blurb".dup;

> Oh, while writing this I noticed "blub".dup does work. It this the
> preferred way or should I manually alter the .length?

Yes :-)

> So what exactly is T[]? According to the documentation it's a tuple
> (pointer, length). So, if I pass a T[] to a function, pointer and length
> are passed by value (unless I specify and (in)out statement)? Is this
> some array magic or can I use this for own types?

See above.  You could duplicate this in your own code by creating a 
struct containing pointers.  Also, I don't think it's a good idea to 
call T[] a Tuple in D because the term has a fairly specific 
connotation.  See the section entitled "Tuple Parameters" at 
http://www.digitalmars.com/d/template.html and also 
http://www.digitalmars.com/d/phobos/std_typetuple.html

> I also found out that I can write
> void foo(inout int[] a)
> {
> 	a ~= 1;
> }
> So "~=" does not only support T[] as RHS but also T. Where is the
> documentation for this?

http://www.digitalmars.com/d/arrays.html I suppose, though the 
description isn't explicit.  Rather, it's implied by "the ~= operator 
means append."

> Sorry, if these are obvious questions, but I can't figure this out by
> the official documentation (or I'm blind).

Not at all.  I've been using D for a few years now, and I still have 
trouble finding things in the spec.  It's pretty much all there, but not 
always in the most obvious location.


Sean
January 12, 2007
Re: Documentation of D arrays
Reply to Sebastian,

> Hello!
> 
> I'm trying to understand array handling in D. Unfortunately the
> official documentation[1] is not very helpful..
> 
> [1] http://www.digitalmars.com/d/arrays.html
> 
> By trial and error I found out that arrays are passed by some COW
> magic (where is this documentated?). So, if I want to change the
> content of an array visible for the caller, I have to pass it with an
> inout-statement (This works, but is it the canonical way?).

Arrays are references types. If you pass an array to a function, the function 
gets a copy of the pointer length pair that the caller uses. The function 
can change the contents of the memory the references but can't change the 
callers reference to that data (unless you use out or inout). As to it seeming 
to be COW, if you change the length of an array sometimes the GC can't extend 
it in place and moves the whole thing to a bigger chunk of ram (this dosn't 
always happen). When the ~ and ~= operators are used, the GC always makes 
a copy.

[...]
> the official documentation (or I'm blind).

I offten feel that way myself. I have had so much trouble finding things 
that /were/ put in a good place that I have a CGI sript on my box that gives 
me a grep of the whole D spec converted into a webpage with links and everything.

> 
> Regards,
> Sebastian
January 12, 2007
Re: Documentation of D arrays
Sean Kelly wrote:
> Sebastian Biallas wrote:
>> Hello!
>>
>> I'm trying to understand array handling in D. Unfortunately the official
>> documentation[1] is not very helpful..
>>
>> [1] http://www.digitalmars.com/d/arrays.html
>>
>> By trial and error I found out that arrays are passed by some COW magic
>> (where is this documentated?). So, if I want to change the content of an
>> array visible for the caller, I have to pass it with an inout-statement
>> (This works, but is it the canonical way?).
> 
> Almost.  Dynamic arrays are declared internally like so in D:
> 
> struct Array
> {
>     size_t len;
>     byte*  ptr;
> }
> 
> So passing a dynamic array by value is essentially the same as passing
> around a pointer.  

But not (Array *) but (len, byte *), I guess?

> The only effect adding 'inout' to your function will
> have is that the length of the array can be altered and those changes
> will persist when the function completes.

Hmm, I'm quite sure I can alter the ptr, too (Implicitly, when I append
to the array and there is not enough room).

> There is a brief mention of this in:
> 
> http://www.digitalmars.com/d/function.html
> 
> "For dynamic array and object parameters, which are passed by reference,
> in/out/inout apply only to the reference and not the contents."

Well, the word "reference" is way to much overloaded. Here you don't
pass the Array (you mentioned above) by reference but the content (the
object ptr points to).

>> Next question: How can I initialize an array?
>> It seems like COW works only for parameters. Eg.
>>
>> void foo(inout char[] s)
>> {
>>         s = "blub";
>> }
>> void bar()
>> {
>>     char[] s;
>>     foo(s);
>>     s[1] = 'a'; // will crash
>> }
> 
> Doing:
> 
>     s = "blurb";
> 
> allocates no memory, but rather just changes Array.ptr to point to
> "blurb" and sets Array.len appropriately. 

Yeah, I guess I understood this already.

Is there something similar to the "const" keyword of C/C++ in D? It
looks a little bit fishy to me, that you can write illegal code in D so
easy.. In C/C++ you can return constant array in way, that the caller
a) knows, it's constant
b) errors are detected at compiler time.

> The above code will actually
> work in Windows because the data segment where string constants are
> stored is not read-only.

For some values of "work" :)

>> So what exactly is T[]? According to the documentation it's a tuple
>> (pointer, length). So, if I pass a T[] to a function, pointer and length
>> are passed by value (unless I specify and (in)out statement)? Is this
>> some array magic or can I use this for own types?
> 
> See above.  You could duplicate this in your own code by creating a
> struct containing pointers.  

But without the COW part?

> Also, I don't think it's a good idea to
> call T[] a Tuple in D because the term has a fairly specific
> connotation.  See the section entitled "Tuple Parameters" at
> http://www.digitalmars.com/d/template.html and also
> http://www.digitalmars.com/d/phobos/std_typetuple.html

Yes, you're right.

>> I also found out that I can write
>> void foo(inout int[] a)
>> {
>>     a ~= 1;
>> }
>> So "~=" does not only support T[] as RHS but also T. Where is the
>> documentation for this?
> 
> http://www.digitalmars.com/d/arrays.html I suppose, though the
> description isn't explicit.  Rather, it's implied by "the ~= operator
> means append."

Hmm, that's not the answer I hoped I'd get :)
It's nice to have a language without suprises, but I could only figure
out that the above part by trying it.

>> Sorry, if these are obvious questions, but I can't figure this out by
>> the official documentation (or I'm blind).
> 
> Not at all.  I've been using D for a few years now, and I still have
> trouble finding things in the spec.  It's pretty much all there, but not
> always in the most obvious location.

That's sad. On a first glance the documentation looks really good, but
then it mostly is about syntax, not about semantic.
January 12, 2007
Re: Documentation of D arrays
"Sebastian Biallas" <groups.5.sepp@spamgourmet.com> wrote in message 
news:eo6ofq$1q2b$2@digitaldaemon.com...
>
> But not (Array *) but (len, byte *), I guess?

Yeah, it's more like (in fact, _exactly_ like) passing around a two-element 
struct by value.  If you pass a struct by value into a function and modify 
its members, those changes won't be reflected in the calling function unless 
you use 'inout'.  The same thing applies for arrays since this is really 
what's going on behind the scenes.
>
> Hmm, I'm quite sure I can alter the ptr, too (Implicitly, when I append
> to the array and there is not enough room).

Yes, that's right.

> Well, the word "reference" is way to much overloaded. Here you don't
> pass the Array (you mentioned above) by reference but the content (the
> object ptr points to).

Yes, and this got me a few times too.  Though most of the time I don't need 
a function to modify an array that's passed into it, just one that's a class 
member, or maybe modify it and then return it.

> Is there something similar to the "const" keyword of C/C++ in D? It
> looks a little bit fishy to me, that you can write illegal code in D so
> easy.. In C/C++ you can return constant array in way, that the caller
> a) knows, it's constant
> b) errors are detected at compiler time.

No, and this issue has been beaten absolutely to death.  I really don't care 
what happens with this issue.  I've never actually run into any bugs that 
would be solved by having const, but your mileage may vary, I guess. 
PLEASE, I don't want to start another topic about this :)

> For some values of "work" :)

Hehe

> But without the COW part?

COW is not part of the language.  It's just a convention you can follow when 
writing array-processing functions.  These functions also typically return 
the array, so the function should be called as "s = foo(s)" instead of 
"foo(s)".  The "COW" behavior that you were talking about before -- how 
resizing/reallocating the array in the function had no effect in the 
caller -- was really just an effect of what I mentioned at the beginning of 
this post.  The local array "structure" members were changed in the array 
processing function when you resized the array, and those changes aren't 
reflected in the calling function.

> Hmm, that's not the answer I hoped I'd get :)
> It's nice to have a language without suprises, but I could only figure
> out that the above part by trying it.

At least it's a nice surprise :)
January 12, 2007
Re: Documentation of D arrays
BCS wrote:
> Reply to Sebastian,
> 
>> Hello!
>>
>> I'm trying to understand array handling in D. Unfortunately the
>> official documentation[1] is not very helpful..
>>
>> [1] http://www.digitalmars.com/d/arrays.html
>>
>> By trial and error I found out that arrays are passed by some COW
>> magic (where is this documentated?). So, if I want to change the
>> content of an array visible for the caller, I have to pass it with an
>> inout-statement (This works, but is it the canonical way?).
> 
> Arrays are references types. If you pass an array to a function, the
> function gets a copy of the pointer length pair that the caller uses.
> The function can change the contents of the memory the references but
> can't change the callers reference to that data (unless you use out or
> inout). 

Ah, you're right. I guess I have a better picture now.

> As to it seeming to be COW, if you change the length of an array
> sometimes the GC can't extend it in place and moves the whole thing to a
> bigger chunk of ram (this dosn't always happen). When the ~ and ~=
> operators are used, the GC always makes a copy.

Yeah, that's the trick. You can change it in-place (without
inout-statement), and the COW-part happens once you alter the length
(implicitly or explicitly).

I guess what I called COW isn't even the right term.

So, new question: How to I pass T[] to foo(), so that foo() isn't
allowed to change the content of T[]?
January 12, 2007
Re: Documentation of D arrays
BCS wrote:
> bigger chunk of ram (this dosn't always happen). When the ~ and ~= 
> operators are used, the GC always makes a copy.

~ always makes a copy, but ~= only does so when necessary.
January 12, 2007
Re: Documentation of D arrays
Jarrett Billingsley wrote:
> "Sebastian Biallas" <groups.5.sepp@spamgourmet.com> wrote in message 
>> Is there something similar to the "const" keyword of C/C++ in D? It
>> looks a little bit fishy to me, that you can write illegal code in D so
>> easy.. In C/C++ you can return constant array in way, that the caller
>> a) knows, it's constant
>> b) errors are detected at compiler time.
> 
> No, and this issue has been beaten absolutely to death.  I really don't care 
> what happens with this issue.  I've never actually run into any bugs that 
> would be solved by having const, but your mileage may vary, I guess. 
> PLEASE, I don't want to start another topic about this :)

Sorry, I'm new to this newsgroup, will google :)

I'm from the C/C++/Java/Ruby world (not to mention the functional
languages) and these languages have pretty easy constraints:

C: you pass everything by value
C++: you pass by value or by reference (and a reference is -- more or
less -- just a pointer)
Java: you pass either PODs or references(pointers) by value
Ruby: you pass everything by reference

D doen't fit it this categories that well. That arrays are passed by
reference means something different, because an array in D isn't a first
class object (or whatever I should call this).

Well, I guess there are some D idioms which avoid the const array problem.
January 12, 2007
Re: Documentation of D arrays
Frits van Bommel wrote:
> BCS wrote:
>> bigger chunk of ram (this dosn't always happen). When the ~ and ~=
>> operators are used, the GC always makes a copy.
> 
> ~ always makes a copy, but ~= only does so when necessary.

The first one is documented on the array page, but where is the
documentation for ~=? Common knowledge by using D?

BtW: What exacly happens on:

a = b ~ c and a ~= b

? Is this some build-in opCat? What are the semantics?
January 12, 2007
Re: Documentation of D arrays
Sebastian Biallas wrote:
> Frits van Bommel wrote:
>> BCS wrote:
>>> bigger chunk of ram (this dosn't always happen). When the ~ and ~=
>>> operators are used, the GC always makes a copy.
>> ~ always makes a copy, but ~= only does so when necessary.
> 
> The first one is documented on the array page, but where is the
> documentation for ~=? Common knowledge by using D?

Not sure, but it should be in the spec somewhere...

> BtW: What exacly happens on:
> 
> a = b ~ c and a ~= b
> 
> ? Is this some build-in opCat? What are the semantics?

You can see it as a built-in opCat if you like.
What happens behind the scenes is that a function in the runtime is called.
The source to these functions is in dmd/src/phobos/internal/gc/gc.d if 
you really want to know exactly what they do... (_d_arraycat for ~, 
_d_arrayappend for ~= with array, _d_arrayappendc for ~= with single 
element)
« First   ‹ Prev
1 2
Top | Discussion index | About this forum | D home