September 21, 2009
On Mon, 21 Sep 2009 14:28:09 -0400, Jeremie Pelletier <jeremiep@gmail.com> wrote:

> Steven Schveighoffer wrote:
>> On Mon, 21 Sep 2009 12:23:46 -0400, Jeremie Pelletier <jeremiep@gmail.com> wrote:
>>>
>>> 'in' means 'const scope' in D2, which is using pass-by-value semantics.
>>  Yes, you are right.  I can't see the benefit to it, so I guess my recommendation is not to ever use in (use const instead).
>
> I disagree, I have different uses for both. I use 'in' when the reference will not leave the function's scope and const when it does. Both are immutable views on the data but with different usage semantics. The different semantics aren't yet implemented in D2 but they are most useful to determine whether I can, for example, decide whether to send a slice (to in parameters) or a copy (to const parameters).

Yes, that would be useful if it was enforced.  But it's not, so it's not useful.  Unless you trust the compiler in your head :)

>>> I myself stay out of 'ref' and 'out' params since they do not yet optimize and add quite a lot of overhead making temporary "safe" copies of the data.
>>  I understand the problem behind not optimizing  (inline), but I'm not sure what you mean by making temporary "safe" copies.
>
> Right now the compiler makes a temporary copy of referenced parameters on the stack, calls the function with a pointer to the stack copy, and once the function returns copies the modified temporary back to its original location. This is quite considerable overhead.

Are you sure this is true?  I don't have a d2 compiler right now, but that sounds like a *huge* step in the wrong direction.  D1 does not do this (tested dmd 1.046).

>
>>> Also 'scope' params have a meaning, when a delegate parameter is declared as scope, it allows a closure to use stack storage instead of the usual heap storage.
>>  yes, but in the context of an 'in' parameter, most of the time you are not passing a delegate using in, so scope doesn't mean much there.
>
> The implied 'scope' in 'in' has no effect yet due to a current compiler bug. You have to explicitly use 'scope' when declaring delegate parameters. Just like the 'in' vs 'const' have different semantics, 'in' vs plain 'scope' also have different semantics, which should get fixed soon.

Yes, I understand the reasoning for scope delegates.  My point was that
1. scope for non-delegate parameters is currently a noop
2. const does not make any sense for a delegate.

So arguing that the 'scope' part of 'in' is useful for delegates is currently a moot point.  I guess you could use it as a quicker way to type scope or const, but it currently doesn't have any different use than just typing 'scope' (for delegates) or 'const' (for other types).

>
> For example, consider the following:
>
> void Foo(scope delegate() bar) { bar(); }
> void Foo2(delegate() bar) { bar(); }
>
> // This method uses stack storage, the implied scope in 'in' should also work here, but is bugged right now so explicit 'scope' is needed
> void Test() {
>      void Bar() {}
>      Foo(&Bar);
> }
>
> // This method uses heap storage allocated on the GC through _d_allocmemory
> // Notice how the only difference is the 'scope' qualifier of Foo2()
> void Test2() {
>      void Bar() {}
>      Foo2(&Bar);
> }

it's still incomplete.  For example, this still allocates a closure on the heap:

void Bar() {}
auto x = &Bar;
Foo2(x);

I think scope will have to become a type-modifier before it is a complete solution.

-Steve
September 21, 2009
Steven Schveighoffer wrote:
> On Mon, 21 Sep 2009 14:28:09 -0400, Jeremie Pelletier <jeremiep@gmail.com> wrote:
> 
>> Steven Schveighoffer wrote:
>>> On Mon, 21 Sep 2009 12:23:46 -0400, Jeremie Pelletier <jeremiep@gmail.com> wrote:
>>>>
>>>> 'in' means 'const scope' in D2, which is using pass-by-value semantics.
>>>  Yes, you are right.  I can't see the benefit to it, so I guess my recommendation is not to ever use in (use const instead).
>>
>> I disagree, I have different uses for both. I use 'in' when the reference will not leave the function's scope and const when it does. Both are immutable views on the data but with different usage semantics. The different semantics aren't yet implemented in D2 but they are most useful to determine whether I can, for example, decide whether to send a slice (to in parameters) or a copy (to const parameters).
> 
> Yes, that would be useful if it was enforced.  But it's not, so it's not useful.  Unless you trust the compiler in your head :)

It's useful when I look back at the prototype to get an idea of how I used the parameter without looking at the entire function body. Besides if it ever gets enforced my code will be ready, its just good practice in my book.

>>>> I myself stay out of 'ref' and 'out' params since they do not yet optimize and add quite a lot of overhead making temporary "safe" copies of the data.
>>>  I understand the problem behind not optimizing  (inline), but I'm not sure what you mean by making temporary "safe" copies.
>>
>> Right now the compiler makes a temporary copy of referenced parameters on the stack, calls the function with a pointer to the stack copy, and once the function returns copies the modified temporary back to its original location. This is quite considerable overhead.
> 
> Are you sure this is true?  I don't have a d2 compiler right now, but that sounds like a *huge* step in the wrong direction.  D1 does not do this (tested dmd 1.046).

Yeah I started a thread about that a few months ago in digitalmars.D, its something that's on the bugzilla I believe.

>>
>>>> Also 'scope' params have a meaning, when a delegate parameter is declared as scope, it allows a closure to use stack storage instead of the usual heap storage.
>>>  yes, but in the context of an 'in' parameter, most of the time you are not passing a delegate using in, so scope doesn't mean much there.
>>
>> The implied 'scope' in 'in' has no effect yet due to a current compiler bug. You have to explicitly use 'scope' when declaring delegate parameters. Just like the 'in' vs 'const' have different semantics, 'in' vs plain 'scope' also have different semantics, which should get fixed soon.
> 
> Yes, I understand the reasoning for scope delegates.  My point was that
> 1. scope for non-delegate parameters is currently a noop
> 2. const does not make any sense for a delegate.
> 
> So arguing that the 'scope' part of 'in' is useful for delegates is currently a moot point.  I guess you could use it as a quicker way to type scope or const, but it currently doesn't have any different use than just typing 'scope' (for delegates) or 'const' (for other types).

There again, its just good practice to know what the keywords implies when using them even if they are noops, you never know when that noop turns into enforcement. Although I agree that const delegates and scope values doesn't mean anything. I myself only use 'in' on references, values gets 'const'.

>>
>> For example, consider the following:
>>
>> void Foo(scope delegate() bar) { bar(); }
>> void Foo2(delegate() bar) { bar(); }
>>
>> // This method uses stack storage, the implied scope in 'in' should also work here, but is bugged right now so explicit 'scope' is needed
>> void Test() {
>>      void Bar() {}
>>      Foo(&Bar);
>> }
>>
>> // This method uses heap storage allocated on the GC through _d_allocmemory
>> // Notice how the only difference is the 'scope' qualifier of Foo2()
>> void Test2() {
>>      void Bar() {}
>>      Foo2(&Bar);
>> }
> 
> it's still incomplete.  For example, this still allocates a closure on the heap:
> 
> void Bar() {}
> auto x = &Bar;
> Foo2(x);
> 
> I think scope will have to become a type-modifier before it is a complete solution.

Yeah, that is also a current bug in bugzilla, local pointers to local closures should not trigger heap allocations.
September 21, 2009
On Mon, Sep 21, 2009 at 12:53 PM, Jeremie Pelletier <jeremiep@gmail.com> wrote:
>>> Right now the compiler makes a temporary copy of referenced parameters on the stack, calls the function with a pointer to the stack copy, and once the function returns copies the modified temporary back to its original location. This is quite considerable overhead.
>>
>> Are you sure this is true?  I don't have a d2 compiler right now, but that sounds like a *huge* step in the wrong direction.  D1 does not do this (tested dmd 1.046).
>
> Yeah I started a thread about that a few months ago in digitalmars.D, its something that's on the bugzilla I believe.

I think this is the only bugzilla bug about speed of reference parameters: http://d.puremagic.com/issues/show_bug.cgi?id=2008

It does not mention the copying issue in D2 you talk about, only lack of inlining in D1 and D2.  But I think the asm code posted there may be doing that copying.  Not a big ASM guru though.

To anyone who thinks that poor optimization of ref args is an important issue: please vote for the bug!

--bb
September 21, 2009
On Mon, 21 Sep 2009 16:29:11 -0400, Bill Baxter <wbaxter@gmail.com> wrote:

> On Mon, Sep 21, 2009 at 12:53 PM, Jeremie Pelletier <jeremiep@gmail.com> wrote:
>>>> Right now the compiler makes a temporary copy of referenced parameters on
>>>> the stack, calls the function with a pointer to the stack copy, and once the
>>>> function returns copies the modified temporary back to its original
>>>> location. This is quite considerable overhead.
>>>
>>> Are you sure this is true?  I don't have a d2 compiler right now, but that
>>> sounds like a *huge* step in the wrong direction.  D1 does not do this
>>> (tested dmd 1.046).
>>
>> Yeah I started a thread about that a few months ago in digitalmars.D, its
>> something that's on the bugzilla I believe.
>
> I think this is the only bugzilla bug about speed of reference parameters:
> http://d.puremagic.com/issues/show_bug.cgi?id=2008
>
> It does not mention the copying issue in D2 you talk about, only lack
> of inlining in D1 and D2.  But I think the asm code posted there may
> be doing that copying.  Not a big ASM guru though.

I don't think it's doing that.  I think it's pushing the pointers onto the stack (as expected).

I'm also not an asm guru :)

Note that some of the assembler instructions at the beginning of the function are to initialize the variables a and b to 0.

Here is my test code, and relevant asm output (dmd 1.046, no inline or release), note the void initialization to prevent the initializing of the structure before passing:


struct S
{
    int x;
    int y;
    int z;
}

void foo(ref S s)
{
    s.x = 5;
    s.y = 6;
    s.z = 7;
}

void main()
{
    S s = void;
    foo(s);
}

_D6testme3fooFKS6testme1SZv:
		push	EBP
		mov	EBP,ESP
		mov	dword ptr [EAX],5
		mov	dword ptr 4[EAX],6
		mov	dword ptr 8[EAX],7
		pop	EBP
		ret
		nop
		nop
		nop
.text._D6testme3fooFKS6testme1SZv	ends

...

_Dmain:
		push	EBP
		mov	EBP,ESP
		sub	ESP,0Ch
		lea	EAX,-0Ch[EBP] ; I think this puts the pointer into the EAX register for the call
		call	near ptr _D6testme3fooFKS6testme1SZv@PC32
		xor	EAX,EAX
		leave
		ret
.text._Dmain	ends


-Steve
September 22, 2009
Jeremie Pelletier wrote:

> You cannot have static array be ref or out. You must use int[] arr instead and let the caller specify the length.
>
> Array in D are already references to their data, it's a 8bytes (or 16bytes on x64) value containing a pointer and a length.

Ok, this I got correct :)

>So the following prototype: void func(<none>/in/ref/out int[] arr);
>
> would have the following semantics:
>
> "<none>" copies the array reference, the referenced data is mutable. Modifying the local reference does not change the caller's reference.

Modifying the local referenced data does not change the caller's referenced
data.
So only if the referenced data is mutated, a copy will be made?
Test seems to back this up.

> "in" copies the array reference, the local reference AND the referenced data are immutable in the method's scope.

What, there is a difference between <none> and "in" ??
Where can I about read this?
In the example "int foo(int x, ..." on the functions page x is "in".
Are we maybe talking about a different D? D1
I really really wish to understand this stuff.
<info> I know about how in c a function stack is created.

> "ref" passes a reference to the caller's array reference, the referenced data is mutable. Modifying the local reference also changes the caller's reference.

Same as <none> except that it doesn't create a copy on mutate?

> "out" passes a reference to the caller's array reference. The referenced array reference is zeroed and can be modified by the local reference.

Compared to returning a locally(to the function) created array, here no allocation might be needed if the passed array is big enough.

>
> If you want a mutable reference to an immutable view on the referenced data, use const(int)[], which can also be ref or out.

out const(int)[] will create a array of zero's which you cannot change?

Why isn't there a nice table about this?
columns: local reference/data with/without mutation, caller reference/data
with/without mutation
rowns: <none>, in, out, ref
cells: array.ptr/length/data
Or, where should I put it after creating it.

> void  func(in/out/ref int i)
>
> "<none>" would be a mutable copy.
> "in" would be immutable copy.
I can't get over the idea that in and none aren't the same
Really, D1? :D
In my test program I can change local i using "in"

> "out" is a reference to the caller's int initialized to 0.

> "ref" is a reference to the caller's int.

> "in" and "const" are only really useful with types which are already references, such as pointers, arrays and objects.

Hope I got it right..


September 22, 2009
Saaa wrote:
> Jeremie Pelletier wrote:
> 
>> You cannot have static array be ref or out. You must use int[] arr instead and let the caller specify the length.
>>
>> Array in D are already references to their data, it's a 8bytes (or 16bytes on x64) value containing a pointer and a length.
> 
> Ok, this I got correct :)
> 
>> So the following prototype: void func(<none>/in/ref/out int[] arr);
>>
>> would have the following semantics:
>>
>> "<none>" copies the array reference, the referenced data is mutable. Modifying the local reference does not change the caller's reference.
> 
> Modifying the local referenced data does not change the caller's referenced data.
> So only if the referenced data is mutated, a copy will be made?
> Test seems to back this up.

What I meant is that modifying the local reference does not change the caller's reference, but both references point to the same data.

>> "in" copies the array reference, the local reference AND the referenced data are immutable in the method's scope.
> 
> What, there is a difference between <none> and "in" ??
> Where can I about read this?
> In the example "int foo(int x, ..." on the functions page x is "in".
> Are we maybe talking about a different D? D1
> I really really wish to understand this stuff.
> <info> I know about how in c a function stack is created.

I don't think theres a difference in D1, in D2 there is:

void Foo(in int[] bar);
here typeof(bar).stringof would say const(int[]).

http://digitalmars.com/d/2.0/function.html

"If no storage class is specified, the parameter becomes a mutable copy of its argument."

>> "ref" passes a reference to the caller's array reference, the referenced data is mutable. Modifying the local reference also changes the caller's reference.
> 
> Same as <none> except that it doesn't create a copy on mutate?

Not the same, no storage class will make a copy of the reference, which points to the same data as the caller's reference. "ref" makes a reference to the caller's reference, so you can modify it *and* the data it points to.

Here's an example:

void main() {
    int[] arr = new int[64];
    arr[0] = 10;

    foo(arr);

    assert(arr.length == 10); // True because foo changed our reference with its own
}

void foo(ref int[] arr) {
    assert(arr[0] == 10); // True because arr points to the same data as the caller's

    arr = new int[10];
}

This is the difference between being able to modify the array data and the array reference.

>> "out" passes a reference to the caller's array reference. The referenced array reference is zeroed and can be modified by the local reference.
> 
> Compared to returning a locally(to the function) created array, here
> no allocation might be needed if the passed array is big enough.

"out" does not create the array data, only a reference to a null array reference in the caller's frame. You still need to allocate data for the array. Take the above example and change ref to out, the assert in foo would fail because arr is set to a null reference, but the assert in main would succeed because foo modified it.

>> If you want a mutable reference to an immutable view on the referenced data, use const(int)[], which can also be ref or out.
> 
> out const(int)[] will create a array of zero's which you cannot change?

No, it will nullify the reference it is given allowing the method to assign a new reference to it for the caller to use, but wont be able to modify its data after assigning the reference.

> Why isn't there a nice table about this?
> columns: local reference/data with/without mutation, caller reference/data with/without mutation
> rowns: <none>, in, out, ref
> cells: array.ptr/length/data
> Or, where should I put it after creating it.

I had to learn it through trial and error myself :) If you do make such a table you could submit it as an improvement in bugzilla.

>> void  func(in/out/ref int i)
>>
>> "<none>" would be a mutable copy.
>> "in" would be immutable copy.
> I can't get over the idea that in and none aren't the same
> Really, D1? :D
> In my test program I can change local i using "in"
> 
>> "out" is a reference to the caller's int initialized to 0.
> 
>> "ref" is a reference to the caller's int.
> 
>> "in" and "const" are only really useful with types which are already references, such as pointers, arrays and objects.
> 
> Hope I got it right.. 

Oops, I was talking in the context of D2, where 'in' is an alias for 'const scope'.
September 22, 2009
> 'ref' and 'inout' are identical. 'ref' was introduced after D1 was finalized for future expansion - 'inout' return values don't make much sense.

Thanks. I now have to get rid of inout.

Is there a deprecation mechanism in D2 to get rid of keywords like "bit", "inout" or "immutable" ? It would break compatibility but no more than default thread local storage.


September 22, 2009
#ponce wrote:
>> 'ref' and 'inout' are identical. 'ref' was introduced after D1 was
>> finalized for future expansion - 'inout' return values don't make much
>> sense.
> 
> Thanks. I now have to get rid of inout.
> 
> Is there a deprecation mechanism in D2 to get rid of keywords like "bit", "inout" or "immutable" ? It would break compatibility but no more than default thread local storage.

immutable is not deprecated, in fact its going to replace invariant.

I dont think we should deprecate keywords like inout, since its already widely used in IDL so it only makes sense to keep it in D. bit is declared in object.d so that one can be deprecated, I myself removed it from my runtime.
September 22, 2009
Thanks Jeremie Pelletier!

Good to see I'm not going crazy (<none> /  in)  :)
just a few small questions:

What exactly is a storage class?
Is it this:
C has a concept of 'Storage classes' which are used to define the scope
(visability) and life time of variables and/or functions.

Dynamic arrays are always on the heap, right?

I never use "new (array)"  nor change references in other ways except
changing
the length.
I suspect, the old array gets garbage collected if no other references point
to it :)
I also suspect "new (array)"  to never allocate in place even if the length
is equal or smaller and no other pointers point to the old array, because
the
gc needs to run first to check for this.

In stead of a table I made a nice diagram:
http://bayimg.com/kAEkNAaCA
Is it correct?


September 22, 2009
Saaa wrote:
> Thanks Jeremie Pelletier!
> 
> Good to see I'm not going crazy (<none> /  in)  :)
> just a few small questions:
> 
> What exactly is a storage class?
> Is it this:
> C has a concept of 'Storage classes' which are used to define the scope (visability) and life time of variables and/or functions.

That would be it. The storage class is mostly useful to the compiler to determine the semantics of the variable, which will determine where it is allocated, its lifetime and its mutability.

> Dynamic arrays are always on the heap, right?

That is correct.

> I never use "new (array)"  nor change references in other ways except changing
> the length.
> I suspect, the old array gets garbage collected if no other references point to it :)
> I also suspect "new (array)"  to never allocate in place even if the length
> is equal or smaller and no other pointers point to the old array, because the
> gc needs to run first to check for this.

I myself use a mix of both new and .length:

new is faster to allocate brand new arrays since there are less internal function calls. new, just like malloc in C and as the name implies, always allocates new memory.

.length is faster to resize arrays since there are good chances the resize can be done in place, for example if it fits in the same memory block or if there are enough free contiguous pages following the allocation, just like C's realloc.

I also change references all the time, depending on the context. For example, when working with file paths, you can send a path to a method accepting an immutable(char)[] (or its alias, string) which saves the reference somewhere without making a copy, then modify your local reference and call another function with the new array. The original array won't be garbage collected because it is still referenced, and the parameter of type immutable(char)[] instead of const(char)[] specifies the data will not change at all; const(char)[] means the callee's scope should not modify the data, but other references may be mutable.

It can be a bit confusing at first, especially since immutable and const have very similar semantics. But you'll get the hang of it after playing with them for some time :)

> In stead of a table I made a nice diagram:
> http://bayimg.com/kAEkNAaCA
> Is it correct? 

That looks good! I would use the terms caller and callee instead of call and func however, it would be clearer that way. You might want to add headers to the different colums (storage class, scope, reference and data respectively), it took me a few seconds to figure the meaning of the columns there :) Other than that I believe it reflects the semantics used in D1.

Jeremie