Thread overview
Initialization of std.typecons.RefCounted objects
Jul 18, 2012
Matthias Walter
Jul 18, 2012
Christophe Travert
Jul 18, 2012
Christophe Travert
Jul 19, 2012
monarch_dodra
Jul 19, 2012
Christophe Travert
Jul 19, 2012
Matthias Walter
Jul 19, 2012
monarch_dodra
Jul 19, 2012
Christophe Travert
Jul 19, 2012
Matthias Walter
Jul 19, 2012
Matthias Walter
July 18, 2012
Hi,

I looked at Bug #6153 (Array!(Array!int) failure) and found that the
reason is the following behavior:

Given some type T you wrap as RefCounted!T object, then proper use of the auto-initialization feature or manual initialization on demand ensures that no null-pointer dereference will happen. But the other problem is that as long as the wrapped object is uninitialized, the object behaves like a value type instead of a reference type.

This exactly is what makes the following code fail:

Array!(Array!int) array2d;
array2d.length = 1;
array2d[0].insert(1);

The inner array "array2d[0]" was not initialized and hence the reference pointer is null. Since Array.opIndex returns by value, the 'insert' method is called on a temporary object and does not affect the inner array (still being empty) which is stored in the outer array.

What do you think about this?

Must the user ensure that the Array container is always initialized explicitly? If yes, how shall this happen since the only constructor takes a (non-empty) tuple of new elements. Or shall opIndex return by reference?

The problem arises in the same way if Array was a class - but then the user typically knows that after enlarging the outer array, the inner array is a null pointer which must be created by new.

Best regards,

Matthias Walter
July 18, 2012
Matthias Walter , dans le message (digitalmars.D:172673), a écrit :
> I looked at Bug #6153 (Array!(Array!int) failure) and found that the
>
> This exactly is what makes the following code fail:
> 
> Array!(Array!int) array2d;
> array2d.length = 1;
> array2d[0].insert(1);
> 
> The inner array "array2d[0]" was not initialized and hence the reference pointer is null. Since Array.opIndex returns by value, the 'insert' method is called on a temporary object and does not affect the inner array (still being empty) which is stored in the outer array.
> 
> What do you think about this?
> 
> Must the user ensure that the Array container is always initialized explicitly? If yes, how shall this happen since the only constructor takes a (non-empty) tuple of new elements. Or shall opIndex return by reference?

I think opIndex should return by reference. opIndexAssign is of no help when the user want to use a function that takes a reference (here Array.insert). It is normal that Array uses default construction when someone increases the array's length.

Besides that point, I don't see why default-constructed Array have an uninitialised Payload. This makes uninitialised Array behaves unexpectedly, because making a copy and using the copy will not affect the original, which is not the intended reference value behavior.

Correcting default-initialization of Array would correct your bug, but would not correct the wrong behavior of Array when the value returned by opIndex is used with a non-const method with other objects (dynamic arrays?). So both have to be changed in my opinion.

-- 
Christophe

July 18, 2012
I see you found the appropriate entry to discuss this bug:

http://d.puremagic.com/issues/show_bug.cgi?id=6153

July 19, 2012
On Wednesday, 18 July 2012 at 13:32:39 UTC, travert@phare.normalesup.org (Christophe Travert) wrote:
> I think opIndex should return by reference. opIndexAssign is of no help
> when the user want to use a function that takes a reference (here
> Array.insert).

Having already brought this up before, the answer to that is that the containers inside "container" are meant to be "closed" and not give access to the addresses of their internals, so the return by value is on purpose:

Here is Andrei's post about this:
http://forum.dlang.org/thread/ceftaiklanejfhodbpix@forum.dlang.org?page=2#post-jthdko:24231u:241:40digitalmars.com

That said, I had brought up the exact same issue here:
http://forum.dlang.org/thread/bkozswmsgeibarowfwvq@forum.dlang.org

> It is normal that Array uses default construction when
> someone increases the array's length.
> Besides that point, I don't see why default-constructed Array have an
> uninitialised Payload. This makes uninitialised Array behaves
> unexpectedly, because making a copy and using the copy will not affect
> the original, which is not the intended reference value behavior.

I think it would be better to "initialize on copy", rather than default initialize. There are too many cases an empty array is created, then initialized on the next line, or passed to something else that does the initialization proper.

You'd get the correct behavior, and everything else (except dupe) works fine anyways.
July 19, 2012
"monarch_dodra" , dans le message (digitalmars.D:172700), a écrit :
> I think it would be better to "initialize on copy", rather than default initialize. There are too many cases an empty array is created, then initialized on the next line, or passed to something else that does the initialization proper.

Not default-initializing Array has a cost for every legitimate use of an Array. I think people use Array more often than they create uninitialized ones that are not going to be used before an other Array instance is assigned to them, so Array would be more efficient if it was default initialized and never check it is initialized again. But that's just speculation.

> You'd get the correct behavior, and everything else (except dupe) works fine anyways.

Keeping the adress of the content secret may be a valuable intention, but as long as properties and opIndex does not allow to correctly forward methods, this is completely broken. Is there even a begining of a plan to implement this? I don't see how properties or opIndex could safely forward methods that uses references and that we do not control without escaping the reference to them. That's not possible until D has a complete control of escaping references, which is not planned in the near or distant future. Not to mention that having a complete control of escaping reference break lots of code anyway, and might considerably decrease the need for ref counted utilities like... Array.

Please, at least give me hope that there is light at the end of the tunnel.

-- 
Christophe
July 19, 2012
On 07/18/2012 03:32 PM, Christophe Travert wrote:
> Matthias Walter , dans le message (digitalmars.D:172673), a écrit :
>> I looked at Bug #6153 (Array!(Array!int) failure) and found that the
>>
>> This exactly is what makes the following code fail:
>>
>> Array!(Array!int) array2d;
>> array2d.length = 1;
>> array2d[0].insert(1);
>>
>> The inner array "array2d[0]" was not initialized and hence the reference pointer is null. Since Array.opIndex returns by value, the 'insert' method is called on a temporary object and does not affect the inner array (still being empty) which is stored in the outer array.
>>
>> What do you think about this?
>>
>> Must the user ensure that the Array container is always initialized explicitly? If yes, how shall this happen since the only constructor takes a (non-empty) tuple of new elements. Or shall opIndex return by reference?
> 
> I think opIndex should return by reference. opIndexAssign is of no help when the user want to use a function that takes a reference (here Array.insert). It is normal that Array uses default construction when someone increases the array's length.

Okay, I fully agree here.

> Besides that point, I don't see why default-constructed Array have an uninitialised Payload. This makes uninitialised Array behaves unexpectedly, because making a copy and using the copy will not affect the original, which is not the intended reference value behavior.

Well the reason is that no user-defined function is called when a struct is default-constructed and then copied into another variable (the first one is this(this) which only known the new variable). Hence no user code can be written that would initialize the payload. Do we need a paradigm to perform explicit initialization of structs?

Best regards,

Matthias
July 19, 2012
On 07/19/2012 10:14 AM, Christophe Travert wrote:
> "monarch_dodra" , dans le message (digitalmars.D:172700), a écrit :
>> I think it would be better to "initialize on copy", rather than default initialize. There are too many cases an empty array is created, then initialized on the next line, or passed to something else that does the initialization proper.
> 
> Not default-initializing Array has a cost for every legitimate use of an Array. I think people use Array more often than they create uninitialized ones that are not going to be used before an other Array instance is assigned to them, so Array would be more efficient if it was default initialized and never check it is initialized again. But that's just speculation.

I agree here. Additionally my question: Is it possible (at the moment) to really do "initialize on copy"? As far as I see it, the only way to interact here is to implement 'this(this)' which is called after bit-copying and hence cannot access the source of the copy process.



July 19, 2012
On Thursday, 19 July 2012 at 08:14:25 UTC, travert@phare.normalesup.org (Christophe Travert) wrote:
> "monarch_dodra" , dans le message (digitalmars.D:172700), a écrit :
>> I think it would be better to "initialize on copy", rather than default initialize. There are too many cases an empty array is created, then initialized on the next line, or passed to something else that does the initialization proper.
>
> Not default-initializing Array has a cost for every legitimate use of an Array.

I'm back pedaling and agreeing 100% actually. Plus it keeps the implementation simpler. +1 to you.

> Keeping the adress of the content secret may be a valuable intention,
> but as long as properties and opIndex does not allow to correctly
> forward methods, this is completely broken. Is there even a begining of
> a plan to implement this? I don't see how properties or opIndex could
> safely forward methods that uses references and that we do not control
> without escaping the reference to them. That's not possible until D has
> a complete control of escaping references, which is not planned in the
> near or distant future. Not to mention that having a complete control of
> escaping reference break lots of code anyway, and might considerably
> decrease the need for ref counted utilities like... Array.
>
> Please, at least give me hope that there is light at the end of the
> tunnel.

One of the reason the implementation doesn't let you escape a reference is that that reference may become (_unverifiably_) invalid. Ranges to Arrays, on the other hand, keep a reference to the Array itself, and always* remain valid. (always as long as the Array doesn't shrink past the range's boundaries, but in this case, at least, the range throws an exception, rather than crash the program).

A reference to a dynamic array, on the other hand, will ALWAYS be valid 100% of the time, because the original data is never actually destroyed (until unreferenced). So it is perfectly safe to expose references in an array.

So escaping references from an Array is something _very_ dangerous, especially in a language that kind of guarantees it won't core dump if you don't manipulate pointers (which could if arrays escaped references).

While I get your point, it really feels like a "lose lose" situation here: You get stiffed either way...with Array...
...That said, I see no reason for the other containers (SList, I'm looking at you), not to expose references.

The way I see it, Array *could* escape references, but then the whole class would have to be qualified @unsafe (if such a thing existed) or something along these lines.

The current work around? Copy-Extract, manipulate, re-insert. Sucks.

On Thursday, 19 July 2012 at 10:39:56 UTC, Matthias Walter wrote:
> On 07/19/2012 10:14 AM, Christophe Travert wrote:
> I agree here. Additionally my question: Is it possible (at the moment)
> to really do "initialize on copy"? As far as I see it, the only way to
> interact here is to implement 'this(this)' which is called after
> bit-copying and hence cannot access the source of the copy process.

Good point. The answer is no though. A RefCounted (also known as SharedPointer in C++) needs to be initialized first if you want it aliased.

Allowing such a behavior is possible, but prohibitively expensive.
July 19, 2012
"monarch_dodra" , dans le message (digitalmars.D:172710), a écrit :
> One of the reason the implementation doesn't let you escape a reference is that that reference may become (_unverifiably_) invalid.

The same applies to a dynamic array: it is undistinguishable from a sliced static array. More generally, as long as you allow variables on the stack with no escaped reference tracking, you can't ensure references remain valid. Even in safe code.

If I want my references to remain valid, I use dynamic array and garbage collection. If I use Array, I accept that my references may die. Array that protects the validity of their references are awesome. But, IMHO, not at that cost.

> ...That said, I see no reason for the other containers (SList, I'm looking at you), not to expose references.

I'm against not exposing reference, but all containers will be implemented with custom allocator someday.

> The current work around? Copy-Extract, manipulate, re-insert. Sucks.

IMO, what sucks even more is that arr[0].insert(foo) compiles while it has no effect. arr[0] is a R-value, but applying method to R-value is allowed. I don't know the state of debates about forbiding to call non-const methods on R-values. I think this would break too much code.

July 19, 2012
On 07/19/2012 02:16 PM, Christophe Travert wrote:
> "monarch_dodra" , dans le message (digitalmars.D:172710), a écrit :
>> One of the reason the implementation doesn't let you escape a reference is that that reference may become (_unverifiably_) invalid.
> 
> The same applies to a dynamic array: it is undistinguishable from a sliced static array. More generally, as long as you allow variables on the stack with no escaped reference tracking, you can't ensure references remain valid. Even in safe code.
> 
> If I want my references to remain valid, I use dynamic array and garbage collection. If I use Array, I accept that my references may die. Array that protects the validity of their references are awesome. But, IMHO, not at that cost.
> 
>> ...That said, I see no reason for the other containers (SList, I'm looking at you), not to expose references.
> 
> I'm against not exposing reference, but all containers will be implemented with custom allocator someday.
> 
>> The current work around? Copy-Extract, manipulate, re-insert. Sucks.
> 
> IMO, what sucks even more is that arr[0].insert(foo) compiles while it has no effect. arr[0] is a R-value, but applying method to R-value is allowed. I don't know the state of debates about forbiding to call non-const methods on R-values. I think this would break too much code.

As it seems the issue really should be resolved by making opIndex return by reference and press thumbs hard that something like a 'scope ref' will be implemented?

Furthermore, since RefCounted objects do not behave like reference types
until initialized, they *must* be initialized before anything else
happens and hence I propose to change std.container.Array like
Christophe said: Replace 'isInitialized()' checks by assertions and add
a method with which the user explicitly initialized the reference counter.

Or is there a reasonable alternative?

Best regards,

Matthias Walter