View mode: basic / threaded / horizontal-split · Log in · Help
July 18, 2012
Initialization of std.typecons.RefCounted objects
Hi,

I looked at Bug #6153 (Array!(Array!int) failure) and found that the
reason is the following behavior:

Given some type T you wrap as RefCounted!T object, then proper use of
the auto-initialization feature or manual initialization on demand
ensures that no null-pointer dereference will happen. But the other
problem is that as long as the wrapped object is uninitialized, the
object behaves like a value type instead of a reference type.

This exactly is what makes the following code fail:

Array!(Array!int) array2d;
array2d.length = 1;
array2d[0].insert(1);

The inner array "array2d[0]" was not initialized and hence the reference
pointer is null. Since Array.opIndex returns by value, the 'insert'
method is called on a temporary object and does not affect the inner
array (still being empty) which is stored in the outer array.

What do you think about this?

Must the user ensure that the Array container is always initialized
explicitly? If yes, how shall this happen since the only constructor
takes a (non-empty) tuple of new elements. Or shall opIndex return by
reference?

The problem arises in the same way if Array was a class - but then the
user typically knows that after enlarging the outer array, the inner
array is a null pointer which must be created by new.

Best regards,

Matthias Walter
July 18, 2012
Re: Initialization of std.typecons.RefCounted objects
Matthias Walter , dans le message (digitalmars.D:172673), a écrit :
> I looked at Bug #6153 (Array!(Array!int) failure) and found that the
>
> This exactly is what makes the following code fail:
> 
> Array!(Array!int) array2d;
> array2d.length = 1;
> array2d[0].insert(1);
> 
> The inner array "array2d[0]" was not initialized and hence the reference
> pointer is null. Since Array.opIndex returns by value, the 'insert'
> method is called on a temporary object and does not affect the inner
> array (still being empty) which is stored in the outer array.
> 
> What do you think about this?
> 
> Must the user ensure that the Array container is always initialized
> explicitly? If yes, how shall this happen since the only constructor
> takes a (non-empty) tuple of new elements. Or shall opIndex return by
> reference?

I think opIndex should return by reference. opIndexAssign is of no help 
when the user want to use a function that takes a reference (here 
Array.insert). It is normal that Array uses default construction when 
someone increases the array's length.

Besides that point, I don't see why default-constructed Array have an 
uninitialised Payload. This makes uninitialised Array behaves 
unexpectedly, because making a copy and using the copy will not affect 
the original, which is not the intended reference value behavior.

Correcting default-initialization of Array would correct your bug, but 
would not correct the wrong behavior of Array when the value returned by 
opIndex is used with a non-const method with other objects (dynamic 
arrays?). So both have to be changed in my opinion.

-- 
Christophe
July 18, 2012
Re: Initialization of std.typecons.RefCounted objects
I see you found the appropriate entry to discuss this bug:

http://d.puremagic.com/issues/show_bug.cgi?id=6153
July 19, 2012
Re: Initialization of std.typecons.RefCounted objects
On Wednesday, 18 July 2012 at 13:32:39 UTC, 
travert@phare.normalesup.org (Christophe Travert) wrote:
> I think opIndex should return by reference. opIndexAssign is of 
> no help
> when the user want to use a function that takes a reference 
> (here
> Array.insert).

Having already brought this up before, the answer to that is that 
the containers inside "container" are meant to be "closed" and 
not give access to the addresses of their internals, so the 
return by value is on purpose:

Here is Andrei's post about this:
http://forum.dlang.org/thread/ceftaiklanejfhodbpix@forum.dlang.org?page=2#post-jthdko:24231u:241:40digitalmars.com

That said, I had brought up the exact same issue here:
http://forum.dlang.org/thread/bkozswmsgeibarowfwvq@forum.dlang.org

> It is normal that Array uses default construction when
> someone increases the array's length.
> Besides that point, I don't see why default-constructed Array 
> have an
> uninitialised Payload. This makes uninitialised Array behaves
> unexpectedly, because making a copy and using the copy will not 
> affect
> the original, which is not the intended reference value 
> behavior.

I think it would be better to "initialize on copy", rather than 
default initialize. There are too many cases an empty array is 
created, then initialized on the next line, or passed to 
something else that does the initialization proper.

You'd get the correct behavior, and everything else (except dupe) 
works fine anyways.
July 19, 2012
Re: Initialization of std.typecons.RefCounted objects
"monarch_dodra" , dans le message (digitalmars.D:172700), a écrit :
> I think it would be better to "initialize on copy", rather than 
> default initialize. There are too many cases an empty array is 
> created, then initialized on the next line, or passed to 
> something else that does the initialization proper.

Not default-initializing Array has a cost for every legitimate use of an 
Array. I think people use Array more often than they create 
uninitialized ones that are not going to be used before an other Array 
instance is assigned to them, so Array would be more efficient if it was 
default initialized and never check it is initialized again. But that's 
just speculation.

> You'd get the correct behavior, and everything else (except dupe) 
> works fine anyways.

Keeping the adress of the content secret may be a valuable intention, 
but as long as properties and opIndex does not allow to correctly 
forward methods, this is completely broken. Is there even a begining of 
a plan to implement this? I don't see how properties or opIndex could 
safely forward methods that uses references and that we do not control 
without escaping the reference to them. That's not possible until D has 
a complete control of escaping references, which is not planned in the 
near or distant future. Not to mention that having a complete control of 
escaping reference break lots of code anyway, and might considerably 
decrease the need for ref counted utilities like... Array.

Please, at least give me hope that there is light at the end of the 
tunnel.

-- 
Christophe
July 19, 2012
Re: Initialization of std.typecons.RefCounted objects
On 07/18/2012 03:32 PM, Christophe Travert wrote:
> Matthias Walter , dans le message (digitalmars.D:172673), a écrit :
>> I looked at Bug #6153 (Array!(Array!int) failure) and found that the
>>
>> This exactly is what makes the following code fail:
>>
>> Array!(Array!int) array2d;
>> array2d.length = 1;
>> array2d[0].insert(1);
>>
>> The inner array "array2d[0]" was not initialized and hence the reference
>> pointer is null. Since Array.opIndex returns by value, the 'insert'
>> method is called on a temporary object and does not affect the inner
>> array (still being empty) which is stored in the outer array.
>>
>> What do you think about this?
>>
>> Must the user ensure that the Array container is always initialized
>> explicitly? If yes, how shall this happen since the only constructor
>> takes a (non-empty) tuple of new elements. Or shall opIndex return by
>> reference?
> 
> I think opIndex should return by reference. opIndexAssign is of no help 
> when the user want to use a function that takes a reference (here 
> Array.insert). It is normal that Array uses default construction when 
> someone increases the array's length.

Okay, I fully agree here.

> Besides that point, I don't see why default-constructed Array have an 
> uninitialised Payload. This makes uninitialised Array behaves 
> unexpectedly, because making a copy and using the copy will not affect 
> the original, which is not the intended reference value behavior.

Well the reason is that no user-defined function is called when a struct
is default-constructed and then copied into another variable (the first
one is this(this) which only known the new variable). Hence no user code
can be written that would initialize the payload. Do we need a paradigm
to perform explicit initialization of structs?

Best regards,

Matthias
July 19, 2012
Re: Initialization of std.typecons.RefCounted objects
On 07/19/2012 10:14 AM, Christophe Travert wrote:
> "monarch_dodra" , dans le message (digitalmars.D:172700), a écrit :
>> I think it would be better to "initialize on copy", rather than 
>> default initialize. There are too many cases an empty array is 
>> created, then initialized on the next line, or passed to 
>> something else that does the initialization proper.
> 
> Not default-initializing Array has a cost for every legitimate use of an 
> Array. I think people use Array more often than they create 
> uninitialized ones that are not going to be used before an other Array 
> instance is assigned to them, so Array would be more efficient if it was 
> default initialized and never check it is initialized again. But that's 
> just speculation.

I agree here. Additionally my question: Is it possible (at the moment)
to really do "initialize on copy"? As far as I see it, the only way to
interact here is to implement 'this(this)' which is called after
bit-copying and hence cannot access the source of the copy process.
July 19, 2012
Re: Initialization of std.typecons.RefCounted objects
On Thursday, 19 July 2012 at 08:14:25 UTC, 
travert@phare.normalesup.org (Christophe Travert) wrote:
> "monarch_dodra" , dans le message (digitalmars.D:172700), a 
> écrit :
>> I think it would be better to "initialize on copy", rather 
>> than default initialize. There are too many cases an empty 
>> array is created, then initialized on the next line, or passed 
>> to something else that does the initialization proper.
>
> Not default-initializing Array has a cost for every legitimate 
> use of an Array.

I'm back pedaling and agreeing 100% actually. Plus it keeps the 
implementation simpler. +1 to you.

> Keeping the adress of the content secret may be a valuable 
> intention,
> but as long as properties and opIndex does not allow to 
> correctly
> forward methods, this is completely broken. Is there even a 
> begining of
> a plan to implement this? I don't see how properties or opIndex 
> could
> safely forward methods that uses references and that we do not 
> control
> without escaping the reference to them. That's not possible 
> until D has
> a complete control of escaping references, which is not planned 
> in the
> near or distant future. Not to mention that having a complete 
> control of
> escaping reference break lots of code anyway, and might 
> considerably
> decrease the need for ref counted utilities like... Array.
>
> Please, at least give me hope that there is light at the end of 
> the
> tunnel.

One of the reason the implementation doesn't let you escape a 
reference is that that reference may become (_unverifiably_) 
invalid. Ranges to Arrays, on the other hand, keep a reference to 
the Array itself, and always* remain valid. (always as long as 
the Array doesn't shrink past the range's boundaries, but in this 
case, at least, the range throws an exception, rather than crash 
the program).

A reference to a dynamic array, on the other hand, will ALWAYS be 
valid 100% of the time, because the original data is never 
actually destroyed (until unreferenced). So it is perfectly safe 
to expose references in an array.

So escaping references from an Array is something _very_ 
dangerous, especially in a language that kind of guarantees it 
won't core dump if you don't manipulate pointers (which could if 
arrays escaped references).

While I get your point, it really feels like a "lose lose" 
situation here: You get stiffed either way...with Array...
...That said, I see no reason for the other containers (SList, 
I'm looking at you), not to expose references.

The way I see it, Array *could* escape references, but then the 
whole class would have to be qualified @unsafe (if such a thing 
existed) or something along these lines.

The current work around? Copy-Extract, manipulate, re-insert. 
Sucks.

On Thursday, 19 July 2012 at 10:39:56 UTC, Matthias Walter wrote:
> On 07/19/2012 10:14 AM, Christophe Travert wrote:
> I agree here. Additionally my question: Is it possible (at the 
> moment)
> to really do "initialize on copy"? As far as I see it, the only 
> way to
> interact here is to implement 'this(this)' which is called after
> bit-copying and hence cannot access the source of the copy 
> process.

Good point. The answer is no though. A RefCounted (also known as 
SharedPointer in C++) needs to be initialized first if you want 
it aliased.

Allowing such a behavior is possible, but prohibitively expensive.
July 19, 2012
Re: Initialization of std.typecons.RefCounted objects
"monarch_dodra" , dans le message (digitalmars.D:172710), a écrit :
> One of the reason the implementation doesn't let you escape a 
> reference is that that reference may become (_unverifiably_) 
> invalid.

The same applies to a dynamic array: it is undistinguishable from a 
sliced static array. More generally, as long as you allow variables on 
the stack with no escaped reference tracking, you can't ensure 
references remain valid. Even in safe code.

If I want my references to remain valid, I use dynamic array and garbage 
collection. If I use Array, I accept that my references may die. Array 
that protects the validity of their references are awesome. But, IMHO, 
not at that cost.

> ...That said, I see no reason for the other containers (SList, 
> I'm looking at you), not to expose references.

I'm against not exposing reference, but all containers will be 
implemented with custom allocator someday.

> The current work around? Copy-Extract, manipulate, re-insert. 
> Sucks.

IMO, what sucks even more is that arr[0].insert(foo) compiles while it 
has no effect. arr[0] is a R-value, but applying method to R-value is 
allowed. I don't know the state of debates about forbiding to call 
non-const methods on R-values. I think this would break too much code.
July 19, 2012
Re: Initialization of std.typecons.RefCounted objects
On 07/19/2012 02:16 PM, Christophe Travert wrote:
> "monarch_dodra" , dans le message (digitalmars.D:172710), a écrit :
>> One of the reason the implementation doesn't let you escape a 
>> reference is that that reference may become (_unverifiably_) 
>> invalid.
> 
> The same applies to a dynamic array: it is undistinguishable from a 
> sliced static array. More generally, as long as you allow variables on 
> the stack with no escaped reference tracking, you can't ensure 
> references remain valid. Even in safe code.
> 
> If I want my references to remain valid, I use dynamic array and garbage 
> collection. If I use Array, I accept that my references may die. Array 
> that protects the validity of their references are awesome. But, IMHO, 
> not at that cost.
> 
>> ...That said, I see no reason for the other containers (SList, 
>> I'm looking at you), not to expose references.
> 
> I'm against not exposing reference, but all containers will be 
> implemented with custom allocator someday.
> 
>> The current work around? Copy-Extract, manipulate, re-insert. 
>> Sucks.
> 
> IMO, what sucks even more is that arr[0].insert(foo) compiles while it 
> has no effect. arr[0] is a R-value, but applying method to R-value is 
> allowed. I don't know the state of debates about forbiding to call 
> non-const methods on R-values. I think this would break too much code.

As it seems the issue really should be resolved by making opIndex return
by reference and press thumbs hard that something like a 'scope ref'
will be implemented?

Furthermore, since RefCounted objects do not behave like reference types
until initialized, they *must* be initialized before anything else
happens and hence I propose to change std.container.Array like
Christophe said: Replace 'isInitialized()' checks by assertions and add
a method with which the user explicitly initialized the reference counter.

Or is there a reasonable alternative?

Best regards,

Matthias Walter
Top | Discussion index | About this forum | D home