Jump to page: 1 2
Thread overview
Copy constructors for lazy initialization
May 29, 2010
bearophile
May 29, 2010
Jonathan M Davis
May 29, 2010
Walter Bright
May 29, 2010
Jonathan M Davis
Jun 03, 2010
Bruno Medeiros
Jun 03, 2010
Bruno Medeiros
May 29, 2010
Lionello Lunesu
May 29, 2010
Jonathan M Davis
May 29, 2010
Michel Fortin
May 30, 2010
Michael Rynn
May 30, 2010
Michael Rynn
May 29, 2010
Walter has had a great idea last night: allow classes to define

this(ref S src);

where S is the name of the struct being defined, as an alternative for

this(this);

The result would be a function similar with a C++ copy constructor.

Such an abstraction allows us to perform lazy initialization in a way that allows the kind of problems associated with non-shared hash tables:

void foo(int[int] x)
{
   x[5] = 5;
}

void main()
{
   int[int] x;
   foo(x);
   assert(x[5] == 5); // fails
}

If you change the first line of main() with

int[int] x = [ 42:42 ];

the assert passes.

The idea of the copy constructor is to lazy initialize the source and the target if the source has null state. That would take care of this problem and the similar problems for shared state.

There is still a possibility to call a method against an object with null state. I think that's acceptable, particularly because lazy initialization saves some state allocation.

What do you think?


Andrei
May 29, 2010
Andrei Alexandrescu Wrote:

> this(ref S src);
> this(this);

> What do you think?

In this moment I am too much sleepy to understand the semantics of what you say.

But I can say something about syntax: that this(this) syntax is bad, it's cryptic, I prefer something that uses/contains some English word/name that I read and reminds me of what it does.

The this(ref S src) syntax makes things even worse in this regard. Please don't turn D into a puzzle language (note that I am not saying your feature is bad, far from it, I am just saying that the syntax you have proposed is very far from being easy to understand from the way it is written).

Regardless of what Don has said, here I'd probably like something like a readable @attribute to replace this(this) :-)

Bye,
bearophile
May 29, 2010
Andrei Alexandrescu wrote:

> What do you think?
> 
> 
> Andrei

Certainly, in the case provided, it's a definite win. I'm not sure what the overall implications would be though. Part of the problem stems from the fact that the array is initialized to null, and yet you can still add stuff to it. My first reaction (certainly without having messed around with it in D) would be that x[5] = 5 would fail because the array was null. However, instead of blowing up, D just makes the null array into an empty one and does the assignment. If D didn't allow the assignment without having first truly created an empty array rather than a null one, then we wouldn't have the problem.

Now, there may be very good reasons for the current behavior, and this suggestion would fix the problem as it stands. But it would still require the programmer to be aware of the issue and use this(ref S src) instead of this(this) if they were writing the constructor or be aware of which it was if they didn't write the constructor.

Not knowing what other implications there are, I'm fine with the change, but the fact that D creates the array when you try and insert into it (or append to it in the case of normal arrays IIRC) rather than blowing up on null seems like a source of subtle bugs and that perhaps it's not the best design choice. But maybe there was a good reason for that that I'm not aware of, so it could be that it really should stay as-is. It's just that it seems danger-prone and that the situation that you're trying to fix wouldn't be an issue if the array stayed null until the programmer made it otherwise.

- Jonathan M Davis
May 29, 2010
bearophile wrote:

> Andrei Alexandrescu Wrote:
> 
>> this(ref S src);
>> this(this);
> 
>> What do you think?
> 
> In this moment I am too much sleepy to understand the semantics of what you say.
> 
> But I can say something about syntax: that this(this) syntax is bad, it's cryptic, I prefer something that uses/contains some English word/name that I read and reminds me of what it does.
> 
> The this(ref S src) syntax makes things even worse in this regard. Please don't turn D into a puzzle language (note that I am not saying your feature is bad, far from it, I am just saying that the syntax you have proposed is very far from being easy to understand from the way it is written).
> 
> Regardless of what Don has said, here I'd probably like something like a
> readable @attribute to replace this(this) :-)
> 
> Bye,
> bearophile

Well, as long as S is the name of the struct, it's essentially what's done in C++ all the time. So, we get

S(ref S src)

instead of

S(const S& src)


The weird thing here is that you're actually altering the parameter that you passed in, which is normally a major no-no with copy constructors.

- Jonathan M Davis
May 29, 2010
Nice. This could also be used to implement unique_ptr(T), with move
semantics.

L.

On 29-5-2010 9:26, Andrei Alexandrescu wrote:
> Walter has had a great idea last night: allow classes to define
> 
> this(ref S src);
> 
> where S is the name of the struct being defined, as an alternative for
> 
> this(this);
> 
> The result would be a function similar with a C++ copy constructor.
> 
> Such an abstraction allows us to perform lazy initialization in a way that allows the kind of problems associated with non-shared hash tables:
> 
> void foo(int[int] x)
> {
>    x[5] = 5;
> }
> 
> void main()
> {
>    int[int] x;
>    foo(x);
>    assert(x[5] == 5); // fails
> }
> 
> If you change the first line of main() with
> 
> int[int] x = [ 42:42 ];
> 
> the assert passes.
> 
> The idea of the copy constructor is to lazy initialize the source and the target if the source has null state. That would take care of this problem and the similar problems for shared state.
> 
> There is still a possibility to call a method against an object with null state. I think that's acceptable, particularly because lazy initialization saves some state allocation.
> 
> What do you think?
> 
> 
> Andrei

May 29, 2010
Jonathan M Davis wrote:
> The weird thing here is that you're actually altering the parameter that you passed in, which is normally a major no-no with copy constructors.

Yup.

One subtle but important distinction from C++ is that D can omit copy construction completely if the compiler can determine there are no further uses of the source object, and substitute a simple bit copy. This should result in a fundamental performance improvement.
May 29, 2010
On 05/28/2010 09:18 PM, Lionello Lunesu wrote:
> Nice. This could also be used to implement unique_ptr(T), with move
> semantics.

Yah, a number of interesting idioms spring to life.

Andrei

May 29, 2010
On Fri, 28 May 2010 21:26:50 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> Walter has had a great idea last night: allow classes to define

I'm almost positive you meant "allow *structs* to define"

>
> this(ref S src);
>
> where S is the name of the struct being defined, as an alternative for
>
> this(this);
>

[snip]

> The idea of the copy constructor is to lazy initialize the source and the target if the source has null state. That would take care of this problem and the similar problems for shared state.
>
> There is still a possibility to call a method against an object with null state. I think that's acceptable, particularly because lazy initialization saves some state allocation.
>
> What do you think?

It is a good effort to solve the problem.  The problem I see with such constructs is inherently the lazy construction.  Because you must lazily construct such a container, every method meant to be called on the struct must first check and initialize the container if not done already.  This results in an undue burden on the struct author to make sure he covers every method.  The first method he forgets to lazily initialize the struct results in an obscure bug.

I wonder, would it be possible to go even further?  For example, something like this:

struct S
{
  lazy this()
  {
     // create state
  }
}

which would be called if the struct still has not been initialized?  lazy this() should be prepared to accept an already initialized struct, which should be no problem because most lazily initialized structs always differ from the .init value.

The compiler could optimize this call out when it can statically determine that a struct has already been initialized.

This of course, does not cover copy construction, but it would be called before the copy constructor.

BTW, I'm glad you guys are looking at this problem.

-Steve
May 29, 2010
Actually, I have to ask what the purpose behind this delayed initialization is in the first place. The following works just fine as things are:

    int[] a;

    assert(!a);

    a ~= 42;

    assert(a);


If a were an object, this wouldn't work at all - even if it implemented the concatenation assignment operator. It would be null until you actually assigned it an object. Arrays - both normal and associative - don't seem to operate this way at all. This has the advantage that it's a bit hard to get the program to blow up on a null array, but it's not like that would generally be hard to find and fix. It makes it bit hard to have actual null array which stays that way. It would be easy to make an array null and accidentally add something to it, resulting in a bug which would have been found if the array didn't create itself upon concatenation. Also, you get the problem which started this thread - that of it getting created in a function that it's passed to and not ending up in the function that did the passing.

Other than having to update existing code, it doesn't seem that onerous to me to require

int[] = new int[](0);

to have an empty array if you want one (though it is a little weird to create an array of length 0).

Are there bugs that I'm not thinking of which the current behavior of creating the array for you avoids? Or am I just missing something here? It really seems to me like you're creating a workaround for a problem in the language. And while that workaround may be great for other stuff too, just making arrays stay null until the programmer assigns them another value fixes the bug that you're trying to fix - at least as far as I can tell. I don't understand why the current behavior was chosen. It does simplify array creation somewhat, but it seems to me that it's more likely to cause bugs than avoid them.

- Jonathan M Davis
May 29, 2010
On 2010-05-28 21:26:50 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> said:

> Walter has had a great idea last night: allow classes to define
> 
> this(ref S src);
> 
> where S is the name of the struct being defined, as an alternative for
> 
> this(this);
> 
> The result would be a function similar with a C++ copy constructor.
> 
> Such an abstraction allows us to perform lazy initialization in a way that allows the kind of problems associated with non-shared hash tables:

At this point I'll put the lazy initialization into question. If as soon as you make a copy of the struct you must allocate, it means that the container will be initialized as soon as you pass it to some function (unless the argument is passed by 'ref', but you want reference semantics precisely to avoid that, am I right?).

If the container is to be initialized as soon as you make a copy, the lazy initialization becomes of limited utility; it'll only be useful when you have a container you don't pass to another function *and* you never put anything in it. This makes the tradeoff of lazy initialization less worth it, as the extra logic checking every time if the container is already initialized will rarely serve a purpose.


-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

« First   ‹ Prev
1 2