October 11, 2012
On 12-Oct-12 01:16, monarch_dodra wrote:
>>> Except that you could write the invariant to be inclusive of the .init
>>> state.
>>
>> Which would completely defeat the purpose of the invariant in many
>> cases. The
>> point is that it is invalid to use the init value. You can pass it
>> around and
>> set stuff to it and whatnot, but actually calling functions on it
>> would be
>> invalid, because its init state isn't valid. SysTime is a prime
>> example of
>> this, because it requires a valid TimeZone object, but its init value
>> can't
>> have one, because TimeZone is a class. So ideally, it would have an
>> invariant
>> which asserts that its TimeZone is non-null, but it can't have that,
>> because
>> opAssign unfortunately checks the invariant before it's called (which
>> makes no
>> sense to me - why would the state of the object prior to assignment
>> matter?
>> you're replacing it), so assigning a valid value to a default-initialized
>> SysTime would fail the invariant.
>>
>> - Jonathan M Davis
>
> This sounds more like a limitation of invariants, rather than a problem
> with .init. You make (imo) a valid point.
>
> Would it be complicated for opAssign to first check memcmp(this,
> T.init), and only do entry invariant check if the comparison fails?
>
> Potentially ditto on exit.
With your rule T.init is a valid state. AFAICT in Jonathan's example it isn't.

-- 
Dmitry Olshansky
October 11, 2012
On Friday, October 12, 2012 01:20:44 Dmitry Olshansky wrote:
> > 
> > This sounds more like a limitation of invariants, rather than a problem with .init. You make (imo) a valid point.
> > 
> > Would it be complicated for opAssign to first check memcmp(this, T.init), and only do entry invariant check if the comparison fails?
> > 
> > Potentially ditto on exit.
> 
> With your rule T.init is a valid state. AFAICT in Jonathan's example it isn't.

Yeah. All that's required is that you outright skip the call to the invariant before calling opAssign. It _does_ mean special casing opAssign, but I don't see that as a problem. I don't understand why it matters whether the object is valid before it's assigned to. Presumably, you're completely replacing its state, and regardless of what you actually do in the function, it would still need to be valid afterwards. So, it seems perfectly fine to me to just skip calling the variant before calling opAssign, but Walter was against it. His comment on the bug ( http://d.puremagic.com/issues/show_bug.cgi?id=5058 ) indicated that he thought that init should always be in a valid state, but since NaN and null are invalid states, I see no reason that a struct's init value can't be an invalid state. It can be copied and passed around just fine. It just wouldn't pass its invariant if you tried to call a function on it before assigning it a valid value.

- Jonathan M Davis
October 12, 2012
First of all thank you for the detailed responses.

I wrote a response yesterday but somehow the website seems to have swallowed it.

On Thursday, 11 October 2012 at 12:43:31 UTC, Andrei Alexandrescu wrote:
> We could (after all, C++ does it). There are a few disadvantages to doing so, however.
>
> 1. Defining static data is more difficult. Currently, all static data is statically-initialized. With default constructors, we'd need to define the pre-construction state of such objects anyway, and then change the compiler to call constructors prior to main(). I find the current design simpler and easier to use.

This is a good reason. I like the idea of "no code gets run before main" Running code before main only lead to problems in C++. However those problems were always merely inconvenient, never big issues. I think it's not good to allow people to run code before main, but I think it's a bigger problem to have no default constructors.

> 2. Creating a temporary object cannot be anymore assumed to be a O(1), no-resources-allocated deal. Instead, generic code must conservatively assume that objects are always arbitrarily expensive to create. That makes some generic functions more difficult to implement.

I think that is OK. The generic algorithm should assume that your object is cheap to create. In C++ all algorithms assume this and there are few issues. Sure, every now and then you pass an expensive-to-create object to an algorithm which creates instances, but that bug is very easy to debug.

> 3. Two-phase object destruction (releasing state and then deallocating memory), which is useful, is made more difficult by default constructors. Essentially the .init "pre-default-constructor" state intervenes in all such cases and makes it more difficult for language users to define and understand object states.

I'm not sure that I understand this. My two ways of interpreting this are:
A) You mean that the compiler currently assumes that it doesn't have to call the destructor for objects that are at init. But after introducing a default constructor it would always have to call the destructor. I think that's OK. That's an optimization that's unlikely to give you much gain for types that need a destructor.
B) You mean that if we introduce a default constructor, there would still be situations where an object is at init and it's destructor gets called. For example if people throw exceptions. And users might be confused by this when their destructor gets run and their object is at init, instead of the state that they expect. I think this is OK. It is the same situation that we currently have with static opCall(). Yes, with the static opCall() hack people kinda expect that their object isn't always initialized, so their destructors probably react better to the state being at init, but I think if the documentation states clearly "there are situations where the destructor will be called on an object whose default constructor was not called" then people can handle that situation just fine.

> 4. Same as above applies to an object post a move operation. What state is the object left after move? C++'s approach to this, forced by the existence of default constructors and other historical artifacts, has a conservative approach that I consider inferior to D's: the state of moved-from object is decided by the library, there's often unnecessary copying, and is essentially unspecified except that "it's valid" so the moved-from object can continue to be used. This is in effect a back-door introduction of a "no-resources-allocated" state for objects, which is what default constructors so hard tried to avoid in the first place.

For moving you'd just have to define a state that the source object is in after moving. Since it's a destructive move I would expect the object to be at init after moving, as if the destructor had been called. If that is well defined, then I think users will be fine with it. This is actually the situation that we currently have, and users seem to be fine with it. This can be achieved with a tiny change to the current implementation of std.algorithm.move: Make it memcpy from init instead of a statically allocated value.


I'd also like it if we could write all structs so that init is a valid state of the struct, as Walter suggests. However this is going to make certain things impossible in the language. Simple things like having shared data between multiple instances of a struct. Or counting how often objects of a certain type was allocated. Or iterating over all instances of a type.
In fact there are parts of the standard library that don't work because they'd need a default constructor. One of the linked posts mentions this example from std.typecons:

    import std.typecons;
    {
        RefCounted!(int, RefCountedAutoInitialize.yes) a;
        assert(a == 0); // works
        RefCounted!(int, RefCountedAutoInitialize.no) b = a;
        assert(b == a); // works
        a = 5;
        assert(b == a); // works
    }
    {
        RefCounted!(int, RefCountedAutoInitialize.yes) a;
        //assert(a == 0);
        RefCounted!(int, RefCountedAutoInitialize.no) b = a;
        assert(b == a); // works
        a = 5;
        assert(b == a); // doesn't work
    }

In this case it just means that that struct needs to be rewritten, because the whole "RefCountedAutoInitialize" thing is impossible in D. std.typecons.RefCounted should never be auto initialized. People have to always initialize it manually. Of course that also means that any type which uses this has to be initialized manually. And any type which uses that.
So you can't really have arrays of RefCounted things. Or arrays of structs which use RefCounter. Basically you will come across situations where RefCounted doesn't do what you expect and then you'll have to hack around it. All because structs can't really have shared data between multiple instances.


But really the bigger problem is not that this makes a small amount of features impossible. The bigger problem is that this means that you have to use classes for a large amount of things for which structs are better suited. Just because I have things that need to happen at creation time doesn't mean that I want it to be a class.
Also, as has been brought up several times: It means you have to fight against the language if you want to use invariants. You have to always say "if it is in an invalid state, then everything is OK. Otherwise check if the state makes sense."

You are all making good points, but I think the current state is not going to work long term in the real world. Heck, it already introduces problems in the standard library and hacks like static opCall are accepted wisdom. Sure, introducing a default constructor would create problems, but those are either minor or solvable.
October 12, 2012
On 2012-05-12 00:10, Jonathan M Davis <jmdavisProg@gmx.com> wrote:

> On Friday, October 12, 2012 01:20:44 Dmitry Olshansky wrote:
>> >
>> > This sounds more like a limitation of invariants, rather than a  
>> problem
>> > with .init. You make (imo) a valid point.
>> >
>> > Would it be complicated for opAssign to first check memcmp(this,
>> > T.init), and only do entry invariant check if the comparison fails?
>> >
>> > Potentially ditto on exit.
>>
>> With your rule T.init is a valid state. AFAICT in Jonathan's example it
>> isn't.
>
> Yeah. All that's required is that you outright skip the call to the invariant
> before calling opAssign. It _does_ mean special casing opAssign, but I don't
> see that as a problem. I don't understand why it matters whether the object is
> valid before it's assigned to. Presumably, you're completely replacing its
> state, and regardless of what you actually do in the function, it would still
> need to be valid afterwards. So, it seems perfectly fine to me to just skip
> calling the variant before calling opAssign, but Walter was against it. His
> comment on the bug ( http://d.puremagic.com/issues/show_bug.cgi?id=5058 )
> indicated that he thought that init should always be in a valid state, but
> since NaN and null are invalid states, I see no reason that a struct's init
> value can't be an invalid state. It can be copied and passed around just fine.
> It just wouldn't pass its invariant if you tried to call a function on it
> before assigning it a valid value.

The opAssign can presumably be more complex, and e.g. require deallocation of
non-GC memory, releasing handles and whatnot.

Anyways, is there a reason you cannot use @disable this() for SysTime? That way,
you have rather explicitly marked .init as invalid.

-- 
Simen
October 12, 2012
On Thursday, 11 October 2012 at 22:05:34 UTC, Jonathan M Davis
wrote:
> On Friday, October 12, 2012 01:20:44 Dmitry Olshansky wrote:
>> > 
>> > This sounds more like a limitation of invariants, rather than a problem
>> > with .init. You make (imo) a valid point.
>> > 
>> > Would it be complicated for opAssign to first check memcmp(this,
>> > T.init), and only do entry invariant check if the comparison fails?
>> > 
>> > Potentially ditto on exit.
>> 
>> With your rule T.init is a valid state. AFAICT in Jonathan's example it
>> isn't.
>
> Yeah. All that's required is that you outright skip the call to the invariant
> before calling opAssign. It _does_ mean special casing opAssign, but I don't
> see that as a problem. I don't understand why it matters whether the object is
> valid before it's assigned to. Presumably, you're completely replacing its
> state, and regardless of what you actually do in the function, it would still
> need to be valid afterwards. So, it seems perfectly fine to me to just skip
> calling the variant before calling opAssign, but Walter was against it. His
> comment on the bug ( http://d.puremagic.com/issues/show_bug.cgi?id=5058 )
> indicated that he thought that init should always be in a valid state, but
> since NaN and null are invalid states, I see no reason that a struct's init
> value can't be an invalid state. It can be copied and passed around just fine.
> It just wouldn't pass its invariant if you tried to call a function on it
> before assigning it a valid value.
>
> - Jonathan M Davis

Yes, as answered, opAssign may do things to this, such as
dealocate a payload, reduce a ref counter, or who knows what.

As a matter of fact, there was a bug in emplace about that I
recently fixed.

My rational for skipping the test *ONLY* if "this == .init" is
that .init is supposed to mean not yet fully initialized. This
would make code such as:

T t = t.init; //Not yet ready for use.
                //Attempt to use will assert the invariant
...
t = T(5); //Initialize it later. Don't check invariant on entry
...
//Use t and check invariants every time.
...
t = t.init; //De-initialization, Don't check invariant on exit.
              //Future attempt for use will assert the invariant

I'd also do the same for destructor entry: The language already
states that .init SHOULD be a valid destructible state.
October 12, 2012
On Friday, October 12, 2012 06:42:23 Simen Kjaeraas wrote:
> Anyways, is there a reason you cannot use @disable this() for SysTime?
> That way,
> you have rather explicitly marked .init as invalid.

Disabling init does a lot to make a type unusable such that it really doesn't make sense to use it unless you need to. It's not that you can't have init. It's that you need to make sure that you actually set it to a valid value before calling any functions on it. If init were disabled, then you couldn't do stuff like

SysTime[12] times;

which should be perfectly fine as long as you make sure to set them all before actually calling any functions on them, e.g.

foreach(i, ref t; times)
    t  = calcTime(i);

Think of it like NaN or null. You can have variable set to them, and it's not a problem. It's only once you try to use functions or operators on them that they blow up.

- Jonathan M Davis
October 12, 2012
On Friday, October 12, 2012 07:53:09 monarch_dodra wrote:
> Yes, as answered, opAssign may do things to this, such as dealocate a payload, reduce a ref counter, or who knows what.

A valid point, but it would be easy to explicitly call the invariant at the beginning of opAssign if wanted to ensure that the object's state was valid at the beginning of opAssign. You can't _not_ call the invariant though if the compiler already does. And there's another problem which your suggest against init suggestion wouldn't fix. It's initializing to void:

S s = void;

If you ever want to do that, you can't have an invariant, or it'll blow up when you try and assign to it. And since it certainly wouldn't be the same as the init property, checking against the init property wouldn't help you any.

- Jonathan M Davis
October 12, 2012
On Friday, 12 October 2012 at 07:36:37 UTC, Jonathan M Davis wrote:
> On Friday, October 12, 2012 07:53:09 monarch_dodra wrote:
>> Yes, as answered, opAssign may do things to this, such as
>> dealocate a payload, reduce a ref counter, or who knows what.
>
> A valid point, but it would be easy to explicitly call the invariant at the
> beginning of opAssign if wanted to ensure that the object's state was valid at
> the beginning of opAssign. You can't _not_ call the invariant though if the
> compiler already does. And there's another problem which your suggest against
> init suggestion wouldn't fix. It's initializing to void:
>
> S s = void;
>
> If you ever want to do that, you can't have an invariant, or it'll blow up
> when you try and assign to it. And since it certainly wouldn't be the same as
> the init property, checking against the init property wouldn't help you any.
>
> - Jonathan M Davis

Well, again, you'd need your s to be in a valid state to assign to it, so I'd *hope* the invariant blew up in my face.

When you declare s void, you are supposed to memcpy .init over it manually, or call emplace (which does the same thing + more).

And you CAN have an invariant:

//----
import std.conv;
import std.c.string;

struct S
{
    invariant()
    {
        assert(0);
    }
}

void main()
{
    S s = void;

    static auto foo = S.init;
    memcpy(&s, &foo, S.sizeof);
    //or
    emplace(&s);
}
//----

TA-DA!!!
October 12, 2012
On Friday, October 12, 2012 10:09:22 monarch_dodra wrote:
> When you declare s void, you are supposed to memcpy .init over it manually, or call emplace (which does the same thing + more).

If that's what you're "supposed" to do, it's only because opAssign is annoying enough to check its invariant. Without the invariant, that's not something that would normally make sense to do. And it's _not_ what you do with a built- in type.

int i = void;
i = 5;

is perfectly legal. I see no reason why

S s = void;
s = S(17);

shouldn't be legal as well. And it _is_ legal, except when you're unlucky enough to need to define an opAssign for S, because then that'll blow up in your face if you have an invariant.

And because the _only_ reason that using memcpy or emplace would be necessary (or really make any sense at all) is to work around the fact that the invariant is called before opAssign, that means that you have to contort your code specifically to make it work in non-release mode, making it _less_ efficient in release mode. And since usually the point of initializing to void is to gain extra efficiency, this is entirely counterproductive.

I'd argue that _any_ code which is specifically doing emplace or memcpy because it initialized to void is probably a bad idea. If that's what it needs to do, it shouldn't have been initialized to void in the first place. And if it's because of the invariant, then it makes way more sense to ditch the invariant then use emplace like that.

Really, I think that it's a bad design decision to require that the invariant be called before opAssign. It does _not_ play nice with some of D's other features, and the result is likely to be that invariants get used less, meaning that code is more likely to be buggy.

- Jonathan M Davis
October 12, 2012
On Friday, 12 October 2012 at 08:20:42 UTC, Jonathan M Davis
wrote:
> On Friday, October 12, 2012 10:09:22 monarch_dodra wrote:
>
> If that's what you're "supposed" to do, it's only because opAssign is annoying
> enough to check its invariant. Without the invariant, that's not something
> that would normally make sense to do. And it's _not_ what you do with a built-
> in type.
>
> int i = void;
> i = 5;
>
> is perfectly legal. I see no reason why
>
> S s = void;
> s = S(17);
>
> [SNIP]
>
> - Jonathan M Davis

The issue with initializing with void actually has nothing to do
with invariants.

Try that code defining S as RefCounted!int and see what happens.