May 20, 2013
On Sunday, 19 May 2013 at 23:07:00 UTC, deadalnix wrote:
> On Sunday, 19 May 2013 at 22:32:58 UTC, Andrei Alexandrescu wrote:
>> How was there a bug if everything was properly synchronized? You either describe the matter with sufficient detail, or acknowledge the destruction of your anecdote. This is going nowhere.
>>
>
> I explained over and over. A field is initialized to null, while the object lock is owned, and later to its value, while it is locked. In the meantime, another thread access the object, owning the lock, assuming the field is always initialized.
>
> The exact same problem arise quite often in the single threaded world : a reference is set to null, the dev try to be clever when initializing it, in a rare case it isn't, and everything blows up when it occurs.
>
> It is simply easier to reproduce when things are single threaded, and are often quickly debugged in that case.
>
> As explained, the multithreaded environment makes it super hard to debug, not the primary cause of the issue. The simply consistent in moving the initialization where it was set to null in the first place.
>
> It is an instance of the very classic something may be null and code use it assuming it is never null.

So this is not a problem of nullableness - rather this is a problem of mutability.
May 20, 2013
On Sunday, 19 May 2013 at 20:45:39 UTC, Walter Bright wrote:
> On 5/19/2013 1:03 PM, Maxim Fomin wrote:
>> I think there is difference between catching exception and saving
>> data which you have typed for some period and letting harware
>> "check" the exception for you meanwile loosing your work.
>
> You can catch seg faults. It's easier on Windows, but it's doable on Linux.

What's the rational for not doing this by default in D? Wouldn't a MemoryAccessError or similar be better than crashing out with SIGSEGV ? I have no idea about the consequences of this (other than tempting people to catch a segfault when they shouldn't, which is pretty much always).
May 20, 2013
On Sunday, 19 May 2013 at 23:29:53 UTC, Walter Bright wrote:
> On 5/19/2013 4:06 PM, deadalnix wrote:
>> On Sunday, 19 May 2013 at 22:32:58 UTC, Andrei Alexandrescu wrote:
>>> How was there a bug if everything was properly synchronized? You either
>>> describe the matter with sufficient detail, or acknowledge the destruction of
>>> your anecdote. This is going nowhere.
>>>
>>
>> I explained over and over. A field is initialized to null, while the object lock
>> is owned, and later to its value, while it is locked. In the meantime, another
>> thread access the object, owning the lock, assuming the field is always
>> initialized.
>
> so, you have:
> ==========================
> Thread A     Thread B
>
> lock
>    p = null

Here p = null is implicit, this is part of the fun. The initialisation is still properly synchronized.

> unlock
>              lock
>                 *p = ...

It was in java, so more somethign like p.foo(); But yes.

>              unlock
> lock
>    p = new ...
> unlock
> ==========================
> Although you are using locks, it still isn't properly synchronized. Changing the p=null to p=someothernonnullvalue will not fix it.

No race condition exists in that program. The error lie in improper initialization of p in the first place, which should never has been null. The example looks dumb as this, you have to imagine the pattern hidden in thousands of LOC.

The code is bugguy, but you'll find no undefined threading effect? What happen is perfectly defined here and no thread access shared data without owning the lock.
May 20, 2013
On Monday, 20 May 2013 at 00:23:59 UTC, John Colvin wrote:
> On Sunday, 19 May 2013 at 20:45:39 UTC, Walter Bright wrote:
>> On 5/19/2013 1:03 PM, Maxim Fomin wrote:
>>> I think there is difference between catching exception and saving
>>> data which you have typed for some period and letting harware
>>> "check" the exception for you meanwile loosing your work.
>>
>> You can catch seg faults. It's easier on Windows, but it's doable on Linux.
>
> What's the rational for not doing this by default in D? Wouldn't a MemoryAccessError or similar be better than crashing out with SIGSEGV ? I have no idea about the consequences of this (other than tempting people to catch a segfault when they shouldn't, which is pretty much always).

https://github.com/D-Programming-Language/druntime/blob/master/src/etc/linux/memoryerror.d

Still you can have weird effect when the code above throw in a middle of a C routine or something. As C don't know about D exception, it is definitively something to be aware of.
May 20, 2013
On Monday, 20 May 2013 at 00:09:23 UTC, Walter Bright wrote:
> On 5/19/2013 3:04 PM, deadalnix wrote:
>> Same argument Walter like to make about very rare failure cases apply here.
>
>
> 1. rare as in programmers rarely create such a bug
>
> 2. rare as in being rare for an existing bug to show itself
>
> I was referring to (1), while you are referring to (2).

When you talk about UNIX utilities not handling properly a full filesystem for instance, you are referring to 1.
May 20, 2013
On Sun, 19 May 2013 16:03:24 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> On 5/19/13 3:36 PM, deadalnix wrote:
>> I described a very usual null bug : something is set to null, then to a
>> specific value. It is assumed not to be null. In a specific case it is
>> null and everything explode.
>>
>> The concurrent context here made it especially hard to debug, but isn't
>> the cause of the bug.
>>
>> Additionally, if you don't have enough information to understand what
>> I'm saying, you are perfectly allowed to ask for additional details This
>> isn't a shame.
>
> Your argument has been destroyed so no need to ask details about it. Replace "null" with "invalid state" and it's the same race in any system. Let's move on.

I just wanted to chime in with this understanding of the bug that I am reading from deadalnix's descriptions:

SomeObj obj;
shareTheObj(&obj); // goes to other threads
obj = new SomeObj; // initialize obj

This is likely simpler than the actual problem, but I think this is the gist of it.

A Non-Nullable solution WOULD solve the race:

SomeObj obj; // Compiler: nope, can't do that, must initialize it.

Now, with an "invalid state" not available like null, is the developer more likely to go through the trouble of building an invalid state for obj, in order to keep the racy behavior?  No.  He's just going to move the initialization:

SomeObj obj = new SomeObj;
shareTheObj(&obj);

In essence the point of the anecdote is that a Non-Nullable reference would have PROMOTED avoiding the race condition -- it would have been harder to keep the racy behavior.

I'm not saying that I think we need NN references as a compiler-supported type, or that it needs to be the default, or that NN references ALWAYS solve race conditions.  I'm just pointing out what I see is an obvious misinterpretation of the underlying story.

-Steve
May 20, 2013
On 05/20/2013 01:39 AM, Andrei Alexandrescu wrote:
> On 5/19/13 7:06 PM, deadalnix wrote:
>> On Sunday, 19 May 2013 at 22:32:58 UTC, Andrei Alexandrescu wrote:
>>> How was there a bug if everything was properly synchronized? You
>>> either describe the matter with sufficient detail, or acknowledge the
>>> destruction of your anecdote. This is going nowhere.
>>>
>>
>> I explained over and over. A field is initialized to null, while the
>> object lock is owned, and later to its value, while it is locked. In the
>> meantime, another thread access the object, owning the lock, assuming
>> the field is always initialized.
>
> How does another thread thread accesses the object "owning the lock"
> when the assignment occurs under lock?
>

lock{ initialize to null. }
lock{ in the meantime assume correctly initialized. }
lock{ initialize correctly. }

This is nothing new. I think he has been pretty clear about what the issue is from the beginning.

> How would non-null fix this? Would the object have type Maybe?
>

This is one possibility. In this case, the type system would have prevented the null dereference.
In the other case, the type system would have caught the invalid initialization.
May 20, 2013
On 05/20/2013 02:33 AM, deadalnix wrote:
> On Monday, 20 May 2013 at 00:09:23 UTC, Walter Bright wrote:
>> On 5/19/2013 3:04 PM, deadalnix wrote:
>>> Same argument Walter like to make about very rare failure cases apply
>>> here.
>>
>>
>> 1. rare as in programmers rarely create such a bug
>>
>> 2. rare as in being rare for an existing bug to show itself
>>
>> I was referring to (1), while you are referring to (2).
>
> When you talk about UNIX utilities not handling properly a full
> filesystem for instance, you are referring to 1.

You mean 2.
May 20, 2013
Unfortunately this is currently not a bug.
T.init provides "default initialized" object image, and it *does not*
provide "default constructed" object. The difference is important.

That is already documented in lanugage reference. http://dlang.org/property#init

> Note: .init produces a default initialized object, not default
constructed. That means using .init is sometimes incorrect.
> 1. If T is a nested struct, the context pointer in T.init is null. 2. If T is a struct which has @disable this();, T.init might return a
logically incorrect object.

Kenji Hara

2013/5/20 Maxim Fomin <maxim@maxim-fomin.ru>

> On Saturday, 18 May 2013 at 20:39:29 UTC, Walter Bright wrote:
>
>> On 5/18/2013 1:22 PM, deadalnix wrote:
>>
>>> Many are, but I think that isn't the point we are discussing here.
>>>
>>> Removing all holes in @disable this will require the same sacrifices at
>>> the ends
>>> than default constructor would. For isntance, what should happen in this
>>> case :
>>>
>>> S[] ss;
>>> ss.length = 42;
>>>
>>> if S has @disable this ?
>>>
>>
>> Already reported:
>>
>> http://d.puremagic.com/issues/**show_bug.cgi?id=10115<http://d.puremagic.com/issues/show_bug.cgi?id=10115>
>>
>
> New case, will report it:
>
>
> struct S
> {
>    int a;
>    @disable this();
>    this(int) { a = 1; }
>    ~this() { assert(a !is 0); }
>    alias a this;
>    int opCall() { return a; }
> }
>
> void main()
> {
>    switch (S.init())
>    {
>       case 0:
>          assert(0); //oops
>       default:
>    }
> }
>
> By the way, here is another bug.
>
> I think there is disagreement about @disable reliability and usefulness
> and similar issues (@safe reliability too) due to different attitude to
>  the problem:
> - As a language designer I care about whether some feature is claimed to
> solve some problem - and that all, I put it on a slide as lang advantage;
> - As a programmer who writes medium importance code I care whether the
> feature stops me from making bugs unintentionally. If it does, than I
> consider that the feature works.
> - As a programmer who writes critical code I care whether feature prevents
> from problem, even made deliberately, and if it doesn't, than the feature
> isn't reliable. It doesn't mean that it is totally useless, but it does
> mean that its reliability commitments are cheap.
>
> Since in system language there is plenty of ways to deliberately pass invalid data to the place where some validity assumptions were made, @disable is a broken feature.
>


May 20, 2013
On 5/19/2013 5:28 PM, deadalnix wrote:
> The error lie in improper
> initialization of p in the first place, which should never has been null. The
> example looks dumb as this, you have to imagine the pattern hidden in thousands
> of LOC.

I would find a design that declared a variable in one place, then initialized it in another, while releasing the lock in between as a bad design pattern to begin with. What other default initialized types could be there? What about an int default initialized to 0, yet code in another thread expects it to be some other value? I suspect there'd be a lot more bugs in it than just null pointer initializations.

It might be time to engineer a new pattern so you don't have to inspect thousands of LOC to manually verify correctness.