May 17, 2013
On Thursday, 16 May 2013 at 21:10:52 UTC, Dmitry Olshansky wrote:
> 16-May-2013 02:21, Idan Arye пишет:
>> But
>> I *want* to (almost)cause a race condition here
>
> You can't "cause" a race condition - it's there by definition of code.
> What you can cause is a data corruption that happened with one bad interleave of reads/writes because it was possible.

That's why I added the "(almost)". A race condition happens when one thread reads a variable and writes it back based on the old value, and between that read and write another thread writes to that variable. By adding a `sleep()` between that read and write, I can conjure my own race condition. Ofcourse, the race condition will never happen, because the singleton's implementation uses synchronization to prevent it - but that's the whole point of my unit test, to check that the implementation is doing the synchronization correctly.

> I'm not seeing this test do anything useful anyway - you know there is a race condition and then you try to test to see if it works as just as if there is none.
>
> It's more like unscrewing a bunch of bolts and seeing if the car manages it to the other town. It might or not depending on how the driver rides it and on a ton of other chances - truth be told it's blind luck almost. Even if you unscrew the same bolts exactly the same way the 2nd car will not ride exactly as long a distance as 1st one.

But if I have a system in my car that allows it to keep traveling even after unscrewing some bolts, and I want to test it, the way to do it is to unscrew those bolts and try to drive it to the other town. If I can't get to the other town, that means that system failed.

That's what I'm doing here - I have a system that prevents a race condition in the singleton's instantiation, and in order to test that system I try to force a race condition in the singleton's instantiation.

> And if your PC was compressing video or serving some HTTP you'll have even less no matter how you try to run them I guess...

OK, I just tested it by playing for video files at once, and the accuracy dropped from 100% to around 96%.

Still, the point of unit tests is to make sure that code changes do not break old code. Most of them are super trivial, because many of the bugs they prevent are also super trivial - the kind of bugs that make you want to hit yourself for being so stupid you didn't notice them before. If the library or system have a good set of unit tests, a programmer that runs them can catch those bugs happening in trivial examples right after they are introduced.

So, if a Phobos\druntime\dmd developer somehow manages to change something that breaks my singleton implementation, the next time they run the unit tests there is over 90% chance that unit test will catch the bug even if they put a heavy burden on their machine - and if they don't put such a heavy burden while developing, those chances climb very near to 100%.

So yea, my unit test is not 100% accurate in worst case scenarios - but it still does it's job.

> But if you like it I think Thread.yield will work just as well, it will cause threads to do the context switch.

I tested both. `Thread.yield()` gives around 92% accuracy, while `Thread.sleep(dur!"msecs"(0))` gives 100% accuracy(when I don't play 4 video files at once). I have no idea why this happens, but it happens.

> Sleep doesn't guarantee it. It causes context switch though and that might be what you want. Maybe creating a bunch of threads as suspended and then waking them all up could get close to that.

That's what I did in the second version - I suspended all the threads using `core.sync.barrier`, and the barrier took care to resume them all at once. This allowed me to use a 0ms sleep - but the call to `sleep()` is still required.

> Another problem is about expecting something definite out of race condition. Yes, here you are getting away with single atomic read/write of pointer simply because of x86 architecture.

How is it an atomic read/write if I call `sleep()` *between* the read and the write?

> Technically what you'll can see with a race condition is undefined (not speaking of this simple example on x86 that IMO is pointless anyway).
> Thus trying to catch it in more complex situation will require more then just putting a sleep(xyz) before a function call.
>
> Imagine trying to test a lock-free collection ? You still need to lauch many threads and somehow try to schedule them funky. Even then I don't see how 1 single-shot can be reliable there.

True - my testing method is not possible with lock-free collections, since I can't put a `sleep()` call inside an atomic operation. But I *can* put a sleep call inside a `synchronized` block(or more precisely, inside a constructor that is being invoked inside a synchronized block), so my testing method does work for my case.
May 17, 2013
17-May-2013 04:57, Idan Arye пишет:
> On Thursday, 16 May 2013 at 21:10:52 UTC, Dmitry Olshansky wrote:
>> 16-May-2013 02:21, Idan Arye пишет:
>>> But
>>> I *want* to (almost)cause a race condition here
>>
>> You can't "cause" a race condition - it's there by definition of code.
>> What you can cause is a data corruption that happened with one bad
>> interleave of reads/writes because it was possible.
>
> That's why I added the "(almost)". A race condition happens when one
> thread reads a variable and writes it back based on the old value, and
> between that read and write another thread writes to that variable. By
> adding a `sleep()
` between that read and write, I can conjure my own
> race condition.

A terminological moment - a race condition is not that fact "bad interleaving of reads-writes did happen".

Race condition is totally a static property of a code (or a program), whereby at least 2 threads may simultaneously access memory location and there is no ordering of operations insured (or happens before relations established). It's then said that the program has race condition, sometimes calling it potential if there is no proof that 2 threads can access it simultaneously.

>> I'm not seeing this test do anything useful anyway - you know there is
>> a race condition and then you try to test to see if it works as just
>> as if there is none.
>>
>> It's more like unscrewing a bunch of bolts and seeing if the car
>> manages it to the other town. It might or not depending on how the
>> driver rides it and on a ton of other chances - truth be told it's
>> blind luck almost. Even if you unscrew the same bolts exactly the same
>> way the 2nd car will not ride exactly as long a distance as 1st one.
>
> But if I have a system in my car that allows it to keep traveling even
> after unscrewing some bolts,

The trick here is that race condition is not guaranteed to fail in a specific way.

> OK, I just tested it by playing for video files at once, and the
> accuracy dropped from 100% to around 96%.
>
> Still, the point of unit tests is to make sure that code changes do not
> break old code. Most of them are super trivial, because many of the bugs
> they prevent are also super trivial - the kind of bugs that make you
> want to hit yourself for being so stupid you didn't notice them before.
> If the library or system have a good set of unit tests, a programmer
> that runs them can catch those bugs happening in trivial examples right
> after they are introduced.
>
> So, if a Phobos\druntime\dmd developer somehow manages to change
> something that breaks my singleton implementation, the next time they
> run the unit tests there is over 90% chance that unit test will catch
> the bug even if they put a heavy burden on their machine - and if they
> don't put such a heavy burden while developing, those chances climb very
> near to 100%.
>
> So yea, my unit test is not 100% accurate in worst case scenarios - but
> it still does it's job.

Chance == smoke test, so I think we finally in agreement :)
Then by definition running it multiple times you'll have (1 - p)^^n chance for bugs escaping.
In general since you can't know probability you don't know how long should you run it. Hence my reference to the running it long enough.

>
>> But if you like it I think Thread.yield will work just as well, it
>> will cause threads to do the context switch.
>
> I tested both. `Thread.yield()` gives around 92% accuracy, while
> `Thread.sleep(dur!"msecs"(0))` gives 100% accuracy(when I don't play 4
> video files at once). I have no idea why this happens, but it happens.
>
>> Sleep doesn't guarantee it. It causes context switch though and that
>> might be what you want. Maybe creating a bunch of threads as suspended
>> and then waking them all up could get close to that.
>
> That's what I did in the second version - I suspended all the threads
> using `core.sync.barrier`, and the barrier took care to resume them all
> at once. This allowed me to use a 0ms sleep - but the call to `sleep()`
> is still required.

To ensure context switch. My guess is that yield didn't work because then it could be the case that there is no extra work in the system. In which case the OS can stay on the same thread (and avoid switching) after yield.

>> Another problem is about expecting something definite out of race
>> condition. Yes, here you are getting away with single atomic
>> read/write of pointer simply because of x86 architecture.
>
> How is it an atomic read/write

Each operations is atomic.

> if I call `sleep()` *between* the read
> and the write?

I'm not seeing it. I think I've misread the code anyway.

If you test TLS low-lock singleton

vs

if(instance_ != null) //this is critical point, how do you test it?
	synchronize(mut){
		if(instance_ != null)
			instance_ = ...;
	}
return instance_;

Or the one that isn't locked at all?

Can you show the code actually - bogus singleton and correct one?

>> Technically what you'll can see with a race condition is undefined
>> (not speaking of this simple example on x86 that IMO is pointless
>> anyway).
>> Thus trying to catch it in more complex situation will require more
>> then just putting a sleep(xyz) before a function call.
>>
>> Imagine trying to test a lock-free collection ? You still need to
>> launch many threads and somehow try to schedule them funky. Even then I
>> don't see how 1 single-shot can be reliable there.
>
> True - my testing method is not possible with lock-free collections,
> since I can't put a `sleep()` call inside an atomic operation. But I
> *can* put a sleep call inside a `synchronized` block(or more precisely,
> inside a constructor that is being invoked inside a synchronized block),
> so my testing method does work for my case.

Which is not what I thought you are trying to test. In this trivial case I'd avoid such tests completely YMMV.

-- 
Dmitry Olshansky
May 17, 2013
On Friday, 17 May 2013 at 10:16:21 UTC, Dmitry Olshansky wrote:
> Can you show the code actually - bogus singleton and correct one?

Here: https://gist.github.com/someboddy/5601276
I thinned down the implementation to a minimum working example.

`enum BREAK_IMPLEMENTATION` at the top of the file decides between using the broken implementation and the correct one. I put the unit test under `main()` so you can compile it without the -unittest flag.

The reason that `Foo` is declared at global scope is a bug in dmd that was fixed for the next release(can't find it at Bugzilla, but it's fixed in dmd master). If I declare it at the unittest's or in main's scope, the constructor won't be invoked and `sleep()` won't be called.
May 17, 2013
I think we all understand it's not going to catch every race condition, or even a given race condition every time, ie. it can't prove that the code is correct.

What it can do however is prove that code is incorrect. If the unit test fails there is definitely something wrong with the code, therefore it's a useful test to have.

The most obvious case is if there is a change to the singleton code which accidentally creates a race condition. Any chance of detecting that in a unit test is better than no chance of detecting it.
May 17, 2013
On Friday, 17 May 2013 at 21:06:30 UTC, Diggory wrote:
> I think we all understand it's not going to catch every race condition, or even a given race condition every time, ie. it can't prove that the code is correct.
>
> What it can do however is prove that code is incorrect. If the unit test fails there is definitely something wrong with the code, therefore it's a useful test to have.
>
> The most obvious case is if there is a change to the singleton code which accidentally creates a race condition. Any chance of detecting that in a unit test is better than no chance of detecting it.

Exactly!

Anyways, I've updated the gist with an even better solution, that gave me 100% accuracy even when playing 4 video files.

I added a static shared counter that counts how many threads have passed the barrier - every thread has to increment it immediately after it passes the barrier. The constructor, instead of a single call to `sleep()`, calls `sleep()` until the counter reaches the number of threads. So, this is what happens:

1) 10 threads are started, each stops at the barrier waiting for the others.
2) Once all the threads have reached the barrier, threads are starting to get released(from the barrier, not from memory).
3) Each thread released from the barrier increments the counter.
4) At least one thread passes the singleton checks, enters the constructor and goes into a loop where it calls `sleep()` all of the threads got a chance to run again.
6) Once all the threads got that chance, all the threads stuck in the constructor get released from the loop.

Now, theoretically this unit test is not 100% accurate. If all the threads except one get switched out between step 3 and step 4 - that is, after they increment the counter but before they pass the singleton checks - then by the time those threads will run again, the one other thread gets to initiate the singleton and write the __gshared instance field - which means the other threads won't pass the singleton checks and won't create more instances, and the unit test won't fail even if the implementation is broken.

Now, while theoretically possible, this scenario is far fetched in reality. The window where a thread needs to be switched is very short, and placed immediately after it gets switched in. The amount of work the thread does in that window after it got switched in is smaller(or at least not much larger) than the work needed for the context switch, so no sane operation system should switch it out in that window - let alone switch out nine such threads.
May 18, 2013
On Wednesday, 15 May 2013 at 18:28:53 UTC, Idan Arye wrote:
> Your opinion?

1/ You'll get nowhere with sleep.
2/ Both singleton and race condition are notoriously difficult to test. For the first one, it is considered good practice to avoid it altogether, for the second one, formal proof is prefered.
1 2
Next ›   Last »