August 20, 2004
In article <cg4lgq$6ac$1@digitaldaemon.com>, Matthew says...
>
>I suspect the process that you've built contains somewhere within it - maybe your code, maybe the D libs - a race condition that relates to an integer operation - say increment of a 16 or 32-bit quantity (assuming a 32-bit architecture). On machines with only one processor such operations will *always* work atomically, because of the way Intel (and most other architectures) works.
>
>However, when doing so on a machine with two processes, explicit instructions are required to lock the bus to prevent interleaving of the separate thread's actions. This can work as follows:
>
>    Thread A reads the value from memory (to register)
>    Thread B reads the value from memory (to register)
>    Thread B increments the value (in register)
>    Thread B writes the value to memory (from register)
>    Thread A increments the value (in register)
>    Thread A writes the value to memory (from register)
>
>When Thread A writes the value it overwrites the changes made by thread B. Rather than the variable being incremented twice, as expected, the net result is only one incremented.
>
>This is a classic race condition, and such things easily bring down processes (or even machines!).

Not to mention the possibility of CPU cache synchronization.  Win32 provides the handy Interlocked calls to take care of this type of thing, but beyond that you need to use the "lock" prefix on asm code.  Speaking of which, it would be nice to have native D versions of the Win32 Interlocked calls--at least increment, decrement, and compare-exchange.


Sean


August 20, 2004
In article <cg5ii9$l8e$1@digitaldaemon.com>, Sean Kelly says...
>
Speaking of which, it would be nice
>to have native D versions of the Win32 Interlocked calls--at least increment, decrement, and compare-exchange.
>
>
>Sean

Here here.

Actually, since they imply such a fine-grain level of control, wouldn't these be better off as D intrinsic functions instead?

Just a thought,
- Pragma


August 20, 2004
try running it in valgrind?

Helmut Leitner wrote:
> 
> "Martin (very worried)" wrote:
> 
>>See: "to Walter: important BUG (I think)"
>>
>>Error on a very low level of the D language is a very big problem and I think it
>>must be handled very quiqkly.
>>
>>
>>>This looks like it could be a problem with stack overflows.
>>
>>Yes, and it looks like it is an error in compiler. It is not a server error,
>>because everything elese works fine. It is not a cgi error, because I tested it
>>also direct throw ssh.  And I don't thik there is an error in my program,
>>because it is very simple and there is nothing there that could cause a
>>"Segmentation fault".
>>
>>I dested it with 0.99, nothing changed.
>>
>>SO DO YOU WANT THE CORE DUMP FILES(I think they include the memory image of the
>>prgogram, at the moment of segmentation fault)?
>>If you do, where do I send them?
>>if you don't, tell me, I stop offering then.
> 
> 
> Martin, you are surely not ignored and there are at least a dozen people
> here that take this error seriously. 
> 
> I just found the time to compile and test your code under Suse Linux 9.1
> and gcc 3.3.3, dmd 0.98. I did 1225551 runs total without a single fault.
> 
> You seem to be sure it is a compiler fault.  I'm not so sure about the cause of this error. Maybe its your system or gcc.
> 
August 20, 2004
In article <cg5j1m$ljm$1@digitaldaemon.com>, pragma <EricAnderton at yahoo dot com> says...
>
>In article <cg5ii9$l8e$1@digitaldaemon.com>, Sean Kelly says...
>>
>Speaking of which, it would be nice
>>to have native D versions of the Win32 Interlocked calls--at least increment, decrement, and compare-exchange.
>
>Here here.
>
>Actually, since they imply such a fine-grain level of control, wouldn't these be better off as D intrinsic functions instead?

Be fine with me :)  It would be nice to have access to this sort of thing without pulling in any library code.


Sean


August 20, 2004
While we're on that topic, what about adding rol() and ror() intrinsics
also?


"Sean Kelly" <sean@f4.ca> wrote in message news:cg5mf8$nhj$1@digitaldaemon.com...
> In article <cg5j1m$ljm$1@digitaldaemon.com>, pragma <EricAnderton at yahoo
dot
> com> says...
> >
> >In article <cg5ii9$l8e$1@digitaldaemon.com>, Sean Kelly says...
> >>
> >Speaking of which, it would be nice
> >>to have native D versions of the Win32 Interlocked calls--at least
increment,
> >>decrement, and compare-exchange.
> >
> >Here here.
> >
> >Actually, since they imply such a fine-grain level of control, wouldn't
these be
> >better off as D intrinsic functions instead?
>
> Be fine with me :)  It would be nice to have access to this sort of thing without pulling in any library code.
>
>
> Sean
>
>


August 20, 2004
Good idea! I try it.

In article <cg5jtj$lmm$2@digitaldaemon.com>, Daniel Horn says...
>
>try running it in valgrind?
>
>Helmut Leitner wrote:
>> 
>> "Martin (very worried)" wrote:
>> 
>>>See: "to Walter: important BUG (I think)"
>>>
>>>Error on a very low level of the D language is a very big problem and I think it must be handled very quiqkly.
>>>
>>>
>>>>This looks like it could be a problem with stack overflows.
>>>
>>>Yes, and it looks like it is an error in compiler. It is not a server error, because everything elese works fine. It is not a cgi error, because I tested it also direct throw ssh.  And I don't thik there is an error in my program, because it is very simple and there is nothing there that could cause a "Segmentation fault".
>>>
>>>I dested it with 0.99, nothing changed.
>>>
>>>SO DO YOU WANT THE CORE DUMP FILES(I think they include the memory image of the
>>>prgogram, at the moment of segmentation fault)?
>>>If you do, where do I send them?
>>>if you don't, tell me, I stop offering then.
>> 
>> 
>> Martin, you are surely not ignored and there are at least a dozen people here that take this error seriously.
>> 
>> I just found the time to compile and test your code under Suse Linux 9.1 and gcc 3.3.3, dmd 0.98. I did 1225551 runs total without a single fault.
>> 
>> You seem to be sure it is a compiler fault.  I'm not so sure about the cause of this error. Maybe its your system or gcc.
>> 


August 20, 2004
antiAlias wrote:
> While we're on that topic, what about adding rol() and ror() intrinsics
> also?

I don't know about DMD, but if memory serves, MSVC can optimize something like this into a single instruction:

    inline void rol(int& x, int count) {
        x = (x >> sizeof(int) - count) | (x << count);
    }

 -- andy
August 20, 2004
This is a most peculiar bug. I was able to duplicate it on my linux box, about once out of 30 runs or so. But there's *nothing wrong* with the code generated by the compiler. It'll also do it under gdb. It appears to happen when the ENTER instruction is reached.

Now, I know that stack is allocated on a page-by-page basis, and if the stack overflows there's some logic in the kernel somewhere to catch an overflow and allocate more pages to the stack, then restart the instruction. My experiments with gcc show it never generates an ENTER instruction to set up the stack frame, but dmd does.

Furthermore, if you compile with -O, dmd will not generate an ENTER instruction, and I cannot get the program to generate a fault.

Therefore, I have a sneaky suspicion there is a bug in the linux kernel where it cannot restart an ENTER instruction after a stack overflow.

What I will do is fix dmd to never generate the ENTER under linux. In the meantime, try compiling your program with -O and see if it works or not.


August 20, 2004
It'll happend again in 0.999 <g>

Matthew wrote:

> I can't wait for version 0.101, so that this can be dropped for good. <sigh>
> 
> "Bent Rasmussen" <exo@bent-rasmussen.info> wrote in message news:cg4epi$3fo$1@digitaldaemon.com...
>> > I think this error is a very big problem, because the 1.00 version is very near.
>>
>> Oh boy, don't go down that road. :-) I believe it is possible and probable that 0.100 will follow.
>>
>>

August 20, 2004
"Sean Kelly" <sean@f4.ca> wrote in message news:cg5ii9$l8e$1@digitaldaemon.com...
> In article <cg4lgq$6ac$1@digitaldaemon.com>, Matthew says...
> >
> >I suspect the process that you've built contains somewhere within it - maybe your code, maybe the D libs - a race condition that relates to an integer operation - say increment of a 16 or 32-bit quantity (assuming a 32-bit architecture). On machines with only one processor such operations will *always* work atomically, because of the way Intel (and most other architectures) works.
> >
> >However, when doing so on a machine with two processes, explicit instructions are required to lock the bus to prevent interleaving of the separate thread's actions. This can work as follows:
> >
> >    Thread A reads the value from memory (to register)
> >    Thread B reads the value from memory (to register)
> >    Thread B increments the value (in register)
> >    Thread B writes the value to memory (from register)
> >    Thread A increments the value (in register)
> >    Thread A writes the value to memory (from register)
> >
> >When Thread A writes the value it overwrites the changes made by thread B. Rather than the variable being incremented twice, as expected, the net result is only one incremented.
> >
> >This is a classic race condition, and such things easily bring down processes (or even machines!).
>
> Not to mention the possibility of CPU cache synchronization.  Win32 provides the handy Interlocked calls to take care of this type of thing, but beyond that you need to use the "lock" prefix on asm code.  Speaking of which, it would be nice to have native D versions of the Win32 Interlocked calls--at least increment, decrement, and compare-exchange.

Ah, yes, my funky brother. At last there are 2.

I tried a lot last year to get Walter interest in D providing some of these kinds of things as part of the language, and I've just started another campaign. I hope you can add weight (and better brains than mine) to the issue. :)

[FYI: I'm normally a big fan of the C++-way, i.e. functionality should be provided by libraries rather than language extensions. But I think many threading constructs would be better suited in the language for D. We still have a blank sheet, and many architectures behave very similarly these days with such things as atomic integer ops, so I reckon this is an exceptional case.]