June 06, 2017
On Tuesday, 6 June 2017 at 04:11:33 UTC, Stefan Koch wrote:
> On Tuesday, 6 June 2017 at 02:03:46 UTC, jmh530 wrote:
>> On Tuesday, 6 June 2017 at 00:46:00 UTC, Stefan Koch wrote:
>>>
>>> Time to find this: roughly 2 weeks.
>>>
>>
>> Damn. That's some commitment.
>
> There is no other way, really.
> These things need to be fixed.

Great work. Keep up.
June 06, 2017
On Tue, Jun 06, 2017 at 02:03:46AM +0000, jmh530 via Digitalmars-d wrote:
> On Tuesday, 6 June 2017 at 00:46:00 UTC, Stefan Koch wrote:
> > 
> > Time to find this: roughly 2 weeks.
> > 
> 
> Damn. That's some commitment.

2 weeks is not bad for subtle bugs in complex code like this one. In my day job I've seen bugs that took 2 *months* to figure out. One of them involved a rare race condition that can only be reproduced under very specific circumstances, and it took a long time and a lot of guesswork before a coworker and myself discovered the exact combination that triggered the bug, thereby leading to the subtle problem in a piece of code that looked perfectly innocuous before then.


T

-- 
A mathematician is a device for turning coffee into theorems. -- P. Erdos
June 06, 2017
On Tuesday, 6 June 2017 at 16:39:07 UTC, H. S. Teoh wrote:
> On Tue, Jun 06, 2017 at 02:03:46AM +0000, jmh530 via Digitalmars-d wrote:
>> On Tuesday, 6 June 2017 at 00:46:00 UTC, Stefan Koch wrote:
>> > 
>> > Time to find this: roughly 2 weeks.
>> > 
>> 
>> Damn. That's some commitment.
>
> 2 weeks is not bad for subtle bugs in complex code like this one. In my day job I've seen bugs that took 2 *months* to figure out. One of them involved a rare race condition that can only be reproduced under very specific circumstances, and it took a long time and a lot of guesswork before a coworker and myself discovered the exact combination that triggered the bug, thereby leading to the subtle problem in a piece of code that looked perfectly innocuous before then.
>
>
> T

Wow, 2 Months.
And I always feel slow, when a bug takes more then a week.
Luckily my architecture is designed to be completely deterministic and reproducibly.
So things like race conditions cannot hit me.
... Thank god for that.
June 06, 2017
On 6/6/17 12:39 PM, H. S. Teoh via Digitalmars-d wrote:
> On Tue, Jun 06, 2017 at 02:03:46AM +0000, jmh530 via Digitalmars-d wrote:
>> On Tuesday, 6 June 2017 at 00:46:00 UTC, Stefan Koch wrote:
>>>
>>> Time to find this: roughly 2 weeks.
>>>
>>
>> Damn. That's some commitment.
>
> 2 weeks is not bad for subtle bugs in complex code like this one. In my
> day job I've seen bugs that took 2 *months* to figure out. One of them
> involved a rare race condition that can only be reproduced under very
> specific circumstances, and it took a long time and a lot of guesswork
> before a coworker and myself discovered the exact combination that
> triggered the bug, thereby leading to the subtle problem in a piece of
> code that looked perfectly innocuous before then.

Oh, I've had those before. I had a race condition that reproduced *randomly* and usually took about 2 weeks to happen, and that's by pounding it non-stop. The result was deadlock. Any debugging after the fact resulted in no clues.

Only way I solved it was to print out state as it was going, so I could see what happened when the state went bad. I think it took at least 2 cycles to find it.

This kind of stuff makes you appreciate how important avoiding race conditions and memory corruption is.

-Steve
June 06, 2017
On Tue, Jun 06, 2017 at 02:23:59PM -0400, Steven Schveighoffer via Digitalmars-d wrote:
> On 6/6/17 12:39 PM, H. S. Teoh via Digitalmars-d wrote:
> > On Tue, Jun 06, 2017 at 02:03:46AM +0000, jmh530 via Digitalmars-d wrote:
> > > On Tuesday, 6 June 2017 at 00:46:00 UTC, Stefan Koch wrote:
> > > > 
> > > > Time to find this: roughly 2 weeks.
> > > > 
> > > 
> > > Damn. That's some commitment.
> > 
> > 2 weeks is not bad for subtle bugs in complex code like this one. In my day job I've seen bugs that took 2 *months* to figure out. One of them involved a rare race condition that can only be reproduced under very specific circumstances, and it took a long time and a lot of guesswork before a coworker and myself discovered the exact combination that triggered the bug, thereby leading to the subtle problem in a piece of code that looked perfectly innocuous before then.
> 
> Oh, I've had those before. I had a race condition that reproduced *randomly* and usually took about 2 weeks to happen, and that's by pounding it non-stop. The result was deadlock. Any debugging after the fact resulted in no clues.
> 
> Only way I solved it was to print out state as it was going, so I could see what happened when the state went bad. I think it took at least 2 cycles to find it.
> 
> This kind of stuff makes you appreciate how important avoiding race conditions and memory corruption is.
[...]

Yeah, race conditions and memory corruption / pointer bugs are the worst to track down.  Since the codebase I deal with is in C, there are plenty of opportunities for slip-ups that lead to pointer bugs.  And the worst of them are dangling pointers... you write to them, and there's no SEGV because they point to valid memory, but that memory has been allocated to something else now.  By the time the corruption manifests itself, you're already long, long past the original buggy code, usually in some completely-innocent code that you can stare at for weeks or months and not find a single flaw.  Meanwhile the original bug randomly corrupts different things depending on who gets the memory pointed to by the bad pointer, making it almost impossible to reproduce. Even after you reproduce it, you've no idea how to trace it to the original cause, because you're long past where it happened.  And it's almost impossible to narrow it down, because reducing the test case may make the bad pointer corrupt something else that you don't see, so you don't know if the bug is still happening or not.

Things like this make you *really* appreciate D features like bounds checking and the oft-maligned but life-saving GC.


T

-- 
Error: Keyboard not attached. Press F1 to continue. -- Yoon Ha Lee, CONLANG
June 06, 2017
On Tuesday, 6 June 2017 at 18:51:58 UTC, H. S. Teoh wrote:
>
> Things like this make you *really* appreciate D features <snip> and the oft-maligned but life-saving GC.

+1000! GC is an opportunity, not a burden! :-P

/Paolo

June 06, 2017
On 6/6/17 2:51 PM, H. S. Teoh via Digitalmars-d wrote:
> On Tue, Jun 06, 2017 at 02:23:59PM -0400, Steven Schveighoffer via Digitalmars-d wrote:
>> On 6/6/17 12:39 PM, H. S. Teoh via Digitalmars-d wrote:
>>> On Tue, Jun 06, 2017 at 02:03:46AM +0000, jmh530 via Digitalmars-d wrote:
>>>> On Tuesday, 6 June 2017 at 00:46:00 UTC, Stefan Koch wrote:
>>>>>
>>>>> Time to find this: roughly 2 weeks.
>>>>>
>>>>
>>>> Damn. That's some commitment.
>>>
>>> 2 weeks is not bad for subtle bugs in complex code like this one. In
>>> my day job I've seen bugs that took 2 *months* to figure out. One of
>>> them involved a rare race condition that can only be reproduced
>>> under very specific circumstances, and it took a long time and a lot
>>> of guesswork before a coworker and myself discovered the exact
>>> combination that triggered the bug, thereby leading to the subtle
>>> problem in a piece of code that looked perfectly innocuous before
>>> then.
>>
>> Oh, I've had those before. I had a race condition that reproduced
>> *randomly* and usually took about 2 weeks to happen, and that's by
>> pounding it non-stop. The result was deadlock. Any debugging after the
>> fact resulted in no clues.
>>
>> Only way I solved it was to print out state as it was going, so I
>> could see what happened when the state went bad. I think it took at
>> least 2 cycles to find it.
>>
>> This kind of stuff makes you appreciate how important avoiding race
>> conditions and memory corruption is.
> [...]
>
> Yeah, race conditions and memory corruption / pointer bugs are the worst
> to track down.  Since the codebase I deal with is in C, there are plenty
> of opportunities for slip-ups that lead to pointer bugs.  And the worst
> of them are dangling pointers... you write to them, and there's no SEGV
> because they point to valid memory, but that memory has been allocated
> to something else now.  By the time the corruption manifests itself,
> you're already long, long past the original buggy code, usually in some
> completely-innocent code that you can stare at for weeks or months and
> not find a single flaw.  Meanwhile the original bug randomly corrupts
> different things depending on who gets the memory pointed to by the bad
> pointer, making it almost impossible to reproduce. Even after you
> reproduce it, you've no idea how to trace it to the original cause,
> because you're long past where it happened.  And it's almost impossible
> to narrow it down, because reducing the test case may make the bad
> pointer corrupt something else that you don't see, so you don't know if
> the bug is still happening or not.
>
> Things like this make you *really* appreciate D features like bounds
> checking and the oft-maligned but life-saving GC.

Yep, there were memory errors too. We used a proprietary tool that was like valgrind (this was before valgrind existed) called purify to find those. I think most of them were either double-freeing/deleting something (usually in a destructor that was called more than once -- always set your members that you deleted to null), or freeing new'd memory/deleting malloc'd memory. Thought I had everything, and then the hang. At first we thought it was a memory issue not caught by purify, but then we found it eventually later as I described.

I can characterize memory corruption bugs as errors that occur randomly and can manifest in any kind of behavior. Even more nasty is sometimes they happen in code that you didn't even touch, because it was *always* happening, but just didn't cause a bug until you changed memory organization around slightly. Race conditions are also generally random but usually manifest the same way. Both are nasty to find and debug.

I don't miss those days :)

-Steve
June 06, 2017
On Tue, Jun 06, 2017 at 03:35:11PM -0400, Steven Schveighoffer via Digitalmars-d wrote: [...]
> I can characterize memory corruption bugs as errors that occur randomly and can manifest in any kind of behavior. Even more nasty is sometimes they happen in code that you didn't even touch, because it was *always* happening, but just didn't cause a bug until you changed memory organization around slightly.
[...]

Yep, that's exactly the kind of bug I'm talking about.  The kind that can appear/disappear depending on the order you link your object files, or declare dummy variables in an unrelated function, or whether you compile with debugging symbols (because that changes the memory layout of your program enough to make the symptoms disappear). The latter is the worst variant of its kind, because it means you're up Bug Creek without any debugging symbol paddles to help you. (And yes, I've actually had to deal with that before. It was not pretty.)


T

-- 
"If you're arguing, you're losing." -- Mike Thomas
June 09, 2017
On Thursday, 16 February 2017 at 21:05:51 UTC, Stefan Koch wrote:
> [ ... ]
Hi there,

I just pulled another all nighter.
I found a bug in the code that was supposed to adjust the values of || and &&.
As will as a mixup in the error messages for overlapping slice-assignment.
Both are fixed.

I am aware that overlapping slice assignment check is not yet good enough.
This is on my short-term todo list.
But in the even shorter-term, there are a couple hours sleep waiting to be claimed.

Night guys,
Stefan
June 14, 2017
On Thursday, 16 February 2017 at 21:05:51 UTC, Stefan Koch wrote:
> [ ... ]

Slice Assignment bugs fixed!
With that we are green again.

I am going to improve concat a little such that it computes the buffer lengths if it can.
And allocs the needed amount upfront.

The alternative is lazy concat which is much more work. With alot of potential for bugs.

|| and && still have me puzzledw :)