Thread overview
[phobos] periodic test failures + auto-testing
Dec 19, 2011
Brad Roberts
Dec 19, 2011
Jonathan M Davis
Dec 19, 2011
Brad Roberts
Dec 19, 2011
Jonathan M Davis
Dec 20, 2011
Jonathan M Davis
December 18, 2011
Since I started the pull tester, there's essentially always a build running.  That's not a problem.  What is a problem is the tests that fail for reasons that have nothing to do with the changes and non-deterministically.  The most common example, that I've seen so far, is this one:

[2011-Dec-18 18:07:48] [2011-Dec-18 18:07:53] [2011-Dec-18 18:07:52.9008638] [-4 secs, -900 ms, -863 ?s, and -8 hnsecs]
[99 ms, 136 ?s, and 2 hnsecs]
core.exception.AssertError at std.file(778): unittest failure

It's shown up 22 times in the last 3 days.  So far, it's been exclusive to the freebsd/64 run.  An example:
  http://d.puremagic.com/test-results/pull.ghtml?runid=2654

Has anyone seen this and looked into it at all?

Thanks,
Brad
December 19, 2011
On Sunday, December 18, 2011 20:59:11 Brad Roberts wrote:
> Since I started the pull tester, there's essentially always a build running.
>  That's not a problem.  What is a problem is the tests that fail for
> reasons that have nothing to do with the changes and non-deterministically.
>  The most common example, that I've seen so far, is this one:
> 
> [2011-Dec-18 18:07:48] [2011-Dec-18 18:07:53] [2011-Dec-18 18:07:52.9008638] [-4 secs, -900 ms, -863 ?s, and -8 hnsecs] [99 ms, 136 ?s, and 2 hnsecs] core.exception.AssertError at std.file(778): unittest failure
> 
> It's shown up 22 times in the last 3 days.  So far, it's been exclusive to the freebsd/64 run.  An example: http://d.puremagic.com/test-results/pull.ghtml?runid=2654
> 
> Has anyone seen this and looked into it at all?

*Sigh* It's trying to verify that the functions which get the various types of file time are correct, and unfortunately, there's no way to test the times _exactly_, so the tests have some leeway in them. I thought that we'd managed to make the leeway big enough, but it looks like it's still not enough in some cases. Maybe the the process ends up sleeping too long or there's something about the machine that makes it vary mroe. I don't know, but I guess that I'll have to increase the leeway again.

Still, it's a bit weird that the access time and the modification time aren't the same - assuming that the access time isn't actually being set when the file is accessed (and that's usually disabled). And if they were different, I'd have expected the _access_ time to be newer, not the file modification time. So, maybe something else is going on. I'll poke at it a bit tomorrow evening, though I suspect that I'll just end up increasing the leeway.

- Jonathan M Davis
December 19, 2011
On 12/19/2011 3:22 AM, Jonathan M Davis wrote:
> On Sunday, December 18, 2011 20:59:11 Brad Roberts wrote:
>> Since I started the pull tester, there's essentially always a build running.
>>  That's not a problem.  What is a problem is the tests that fail for
>> reasons that have nothing to do with the changes and non-deterministically.
>>  The most common example, that I've seen so far, is this one:
>>
>> [2011-Dec-18 18:07:48] [2011-Dec-18 18:07:53] [2011-Dec-18 18:07:52.9008638] [-4 secs, -900 ms, -863 ?s, and -8 hnsecs] [99 ms, 136 ?s, and 2 hnsecs] core.exception.AssertError at std.file(778): unittest failure
>>
>> It's shown up 22 times in the last 3 days.  So far, it's been exclusive to the freebsd/64 run.  An example: http://d.puremagic.com/test-results/pull.ghtml?runid=2654
>>
>> Has anyone seen this and looked into it at all?
> 
> *Sigh* It's trying to verify that the functions which get the various types of file time are correct, and unfortunately, there's no way to test the times _exactly_, so the tests have some leeway in them. I thought that we'd managed to make the leeway big enough, but it looks like it's still not enough in some cases. Maybe the the process ends up sleeping too long or there's something about the machine that makes it vary mroe. I don't know, but I guess that I'll have to increase the leeway again.
> 
> Still, it's a bit weird that the access time and the modification time aren't the same - assuming that the access time isn't actually being set when the file is accessed (and that's usually disabled). And if they were different, I'd have expected the _access_ time to be newer, not the file modification time. So, maybe something else is going on. I'll poke at it a bit tomorrow evening, though I suspect that I'll just end up increasing the leeway.
> 
> - Jonathan M Davis

I still believe you're testing more than is useful.  What kinds of bugs in this D code can these tests find?  So far, they're far more pain than benefit.  You're not going to find issues with time not moving forward, and even if you do, what would you do about it?  You might find bugs in the underlying OS's dealing with file time metadata.  Assuming you do, what would you do about it?  Add to that my loathing for tests that require sleeps.  Just get rid of them.  At best, disable them but keep the code on the extremely remote chance someone reports a bug in the area and you can point them at the unittest to help debug.

Sigh,
Brad
December 19, 2011
On Monday, December 19, 2011 10:00:21 Brad Roberts wrote:
> I still believe you're testing more than is useful. What kinds of bugs in this D code can these tests find? So far, they're far more pain than benefit. You're not going to find issues with time not moving forward, and even if you do, what would you do about it? You might find bugs in the underlying OS's dealing with file time metadata. Assuming you do, what would you do about it? Add to that my loathing for tests that require sleeps. Just get rid of them. At best, disable them but keep the code on the extremely remote chance someone reports a bug in the area and you can point them at the unittest to help debug.

I think that the tests _are_ useful in that they verify that the functions in question return roughly the correct values as opposed to being completely off, returning the wrong thing entirely. But it's also not the kind of thing that's likely to break once it's been written and tested, so it wouldn't necessarily cost us much to comment them out or put them in version block intended for more extensive tests - especially when we end up with periodic failures that we can't really control (that and having to have the sleeps is certainly annoying).

- Jonathan M Davis
December 19, 2011
On Monday, December 19, 2011 10:00:21 Brad Roberts wrote:
> I still believe you're testing more than is useful.  What kinds of bugs in this D code can these tests find?  So far, they're far more pain than benefit.  You're not going to find issues with time not moving forward, and even if you do, what would you do about it?  You might find bugs in the underlying OS's dealing with file time metadata.  Assuming you do, what would you do about it?  Add to that my loathing for tests that require sleeps.  Just get rid of them.  At best, disable them but keep the code on the extremely remote chance someone reports a bug in the area and you can point them at the unittest to help debug.

Okay. For the moment, I've increased the leeway and versioned off some of the tests, so hopefully that's enough to fix the autotester for now.

I think what I need to do in the long run is refactor and combine the tests so that they're more condensed and don't require as many sleeps and don't need to do as much writing to and reading from disk (and general reduce code duplication in the tests). Also, it looks like it might be possible to make the file times more precise, which would allow the tests to be more precise and allow the sleeps to be further reduced. So, obviously I have something more for my todo list.

- Jonathan M Davis