April 27, 2011 [phobos] [D-Programming-Language/phobos] 3cf671: Add std.parallelism. | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | On 4/27/2011 7:53 AM, David Simcha wrote:
>
>
> On Wed, Apr 27, 2011 at 12:10 AM, Brad Roberts <braddr at puremagic.com <mailto:braddr at puremagic.com>> wrote:
>
> It's a dual core amd:
>
> vendor_id : AuthenticAMD
> cpu family : 15
> model : 75
> model name : AMD Athlon(tm) 64 X2 Dual Core Processor 3800+
> stepping : 2
> cpu MHz : 1000.000
> cache size : 512 KB
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht
> syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic
> cr8_legacy
>
> Actually, showing up on one box but not another is strong evidence of a concurrency bug in my experience. Different
> cpu's at different speeds, with different speed memory and other side components change timings enough to expose bugs
> that otherwise haven't occurred elsewhere.
>
>
> Also, is there a way I can trigger the auto tester to run a few more times w/o committing anything? Everything passed on the latest run except Windows, which is broken for unrelated reasons, but I want to see whether I've solved the problem or the issue is non-deterministic and I just got lucky. (On FreeBSD everything started working after a change that shouldn't have mattered, but my gut feeling is that the failure here was due to a codegen bug.)
It's the easiest way for anyone other than me right now. A commit to any of the three key packages triggers a rebuild of all of them on every platform.
I'll set up an account for you on the box tonight so you can login to it and test/debug directly.
Later,
Brad
|
April 27, 2011 [phobos] [D-Programming-Language/phobos] 3cf671: Add std.parallelism. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Brad Roberts | On Wed, Apr 27, 2011 at 2:06 PM, Brad Roberts <braddr at puremagic.com> wrote: > > I'll set up an account for you on the box tonight so you can login to it and test/debug directly. > > Sincerely appreciated especially since I'm having problems getting a FreeBSD VM set up on my machine. Can I please have one on the Linux64 box, too since it did segfault there once? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20110427/2af51d55/attachment.html> |
April 27, 2011 [phobos] [D-Programming-Language/phobos] 3cf671: Add std.parallelism. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Brad Roberts | On 27 April 2011 20:06, Brad Roberts <braddr at puremagic.com> wrote:
> On 4/27/2011 7:53 AM, David Simcha wrote:
>>
>>
>> On Wed, Apr 27, 2011 at 12:10 AM, Brad Roberts <braddr at puremagic.com <mailto:braddr at puremagic.com>> wrote:
>>
>> ? ? It's a dual core amd:
>>
>> ? ? vendor_id ? ? ? : AuthenticAMD
>> ? ? cpu family ? ? ?: 15
>> ? ? model ? ? ? ? ? : 75
>> ? ? model name ? ? ?: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+
>> ? ? stepping ? ? ? ?: 2
>> ? ? cpu MHz ? ? ? ? : 1000.000
>> ? ? cache size ? ? ?: 512 KB
>> ? ? flags ? ? ? ? ? : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht
>> ? ? syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic
>> ? ? cr8_legacy
>>
>> ? ? Actually, showing up on one box but not another is strong evidence of a concurrency bug in my experience. ?Different
>> ? ? cpu's at different speeds, with different speed memory and other side components change timings enough to expose bugs
>> ? ? that otherwise haven't occurred elsewhere.
>>
>>
>> Also, is there a way I can trigger the auto tester to run a few more times w/o committing anything? ?Everything passed on the latest run except Windows, which is broken for unrelated reasons, but I want to see whether I've solved the problem or the issue is non-deterministic and I just got lucky. ?(On FreeBSD everything started working after a change that shouldn't have mattered, but my gut feeling is that the failure here was due to a codegen bug.)
>
> It's the easiest way for anyone other than me right now. ?A commit to any of the three key packages triggers a rebuild of all of them on every platform.
BTW: It seems to not do that in all cases. I made a change to Phobos' win32.mak almost immediately after the Windows test started running. It should have run the tests again as soon as it finished, but it didn't. It waited until the next commit.
|
April 27, 2011 [phobos] [D-Programming-Language/phobos] 3cf671: Add std.parallelism. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Don Clugston | On Wed, 27 Apr 2011, Don Clugston wrote:
> On 27 April 2011 20:06, Brad Roberts <braddr at puremagic.com> wrote:
> > On 4/27/2011 7:53 AM, David Simcha wrote:
> >>
> >>
> >> On Wed, Apr 27, 2011 at 12:10 AM, Brad Roberts <braddr at puremagic.com <mailto:braddr at puremagic.com>> wrote:
> >>
> >> ? ? It's a dual core amd:
> >>
> >> ? ? vendor_id ? ? ? : AuthenticAMD
> >> ? ? cpu family ? ? ?: 15
> >> ? ? model ? ? ? ? ? : 75
> >> ? ? model name ? ? ?: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+
> >> ? ? stepping ? ? ? ?: 2
> >> ? ? cpu MHz ? ? ? ? : 1000.000
> >> ? ? cache size ? ? ?: 512 KB
> >> ? ? flags ? ? ? ? ? : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht
> >> ? ? syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic
> >> ? ? cr8_legacy
> >>
> >> ? ? Actually, showing up on one box but not another is strong evidence of a concurrency bug in my experience. ?Different
> >> ? ? cpu's at different speeds, with different speed memory and other side components change timings enough to expose bugs
> >> ? ? that otherwise haven't occurred elsewhere.
> >>
> >>
> >> Also, is there a way I can trigger the auto tester to run a few more times w/o committing anything? ?Everything passed on the latest run except Windows, which is broken for unrelated reasons, but I want to see whether I've solved the problem or the issue is non-deterministic and I just got lucky. ?(On FreeBSD everything started working after a change that shouldn't have mattered, but my gut feeling is that the failure here was due to a codegen bug.)
> >
> > It's the easiest way for anyone other than me right now. ?A commit to any of the three key packages triggers a rebuild of all of them on every platform.
>
> BTW: It seems to not do that in all cases. I made a change to Phobos' win32.mak almost immediately after the Windows test started running. It should have run the tests again as soon as it finished, but it didn't. It waited until the next commit.
I'll check to see why it didn't (was it today?). There's a couple things that can go wrong. What I've seen in the past one or two instances I've noticed is that github fails to send the notification for some reason. It's annoying, but non-fatal since each run syncs to current code.
|
April 27, 2011 [phobos] [D-Programming-Language/phobos] 3cf671: Add std.parallelism. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Brad Roberts | I believe some AMD 64 CPUs allow independent loads to be reordered, which is contrary to the current Intel 64 spec. Not sure whether this has anything to do with the failure though.
On Apr 26, 2011, at 9:10 PM, Brad Roberts wrote:
> It's a dual core amd:
>
> vendor_id : AuthenticAMD
> cpu family : 15
> model : 75
> model name : AMD Athlon(tm) 64 X2 Dual Core Processor 3800+
> stepping : 2
> cpu MHz : 1000.000
> cache size : 512 KB
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht
> syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
>
> Actually, showing up on one box but not another is strong evidence of a concurrency bug in my experience. Different cpu's at different speeds, with different speed memory and other side components change timings enough to expose bugs that otherwise haven't occurred elsewhere.
>
> On 4/26/2011 8:44 PM, David Simcha wrote:
>> What about the Linux 64 machine? I'm getting some weird segfaults in std.parallelism that I can't reproduce on my Linux 64 VM. I'm currently running the std.parallelism unittests in a loop and they're passing every time. Furthermore, I've been using std.parallelism for real work on Linux 64 since DMD was capable of producing 64-bit binaries and it's worked fine. It also passed a Jinx test a few weeks back. (Jinx is software that's designed to make buggy multithreaded programs fail by fiddling with timings. Bartosz Milewski tested std.parallelism with it after seeing my blog post. The point is that this is evidence against the issue being a latent concurrency bug.) Therefore, I'm thinking something very weird is going on with auto tester.
>>
>> Also, I noticed it's segfaulting on FreeBSD, too. I'm installing a FreeBSD VM right now to look into this.
>>
>> On 4/26/2011 9:38 PM, Brad Roberts wrote:
>>> Good point.. I'll make a note to myself to add system definitions for those boxes somewhere obvious since that's important for some things. The freebsd box is an ancient pos, but it works:
>>>
>>> CPU: Intel(R) Pentium(R) 4 CPU 3.00GHz (2992.52-MHz 686-class CPU)
>>> Origin = "GenuineIntel" Id = 0xf41 Family = f Model = 4 Stepping = 1
>>>
>>> Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
>>>
>>> Features2=0x441d<SSE3,DTES64,MON,DS_CPL,CNXT-ID,xTPR>
>>> TSC: P-state invariant
>>> real memory = 2147483648 (2048 MB)
>>> avail memory = 2091065344 (1994 MB)
>>> ACPI APIC Table:<A M I OEMAPIC>
>>> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
>>> FreeBSD/SMP: 1 package(s) x 1 core(s) x 2 HTT threads
>>> cpu0 (BSP): APIC ID: 0
>>> cpu1 (AP/HT): APIC ID: 1
>>>
>>>
>>>
>>> On Tue, 26 Apr 2011, David Simcha wrote:
>>>
>>>> Date: Tue, 26 Apr 2011 21:02:07 -0400
>>>> From: David Simcha<dsimcha at gmail.com>
>>>> Reply-To: Discuss the phobos library for D<phobos at puremagic.com>
>>>> To: Discuss the phobos library for D<phobos at puremagic.com>
>>>> Subject: Re: [phobos] [D-Programming-Language/phobos] 3cf671: Add
>>>> std.parallelism.
>>>>
>>>> I actually realized that FreeBSD matters by looking at the auto tester to make sure the checkin wasn't breaking anything on OSX, which I also don't have access to. I have a static assert that will probably fail (fixing it right now since it's trivial). The other issue is that FreeBSD appears broken for unrelated reasons right now. I just pushed a fix based on reading some documentation about FreeBSD, but I have no idea whether it's right or not because I don't have a FreeBSD box to test it on. Also, the functionality is hard to automatically test because you need to know how many cores the box you're running on actually has.
>>>>
>>>> On 4/26/2011 8:45 PM, Brad Roberts wrote:
>>>>> Assuming you have a good unit test for it, just check the auto tester... Yes, freebsd/32 is supported for d2.
>>>>>
>>>>> On Tue, 26 Apr 2011, David Simcha wrote:
>>>>>
>>>>>> Just realized, are we supporting D2 on FreeBSD yet? If so, I need to fix
>>>>>> std.parallelism to detect the number of CPUs on it. I think this is the
>>>>>> same
>>>>>> as on OSX, though. (I used sysctlbyname on OSX. Someone please confirm
>>>>>> this
>>>>>> is right on BSD.)
>>>>>>
>>>>>> On 4/26/2011 8:06 PM, noreply at github.com wrote:
>>>>>>> Branch: refs/heads/master
>>>>>>> Home: https://github.com/D-Programming-Language/phobos
>>>>>>>
>>>>>>> Commit: 3cf67160b82d87d8de0bc7564aa1474149a92349
>>>>>>> https://github.com/D-Programming-Language/phobos/commit/3cf67160b82d87d8de0bc7564aa1474149a92349
>>>>>>> Author: dsimcha<dsimcha at gmail.com>
>>>>>>> Date: 2011-04-26 (Tue, 26 Apr 2011)
>>>>>>>
>>>>>>> Changed paths:
>>>>>>> M changelog.dd
>>>>>>> M posix.mak
>>>>>>> A std/parallelism.d
>>>>>>> M unittest.d
>>>>>>> M win32.mak
>>>>>>>
>>>>>>> Log Message:
>>>>>>> -----------
>>>>>>> Add std.parallelism.
>>>>>>>
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
|
April 27, 2011 [phobos] [D-Programming-Language/phobos] 3cf671: Add std.parallelism. | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | On Apr 27, 2011, at 5:58 AM, David Simcha wrote:
>
> Also, I found one potential explanation. I changed at the last minute from using my own ASM blocks for atomic loads to core.atomic, now that atomicLoad is exposed. I wrote a wrapper function around it that casts stuff to shared. The way I was doing the cast is probably invalid code (because cast(shared) someValue is not an lvalue; look at the latest changeset for details) but was being accepted by the compiler and doing God only knows what, possibly a non-atomic load at some point.
I'm currently working on the prototypes for core.atomic so the stuff that should compile does, and the stuff that shouldn't doesn't. Please let me know if you encounter any problems once I've committed my changes.
|
April 27, 2011 [phobos] [D-Programming-Language/phobos] 3cf671: Add std.parallelism. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Brad Roberts | >> > It's the easiest way for anyone other than me right now. ?A commit to any of the three key packages triggers a rebuild of all of them on every platform.
>>
>> BTW: It seems to not do that in all cases. I made a change to Phobos' win32.mak almost immediately after the Windows test started running. It should have run the tests again as soon as it finished, but it didn't. It waited until the next commit.
>
> I'll check to see why it didn't (was it today?). ?There's a couple things that can go wrong. ?What I've seen in the past one or two instances I've noticed is that github fails to send the notification for some reason. It's annoying, but non-fatal since each run syncs to current code.
It happened about a week ago. (I may have been mistaken, it might have been a change to the DMD test suite).
I think it was on the 16th or 17th of April. The only anomaly I can see is:
A windows test ran at 2011-04-16 00:25:59 (00:36:52), and there was a
checkin at 00:24:16 which wasn't included.
So it might just be a clock difference of two minutes, or checkout
took more than two minutes.
But I might be looking at the wrong day. In any case it was a checkin at almost exactly midnight.
|
April 27, 2011 [phobos] [D-Programming-Language/phobos] 3cf671: Add std.parallelism. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | On Wed, Apr 27, 2011 at 5:09 PM, Sean Kelly <sean at invisibleduck.org> wrote: > I believe some AMD 64 CPUs allow independent loads to be reordered, which is contrary to the current Intel 64 spec. Not sure whether this has anything to do with the failure though. > > Probably not. My gut feeling (no hard proof and of course I'm biased) is that it's a newly introduced compiler bug. I base this on having run the tests a ton of times on my dual core AMD box (Windows, Linux 32 and Linux 64) and used the module for real work on various hardware on 64-bit Linux since DMD supported 64-bit hardware, all with no issues. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20110427/443bc413/attachment-0001.html> |
Copyright © 1999-2021 by the D Language Foundation