May 02, 2012 [Issue 7413] Vector literals don't work | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | http://d.puremagic.com/issues/show_bug.cgi?id=7413 --- Comment #10 from github-bugzilla@puremagic.com 2012-05-02 02:21:17 PDT --- Commit pushed to master at https://github.com/D-Programming-Language/dmd https://github.com/D-Programming-Language/dmd/commit/b4ab1b0982a68284dcb8780e7d1e5f701aeefaa5 fix Issue 7413 - Vector literals don't work -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
May 02, 2012 [Issue 7413] Vector literals don't work | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | http://d.puremagic.com/issues/show_bug.cgi?id=7413 Walter Bright <bugzilla@digitalmars.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED --- Comment #11 from Walter Bright <bugzilla@digitalmars.com> 2012-05-02 02:22:42 PDT --- Haven't done the special case optimizations for constant loading. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
May 02, 2012 [Issue 7413] Vector literals don't work | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | http://d.puremagic.com/issues/show_bug.cgi?id=7413 --- Comment #12 from Manu <turkeyman@gmail.com> 2012-05-02 06:55:58 PDT --- (In reply to comment #11) > Haven't done the special case optimizations for constant loading. No problem, I'm using GDC anyway which might detect those in the back end. An efficient implementation would certainly use at least an xor for 0 initialisation, and the other tricks will get different mileage depending on the length of the pipeline surrounding. Not accessing memory is always better if there are pipeline cycles to soak up the latency. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
May 02, 2012 [Issue 7413] Vector literals don't work | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | http://d.puremagic.com/issues/show_bug.cgi?id=7413 Don <clugdbug@yahoo.com.au> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |clugdbug@yahoo.com.au --- Comment #13 from Don <clugdbug@yahoo.com.au> 2012-05-02 11:26:26 PDT --- (In reply to comment #12) > (In reply to comment #11) > > Haven't done the special case optimizations for constant loading. > > No problem, I'm using GDC anyway which might detect those in the back end. > > An efficient implementation would certainly use at least an xor for 0 initialisation, and the other tricks will get different mileage depending on the length of the pipeline surrounding. Not accessing memory is always better if there are pipeline cycles to soak up the latency. The -1 trick is always worth doing, I think. Agner Fog has a nice list in his optimisation manuals, but the only ones _always_ worth doing are the 0 and -1 integer cases, and the 0.0 floating point case (also using xor). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
May 02, 2012 [Issue 7413] Vector literals don't work | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | http://d.puremagic.com/issues/show_bug.cgi?id=7413 --- Comment #14 from Manu <turkeyman@gmail.com> 2012-05-02 13:06:19 PDT --- (In reply to comment #13) > (In reply to comment #12) > > (In reply to comment #11) > > > Haven't done the special case optimizations for constant loading. > > > > No problem, I'm using GDC anyway which might detect those in the back end. > > > > An efficient implementation would certainly use at least an xor for 0 initialisation, and the other tricks will get different mileage depending on the length of the pipeline surrounding. Not accessing memory is always better if there are pipeline cycles to soak up the latency. > > The -1 trick is always worth doing, I think. Agner Fog has a nice list in his optimisation manuals, but the only ones _always_ worth doing are the 0 and -1 integer cases, and the 0.0 floating point case (also using xor). If the compiler knows anything about the pipeline around the code, it should be able to make the best choice about the others. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
May 02, 2012 [Issue 7413] Vector literals don't work | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | http://d.puremagic.com/issues/show_bug.cgi?id=7413 --- Comment #15 from Don <clugdbug@yahoo.com.au> 2012-05-02 15:12:18 PDT --- (In reply to comment #14) > (In reply to comment #13) > > (In reply to comment #12) > > > (In reply to comment #11) > > > > Haven't done the special case optimizations for constant loading. > > > > > > No problem, I'm using GDC anyway which might detect those in the back end. > > > > > > An efficient implementation would certainly use at least an xor for 0 initialisation, and the other tricks will get different mileage depending on the length of the pipeline surrounding. Not accessing memory is always better if there are pipeline cycles to soak up the latency. > > > > The -1 trick is always worth doing, I think. Agner Fog has a nice list in his optimisation manuals, but the only ones _always_ worth doing are the 0 and -1 integer cases, and the 0.0 floating point case (also using xor). > > If the compiler knows anything about the pipeline around the code, it should be able to make the best choice about the others. My guess is that it's pretty rare that the alternative sequences are favoured just on the basis of the pipeline, since MOVDQA only uses a load port, and nothing else. Especially on Sandy Bridge or AMD, where there are two load ports. So I doubt there's much benefit to be had. By contrast, if there's _any_ chance of a cache miss, they'd be a huge win, but unfortunately that's far beyond the compiler's capabilities. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
May 03, 2012 [Issue 7413] Vector literals don't work | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | http://d.puremagic.com/issues/show_bug.cgi?id=7413 --- Comment #16 from Manu <turkeyman@gmail.com> 2012-05-03 03:25:43 PDT --- (In reply to comment #15) > (In reply to comment #14) > > (In reply to comment #13) > > > (In reply to comment #12) > > > > (In reply to comment #11) > > > > > Haven't done the special case optimizations for constant loading. > > > > > > > > No problem, I'm using GDC anyway which might detect those in the back end. > > > > > > > > An efficient implementation would certainly use at least an xor for 0 initialisation, and the other tricks will get different mileage depending on the length of the pipeline surrounding. Not accessing memory is always better if there are pipeline cycles to soak up the latency. > > > > > > The -1 trick is always worth doing, I think. Agner Fog has a nice list in his optimisation manuals, but the only ones _always_ worth doing are the 0 and -1 integer cases, and the 0.0 floating point case (also using xor). > > > > If the compiler knows anything about the pipeline around the code, it should be able to make the best choice about the others. > > My guess is that it's pretty rare that the alternative sequences are favoured > just on the basis of the pipeline, since MOVDQA only uses a load port, and > nothing else. Especially on Sandy Bridge or AMD, where there are two load > ports. > So I doubt there's much benefit to be had. > > By contrast, if there's _any_ chance of a cache miss, they'd be a huge win, but unfortunately that's far beyond the compiler's capabilities. And that's precisely my reasoning. If the compiler knows the state of the pipeline around the load, and there aren't conflicts, ie, can slip the instructions in for free between other pipeline stalls, then generating an immediate is always better than touching memory. Schedulers usually do have this information while performing code generation, so it may be possible. These sorts of considerations are obviously much more critical for non-x86 based architectures though, as with basically all optimisations ;) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
Copyright © 1999-2021 by the D Language Foundation