April 27, 2009 Re: Yet another strike against the current AA implementation | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Michel Fortin | Michel Fortin wrote:
> On 2009-04-26 11:46:51 -0400, dsimcha <dsimcha@yahoo.com> said:
>
>> == Quote from Michel Fortin (michel.fortin@michelf.com)'s article
>>> Hum, are you saying opApply superior when it comes to iterating in a
>>> tree? It seems that with opApply you could use the stack by recursively
>>> calling opApply with the given delegate on each one of the branches.
>>
>> Exactly. On the other hand, you lose a lot of flexibility with opApply. If you
>> tried to port most of std.range to operate on the opApply interface instead fo the
>> forward range interface, I doubt you'd get very far.
>>
>> IMHO, though, opApply should *not* be deprecated. opApply and ranges attempt to
>> solve similar problems, but not the same problem. Ranges attempt to solve the
>> problem of iterating over some object with maximum flexibility from the point of
>> view of the user of the object. opApply attempts to solve the problem of
>> iterating with maximum flexibility from the point of view of the implementer of
>> the object. In some cases, like the one you just described, the latter can be
>> better.
>
> Indeed. I certainly agree that both ranges and opApply have their place.
>
> So what the implementer can do with opApply is write an optimized iteration algorithm for use with foreach. Which may mean that when both opApply and ranges are available for generating foreach's code, the compiler should favor opApply. Currently, I believe it's the reverse.
I think, all other things being equal, that opApply tends to be slower than iterators.
Andrei
| |||
April 27, 2009 Re: Yet another strike against the current AA implementation | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | Andrei Alexandrescu wrote:
> 2. I haven't measured, but the cost of the indirect call is large enough to make me suspect that opApply is not as efficient as it's cracked to be, even when compared with an iterator.
When you know the type beforehand or can use templates, that is, rather than wrapping your range struct in a wrapper class. If you can't use a template for whatever reason, ranges are going to suck -- three virtual calls rather than one.
I don't usually care sufficiently about performance to worry about whether a call is virtual or not, but you brought that issue up before. And I imagine that, most of the time, you will know the range type in advance.
| |||
April 27, 2009 Re: Yet another strike against the current AA implementation | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Christopher Wright | Christopher Wright wrote:
> Andrei Alexandrescu wrote:
>> 2. I haven't measured, but the cost of the indirect call is large enough to make me suspect that opApply is not as efficient as it's cracked to be, even when compared with an iterator.
>
> When you know the type beforehand or can use templates, that is, rather than wrapping your range struct in a wrapper class. If you can't use a template for whatever reason, ranges are going to suck -- three virtual calls rather than one.
>
> I don't usually care sufficiently about performance to worry about whether a call is virtual or not, but you brought that issue up before. And I imagine that, most of the time, you will know the range type in advance.
Yah, that's a good motivation to change how hashes are currently handled. But above I was referring to the cost of opApply's callback, which adds to the costs you mention.
Andrei
| |||
April 27, 2009 Re: Yet another strike against the current AA implementation | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Christopher Wright | == Quote from Christopher Wright (dhasenan@gmail.com)'s article
> Andrei Alexandrescu wrote:
> > 2. I haven't measured, but the cost of the indirect call is large enough to make me suspect that opApply is not as efficient as it's cracked to be, even when compared with an iterator.
> When you know the type beforehand or can use templates, that is, rather
> than wrapping your range struct in a wrapper class. If you can't use a
> template for whatever reason, ranges are going to suck -- three virtual
> calls rather than one.
> I don't usually care sufficiently about performance to worry about
> whether a call is virtual or not, but you brought that issue up before.
> And I imagine that, most of the time, you will know the range type in
> advance.
Rule number 1 of all performance discussions: Always measure if possible.
import std.stdio, std.perf;
struct Direct {
uint n;
uint front() {
return n;
}
void popFront() {
n++;
}
bool empty() {
return n >= 1_000_000_000;
}
}
struct Apply {
int opApply(int delegate(ref uint) dg) {
int res = 0;
foreach(i; 0U..1_000_000_000) {
res = dg(i);
if(res) break;
}
return res;
}
}
class Virtual {
uint n;
uint front() {
return n;
}
void popFront() {
n++;
}
bool empty() {
return n >= 1_000_000_000;
}
}
void main() {
scope pc = new PerformanceCounter;
pc.start;
foreach(elem; Direct.init) {}
pc.stop;
writeln("Direct: ", pc.milliseconds);
pc.start;
auto v = new Virtual;
foreach(elem; v) {}
pc.stop;
writeln("Virtual: ", pc.milliseconds);
pc.start;
foreach(elem; Apply.init) {}
pc.stop;
writeln("opApply: ", pc.milliseconds);
}
Output:
Direct: 2343
Virtual: 5695
opApply: 3014
Bottom line is that these pretty much come in the order you would expect them to, but there are no particularly drastic differences between any of them. To put these timings in perspective, 5700 ms for 1 billion iterations is roughly (on a 2.7 GHz machine) 15 clock cycles per iteration. How often does anyone really have code that is performance critical *and* where the contents of the loop don't take long enough to dwarf the 15 clock cycles per iteration loop overhead *and* you need the iteration to be polymorphic?
| |||
April 27, 2009 Re: Yet another strike against the current AA implementation | ||||
|---|---|---|---|---|
| ||||
Posted in reply to dsimcha | dsimcha wrote:
> == Quote from Christopher Wright (dhasenan@gmail.com)'s article
>> Andrei Alexandrescu wrote:
>>> 2. I haven't measured, but the cost of the indirect call is large enough
>>> to make me suspect that opApply is not as efficient as it's cracked to
>>> be, even when compared with an iterator.
>> When you know the type beforehand or can use templates, that is, rather
>> than wrapping your range struct in a wrapper class. If you can't use a
>> template for whatever reason, ranges are going to suck -- three virtual
>> calls rather than one.
>> I don't usually care sufficiently about performance to worry about
>> whether a call is virtual or not, but you brought that issue up before.
>> And I imagine that, most of the time, you will know the range type in
>> advance.
>
> Rule number 1 of all performance discussions: Always measure if possible.
>
> import std.stdio, std.perf;
>
> struct Direct {
> uint n;
>
> uint front() {
> return n;
> }
>
> void popFront() {
> n++;
> }
>
> bool empty() {
> return n >= 1_000_000_000;
> }
> }
>
> struct Apply {
>
> int opApply(int delegate(ref uint) dg) {
> int res = 0;
> foreach(i; 0U..1_000_000_000) {
> res = dg(i);
> if(res) break;
> }
> return res;
> }
> }
>
> class Virtual {
> uint n;
>
> uint front() {
> return n;
> }
>
> void popFront() {
> n++;
> }
>
> bool empty() {
> return n >= 1_000_000_000;
> }
> }
>
> void main() {
> scope pc = new PerformanceCounter;
> pc.start;
> foreach(elem; Direct.init) {}
> pc.stop;
> writeln("Direct: ", pc.milliseconds);
> pc.start;
> auto v = new Virtual;
> foreach(elem; v) {}
> pc.stop;
> writeln("Virtual: ", pc.milliseconds);
> pc.start;
> foreach(elem; Apply.init) {}
> pc.stop;
> writeln("opApply: ", pc.milliseconds);
> }
>
> Output:
> Direct: 2343
> Virtual: 5695
> opApply: 3014
>
> Bottom line is that these pretty much come in the order you would expect them to,
> but there are no particularly drastic differences between any of them. To put
> these timings in perspective, 5700 ms for 1 billion iterations is roughly (on a
> 2.7 GHz machine) 15 clock cycles per iteration. How often does anyone really have
> code that is performance critical *and* where the contents of the loop don't take
> long enough to dwarf the 15 clock cycles per iteration loop overhead *and* you
> need the iteration to be polymorphic?
Did you compile that with optimization/inlining? If not, it's not really a fair test... and if so why isn't the compiler getting rid of those pointless loops?
| |||
April 27, 2009 Re: Yet another strike against the current AA implementation | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Steve Teale | Steve Teale wrote: > Ranges, ranges! That's all I hear these days, and it looks to me like > the continuing advance of D toward being a complete meta-language. > > Where do I see ranges described in terms that an old hand can > understand? > > I'm constantly having to roll my own in many areas when I see how the > meta stuff is implemented - like x ~= c to add a character to the end > of an array - reallocation every time? > > I thought D was supposed to be a systems programming language, not > something that was guaranteed to win the universe obfuscated code > competition. > > I've been trying to keep my projects up to date and compilable with > D2.0xx, but I think I'm going to give up on that and rewrite them for > whatever the current version of D1 is. > > I seriously think that the crew who are driving the development > should realize that not all computer programmers are going to have an > IQ of 140, and that any practical language should be comprehensible > to a majority of its users. It will, be patient. See below. > Maybe the problem I'm complaining about is just a lack of > documentation. Generating said from comments really does not hack it > - comments are always skimped, and usually lie. > > Before something like Ranges are implemented, there should be some > sort of RFC process where they are properly described rather than a > reliance on D users to have read every thread of the newsgroup, and > remembered it all. There's an illusion. And that illusion comes from the D newsgroups having "wrong" names. The D2 newsgroup should have a name like "D2 discussion -- for D language development folks, enter at your own risk". And a *D1* newsgroup should then be for anybody who actually uses the language for something. Currently, actually, the D.learn newsgroup has sort-of assumed this functionality. Being as it is (a de facto D2 development newsgroup), d.D will contain legthy discussions that meander and roam, possibly for months, before something crystallises, and the issue is settled (at least for the time). A lack of proper documentation belongs to this setup. (But then, one really has to congratulate Andrei, Phobos docs have *never* been as good as they are today!!!) About ranges: I suspect that once D2 is solidified (and we start working on D3 :-) ), ranges will be easy to use, easy to understand, and practical. Incidentally, that same complaint could have been said about templates. It took more than a year to get the discussion settled down a bit, and today templates seem not impossible at all to begin using (unless you want to begin right with the hairy stuff). And earlier, it took two months of intense discussion here before Unicode, UTF and their relation to char, wchar and dchar got generally understood. Today you can program in D without much thinking about UTF, and things "just work". Even abroad. But to sum it all, any project whose purpose is anything else than learning D2, should be written in D1. | |||
April 27, 2009 Re: Yet another strike against the current AA implementation | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Georg Wrede | On Mon, 27 Apr 2009 11:32:30 +0300, Georg Wrede wrote: > Phobos docs have *never* been as good as they are today!!! Really?! I still find them very confusing. Maybe its just the formatting but I often find it so hard to understand or find what I'm after that I refer to the source code to work it out. > About ranges: I suspect that once D2 is solidified (and we start working on D3 :-) ), ranges will be easy to use, easy to understand, and practical. Incidentally, that same complaint could have been said about templates. It took more than a year to get the discussion settled down a bit, and today templates seem not impossible at all to begin using (unless you want to begin right with the hairy stuff). It must be me, but I still find D2 templates as clear as mud. The syntax is butt-ugly, counter intuitive, and reader-hostile, IMHO. I'm sure that C++ is worse, but that is not the issue. I'll have to wait until somebody writes the "D Templates for Dummies" book. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell | |||
April 27, 2009 Re: Yet another strike against the current AA implementation | ||||
|---|---|---|---|---|
| ||||
Posted in reply to dsimcha | On 2009-04-27 00:50:23 -0400, dsimcha <dsimcha@yahoo.com> said: > Output: > Direct: 2343 > Virtual: 5695 > opApply: 3014 Nice. Isn't there room for optimisation on the compiler side though? I mean, the compiler could inline opApply, and while doing so it could notice that the delegate is constant and inline it too. I suspect that adding this optimisation would put opApply at the same performance level than ranges. -- Michel Fortin michel.fortin@michelf.com http://michelf.com/ | |||
April 27, 2009 Re: Yet another strike against the current AA implementation | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Michel Fortin |
Michel Fortin wrote:
> On 2009-04-27 00:50:23 -0400, dsimcha <dsimcha@yahoo.com> said:
>
>> Output:
>> Direct: 2343
>> Virtual: 5695
>> opApply: 3014
>
> Nice.
>
> Isn't there room for optimisation on the compiler side though? I mean, the compiler could inline opApply, and while doing so it could notice that the delegate is constant and inline it too. I suspect that adding this optimisation would put opApply at the same performance level than ranges.
Inlining does not automatically make things faster. Case in point: downs' raytracer stacy actually slows down when you inline certain parts. The penalty of not fitting in cache was overwhelming the penalty from using virtual methods.
-- Daniel
| |||
April 27, 2009 Re: Yet another strike against the current AA implementation | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Derek Parnell | Derek Parnell wrote:
> On Mon, 27 Apr 2009 11:32:30 +0300, Georg Wrede wrote:
>
>> Phobos docs have *never* been as good as they are today!!!
>
> Really?! I still find them very confusing. Maybe its just the formatting
> but I often find it so hard to understand or find what I'm after that I
> refer to the source code to work it out.
Suggestions are always welcome.
Andrei
| |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply