View mode: basic / threaded / horizontal-split · Log in · Help
October 02, 2012
Functional vs simple code
Without optimization the range and algorithm method takes about 
10 times as long as the simple code below it, with no array 
bounds checking and optimization it takes six times as long. Why 
is the difference so huge? I'd expect a moderate overhead but 
that's a big difference.

module main;
import std.stdio, std.algorithm, std.range, std.datetime;

enum MAX = 10_000_000UL;

void main() {
	StopWatch sw;
	sw.start;

	auto sum1 = MAX.iota.map!(x => x * x).reduce!"a + b";

	sw.peek.msecs.writeln("msecs");
	sum1.writeln;
	sw.start;

	ulong sum2 = 0;
	foreach(i;0..MAX)
		sum2 += i * i;

	sw.peek.msecs.writeln("msecs");

	sum2.writeln;
}
October 02, 2012
Re: Functional vs simple code
On Tuesday, 2 October 2012 at 20:48:31 UTC, ixid wrote:
> Without optimization the range and algorithm method takes about 
> 10 times as long as the simple code below it, with no array 
> bounds checking and optimization it takes six times as long. 
> Why is the difference so huge? I'd expect a moderate overhead 
> but that's a big difference.
>
> module main;
> import std.stdio, std.algorithm, std.range, std.datetime;
>
> enum MAX = 10_000_000UL;
>
> void main() {
> 	StopWatch sw;
> 	sw.start;
>
> 	auto sum1 = MAX.iota.map!(x => x * x).reduce!"a + b";
>
> 	sw.peek.msecs.writeln("msecs");
> 	sum1.writeln;
> 	sw.start;
>
> 	ulong sum2 = 0;
> 	foreach(i;0..MAX)
> 		sum2 += i * i;
>
> 	sw.peek.msecs.writeln("msecs");
>
> 	sum2.writeln;
> }

I realised I was making the functional version more complicated 
than necessary. This version is faster but still four times 
slower than the simple version:

MAX.iota.reduce!"a + b * b".writeln;
October 02, 2012
Re: Functional vs simple code
On 10/02/2012 10:48 PM, ixid wrote:
> Without optimization the range and algorithm method takes about 10 times
> as long as the simple code below it, with no array bounds checking and
> optimization it takes six times as long. Why is the difference so huge?
> I'd expect a moderate overhead but that's a big difference.
>
> module main;
> import std.stdio, std.algorithm, std.range, std.datetime;
>
> enum MAX = 10_000_000UL;
>
> void main() {
>      StopWatch sw;
>      sw.start;
>
>      auto sum1 = MAX.iota.map!(x => x * x).reduce!"a + b";
>
>      sw.peek.msecs.writeln("msecs");
>      sum1.writeln;
>      sw.start;
>
>      ulong sum2 = 0;
>      foreach(i;0..MAX)
>          sum2 += i * i;
>
>      sw.peek.msecs.writeln("msecs");
>
>      sum2.writeln;
> }


$ cat ixidbench.d
module main;
import std.stdio, std.algorithm, std.range, std.datetime;

enum MAX = 10_000_000_000UL;

void main() {
	StopWatch sw;
	sw.start;

	auto sum1 = MAX.iota.map!(x => x * x).reduce!"a + b";

	sw.stop;
	sw.peek.msecs.writeln("msecs");
	sum1.writeln;
	sw.reset;
	sw.start;

	ulong sum2 = 0;
	foreach(i;0..MAX)
		sum2 += i * i;

	sw.stop;
	sw.peek.msecs.writeln("msecs");

	sum2.writeln;
}
$ ldmd2 -O -release -inline ixidbench.d -ofixidbench
$ ./ixidbench
6528msecs
7032546979563742720
7518msecs
7032546979563742720
October 02, 2012
Re: Functional vs simple code
On 10/03/2012 12:11 AM, Timon Gehr wrote:
> ...
>
> $ cat ixidbench.d
> module main;
> import std.stdio, std.algorithm, std.range, std.datetime;
>
> enum MAX = 10_000_000_000UL;
>
> void main() {
>      StopWatch sw;
>      sw.start;
>
>      auto sum1 = MAX.iota.map!(x => x * x).reduce!"a + b";
>
>      sw.stop;
>      sw.peek.msecs.writeln("msecs");
>      sum1.writeln;
>      sw.reset;
>      sw.start;
>
>      ulong sum2 = 0;
>      foreach(i;0..MAX)
>          sum2 += i * i;
>
>      sw.stop;
>      sw.peek.msecs.writeln("msecs");
>
>      sum2.writeln;
> }
> $ ldmd2 -O -release -inline ixidbench.d -ofixidbench
> $ ./ixidbench
> 6528msecs
> 7032546979563742720
> 7518msecs
> 7032546979563742720

$ gdmd -O -release -inline ixidbench.d -ofixidbench
$ ./ixidbench
11250msecs
7032546979563742720
11233msecs
7032546979563742720
October 02, 2012
Re: Functional vs simple code
On Tuesday, 2 October 2012 at 22:13:10 UTC, Timon Gehr wrote:
> On 10/03/2012 12:11 AM, Timon Gehr wrote:
>> ...
>>
>> $ cat ixidbench.d
>> module main;
>> import std.stdio, std.algorithm, std.range, std.datetime;
>>
>> enum MAX = 10_000_000_000UL;
>>
>> void main() {
>>     StopWatch sw;
>>     sw.start;
>>
>>     auto sum1 = MAX.iota.map!(x => x * x).reduce!"a + b";
>>
>>     sw.stop;
>>     sw.peek.msecs.writeln("msecs");
>>     sum1.writeln;
>>     sw.reset;
>>     sw.start;
>>
>>     ulong sum2 = 0;
>>     foreach(i;0..MAX)
>>         sum2 += i * i;
>>
>>     sw.stop;
>>     sw.peek.msecs.writeln("msecs");
>>
>>     sum2.writeln;
>> }
>> $ ldmd2 -O -release -inline ixidbench.d -ofixidbench
>> $ ./ixidbench
>> 6528msecs
>> 7032546979563742720
>> 7518msecs
>> 7032546979563742720
>
> $ gdmd -O -release -inline ixidbench.d -ofixidbench
> $ ./ixidbench
> 11250msecs
> 7032546979563742720
> 11233msecs
> 7032546979563742720

Yes, I think it was just the compiler and compiler options rather 
than anything inherent in the method, DMD produced much slower 
code than GDC for the functional version. It's interesting that 
it's significantly faster in the functional style with LDC.
October 03, 2012
Re: Functional vs simple code
On Tuesday, 2 October 2012 at 22:13:10 UTC, Timon Gehr wrote:
> On 10/03/2012 12:11 AM, Timon Gehr wrote:
>> ...
>>
>> $ cat ixidbench.d
>> module main;
>> import std.stdio, std.algorithm, std.range, std.datetime;
>>
>> enum MAX = 10_000_000_000UL;
>>
>> void main() {
>>     StopWatch sw;
>>     sw.start;
>>
>>     auto sum1 = MAX.iota.map!(x => x * x).reduce!"a + b";
>>
>>     sw.stop;
>>     sw.peek.msecs.writeln("msecs");
>>     sum1.writeln;
>>     sw.reset;
>>     sw.start;
>>
>>     ulong sum2 = 0;
>>     foreach(i;0..MAX)
>>         sum2 += i * i;
>>
>>     sw.stop;
>>     sw.peek.msecs.writeln("msecs");
>>
>>     sum2.writeln;
>> }
>> $ ldmd2 -O -release -inline ixidbench.d -ofixidbench
>> $ ./ixidbench
>> 6528msecs
>> 7032546979563742720
>> 7518msecs
>> 7032546979563742720
>
> $ gdmd -O -release -inline ixidbench.d -ofixidbench
> $ ./ixidbench
> 11250msecs
> 7032546979563742720
> 11233msecs
> 7032546979563742720

While fiddling with this I came across something that seems odd 
in the behaviour of reduce and wondered if it's intended. It 
rather limits the usefulness of reduce with UFCS to "a + b" and 
"a - b".

Reduce works correctly when provided with a seed argument:

reduce!"a + b + 2"(0, [1,1,1]).writeln; // == 9 as expected

With UFCS I see no elegant way to reduce a chain with a more 
sophisticated reduce argument than the two simple ones listed 
above because the operation is not carried out on the first array 
element, used as the seed. This means:

[1,1,1].reduce!"a + b + 2".writeln; // == 7

which is useless and could easily trip people up because the 
intuitive expectation is that it will act like the seed example. 
If I am wrong in that expectation what is a use case for the 
seedless behaviour? Is there a way of writing the operation to 
give the same answer as the seed answer?

The element in position 0 should surely have +2 added to itself, 
it's only being the seed out of convenience isn't it? Instead we 
get 1 + (1 + 2) + (1 + 2). The order of reduce's arguments, 
(seed, range) prevent it being used with UFCS properly. If it 
were (range, seed) then there would be no problem:

[1,1,1].reduce!"a + b + 2"(0).writeln; // == 9
October 03, 2012
Re: Functional vs simple code
On Wednesday, 3 October 2012 at 01:21:38 UTC, ixid wrote:
> If it were (range, seed) then there would be no problem:
>
> [1,1,1].reduce!"a + b + 2"(0).writeln; // == 9

My thoughts exactly.
October 03, 2012
Re: Functional vs simple code
> While fiddling with this I came across something that seems odd 
> in the behaviour of reduce and wondered if it's intended. It 
> rather limits the usefulness of reduce with UFCS to "a + b" and 
> "a - b".
>
> Reduce works correctly when provided with a seed argument:
>
> reduce!"a + b + 2"(0, [1,1,1]).writeln; // == 9 as expected
>
> With UFCS I see no elegant way to reduce a chain with a more 
> sophisticated reduce argument than the two simple ones listed 
> above because the operation is not carried out on the first 
> array element, used as the seed. This means:
>
> [1,1,1].reduce!"a + b + 2".writeln; // == 7
>
> which is useless and could easily trip people up because the 
> intuitive expectation is that it will act like the seed 
> example. If I am wrong in that expectation what is a use case 
> for the seedless behaviour?

The fact that it does something different than the seeded version 
does not make the seedless version useless - in fact, that's what 
makes it useful. A typical use case is to find the maximum of a 
range (there is an example of this in the documentation at 
http://dlang.org/phobos/std_algorithm.html#reduce). If you don't 
know the highest possible value of elements, then you couldn't 
come up with an appropriate seed for this case.

I agree that the difference between the two versions of reduce 
can be a bit confusing. Maybe it would be better if they were 
named differently.

> Is there a way of writing the operation to give the same answer 
> as the seed answer?

In general, no (apart from the silly reduce!f(chain([seed], 
range))). If you want the behavior of the seeded version, use the 
seeded version.

> The element in position 0 should surely have +2 added to 
> itself, it's only being the seed out of convenience isn't it?

No, reduce!f(r) is not merely a syntactic sugar for reduce!f(0, 
r) (you could say it's a syntactic sugar for reduce!f(r.front, 
r.drop(1), though). Making reduce!f(r) be the same as reduce!f(0, 
r) doesn't really make any sense - would you expect [1, 2, 
3].reduce!"a * b" to return 0?  I wouldn't.

> Instead we get 1 + (1 + 2) + (1 + 2). The order of reduce's 
> arguments, (seed, range) prevent it being used with UFCS 
> properly. If it were (range, seed) then there would be no 
> problem:
>
> [1,1,1].reduce!"a + b + 2"(0).writeln; // == 9

This would be better, yes. Reduce was written before there was 
UFCS, so they couldn't take this into account. I don't know if it 
is possible for this to be changed now, as it would definitely 
break some code.
October 03, 2012
Re: Functional vs simple code
>A typical use case is to find the maximum of a range (there is 
>an example of this in the documentation at 
>http://dlang.org/phobos/std_algorithm.html#reduce). If you don't 
>know the highest possible value of elements, then you couldn't 
>come up with an appropriate seed for this case.

Fixing unseeded reduce would not remove that use case though, it 
just happens to be a case that's not broken by the way unseeded 
reduce works, why couldn't unseeded reduce compare a (the first 
datum) to itself? That would discover a isn't greater than a so 
leave a in position 0 and compare with all of the other elements. 
It would also avoid breaking uses like "a + b + 2" and "a + b * 
2" as well as a raft of more complex uses that reduce should be 
able to do but requires map and then reduce at present.

Walter Bright has a subtle bug in his code (the combined sum and 
square reduce) in his latest Dr Dobbs due to the behaviour of 
unseeded reduce. I'd call that more broken than unintuitive.

int[] arr = [1,2,3,4,5];
auto r = arr.reduce!((a,b) => a + b, (a,b) => a + b * b);
writefln("sum = %s, sum of squares = %s", r[0], r[1]);

If you change the value of the first number to 2 the square sum 
is an incorrect 56 when it should be 58.
October 03, 2012
Re: Functional vs simple code
>> Stuff

Actually now I realise what the conflict is between the two, a + 
b * b would give the wrong answer when applied to the whole array 
in the manner I was thinking by doing a + a * a for the first 
value.
Top | Discussion index | About this forum | D home