October 03, 2012
On Tuesday, 2 October 2012 at 21:27:42 UTC, Andrei Alexandrescu wrote:
> http://www.reddit.com/r/programming/comments/10u6sk/component_programming_in_d/
>
> Andrei

The article contains a bug due to the pernicious behaviour of seedless reduce.

This section:

Just to show how flexible algorithms can be, reduce can also compute multiple values with one pass through the data (which is pretty useful for streaming data and would be expensive to save for a second pass through it). Multiple lambdas produce a tuple result, here the sum and sum of squares is computed:

    int[] arr = [1,2,3,4,5];
    auto r = arr.reduce!((a,b) => a + b, (a,b) => a + b * b);
    writefln("sum = %s, sum of squares = %s", r[0], r[1]);
Which prints: sum = 15, sum of squares = 55

That is the correct answer for the squares but only because 1*1 is 1. The first element of a seedless reduce does not have any operation carried out on it.

If we change the array to [2,2,2,2,2] we would expect the squares sum to be 20. It's 18 because the seed element at arr[0] has no operation carried out on it.
October 03, 2012
On Tuesday, 2 October 2012 at 21:27:42 UTC, Andrei Alexandrescu wrote:
> http://www.reddit.com/r/programming/comments/10u6sk/component_programming_in_d/
>
> Andrei

The article contains a bug due to the pernicious behaviour of seedless reduce.

This section:

"Just to show how flexible algorithms can be, reduce can also compute multiple values with one pass through the data (which is pretty useful for streaming data and would be expensive to save for a second pass through it). Multiple lambdas produce a tuple result, here the sum and sum of squares is computed:

    int[] arr = [1,2,3,4,5];
    auto r = arr.reduce!((a,b) => a + b, (a,b) => a + b * b);
    writefln("sum = %s, sum of squares = %s", r[0], r[1]);

Which prints: sum = 15, sum of squares = 55"

That is the correct answer for the squares sum but only because 1*1 is 1, what it's really doing here is 1 + (2 * 2) + (3 * 3) + (4 * 4) + (5 * 5) which happens to work in this case and for "a + b" and "a - b" but is otherwise broken. The first element of a seedless reduce does not have any operation carried out on it.

If we change the array to [2,2,2,2,2] we would expect the squares sum to be 20. It's 18 because the seed element at arr[0] has no operation carried out on it other than the addition of the other elements to it.
October 03, 2012
On Wed, 03 Oct 2012 03:05:08 +0200
"Jonathan M Davis" <jmdavisProg@gmx.com> wrote:

> On Tuesday, October 02, 2012 17:27:55 Andrei Alexandrescu wrote:
> > http://www.reddit.com/r/programming/comments/10u6sk/component_programming_in _d/
> 
> It's definitely the sort of article that we've needed to show what we're trying to do with ranges.
> 

Yes, and also why a lot of D's features, esp its metaprogramming features, are so significant. And why most other languages, including dynamic languages, don't even come close.

I think I'm starting to get a sense of the next big step, though. There are other misc improvements I'd like to see, but I think the biggest weakness our range approach faces now (even as far ahead as we are) is the effort and, arguably, boilerplate to actually create the ranges.

Especially input/forward ranges: It's pretty well known and accepted (esp. to those who have created input/forward ranges) that the easiest way to make a generator is with a straight imperative function. But ranges turn the whole logic inside-out. Walking a tree can get particularly convoluted.

So I think the next *big* step from here, in D3 or some other D-derived
language, would be easing the creation of ranges. For example, a
special low-boilerplate syntax for creating bidirectional and
random-access ranges. Or for input (or maybe even forward) ranges, a
C#-style compile-time transformation of a generator function into a
range (With input ranges, you can technically get around needing source
transformation right now in D2 using fibers, but that has too
much runtime overhead to be used as a general solution).

October 03, 2012
> So I think the next *big* step from here, in D3 or some other D-derived
> language, would be easing the creation of ranges. For example, a
> special low-boilerplate syntax for creating bidirectional and
> random-access ranges. Or for input (or maybe even forward) ranges, a
> C#-style compile-time transformation of a generator function into a
> range (With input ranges, you can technically get around needing source
> transformation right now in D2 using fibers, but that has too
> much runtime overhead to be used as a general solution).

I don't think we need language support in order to have an easy way to create random access ranges. We could have a function that would be used like this:

auto squares = randomAccessRange(len, (size_t i) => i * i);

This probably wouldn't have optimal performance, but wouldn't be horribly slow either. Another option would be to have a mixin that would generate all other range methods from a method that would provide random access to the initial (that is, before the first call to popFront or popBack) range and a property that would return the length of the initial range.
October 03, 2012
On Wednesday, 3 October 2012 at 06:15:32 UTC, jerro wrote:
>> So I think the next *big* step from here, in D3 or some other D-derived
>> language, would be easing the creation of ranges. For example, a
>> special low-boilerplate syntax for creating bidirectional and
>> random-access ranges. Or for input (or maybe even forward) ranges, a
>> C#-style compile-time transformation of a generator function into a
>> range (With input ranges, you can technically get around needing source
>> transformation right now in D2 using fibers, but that has too
>> much runtime overhead to be used as a general solution).
>
> I don't think we need language support in order to have an easy way to create random access ranges. We could have a function that would be used like this:
>
> auto squares = randomAccessRange(len, (size_t i) => i * i);
>
> This probably wouldn't have optimal performance, but wouldn't be horribly slow either. Another option would be to have a mixin that would generate all other range methods from a method that would provide random access to the initial (that is, before the first call to popFront or popBack) range and a property that would return the length of the initial range.

Never mind, I just realized that is what std.range.sequence already does.

October 03, 2012
ixid:

> The article contains a bug due to the pernicious behaviour of seedless reduce.

Hopefully Walter will be able to fix the article.

Bye,
bearophile
October 03, 2012
Andrei Alexandrescu wrote:
> http://www.reddit.com/r/programming/comments/10u6sk/component_programming_in_d/

Nice article!

There, I found an interesting way of writing ranges that are not explicitly initialized:

    private import core.stdc.stdio;

    struct StdinByChar {

	@property bool empty() {
	    if (hasChar)
		return false;
	    auto c = fgetc(stdin);
	    if (c == EOF)
		return true;
	    ch = cast(char)c;
	    hasChar = true;
	    return false;
	}

	@property char front() {
	    return ch;
        }

	void popFront() {
	    hasChar = false;
	}

      private:
	char ch;
	bool hasChar;
    }

This is unusual to me that popFront() code is shifted to empty(), but I guess that's necessary because structs doesn't have () ctors.
October 03, 2012
Piotr Szturmaj wrote:
> guess that's necessary because structs doesn't have () ctors.

should be "don't"
October 03, 2012
I have some mixed feeling about component programming: add in all the examples the requirement to give the context (line number for example) where something happened (either a match or an error) and suddendly component programming becomes much more "tricky"!!

So for me component programming looks good on paper, but not so much in the real world..

renoX


October 03, 2012
On Wednesday, 3 October 2012 at 01:40:05 UTC, ixid wrote:
> On Tuesday, 2 October 2012 at 21:27:42 UTC, Andrei Alexandrescu wrote:
>> http://www.reddit.com/r/programming/comments/10u6sk/component_programming_in_d/
>>
>> Andrei
>
> The article contains a bug due to the pernicious behaviour of seedless reduce.
>
> This section:
>
> "Just to show how flexible algorithms can be, reduce can also compute multiple values with one pass through the data (which is pretty useful for streaming data and would be expensive to save for a second pass through it). Multiple lambdas produce a tuple result, here the sum and sum of squares is computed:
>
>     int[] arr = [1,2,3,4,5];
>     auto r = arr.reduce!((a,b) => a + b, (a,b) => a + b * b);
>     writefln("sum = %s, sum of squares = %s", r[0], r[1]);
>
> Which prints: sum = 15, sum of squares = 55"
>
> That is the correct answer for the squares sum but only because 1*1 is 1, what it's really doing here is 1 + (2 * 2) + (3 * 3) + (4 * 4) + (5 * 5) which happens to work in this case and for "a + b" and "a - b" but is otherwise broken. The first element of a seedless reduce does not have any operation carried out on it.
>
> If we change the array to [2,2,2,2,2] we would expect the squares sum to be 20. It's 18 because the seed element at arr[0] has no operation carried out on it other than the addition of the other elements to it.

Pernicious? Bug? I've never used D, but it looks obvious that arr.reduce does what it's compile time parameter promises with 'a' as the "seed". It seems that using [0,1,2,3,4,5] or [0]~[1,2,3,4,5] or ([0]~arr).reduce! would have reflected a more general "sum of squares" idea simply enough.