May 28, 2013
Jacob Carlborg:

> Why not use proper lambdas instead of strings?

Mostly for personal reasons: quoted strings are sometimes a little shorter, and they require you to use arguments with default names (as "a" and "b"), this increased standardization makes me read them a little faster than lambdas that have varying argument names. If I see q{a + b} I don't need to go look what a and b are, for me it's a "standard" sum, like one to be used by a reduce().

It's a bit like the "pointfree" style in Haskell (http://www.haskell.org/haskellwiki/Pointfree ), also named "tacit" in other languages, where you don't name input variables; that if overused makes the code unreadable, but if used a bit and wisely allows you to underline the transformations done on the data instead of on the (sometimes arbitrary) names you give to parts of generic data.

Bye,
bearophile
May 28, 2013
On Tue, May 28, 2013 at 11:37:06AM +0200, bearophile wrote:
> Timothee Cour:
> 
> >python uses itertools.product which is lexicographic_depth.  Like you say, no-one can agrees what the order should be, so let's leave it up to user through a template. Sounds like a no-brainer to me.  There are use cases for each order I mentioned.
> 
> Different cases enjoy different orderings, so giving it a compile-time enum is OK. But I prefer the order of 9878 to be the default one, because it's the most commonly needed by me, and it's what I expect when I port Python code to D.
[...]

OK, I'm looking at the code to see how this could be implemented, and it seems to require some complications:

If both ranges are infinite, then the ordering needs to be Andrei's suggested ordering (incremental 2D square). It *can* have other orderings too, like diagonals, but that requires bidirectional ranges, not just forward ranges.

If only one range is infinite, then it is allowed to be an input range, but that constrains it to a specific ordering (namely, finite range first, then increment infinite range). If the infinite range is more than an input range, then of course other orderings are possible, but explicitly supporting those cases seems to be a lot of work for very little added value.

If both ranges are finite, then many different orderings are possible, but again it depends on the nature of the ranges. If one of them is an input range, then the other must be a forward range, and the order is constrained.

The biggest problem here is, what should be the default ordering for the template? There is no single ordering that will work for every case.

Furthermore, how many different kinds of orderings should be supported? There are a lot of possibilities, even just in the finite case (if the ranges are both bidirectional, for example -- who's to say we should exclude a Peano-curve traversal of the cartesian product, or everything in between?). Even restricting ourselves to the simplest orderings, there's the question of how to efficiently implement something like Andrei's 2D traversal when you have two finite ranges of different lengths.

Currently, I'm inclined to say, the library gets to decide on the ordering based on its inputs. I'd rather the template require minimal functionality from its input ranges and automatically decide on which ordering is best. I'd just follow bearophile's suggestion for the finite case -- the ordering in issue 9878 is certainly the most useful, and most likely expected (most uses of cartesian products would expect an ordering analogous to binary enumeration, I think). While adding an enum to specify the exact ordering is certainly possible, I'm not sure if it's adding enough value to justify the complications in the implementation.


T

-- 
Gone Chopin. Bach in a minuet.
May 28, 2013
On Tuesday, 28 May 2013 at 09:54:35 UTC, bearophile wrote:
> The "rewrite rules" is a feature of the GHC compiler. It allows the library writers to define rules that in most (or many) cases are advantageous and lead to better optimization. As example one of such rules swap map and filter, putting the filter before the map, because this is more efficient. Haskell allows you to perform such swapping because (nearly) everything it does is pure and immutable.

Phobos does this a little bit in simple cases. For example, retro(retro(r)) returns r if I remember correctly.

More complicated rewrites would be trickier to implement, and may be impossible cross-modules (I cannot provide a rewrite for retro(my.module.range) without modifying Phobos).

General rewrites, like the kind relational database query optimisers do, require statistics on the data structures. For example:

1) cartesianProduct(r1, r2).filter!(pred).sort("a[1]<b[1]")
2) cartesianProduct(r1, r2.sort("a[1]<b[1]")).filter!(pred)

Which is faster depends on the sizes of r1 and r2, and the pass ratio of the filter predicate.
May 28, 2013
On Tuesday, May 28, 2013 19:55:43 Peter Alexander wrote:
> Phobos does this a little bit in simple cases. For example,
> retro(retro(r)) returns r if I remember correctly.

Yes. Phobos tries to optimize useless templates like that, and definitely does
it in this particular case, though I don't know if it does it everywhere that
it should.

> More complicated rewrites would be trickier to implement, and may
> be impossible cross-modules (I cannot provide a rewrite for
> retro(my.module.range) without modifying Phobos).

True, though you might be able to pull off something very close with alias this. The type wouldn't be the same, but it would at least convert to the same type.

- Jonathan M Davis
May 28, 2013
On 5/28/2013 2:54 AM, bearophile wrote:
> The "rewrite rules" is a feature of the GHC compiler. It allows the library
> writers to define rules that in most (or many) cases are advantageous and lead
> to better optimization. As example one of such rules swap map and filter,
> putting the filter before the map, because this is more efficient. Haskell
> allows you to perform such swapping because (nearly) everything it does is pure
> and immutable.

Thank you.

May 29, 2013
On Tuesday, 28 May 2013 at 17:24:13 UTC, H. S. Teoh wrote:
> On Tue, May 28, 2013 at 11:37:06AM +0200, bearophile wrote:
>> Timothee Cour:
>> 
>> >python uses itertools.product which is lexicographic_depth.  Like you
>> >say, no-one can agrees what the order should be, so let's leave it up
>> >to user through a template. Sounds like a no-brainer to me.  There
>> >are use cases for each order I mentioned.
>> 
>> Different cases enjoy different orderings, so giving it a
>> compile-time enum is OK. But I prefer the order of 9878 to be the
>> default one, because it's the most commonly needed by me, and it's
>> what I expect when I port Python code to D.
> [...]
>
> OK, I'm looking at the code to see how this could be implemented, and it
> seems to require some complications:
>
> If both ranges are infinite, then the ordering needs to be Andrei's
> suggested ordering (incremental 2D square). It *can* have other
> orderings too, like diagonals, but that requires bidirectional ranges,
> not just forward ranges.
>
> If only one range is infinite, then it is allowed to be an input range,
> but that constrains it to a specific ordering (namely, finite range
> first, then increment infinite range). If the infinite range is more
> than an input range, then of course other orderings are possible, but
> explicitly supporting those cases seems to be a lot of work for very
> little added value.
>
> If both ranges are finite, then many different orderings are possible,
> but again it depends on the nature of the ranges. If one of them is an
> input range, then the other must be a forward range, and the order is
> constrained.
>
> The biggest problem here is, what should be the default ordering for the
> template? There is no single ordering that will work for every case.
>
> Furthermore, how many different kinds of orderings should be supported?
> There are a lot of possibilities, even just in the finite case (if the
> ranges are both bidirectional, for example -- who's to say we should
> exclude a Peano-curve traversal of the cartesian product, or everything
> in between?). Even restricting ourselves to the simplest orderings,
> there's the question of how to efficiently implement something like
> Andrei's 2D traversal when you have two finite ranges of different
> lengths.
>
> Currently, I'm inclined to say, the library gets to decide on the
> ordering based on its inputs. I'd rather the template require minimal
> functionality from its input ranges and automatically decide on which
> ordering is best. I'd just follow bearophile's suggestion for the finite
> case -- the ordering in issue 9878 is certainly the most useful, and
> most likely expected (most uses of cartesian products would expect an
> ordering analogous to binary enumeration, I think). While adding an enum
> to specify the exact ordering is certainly possible, I'm not sure if
> it's adding enough value to justify the complications in the
> implementation.
>
>
> T

I don't think the ordering should change just because the input type is changed, that would be very confusing, especially when code is just passing on ranges it's received from somewhere else.

The default should be the simplest ordering (across then down) and other orderings must be explicitly asked for. When an incompatible ordering (including using an infinite range for the "across" range with the default ordering) is being used it should show a helpful error message suggesting a compatible ordering.

There could also be an ordering constant "dontCare" which simply picks any ordering compatible with the nature of the input ranges. I don't think this should be the default because the common case is that you do care what the ordering is, you should have to explicitly say that you don't care.

It's also worth mentioning that the rules are more complicated than have been suggested - using an infinite range with the default ordering is fine as long as a finite range is used for the range that is being iterated over more quickly.
May 29, 2013
> Then maybe you have missed the last two times I have discussed such topics in D.learn.

Sorry, I meant the main D newsgroup.

Bye,
bearophile
1 2 3 4
Next ›   Last »