July 04, 2013
On Thursday, July 04, 2013 11:37:15 H. S. Teoh wrote:
> if a function
> only needs to iterate the range once, then it should only require an
> input range, nothing more.

Agreed, but it's freuently the case that an input range isn't enough, and even if it is for a particular function, odds are than the result of that function is going to need to be fed to another function which _does_ require more than an input range. And far too often, it's the case that people think that an input range is enough when in fact a forward range is required, but the implicit save that many ranges do makes it so that the function works without calling save much of the time, even though it really does need to save.

Whenever I have to deal with pure input ranges, I find it to be extremely annoying. The simple fact that you can't even save its state is just way too limiting for a lot of algorithms to be able to be implemented in any kind of sane way, if at all, and too much of Phobos becomes unavailable either because what it's trying to do just can't be done with an input range or because it would consume the range when you can't afford for it to be consumed.

- Jonathan M Davis
July 04, 2013
On Thursday, 4 July 2013 at 19:26:43 UTC, Jonathan M Davis wrote:
> Sometimes, you're stuck, because the nature of the type that you're dealing
> with forces it to be a pure input range, but you just can't take advantage of
> many algorithms with pure input ranges, which means that you have to write a
> lot more code or use std.array.array in order to use the various algorithms,
> forcing you to allocate when it should be completely unnecessary.

That is exactly why yield is useful, because without yield, you have two options.

1. Write a lot of code to create an InputRange. (Cost in time taken to write it.)
2. Allocate and lose on performance. (Cost in runtime performance.)

Yield gives you option 3.

3. Write very little code while also saving on performance.
July 04, 2013
On Thursday, July 04, 2013 21:40:23 w0rp wrote:
> On Thursday, 4 July 2013 at 19:26:43 UTC, Jonathan M Davis wrote:
> > Sometimes, you're stuck, because the nature of the type that
> > you're dealing
> > with forces it to be a pure input range, but you just can't
> > take advantage of
> > many algorithms with pure input ranges, which means that you
> > have to write a
> > lot more code or use std.array.array in order to use the
> > various algorithms,
> > forcing you to allocate when it should be completely
> > unnecessary.
> 
> That is exactly why yield is useful, because without yield, you have two options.
> 
> 1. Write a lot of code to create an InputRange. (Cost in time
> taken to write it.)
> 2. Allocate and lose on performance. (Cost in runtime
> performance.)
> 
> Yield gives you option 3.
> 
> 3. Write very little code while also saving on performance.

Except that in most cases, you should have been writing a range type with more capabilites than an input range. IMHO, something like yield would pretty much only be of value in situations where you just need to iterate over something (so you're not taking advantage of much in the way of algorithms) and where you need a range type specific to your program which will never be reused anywhere. And that implies that you're not writing very reusable code. Sometimes, that's the way things go, but ranges and algorithms are such that most of the time, you should be able to write range-based functions which are reusable. And if you're doing that, you should be writing ranges with the full set of capabilites that they can have, not just input ranges.

- Jonathan M Davis
July 04, 2013
On 7/4/13 12:40 PM, w0rp wrote:
> On Thursday, 4 July 2013 at 19:26:43 UTC, Jonathan M Davis wrote:
>> Sometimes, you're stuck, because the nature of the type that you're
>> dealing
>> with forces it to be a pure input range, but you just can't take
>> advantage of
>> many algorithms with pure input ranges, which means that you have to
>> write a
>> lot more code or use std.array.array in order to use the various
>> algorithms,
>> forcing you to allocate when it should be completely unnecessary.
>
> That is exactly why yield is useful, because without yield, you have two
> options.
>
> 1. Write a lot of code to create an InputRange. (Cost in time taken to
> write it.)
> 2. Allocate and lose on performance. (Cost in runtime performance.)
>
> Yield gives you option 3.
>
> 3. Write very little code while also saving on performance.

That would be a win if input ranges are very frequent. One possibility to assess that would be to figure how many algorithms in std.algorithm work with input ranges, how many generators in std.range create input ranges, and then assess the potential code savings. A consideration is also the cost of adding yield to the language (somehow this often gets neglected in the excitement of adding stuff).


Andrei
July 04, 2013
Andrei Alexandrescu:

> That would be a win if input ranges are very frequent.

As I have explained in one of my recent answers in this thread in Python "input ranges" are used all the time.

So in your analysis you should also hypothesize how frequent input ranges are going to appear in D if you offer a handy and compact built-in syntax to create them.


> A consideration is also the cost of adding yield to the language (somehow this often gets neglected in the excitement of adding stuff).

Of course. C#, its history and coding probably is a starting point to assess how much complexity causes this addition.

Bye,
bearophile
July 05, 2013
On Thu, 2013-07-04 at 20:27 +0200, bearophile wrote:
[…]
> Yet in Python they are used everywhere. I define input ranges often in D and I think they are useful.
> 
> Having a built-in "yield" in a language is quite handy. Take a look at the Python and the second D solutions here: http://rosettacode.org/wiki/Same_Fringe

cf. Scala which has for comprehensions and a yield. It is used a lot.

Assuming yield means the same thing in all these emails!
-- 
Russel. ============================================================================= Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder@ekiga.net 41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel@winder.org.uk London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder


July 05, 2013
On 07/04/13 20:27, bearophile wrote:
> Having a built-in "yield" in a language is quite handy. Take a look at the Python and the second D solutions here: http://rosettacode.org/wiki/Same_Fringe

Would tuples really be the ideal tree representation when dealing with mutable data in /real/ code?

Anyway, let's see... Right now, D will let you do this:

   struct Fringe(T) {
      struct State { T* root; HeapRange!Fringe subtree; };
      mixin GeneratorT!(State, q{
            if (root.l)
               for (subtree = Fringe(root.l); !subtree.empty; subtree.popFront() )
                  yield subtree.front;
            if (root.isLeaf())
               yield root.v;
            if (root.r)
               for (subtree = Fringe(root.r); !subtree.empty; subtree.popFront() )
                  yield subtree.front;
         }, typeof(T.v));
   }
   Fringe!TREE fringe(TREE)(TREE* tree) { return Fringe!TREE(tree); }

   auto sameFringe(R1, R2)(R1 r1, R2 r2) {
      import std.algorithm : equal; return equal(fringe(r1), fringe(r2));
   }

   void fringetest() {
      auto t1 = tree(tree(tree(1), 2, tree(3)), 4, tree(tree(5), 6, tree(7)));
      auto t2 = tree(tree(tree(1), 8, tree(3)), 9, tree(tree(5), 0, tree(7)));
      auto t3 = tree(tree(tree(1), 2, tree(3)), 4, tree(tree(5), 6, tree(8)));
      auto t4 = tree(tree(tree(1), 2, tree(3)), 4, tree(tree(5), 6, tree(null, 8, tree(7))));

      assert(sameFringe(t1, t2));
      assert(!sameFringe(t1, t3));
      assert(sameFringe(t1, t4));
   }

It really can't be made much simpler, it's just a few extra characters of boilerplate. Modulo certain other language issues, like no IFTI on constructors, and an imperfect compiler (the explicit "typeof(T.v)" is only required to work around forward ref errors)

The problem with encouraging this style is that people will use this approach, and then not expose the other range properties. It will result in less efficient code, which is harder and more bug-prone to improve (because the internal state has to be handled too).

Having input ranges be as simple to define as:

   auto iota(T)(T start, T end) {
      struct State { T start, end, i; }
      return Generator!(State, q{
                           for (i=start; i<end; ++i)
                              yield state.i;
                        })
                        (start, end);
   }

   auto map(alias F, R)(R r) {
      struct State { R r; alias F MF; }
      return Generator!(State, q{
                           for(; !r.empty; r.popFront())
                              yield MF(r.front);
                        })
                        (r);
   }

is nice, but has downsides. A DSL like this one is probably enough, there's not that much to gain from adding "yield" to the core language. [1]

artur

[1] A bit compiler support would make it /safer/, though; right now it's
    too easy to lose part of the state. But having implicit allocations
    (closure-like) wouldn't be a good solution; better diagnostics might
    be enough.
1 2 3
Next ›   Last »