Recommendations on avoiding range pipeline type hell

Recommendations on avoiding range pipeline type hell
May 15, 2021 Chris Piker
May 15, 2021 Adam D. Ruppe
May 15, 2021 Chris Piker
May 15, 2021 Paul Backus
May 15, 2021 Chris Piker
May 16, 2021 Chris Piker
May 16, 2021 Jordan Wilson
May 16, 2021 Chris Piker
May 16, 2021 SealabJaster
May 16, 2021 Chris Piker
May 16, 2021 Adam D. Ruppe
May 16, 2021 Chris Piker
May 16, 2021 Adam D. Ruppe
May 17, 2021 SealabJaster
May 17, 2021 Adam D. Ruppe
May 17, 2021 Chris Piker
May 16, 2021 SealabJaster
May 16, 2021 SealabJaster
May 16, 2021 Jordan Wilson
May 16, 2021 Patrick Schluter
May 15, 2021 Adam D. Ruppe
May 16, 2021 Chris Piker
May 19, 2021 Jerry
May 15, 2021 Mike Parker
May 15, 2021 Chris Piker
May 15, 2021 Mike Parker
May 15, 2021 Ali Çehreli
May 16, 2021 H. S. Teoh

May 15, 2021

Posted by Chris Piker

Permalink

Chris Piker

Permalink

Hi D

Since the example of piping the output of one range to another looked pretty cool, I've tried my own hand at it for my current program, and the results have been... sub optimal.

Basically the issue is that if one attempts to make a range based pipeline aka:

auto mega_range = range1.range2!(lambda2).range3!(lambda3);

Then the type definition of mega_range is something in the order of:

  TYPE_range3!( TYPE_range2!( TYPE_range1, TYPE_lamba2 ), TYPE_lambda3));

So the type tree builds to infinity and the type of range3 is very much determined by the lambda I gave to range2. To me this seems kinda crazy.

To cut through all the clutter, I need something more like a unix command line:

prog1 | prog2 some_args | prog3 some_args

Here prog2 doesn't care what prog1 is just what it produces.

So pipelines that are more like:

ET2 front2(ET1, FT)(ET1 element, FT lambda){ /* stuff */ }
ET3 front3(ET2, FT)(ET2 element, FT lambda){ /* stuff */ }

void main(){

  for(; !range1.empty; range1.popFront() )
  {
    ET3 el3 = front3( front2(range1.front, lambda2), lamda3) );
    writeln(el3);
  }
}

But, loops are bad. On the D blog I've seen knowledgeable people say all loops are bugs. But how do you get rid of them without descending into Type Hell(tm). Is there anyway to get some type erasure on the stack?

The only thing I can think of is to use Interfaces and Classes like Java, but we don't have the automagical JVM reordering the heap at runtime, so that means living life on a scattered heap, just like python.

Is there some obvious trick or way of looking at the problem that I'm missing?

Thanks for your patience with a potentially dumb question. I've been working on the code for well over 12 hours so I'm probably not thinking straight it this point.

Cheers all,

May 15, 2021

Re: Recommendations on avoiding range pipeline type hell

Posted by Adam D. Ruppe
in reply to Chris Piker

Permalink

Adam D. Ruppe

Posted in reply to Chris Piker

Permalink

On Saturday, 15 May 2021 at 11:25:10 UTC, Chris Piker wrote:

Then the type definition of mega_range is something in the order of:

The idea is you aren't supposed to care what the type is, just what attributes it has, e.g., can be indexed, or can be assigned, etc.

You'd want to do it all in one big statement, with a consumer at the end (and pray there's no errors cuz while you're supposed to hide from the type, it won't hide from you if there's a problem, and as you know the errors might be usable if they were formatted better and got to the point but they're not and sometimes the compiler withholds vital information from you! Error message quality is D's #1 practical problem. but ill stop ranting)

Anyway, you put it all in one bit thing and this is kinda important: avoid assigning it to anything. You'd ideally do all the work, from creation to conclusion, all in the big pipeline.

So say you want to write it

auto mega_range = range1.range2!(lambda2).range3!(lambda3);
writeln(mega_range);

that'd prolly work, writeln is itself flexible enough, but you'd prolly be better off doing like

range1
   .range2!lambda2
   .range3!lambda3
   .each!writeln; // tell it to write out each element

Or since writeln is itself a range consumer you could avoid that .each call. But it is really useful for getting the result out of a bit mess for a normal function that isn't a full range consumer. (It is basically foreach written as a function call instead of as a loop)

This way the concrete type never enters into things, it is all just a detail the compiler tracks to ensure the next consumer doesn't try to do things the previous step does not support.

But, loops are bad. On the D blog I've seen knowledgeable people say all loops are bugs.

Meh, don't listen to that nonsense, just write what works for you. D's strength is that it adapts to different styles and meets you where you are. Listening to dogmatic sermons about idiomatic one true ways is throwing that strength away and likely to kill your personal productivity as you're fighting your instincts instead of making it work.

May 15, 2021

Re: Recommendations on avoiding range pipeline type hell

Posted by Mike Parker
in reply to Chris Piker

Permalink

Mike Parker

Posted in reply to Chris Piker

Permalink

On Saturday, 15 May 2021 at 11:25:10 UTC, Chris Piker wrote:

Is there some obvious trick or way of looking at the problem that I'm missing?

In addition to what Adam said, if you do need to store the result for use in a friendlier form, just import std.array and append .array to the end of the pipeline. This will eagerly allocate space for and copy the range elements to an array, i.e., convert the range to a container:

auto mega_range = range1.range2!(lambda2).range3!(lambda3).array;

Sometimes you may want to set up a range and save it for later consumption, but not necessarily as a container. In that case, just store the range itself as you already do, and pass it to a consumer when you're ready. That might be .array or it could be foreach or something else.

auto mega_range = range1.range2!(lambda2).range3!(lambda3);

// later
foreach(elem; mega_range) {
   doStuff(elem);
}

May 15, 2021

Re: Recommendations on avoiding range pipeline type hell

Posted by Chris Piker
in reply to Adam D. Ruppe

Permalink

Chris Piker

Posted in reply to Adam D. Ruppe

Permalink

On Saturday, 15 May 2021 at 11:51:11 UTC, Adam D. Ruppe wrote:

On Saturday, 15 May 2021 at 11:25:10 UTC, Chris Piker wrote:
The idea is you aren't supposed to care what the type is, just what attributes it has, e.g., can be indexed, or can be assigned, etc.

(Warning, new user rant ahead. Eye rolling warranted and encouraged)

I'm trying to do that, but range3 and range2 are written by me not a Phobos wizard, and there's a whole library of template functions a person needs to learn to make their own pipelines. For example:

// From std/range/package.d
CommonType!(staticMap!(ElementType, staticMap!(Unqual, Ranges))

alias RvalueElementType = CommonType!(staticMap!(.ElementType, R));
// ... what's with the . before the ElementType statement?  Line 921 says
// .ElementType depends on RvalueElementType.  How can they depend on
// each other?  Is this a recursive template thing?

and all the other automagic stuff that phobos pulls off to make ranges work. If that's what's needed to make a custom range type, then D ranges should come with the warning don't try this at home. (Ali's book made it look so easy that I got sucker in)

Every time I slightly change the inputs to range2, then a function that operates on range3 output types blows up with a helpful message similar to:

template das2.range.PrioritySelect!(PriorityRange!(DasRange!(Tuple!(int, int)[], int function(Tuple!(int, int)) pure nothrow @nogc @safe, int function(Tuple!(int, int)) pure nothrow @nogc @safe, Tuple!(int, int), int), int function() pure nothrow @nogc @safe), PriorityRange!(DasRange!(Tuple!(int, int)[], int function(Tuple!(int, int)) pure nothrow @nogc @safe, int function(Tuple!(int, int)) pure nothrow @nogc @safe, Tuple!(int, int), int), int function() pure nothrow @nogc @safe)).PrioritySelect.getReady.filter!((rng) => !rng.empty).filter cannot deduce function from argument types !()(PriorityRange!(DasRange!(Tuple!(int, int)[], int function(Tuple!(int, int)) pure nothrow @nogc @safe, int function(Tuple!(int, int)) pure nothrow @nogc @safe, Tuple!(int, int), int), int function() pure nothrow @nogc @safe), PriorityRange!(DasRange!(Tuple!(int, int)[], int function(Tuple!(int, int)) pure nothrow @nogc @safe, int function(Tuple!(int, int)) pure nothrow @nogc @safe, Tuple!(int, int), int), int function() pure nothrow @nogc @safe))

What the heck is that?

Anyway, you put it all in one bit thing and this is kinda important: avoid assigning it to anything. You'd ideally do all the work, from creation to conclusion, all in the big pipeline.

I fell back to using assignments just to make sure range2 values were saved in a concrete variable so that range3 didn't break when I changed the lambda that was run by range2 to mutate it's output elements.

What went in to getting the element to range3's doorstep is a detail that I shouldn't have to care about inside range3 code, but am forced to care about it, because changing range2's type, changes range3's type and triggers really obscure error messages. (Using interfaces or gasp casts, would break the TMI situation.)

So say you want to write it

auto mega_range = range1.range2!(lambda2).range3!(lambda3);
writeln(mega_range);

that'd prolly work, writeln is itself flexible enough, but you'd prolly be better off doing like

Sure it will work, because writeln isn't some function written by a new user, it's got all the meta magic.

This way the concrete type never enters into things, it is all just a detail the compiler tracks to ensure the next consumer doesn't try to do things the previous step does not support.

It's all just a detail the compiler tracks, until you're not sending to writeln, but to your own data consumer. Then, you'd better know all of std.traits and std.meta cause you're going to need them too implement a range-of-ranges consumer. And by the way you have to use a range of ranges instead of an array of ranges because two ranges that look to be identical types, actually are not identical types and so can't go into the same array.

Here's an actual (though formatted by me) error message I got stating that two things were different and thus couldn't share an array. Can you see the difference? I can't. Please point it out if you do.

das2/range.d(570,39): Error: incompatible types for (dr_fine) : (dr_coarse):

das2.range.PriorityRange!(
  DasRange!(
    Take!(
      ZipShortest!(
        cast(Flag)false, Result, Generator!(function () @safe => uniform(0, 128))
      )
    ),
    int function(Tuple!(int, int)) pure nothrow @nogc @safe,
    int function(Tuple!(int, int)) pure nothrow @nogc @safe,
    Tuple!(int, int),
    int
  ),
  int function() pure nothrow @nogc @safe
)

and

das2.range.PriorityRange!(
  DasRange!(
    Take!(
      ZipShortest!(
        cast(Flag)false, Result, Generator!(function () @safe => uniform(0, 128))
      )
    ),
    int function(Tuple!(int, int)) pure nothrow @nogc @safe,
    int function(Tuple!(int, int)) pure nothrow @nogc @safe,
    Tuple!(int, int),
    int
  ),
  int function() pure nothrow @nogc @safe
)

> >

But, loops are bad. On the D blog I've seen knowledgeable people say all loops are bugs.

Insightful.

Anyway, if you made it this far, you're a saint. Thanks for your time :)

May 15, 2021

Re: Recommendations on avoiding range pipeline type hell

Posted by Mike Parker
in reply to Chris Piker

Permalink

Mike Parker

Posted in reply to Chris Piker

Permalink

On Saturday, 15 May 2021 at 11:25:10 UTC, Chris Piker wrote:

Thanks for your patience with a potentially dumb question. I've been working on the code for well over 12 hours so I'm probably not thinking straight it this point.

BTW, I can send you a couple of documents regarding ranges that you may or may not find useful. Please email me at aldacron@gmail.com if you're interested.

May 15, 2021

Re: Recommendations on avoiding range pipeline type hell

Posted by Paul Backus
in reply to Chris Piker

Permalink

Paul Backus

Posted in reply to Chris Piker

Permalink

On Saturday, 15 May 2021 at 13:46:57 UTC, Chris Piker wrote:

Every time I slightly change the inputs to range2, then a function that operates on range3 output types blows up with a helpful message similar to:
[snip]

If you post your code (or at least a self-contained subset of it) someone can probably help you figure out where you're running into trouble. The error messages by themselves do not provide enough information--all I can say from them is, "you must be doing something wrong."

May 15, 2021

Re: Recommendations on avoiding range pipeline type hell

Posted by Chris Piker
in reply to Mike Parker

Permalink

Chris Piker

Posted in reply to Mike Parker

Permalink

On Saturday, 15 May 2021 at 13:43:29 UTC, Mike Parker wrote:

On Saturday, 15 May 2021 at 11:25:10 UTC, Chris Piker wrote:

Thanks for the suggestion. Unfortunately the range is going to be 40+ years of Voyager magnetometer data processed in a pipeline.

I am trying to do everything in functional form, but the deep type dependencies (and my lack of knowledge) are crushing my productivity. I might have to stop trying to write idiomatic D and start writing Java-in-D just to move this project along. Fortunately, D supports that too.

May 15, 2021

Re: Recommendations on avoiding range pipeline type hell

Posted by Chris Piker
in reply to Paul Backus

Permalink

Chris Piker

Posted in reply to Paul Backus

Permalink

On Saturday, 15 May 2021 at 14:05:34 UTC, Paul Backus wrote:

If you post your code (or at least a self-contained subset of it) someone can probably help you figure out where you're running into trouble.

Smart idea. It's all on github. I'll fix a few items and send a link soon as I get a little shut eye.

all I can say from them is, "you must be doing something wrong."

I bet you're right :)

Take Care,

May 15, 2021

Re: Recommendations on avoiding range pipeline type hell

Posted by Ali Çehreli
in reply to Chris Piker

Permalink

Ali Çehreli

Posted in reply to Chris Piker

Permalink

On 5/15/21 4:25 AM, Chris Piker wrote:

> But, loops are bad.

I agree with Adam here. Although most of my recent code gravitates towards long range expressions, I use 'foreach' (even 'for') when I think it makes code more readable.

> Is there some obvious trick or way of looking at the problem that I'm
> missing?

The following are idioms that I use:

* The range is part of the type:

struct MyType(R) {
  R myRange;
}

* If the type is too complicated as in your examples:

struct MyType(R) {
  R myRange;
}

auto makeMyType(X, Y)(/* ... */) {
  auto myArg = foo!X.bar!Y.etc;
  return MyType!(typeof(myArg))(myArg);
}

* If my type can't be templated:

struct MyType {
  alias MyRange = typeof(makeMyArg());
  MyRange myRange;
}

// For the alias to work above, all parameters of this
// function must have default values so that the typeof
// expression is as convenient as above.
auto makeMyArg(X, Y)(X x = X.init, Y y = Y.init) {
  // Then, you can put some condition checks here if
  // X.init and Y.init are invalid values for your
  // program.
  return foo!X.bar!Y.etc;
}

I think that's all really.

And yes, sometimes there are confusing error messages but the compiler is always right. :)

Ali

May 15, 2021

Re: Recommendations on avoiding range pipeline type hell

Posted by Adam D. Ruppe
in reply to Chris Piker

Permalink

Adam D. Ruppe

Posted in reply to Chris Piker

Permalink

On Saturday, 15 May 2021 at 13:46:57 UTC, Chris Piker wrote:

I'm trying to do that, but range3 and range2 are written by me not a Phobos wizard, and there's a whole library of template functions a person needs to learn to make their own pipelines. For example:

Phobos has plenty of design flaws, you don't want to copy that.

Generally you should just accept the range with a simple foreach in your handler.

void processRange(R)(R r) {
   foreach(item; r) {
      // use item
   }
}

If you want it forwarded in a pipeline, make a predicate that works on an individual item and pass it to map instead of trying to forward everything.

If you're creating a range, only worry about the basic three functions: empty, front, and popFront. That's the minimum then it works with most phobos things too. That's where I balance ease of use with compatibility - those three basics let the phobos ones iterate through your generated data. Can't jump around but you can do a lot with just that.

(personally btw I don't even use most of this stuff at all)

// ... what's with the . before the ElementType statement?

Now that is good to know: that is a D language thing meaning "look this up at top level".

So like let's say you are writing a module with

void func();

class Foo {
    void func();

    void stuff() {
         func();
    }
}

The func inside stuff would normally refer to the local method; it is shorthand for this.func();.

But what if you want that func from outside the class? That's where the . comes in:

void func();

class Foo {
    void func();

    void stuff() {
         .func(); // now refers to top-level, no more `this`
    }
}

In fact, it might help to think of it as specifically NOT wanting this.func, so you leave the this out.

What the heck is that?

idk i can't read that either, the awful error message are one reason why i don't even use this style myself (and the other is im just not on the functional bandwagon...)

Most the time std.algorithm vomits though it is because some future function required a capability that got dropped in the middle.

For example:

some_array.filter.sort

would vomit because sort needs random access, but filter drops that. So like sort says "give me the second element" but filter doesn't know what the second element is until it actually processes the sequence - it might filter out ALL the elements and it has no way of knowing if anything is left until it actually performs the filter.

And since all these algorithms are lazy, it puts off actually performing anything until it has some idea what the end result is supposed to be.

The frequent advice here is to stick ".array" in the middle, which performs the operation up to that point and puts the result in a freshly-created array. This works, but it also kinda obscures why it is there and sacrifices the high performance the lazy pipeline is supposed to offer, making it process intermediate data it might just discard at the next step anyway.

Rearranging the pipeline so the relatively destructive items are last can sometimes give better results. (But on the other hand, sorting 100,000 items when you know 99,000 are going to be filtered out is itself wasted time... so there's no one right answer.)

anyway idk what's going on in your case. it could even just be a compile error in a predicate, like a typo'd name. it won't tell you, it just vomits up so much spam it could fill a monty python sketch.

messages. (Using interfaces or gasp casts, would break the TMI situation.)

i <3 interfaces

it is a pity to me cuz D's java-style OOP is actually pretty excellent. a few little things I'd fix if I could, a few nice additions I could dream up, but I'm overall pretty happy with it and its error messages are much better.

but too many people in the core team are allergic to classes. and i get it, classes do cost you some theoretical performance, and a lot of people's class hierarchies are hideous af, but hey they work and give pretty helpful errors. Most the time.

better know all of std.traits and std.meta cause you're going to need them too implement a range-of-ranges consumer.

Write your function like it is Python or javascript - use the properties you want on an unconstrained template function.

void foo(T)(T thing) {
// use thing.whatever
// or thing[whatever]
// or whatever you need
}

Even if that's a range of ranges:

void foo(T)(T thing) {
foreach(range; thing)
foreach(item; range)
// use item.
}

It will work if you actually get a range of ranges and if not, you get an error anyway. It isn't like the constraint ones are readable, so just let this fail where it may. (In fact, I find the non-contraint messages to be a little better! I'd rather see like "cannot foreach over range" than "no match for ")

I don't even think phobos benefits from its traits signatures. If you do it wrong it won't compile the same as if you do all the explicit checks.

But again, if you're doing some intermediate processing... try to use map, filter, fold, and friends... since doing the forwarding they do is legitimately complicated and my little foo consumers here don't even touch it.

idk.... maybe with the full code i could guess and check my way to something but i too lazy rn tbh.

Top | Forum index | About this forum

Forums