Range Redesign: Empty Ranges (page 2)

Settings

Help

Index » General » Range Redesign: Empty Ranges (page 2)

March 06

Re: Range Redesign: Empty Ranges

Posted by Dom DiSc
in reply to Arafel

Permalink

Dom DiSc

Posted in reply to Arafel

Permalink

On Wednesday, 6 March 2024 at 08:56:02 UTC, Arafel wrote:
> On 05.03.24 17:41, Alexandru Ermicioi wrote:
>> On Tuesday, 5 March 2024 at 14:39:17 UTC, Dom DiSc wrote:
>>> Are there infinite ranges that are indeed not forward ranges?
>> 
>> /dev/random could be such a range.

If you have a source of real entropy, ok. But pseudo-random-number-generators do have a seed and with same seed produces always the same numbers.

> Or a range that returns captured data from a sensor.

Ok, that's a more convincing example. This is really hard to copy. Requires perfect knowlegde in a deterministic universe :-)

March 06

Re: Range Redesign: Empty Ranges

Posted by Arafel
in reply to Dom DiSc

Permalink

Arafel

Posted in reply to Dom DiSc

Permalink

On 06.03.24 11:56, Dom DiSc wrote:
> 
> If you have a source of real entropy, ok. But pseudo-random-number-generators do have a seed and with same seed produces always the same numbers.

While technically true, you might not have access to the seed, so in practical terms it's a bit of distinction without a difference.

For instance, in `/dev/[u]random` the seed is kept internally by the OS, and I don't think you have access to it.... and even if you had, you wouldn't want to re-implement the kernel's PRNG, right?

March 06

Re: Range Redesign: Empty Ranges

Posted by Atila Neves
in reply to Jonathan M Davis

Permalink

Atila Neves

Posted in reply to Jonathan M Davis

Permalink

On Monday, 4 March 2024 at 21:29:40 UTC, Jonathan M Davis wrote:

Okay. A few weeks ago, I made a post about potential design decisions when updating the range API in the next version of Phobos, and this is a continuation of that, albeit with a different set of problems. Specifically:

[...]

I like T.init being empty. I thought I'd found a clever way to make this work for classes but apparently this crashes, much to my surprise:

class Class {
    bool empty() @safe @nogc pure nothrow scope const {
        return this is null;
    }
}

Class c;
assert(c.empty);

This, although supposedly equivalent, works fine:

class Class {}
bool empty(in Class c) @safe @nogc pure nothow {
    return c is null;
}

Which might be a way out? I still think the first example should work.

March 06

Re: Range Redesign: Empty Ranges

Posted by Ogi
in reply to Atila Neves

Permalink

Ogi

Posted in reply to Atila Neves

Permalink

On Wednesday, 6 March 2024 at 12:20:40 UTC, Atila Neves wrote:
I thought I'd found a clever way to

make this work for classes but apparently this crashes, much to my surprise:

class Class {
    bool empty() @safe @nogc pure nothrow scope const {
        return this is null;
    }
}

Class c;
assert(c.empty);

I’m surprised that the equivalent C++ code doesn’t crash (at least on my machine):

Class* c = nullptr;
assert(c->empty());

That’s still UB though.

March 06

Re: Range Redesign: Empty Ranges

Posted by Paul Backus
in reply to Atila Neves

Permalink

Paul Backus

Posted in reply to Atila Neves

Permalink

On Wednesday, 6 March 2024 at 12:20:40 UTC, Atila Neves wrote:

I like T.init being empty. I thought I'd found a clever way to make this work for classes but apparently this crashes, much to my surprise:

class Class {
    bool empty() @safe @nogc pure nothrow scope const {
        return this is null;
    }
}

Class c;
assert(c.empty);

It crashes because it's attempting to access Class's vtable. If you make the method final, it works:

class Class {
    final bool empty() {
        return this is null;
    }
}

void main()
{
    Class c;
    assert(c.empty); // ok
}

March 06

Re: Range Redesign: Empty Ranges

Posted by Paul Backus
in reply to Jonathan M Davis

Permalink

Paul Backus

Posted in reply to Jonathan M Davis

Permalink

On Monday, 4 March 2024 at 21:29:40 UTC, Jonathan M Davis wrote:
> 2. The range API provides no way (other than fully iterating through a range) to get an empty range of the same type from a range unless the range is a random-access range.

Genuine question: what are the use-cases for this?

In general, the capabilities of ranges are designed to serve the needs of algorithms. Input ranges exist because single-pass iteration is all that's needed to implement algorithms like map, filter, and reduce. Random-access ranges exist because they're needed for sorting algorithms. And so on.

I'm not aware of any algorithm, or class of algorithm, that needs this specific capability you're describing. If such algorithms exist, we should be using them to guide our design here. If they don't...then maybe this isn't really a problem at all.

March 06

Re: Range Redesign: Empty Ranges

Posted by Steven Schveighoffer
in reply to Paul Backus

Permalink

Steven Schveighoffer

Posted in reply to Paul Backus

Permalink

On Wednesday, 6 March 2024 at 14:18:50 UTC, Paul Backus wrote:

On Monday, 4 March 2024 at 21:29:40 UTC, Jonathan M Davis wrote:

The range API provides no way (other than fully iterating through a range) to get an empty range of the same type from a range unless the range is a random-access range.

Genuine question: what are the use-cases for this?

In general, the capabilities of ranges are designed to serve the needs of algorithms. Input ranges exist because single-pass iteration is all that's needed to implement algorithms like map, filter, and reduce. Random-access ranges exist because they're needed for sorting algorithms. And so on.

I'm not aware of any algorithm, or class of algorithm, that needs this specific capability you're describing. If such algorithms exist, we should be using them to guide our design here. If they don't...then maybe this isn't really a problem at all.

When you need to pass a specific range type, and you want to pass in an empty range of that type, how do you do it? There is no formal definition for "create an empty range T". It's not just algorithms, it's regular functions or types with specific parameter requirements.

What is a very logical mechanism is just to pass the init value, because 99% of the time, that is the same thing as an empty range. Except when it isn't. But if you aren't testing for that, then you don't realize it. Or if code changes slightly such that now you involve a range you didn't before which doesn't have that property, then all of a sudden code breaks.

We can add a new requirement that ranges that can be initialized as empty have some emptyRange static member or something like that. But I expect this is going to go down the same route as save, where really you should be calling save every time you copy a forward range, but in practice nobody does, since regular copying is 99% of the time the same thing. Likewise people will pass .init instead of .emptyRange because it's almost always the same thing. This is why I think it should just be formally stated that the init value of a non-infinite range should be empty, bringing into standard what people already do.

The only tricky aspect is ranges that are references (classes/pointers). Neither of those to me should be supported IMO, you can always wrap such a thing in a range harness.

But if people insist that we must have class ranges, I'd say an emptyRange property is in order.

-Steve

March 06

Re: Range Redesign: Empty Ranges

Posted by H. S. Teoh
in reply to Steven Schveighoffer

Permalink

H. S. Teoh

Posted in reply to Steven Schveighoffer

Permalink

On Wed, Mar 06, 2024 at 04:47:02PM +0000, Steven Schveighoffer via Digitalmars-d wrote: [...]
> The only tricky aspect is ranges that are references (classes/pointers).  Neither of those to me should be supported IMO, you can always wrap such a thing in a range harness.
[...]

Every time this topic comes up, class-based ranges become the whipping boy of range design woes.  Actually, they serve an extremely important role: type erasure, which is critical when you have code like this:

	auto myRangeFunc(R,S)(R range1, S range2) {
		if (runtimeDecision()) {
			return range1;
		} else {
			return range2;
		}
	}

This generally will not compile because R and S are different types, even if both conform to a common underlying range API, like an input range. To remedy this situation, class-based range wrappers come to the rescue:

	auto myRangeFunc(R,S)(R range1, S range2) {
		if (runtimeDecision()) {
			return inputRangeObject(range1);
		} else {
			return inputRangeObject(range2);
		}
	}

Note that you can't use a struct wrapper here, because R and S have different ABIs; the only way to correctly forward range methods to R or S is to use overridden base class methods. IOW, the type erasure of R and S is unavoidable for this code to work.

//

Also, class-based ranges are sometimes necessary for practical reasons, like in this one program I had, that has an UFCS chain consisting of hundreds of components (not all written out explicitly in the same function, of course, but that's what the underlying instantiation looks like).  It generated ridiculously large symbols that crashed the compiler. Eventually the bug I filed prodded Rainer to implement symbol compression in DMD. But even then, the symbols were still ridiculously huge -- because the outermost template instantiation had to encode the types of every one of the hundreds of components.  As a result, symbol compression or not, compile times were still ridiculously slow because the compiler still had to copy those symbols around -- their length grew quadratically with every component added to the chain.

Inserting a call to .inputRangeObject in the middle of the chain dramatically cut down the size of the resulting symbols, because it effectively erased all the types preceding it, resulting in much saner codegen afterwards.


T

-- 
Leather is waterproof.  Ever see a cow with an umbrella?

March 06

Re: Range Redesign: Empty Ranges

Posted by Paul Backus
in reply to Steven Schveighoffer

Permalink

Paul Backus

Posted in reply to Steven Schveighoffer

Permalink

On Wednesday, 6 March 2024 at 16:47:02 UTC, Steven Schveighoffer wrote:

On Wednesday, 6 March 2024 at 14:18:50 UTC, Paul Backus wrote:

On Monday, 4 March 2024 at 21:29:40 UTC, Jonathan M Davis wrote:

The range API provides no way (other than fully iterating through a range) to get an empty range of the same type from a range unless the range is a random-access range.

Genuine question: what are the use-cases for this?

When you need to pass a specific range type, and you want to pass in an empty range of that type, how do you do it?

By "specific range type", do you mean a specific category of range (input, forward, random-access, etc.), or a specific concrete type?

If the former, this is already easy to do with existing language and library features (for example, in many cases you can use an empty slice).

If the latter, then the question I'm asking is, why do you need to do that?

Personally, I can think of plenty of times that I've needed to do the first thing, but I can't think of a single time that I've needed to do the second thing.

The only tricky aspect is ranges that are references (classes/pointers). Neither of those to me should be supported IMO, you can always wrap such a thing in a range harness.

The main thing you lose by dropping support for reference-type ranges is interfaces. In particular, the interface inheritance hierarchy in std.range.interfaces, where ForwardRange inherits from InputRange and so on, cannot really be replicated using structs (alias this only goes so far).

March 06

Re: Range Redesign: Empty Ranges

Posted by Paul Backus
in reply to H. S. Teoh

Permalink

Paul Backus

Posted in reply to H. S. Teoh

Permalink

On Wednesday, 6 March 2024 at 17:32:17 UTC, H. S. Teoh wrote:
> Every time this topic comes up, class-based ranges become the whipping boy of range design woes.  Actually, they serve an extremely important role: type erasure, which is critical when you have code like this:
>
> 	auto myRangeFunc(R,S)(R range1, S range2) {
> 		if (runtimeDecision()) {
> 			return range1;
> 		} else {
> 			return range2;
> 		}
> 	}
>
> [...]
>
> Note that you can't use a struct wrapper here, because R and S have different ABIs; the only way to correctly forward range methods to R or S is to use overridden base class methods. IOW, the type erasure of R and S is unavoidable for this code to work.

This is what std.range.choose [1] is for. Internally, it's implemented as a tagged union of R and S.

[1] https://dlang.org/library/std/range/choose.html

Top | Forum index | About this forum

Forums