Jump to page: 1 2
Thread overview
InputRange for data structure without order?
Jun 03
monkyyy
Jun 04
Monkyyy
Jun 04
monkyyy
Jun 04
monkyyy
June 03

I have a Set data structure which has no concept of order; its members are stored and can be searched efficiently based on their hash.

I have a situation where chain()'ing them together would be convenient, but InputRange requires front() and popFront(). There really isn't a front, and creating a new Set with an element removed would be unpleasant.

What would be the best way to iterate across a list of such Set's? For now I'm just going to write the explicit loop.

Thanks!
Andy

June 03

On Tuesday, 3 June 2025 at 16:03:15 UTC, Andy Valencia wrote:

>

I have a Set data structure which has no concept of order; its members are stored and can be searched efficiently based on their hash.

I have a situation where chain()'ing them together would be convenient, but InputRange requires front() and popFront(). There really isn't a front, and creating a new Set with an element removed would be unpleasant.

What would be the best way to iterate across a list of such Set's? For now I'm just going to write the explicit loop.

Thanks!
Andy

import std;
struct myset{
	bool has3;
	bool has5;
	auto opBinary(string s:"+")(typeof(this) a){
		return myset(has3||a.has3,has5||a.has5);
	}
	alias chain=opBinary!"+";
	int front(){
		if(has3){ return 3;}
		if(has5){ return 5;}
		assert(0);
	}
	void popFront(){
		if(has3){has3=false; return;}
		has5=false;
	}
	bool empty()=>( ! has3) && ( ! has5);
}
myset toset(R)(R r){
	myset o;
	foreach(i;r){
		if(i==3){o.has3=true;}
		if(i==5){o.has5=true;}
	}
	return o;
}
unittest{
	myset a,b;
	a.has3=true;
	b.has5=true;
	(a+b).writeln;
	
	a.chain(a).chain(a).writeln;
	b.chain(b).chain(b).writeln;
	a.chain(b).writeln;
	
	auto c=a+b;
	c.map!(a=>a*2).writeln;
	iota(10).filter!(a=>a==3).toset.chain(
		iota(10).filter!(a=>a!=3).toset)
		.writeln;
}

You can overload the names inside the datastructure and eagerly lower it as needed
if you need it in something nested I have some more workarounds, but they start to suck.

June 03
On 6/3/25 9:03 AM, Andy Valencia wrote:
> I have a Set data structure which has no concept of order; its members
> are stored and can be searched efficiently based on their hash.
>
> I have a situation where chain()'ing them together would be convenient,
> but InputRange requires front() and popFront(). There really _isn't_ a
> front

front() does not imply an order. It will work as long as you can provide the elements in a sequence. The built-in associative array feature is an example:

import std.stdio;
import std.range;

void main() {
    int[int] squares;
    foreach (i; 0 .. 10) {
        squares[i] = i * i;
    }

    writeln("keys  : ", squares.byKey);
    writeln("values: ", squares.byValue);
}

The elements are not in any particular order, yet associative arrays provide InputRanges (byKey and byValue return InputRange objects.):

keys  : [0, 6, 7, 2, 3, 1, 8, 5, 4, 9]
values: [0, 36, 49, 4, 9, 1, 64, 25, 16, 81]

Ali

June 03
On Tuesday, 3 June 2025 at 20:51:05 UTC, Ali Çehreli wrote:
> front() does not imply an order. It will work as long as you can provide the elements in a sequence. The built-in associative array feature is an example:

Ok, but now the data structure is told to popFront().  I completely see how this would work for, say, slices.  But when hashed, arbitrarily large, and unordered?  Not so much.

(I know how to code this walker using generators, but that does not play with the Range oriented API's so far as I can tell.)

Thanks,
Andy

June 03
On Tue, Jun 03, 2025 at 11:54:20PM +0000, Andy Valencia via Digitalmars-d-learn wrote:
> On Tuesday, 3 June 2025 at 20:51:05 UTC, Ali Çehreli wrote:
> > front() does not imply an order. It will work as long as you can provide the elements in a sequence. The built-in associative array feature is an example:
> 
> Ok, but now the data structure is told to popFront().  I completely see how this would work for, say, slices.  But when hashed, arbitrarily large, and unordered?  Not so much.
> 
> (I know how to code this walker using generators, but that does not play with the Range oriented API's so far as I can tell.)
[...]

In general, it's a bad idea to conflate the container with the range over its elements.  Built-in arrays are a bad example in this case, because they are both, and it just so happened that they work OK as such. But in general, containers should NOT be conflated with ranges. That only leads to wrong design.

The correct way to do this is to create a method in the container that returns a range over its contents.  Popping the range should NOT mutate the container.  It should be regarded as something separate from the container.  Only then will you get sane semantics.  Trying to conflate an unordered container with a range will only lead to pain and bugs.


T

-- 
You have to expect the unexpected. -- RL
June 04

On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:

>

The correct way to do this is to create a method in the container that returns a range over its contents. Popping the range should NOT mutate the container. It should be regarded as something separate from the container. Only then will you get sane semantics. Trying to conflate an unordered container with a range will only lead to pain and bugs.

So, concretely, my container has an opApply method, and because of that foreach works to iterate its contents. Imagine that I now want to chain() across several of these, along the lines of:

    foreach(val; s1) {
        ...;
    }

    // But now:
    foreach(val; chain(s1, s2, s3)) {
        ...;
    }

What API do I add to my container so that chain will do its magic? I don't see the infrastructure under std.range.chain using it. Or is there a non-Range chain I haven't found yet?

Thank you again,
Andy

June 04
On Tuesday, June 3, 2025 9:39:35 PM Mountain Daylight Time Andy Valencia via Digitalmars-d-learn wrote:
> On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:
> > The correct way to do this is to create a method in the container that returns a range over its contents.  Popping the range should NOT mutate the container.  It should be regarded as something separate from the container.  Only then will you get sane semantics.  Trying to conflate an unordered container with a range will only lead to pain and bugs.
>
> So, concretely, my container has an opApply method, and because of that foreach works to iterate its contents.  Imagine that I now want to chain() across several of these, along the lines of:
>
> ```d
>      foreach(val; s1) {
>          ...;
>      }
>
>      // But now:
>      foreach(val; chain(s1, s2, s3)) {
>          ...;
>      }
> ```
>
> What API do I add to my container so that chain will do its magic?  I don't see the infrastructure under std.range.chain using it.  Or is there a non-Range chain I haven't found yet?

chain chains two ranges with the same element type. opApply doesn't really have anything to do with ranges. opApply was the way to give a type the ability to be used with foreach in D1, and it was left in for D2. You can still use it, but in general, the preferred approach in D2 is to use ranges.

Typically, a container will implement opSlice (or opIndex, since confusingly, both can be used in this case) and have it return a range over the container. So, then you can do

    auto range = myContainer[];

and then iterate over the range. You can then also do

    foreach(e; myContainer[])
    {
        ...
    }

but in addition to foreach understanding the range API, it also will slice what it's given if that type can be sliced with no arguments. So,

    foreach(e; myContainer)
    {
        ...
    }

will have the same semantics if myContainer implements opSlice with no arguments, and its opSlice returns a range.

For the range API to work, a type needs to implement

    bool empty();
    T front();
    void popFront();

and ideally it would also implement

    typeof(this) save();

so that you can get a copy of the range and thus save the current iteration state.

If you look at std.container, each container type follows this pattern. It implements opSlice which then returns a range over the container.

You can then use the range over the container with range-based functions such as chain.

- Jonathan M Davis




June 04
On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:
> . But in general, containers should NOT be conflated with ranges. That only leads to wrong design.
>
> range should NOT mutate the container.  It should be regarded as something separate from the container.

Not "general". It's a safety vs speed tradeoff. Immutable whatever's data structure do just make allot of copies or allot of pointer overhead and indirection.
June 04
On Wednesday, 4 June 2025 at 10:47:24 UTC, Jonathan M Davis wrote:
> Typically, a container will implement opSlice (or opIndex, since confusingly, both can be used in this case) and have it return a range over the container. So, then you can do
>
>     auto range = myContainer[];
>...

Brilliant, thank you.  Now I know what I'm looking for, and I see the relevance of all the bits and pieces I'll need.

Andy

June 04
On 6/4/25 4:10 AM, Monkyyy wrote:
> On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:
>> . But in general, containers should NOT be conflated with ranges. That
>> only leads to wrong design.
>>
>> range should NOT mutate the container.  It should be regarded as
>> something separate from the container.
>
> Not "general". It's a safety vs speed tradeoff. Immutable whatever's
> data structure do just make allot of copies or allot of pointer overhead
> and indirection.

I don't understand. Like H. S. Teoh said, a range over a container should not alter the container. I'm not aware of any data structure where that's not the case.

If a container has a way of accessing the elements (which it definitely has to because otherwise it's not really a container), then a light-weight range can be specified over the elements.

Ali

« First   ‹ Prev
1 2