Iterators and Ranges: Comparing C++ to D to Rust (page 3)

Settings

Help

Index » General » Iterators and Ranges: Comparing C++ to D to Rust (page 3)

June 14, 2021

Re: Iterators and Ranges: Comparing C++ to D to Rust

Posted by Ola Fosheim Grøstad
in reply to Steven Schveighoffer

Permalink

Ola Fosheim Grøstad

Posted in reply to Steven Schveighoffer

Permalink

On Monday, 14 June 2021 at 17:19:35 UTC, Steven Schveighoffer wrote:
> Oh for sure. Generalizing the relationships that nodes have would be pretty difficult. Trees need 2 pointers at least (possibly a parent pointer as well), whereas a linked list only needs one (if it's not bidirectional).

Yes, my idea is to keep the chunk-header within two cache lines (or one), but there should be auxillary space there for the container as well. So when you insert a chunk-list into a new container there is meta-info about the nature of the data (are all chunks filled up, data sorted etc), so the container can then decide whether to run over the chunks and "fix" the structure/invariants. So if you take a singled linked list and insert it into a double linked one with a hash index, then the double linked list will run over it and update backpointers and stuff in key-hash-values.

However, in my next attempt at this I will try to make it SIMD friendly, I think. So using a struct-of-arrays with maybe a bitmask that says whether a slot is occupied. So each chunk would have a 32, 64 or 256 element capacity or something like that.

Then maybe a table pointer would be a pointer to the chunk + a bitmask? But then it makes more sense to use a ranges approach. So you point to first chunk with bitmask and the last chunk with bitmask. So pointing to the end would be pointing to the last chunk with a zero-bitmask? Not sure… :-D

It is funny how implementation choices for containers changes how one thinks about table pointers/iterators/ranges.

> Unless you allocate nodes for the actual spaces, you have to deal with this kind of stuff. I felt like pointing at an element, and then storing whether that data is valid or not was a pretty reasonable way to have element pointers without safety problems.

Yes, but how does it affect efficiency?

It really depends on how they are used, for instance, if you provide all the algorithms yourself, then inefficent table-pointers/cursors can be turned into internal raw pointers and then it does not matter as much (as they only are begin/end markers that are mutated infrequently).

> My original use case for making dcollections was to have a fast-lookup map (using RBTree), and be able to remove items as I iterate it the tree. Using ranges for this just doesn't make sense in the general case, but using cursors works well.

Yes. I am concerned about how a struct-of-arrays datastructure approach affects these kind of considerations.

Of course, there is also the question of whether the LDC/GDC backend are able to compile to bitmasked SIMD instructions (bitmasks that says whether an element is present or not)?? I wonder if that is at all possible or if I will have to write explicit SIMD code?

June 14, 2021

Re: Iterators and Ranges: Comparing C++ to D to Rust

Posted by Ola Fosheim Grøstad
in reply to Ola Fosheim Grøstad

Permalink

Ola Fosheim Grøstad

Posted in reply to Ola Fosheim Grøstad

Permalink

On Monday, 14 June 2021 at 17:49:46 UTC, Ola Fosheim Grøstad wrote:
> Then maybe a table pointer would be a pointer to the chunk + a bitmask? But then it makes more sense to use a ranges approach. So you point to first chunk with bitmask and the last chunk with bitmask. So pointing to the end would be pointing to the last chunk with a zero-bitmask? Not sure… :-D

No. I think I cannot have empty ranges to signify the beginning or end. Since there might be only one chunk in the container. So then there would be an ambiguity as to whether I point to the empty space at the beginning or the end.

So probably better to have two types: bitmasked ranges, and explicitly indexed table pointers. Perhaps?

June 14, 2021

Re: Iterators and Ranges: Comparing C++ to D to Rust

Posted by IGotD-
in reply to Ola Fosheim Grøstad

Permalink

IGotD-

Posted in reply to Ola Fosheim Grøstad

Permalink

On Monday, 14 June 2021 at 13:45:14 UTC, Ola Fosheim Grøstad wrote:
>
> E.g. his take on StopIteration in Python is not really correct either, you can write a generator (coroutine) with no explicit StopIteration (although the semantics implies a raised exception, but that does not mean the implementation has to).

How do we classify the iterators of Nim?

https://nim-lang.org/docs/tut1.html#iterators

It looks similar to Python and Rust. Also notice how it can use yield like a coroutine.

June 14, 2021

Re: Iterators and Ranges: Comparing C++ to D to Rust

Posted by Ola Fosheim Grøstad
in reply to IGotD-

Permalink

Ola Fosheim Grøstad

Posted in reply to IGotD-

Permalink

On Monday, 14 June 2021 at 22:56:29 UTC, IGotD- wrote:
> How do we classify the iterators of Nim?
>
> https://nim-lang.org/docs/tut1.html#iterators
>
> It looks similar to Python and Rust. Also notice how it can use yield like a coroutine.

Yes, but the page says that they have to be inlined and used in for-loops only? But if they are made more general then they would be generator coroutines like Python and C++.

June 15, 2021

Re: Iterators and Ranges: Comparing C++ to D to Rust

Posted by Araq
in reply to Ola Fosheim Grøstad

Permalink

Araq

Posted in reply to Ola Fosheim Grøstad

Permalink

On Monday, 14 June 2021 at 23:08:40 UTC, Ola Fosheim Grøstad wrote:
> On Monday, 14 June 2021 at 22:56:29 UTC, IGotD- wrote:
>> How do we classify the iterators of Nim?
>>
>> https://nim-lang.org/docs/tut1.html#iterators
>>
>> It looks similar to Python and Rust. Also notice how it can use yield like a coroutine.
>
> Yes, but the page says that they have to be inlined and used in for-loops only? But if they are made more general then they would be generator coroutines like Python and C++.

There is also a variant called "closure" iterators which is Nim's foundation for async programming. Nim's inline iterators don't compose particularly well, but they get the job done and there are all sorts of workarounds and 3rd party packages. We couldn't copy C++ or D's designs because they are fundamentally unsafe (iterator invalidation is complex and expensive). Rust can do it easily thanks to its borrow checker. We're slowly catching up with our support for "view types".

June 15, 2021

Re: Iterators and Ranges: Comparing C++ to D to Rust

Posted by Ola Fosheim Grøstad
in reply to Araq

Permalink

Ola Fosheim Grøstad

Posted in reply to Araq

Permalink

On Tuesday, 15 June 2021 at 07:22:24 UTC, Araq wrote:
> We couldn't copy C++ or D's designs because they are fundamentally unsafe (iterator invalidation is complex and expensive).

Yes, but C++20 coroutines are completely different though.

My understanding is that the compiler analyse the coroutine function and breaks it up into many functions so that the stack is empty when it hits ```co_yield```, then the state is synthesised into an object (I guess you could call it a closure).

Is this what Nim closure-iterators do?

June 15, 2021

Re: Iterators and Ranges: Comparing C++ to D to Rust

Posted by Petar Kirov [ZombineDev]
in reply to Araq

Permalink

Petar Kirov [ZombineDev]

Posted in reply to Araq

Permalink

On Tuesday, 15 June 2021 at 07:22:24 UTC, Araq wrote:

On Monday, 14 June 2021 at 23:08:40 UTC, Ola Fosheim Grøstad wrote:

On Monday, 14 June 2021 at 22:56:29 UTC, IGotD- wrote:

How do we classify the iterators of Nim?

https://nim-lang.org/docs/tut1.html#iterators

It looks similar to Python and Rust. Also notice how it can use yield like a coroutine.

Yes, but the page says that they have to be inlined and used in for-loops only? But if they are made more general then they would be generator coroutines like Python and C++.

There is also a variant called "closure" iterators which is Nim's foundation for async programming. Nim's inline iterators don't compose particularly well, but they get the job done and there are all sorts of workarounds and 3rd party packages.

In addition to Nim's inline and closure iterators, which if I understand correctly, model input iteration (not sure about forward), is there an at least somewhat standardized interface for implementing iterators manually that does support backward iteration and random access? What do the most prominent Nim libraries do?

One of the things that I don't like about languages that I use that have built-in iterators like C# and JavaScript/TypeScript is that pretty much their whole ecosystems only cares about input iteration and things that in D are trivially O(1) like array.map!foo[$/2 .. $].retro.map!bar.length end up being unexpected performance pitfalls in practice in these languages.

We couldn't copy C++ or D's designs because they are fundamentally unsafe (iterator invalidation is complex and expensive). Rust can do it easily thanks to its borrow checker.

Heh, when I guess many people having read the the infamous "fearless concurrency" blog post believe that iterator invalidation is a big deal and that languages really ought to prevent it statically and now demand it from language maintainers (which is a good thing™). That said, no matter how easy is to show that innocent looking code can end up with an iterator invalidation bug, in practice I don't remember ever experiencing it in my D code. Resisting the urge to mutate a container while you're iterating is really not that hard :D Perhaps iterator invalidation problems are more common with imperative-style raw loops code, rather that FP-style, which I feel is much better supported by D's standard library, than the standard libraries of other languages like C++, C# and JS.

That is not to say that I wouldn't want my libraries to be fool-proof against that kind of problems, but at same time, at least in my experience with D it hasn't been a big deal, so I'd much rather have a powerful algorithmic API foundation than prevent very useful and beautiful designs, just because they can be misused.

We're slowly catching up with our support for "view types".

Interesting, can you elaborate more?

June 15, 2021

Re: Iterators and Ranges: Comparing C++ to D to Rust

Posted by Paulo Pinto
in reply to Ola Fosheim Grøstad

Permalink

Paulo Pinto

Posted in reply to Ola Fosheim Grøstad

Permalink

On Tuesday, 15 June 2021 at 08:04:13 UTC, Ola Fosheim Grøstad wrote:
> On Tuesday, 15 June 2021 at 07:22:24 UTC, Araq wrote:
>> We couldn't copy C++ or D's designs because they are fundamentally unsafe (iterator invalidation is complex and expensive).
>
> Yes, but C++20 coroutines are completely different though.
>
> My understanding is that the compiler analyse the coroutine function and breaks it up into many functions so that the stack is empty when it hits ```co_yield```, then the state is synthesised into an object (I guess you could call it a closure).
>

It is a bit more elaborated than that.

I suggest a carefull reading of all these posts to get a good overview of all use cases.

https://devblogs.microsoft.com/oldnewthing/20210504-01/?p=105178

June 15, 2021

Re: Iterators and Ranges: Comparing C++ to D to Rust

Posted by Ola Fosheim Grøstad
in reply to Paulo Pinto

Permalink

Ola Fosheim Grøstad

Posted in reply to Paulo Pinto

Permalink

On Tuesday, 15 June 2021 at 08:36:16 UTC, Paulo Pinto wrote:
> It is a bit more elaborated than that.
>
> I suggest a carefull reading of all these posts to get a good overview of all use cases.
>
> https://devblogs.microsoft.com/oldnewthing/20210504-01/?p=105178

That was longwinded... about async/await like behaviour?

A short summary of C++20 coroutines:

https://en.cppreference.com/w/cpp/language/coroutines

June 15, 2021

Re: Iterators and Ranges: Comparing C++ to D to Rust

Posted by Ola Fosheim Grøstad
in reply to Petar Kirov [ZombineDev]

Permalink

Ola Fosheim Grøstad

Posted in reply to Petar Kirov [ZombineDev]

Permalink

On Tuesday, 15 June 2021 at 08:22:07 UTC, Petar Kirov [ZombineDev] wrote:

That said, no matter how easy is to show that innocent looking code can end up with an iterator invalidation bug, in practice I don't remember ever experiencing it in my D code. Resisting the urge to mutate a container while you're iterating is really not that hard :D

I agree. If D wants to be a system level programming language it should take some of the disadvantages that comes with making it easy to write performant code.

It would be better to focus on making it easier to write static analysis linters for D, by making it possible to emit a high level IR after templates have been expanded.

There are tools that take a "generic programming language" as input and makes it possible to write various custom checks over that language. So what is needed is a way to translate template-expanded D code to such "verification languages".

Limited resources do suggest that one should leverage existing generic tools.

Lots of unused opportunities there, I think.

Top | Forum index | About this forum

Forums