Jump to page: 1 2
Thread overview
D Ranges in C#
Jun 01, 2013
David Piepgrass
Jun 01, 2013
bearophile
Jun 01, 2013
Peter Alexander
Jun 01, 2013
David Piepgrass
Jun 01, 2013
Mehrdad
Jun 01, 2013
Paulo Pinto
Jun 01, 2013
David Piepgrass
Jun 01, 2013
renoX
Jun 02, 2013
Marco Leise
Jun 01, 2013
Szymon Gatner
Jun 08, 2013
Ali Çehreli
June 01, 2013
I'm adding D-style ranges to my new C# collections library. In case anyone would like to comment, please see here:

http://loyc-etc.blogspot.ca/2013/06/d-style-ranges-in-c-net.html
June 01, 2013
David Piepgrass:

> http://loyc-etc.blogspot.ca/2013/06/d-style-ranges-in-c-net.html

From the article:

>Ranges are an improvement on the C++ concept of iterators. I don't exactly know how ranges were invented in D, but perhaps someone noticed that most of the C++ STL algorithms require pairs of iterators:<

Ranges are an improvement over iterators about as much as oxygen molecules are an "improvement" over oxygen atoms. You breath mostly O2, and it allows you to live, but they are made of atomic oxygen. Atomic oxygen is more fundamental, it is here to stay (as Alexander Stepanov says), and it has "capabilities" that molecules don't have. In your normal life you can think to use only molecular oxygen, but sometimes you can even desire some ozone :-) And in chemistry reactions that happen inside you all the time, like oxidation, the oxygen molecules usually break apart to react.


>In fact, most STL algorithms require exactly two iterators--a range--and none require only a single iterator<

I think there are some C++ data structures that store many single iterators. If you instead store ranges you double the data amount.

Bye,
bearophile
June 01, 2013
On Saturday, 1 June 2013 at 08:12:00 UTC, bearophile wrote:
> David Piepgrass:
>>In fact, most STL algorithms require exactly two iterators--a range--and none require only a single iterator<
>
> I think there are some C++ data structures that store many single iterators. If you instead store ranges you double the data amount.

Hashmaps would be the most common example. Usually implemented as a linked list of key-value pairs along with a vector of list iterators.

Iterators are also useful for constructing sub-ranges, which proves useful in the implementation of some algorithms. Writing std::next_permutation in D with ranges is quiet frustrating compared to C++.

https://github.com/D-Programming-Language/phobos/blob/master/std/algorithm.d#L10901
http://gcc.gnu.org/onlinedocs/gcc-4.6.2/libstdc++/api/a01045_source.html#l03619

Notice that in D you need to maintain a count of the elements popped off the back and then use takeExactly to retrieve that portion again. In C++ you just move the iterators around and create "ranges" from pairs of iterators as you need them.

When I implemented nextPermutation in D, I constantly felt as if I was fighting with ranges instead of working with them - I knew exactly what I wanted to do, but ranges only provide a roundabout way of doing it.
June 01, 2013
On 6/1/13 2:17 AM, David Piepgrass wrote:
> I'm adding D-style ranges to my new C# collections library. In case
> anyone would like to comment, please see here:
>
> http://loyc-etc.blogspot.ca/2013/06/d-style-ranges-in-c-net.html

http://www.reddit.com/r/programming/comments/1fgnyl/dstyle_ranges_in_c_net/

Andrei
June 01, 2013
On Saturday, 1 June 2013 at 06:17:30 UTC, David Piepgrass wrote:
> I'm adding D-style ranges to my new C# collections library. In case anyone would like to comment, please see here:
>
> http://loyc-etc.blogspot.ca/2013/06/d-style-ranges-in-c-net.html

Lol I am just preparing a blog post on D ranges in C++, a piece of code I did after Andrei's "Iterators must Go" presentation. It was not super usable before C++11 and auto but now is pretty cool thing to have.
June 01, 2013
>> David Piepgrass:
>>>In fact, most STL algorithms require exactly two iterators--a range--and none require only a single iterator<
>>
>> I think there are some C++ data structures that store many single iterators. If you instead store ranges you double the data amount.
>
> Hashmaps would be the most common example. Usually implemented as a linked list of key-value pairs along with a vector of list iterators.

In theory. But the .NET hashtables are implemented with an *array* of key-value pairs and an array of *indexes*. The former forms a virtual linked list that is more efficient than a traditional linked list, and the latter is more efficient than a vector of iterators (especially on x64, as the indexes can be 32-bit.)

> Iterators are also useful for constructing sub-ranges, which proves useful in the implementation of some algorithms. Writing std::next_permutation in D with ranges is quiet frustrating compared to C++.
>
> https://github.com/D-Programming-Language/phobos/blob/master/std/algorithm.d#L10901
> http://gcc.gnu.org/onlinedocs/gcc-4.6.2/libstdc++/api/a01045_source.html#l03619
>
> Notice that in D you need to maintain a count of the elements popped off the back and then use takeExactly to retrieve that portion again. In C++ you just move the iterators around and create "ranges" from pairs of iterators as you need them.
>
> When I implemented nextPermutation in D, I constantly felt as if I was fighting with ranges instead of working with them - I knew exactly what I wanted to do, but ranges only provide a roundabout way of doing it.

Hmm, very interesting. I suppose the problem with C++ iterators is that they are useless without external context: you can't increment or decrement one without also comparing it to begin() or end() in its container, which implies that the caller must manually keep track of which container it came from. Thus, an iterator is hardly an improvement over a plain-old integer index, the only advantages being

1. You can dereference it (*if* you can be sure it doesn't point to end())
2. Unlike an index, it's compatible with non-random-access data structures

But perhaps the iterator concept could be improved by being made self-aware: if the iterator "knew" and could tell you when it was at the beginning or end of its container. This would increase the storage requirement in some circumstances but not others. For example, an iterator to a doubly-linked list node can tell when it is at the beginning or end, but an iterator to a singly-linked list node can only tell when it is at the end. A pointer inside an array may or may not be able to tell if it is at the end depending on how arrays work, e.g. perhaps the way D heap arrays work would allow an array iterator, implemented as a simple pointer, to still know when it is safe to increment and decrement.

The simplest possible .NET array iterator is an array reference plus an index, and again the iterator can know when it is at the beginning or end of the array--except that if the iterator came from inside a range, it would be unaware of the range's boundaries, which may be smaller than the array's boundaries.
June 01, 2013
On Saturday, 1 June 2013 at 16:30:05 UTC, David Piepgrass wrote:
>>> David Piepgrass:
>>>>In fact, most STL algorithms require exactly two iterators--a range--and none require only a single iterator<
>>>
>>> I think there are some C++ data structures that store many single iterators. If you instead store ranges you double the data amount.
>>
>> Hashmaps would be the most common example. Usually implemented as a linked list of key-value pairs along with a vector of list iterators.
>
> In theory. But the .NET hashtables are implemented with an *array* of key-value pairs and an array of *indexes*. The former forms a virtual linked list that is more efficient than a traditional linked list, and the latter is more efficient than a vector of iterators (especially on x64, as the indexes can be 32-bit.)



Iterators are usually (but not always) faster, as they don't have an extra level of indirection involved -- the iterators tell you directly where to look in memory, whereas indices are useless without containers (or iterators) backing them.


You shouldn't be using 32-bit indices on x64, that defeats the whole point of x64.


> Hmm, very interesting. I suppose the problem with C++ iterators is that they are useless without external context: you can't increment or decrement one without also comparing it to begin() or end() in its container, which implies that the caller must manually keep track of which container it came from.


No, that's exactly the opposite of what happens.

The caller just accepts two iterators (like std::unique), so that it has the end iterator as well as the begin iterator.
June 01, 2013
Am 01.06.2013 20:23, schrieb Mehrdad:
> On Saturday, 1 June 2013 at 16:30:05 UTC, David Piepgrass wrote:
>>>> David Piepgrass:
>>>>> In fact, most STL algorithms require exactly two iterators--a
>>>>> range--and none require only a single iterator<
>>>>
>>>> I think there are some C++ data structures that store many single
>>>> iterators. If you instead store ranges you double the data amount.
>>>
>>> Hashmaps would be the most common example. Usually implemented as a
>>> linked list of key-value pairs along with a vector of list iterators.
>>
>> In theory. But the .NET hashtables are implemented with an *array* of
>> key-value pairs and an array of *indexes*. The former forms a virtual
>> linked list that is more efficient than a traditional linked list, and
>> the latter is more efficient than a vector of iterators (especially on
>> x64, as the indexes can be 32-bit.)
>
>
>
> Iterators are usually (but not always) faster, as they don't have an
> extra level of indirection involved -- the iterators tell you directly
> where to look in memory, whereas indices are useless without containers
> (or iterators) backing them.
>
>
> You shouldn't be using 32-bit indices on x64, that defeats the whole
> point of x64.

As of .NET 4.5, 64bit array indexes are supported as well.

http://msdn.microsoft.com/en-us/library/hh285054.aspx



June 01, 2013
>> You shouldn't be using 32-bit indices on x64, that defeats the whole
>> point of x64.
>
> As of .NET 4.5, 64bit array indexes are supported as well.
>
> http://msdn.microsoft.com/en-us/library/hh285054.aspx

Don't forget that we're talking about a *hashtable* here. If a .NET hashtable used 64-bit indexes (or pointers) it would require 8-12 bytes more memory per entry, specifically 32 bytes total, including overhead, if the key and value are 4 bytes each.

An in-memory hashtable that requires 64-bit indexes rather than 32 bits would have to contain over 4 billion entries which would take at least 128 GB of RAM, assuming 8 bytes for each key-value pair!!! In fact it's worse than that, as the dictionary grows by size-doubling and contains a certain amount of unused entries at the end.

No thanks, I'd rather save those 8 bytes and accept the 4 billion limit, if you don't mind.
June 01, 2013
On Saturday, 1 June 2013 at 18:23:31 UTC, Mehrdad wrote:
[cut]
>
> You shouldn't be using 32-bit indices on x64, that defeats the whole point of x64.

Some would disagree: have a look at x32: http://en.wikipedia.org/wiki/X32_ABI
Being able to use efficiently 64 bit doesn't mean that you *have* to use 64bit when 32bit is enough..
renoX
« First   ‹ Prev
1 2