Improving std.range.Zip

Apologies for the late answer (the message is from 2009!). This is about David Simcha's thoughts on improving std.range.Zip. In spite of its lateness, the response is rather timely for the ongoing discussion of lockstep().

Zip was indeed a kludge because it required algorithms to be modified to recognize it. Now I think we have a general method for expressing types that should be assignable and swappable, yet don't expose references.

The problem
===========

We want to define read and write methods for e.g. the front of a range. In addition we also want to swap a front of a range with another object of the same type without incurring unnecessary copies. This is because copying objects can be arbitrarily expensive due to this(this).

If all ranges exposed references, that all would be simple because access to objects is immediate. Yet some ranges don't want to expose references to their internals. I define below such ranges as "sealed ranges". Example of non-sealed range:

struct NonSealed {
     @property bool empty();
     @property ref Widget front();
     void popFront();
}

struct Sealed {
     @property bool empty();
     @property Widget front();
     void popFront();
}

Solution
========

Sealed ranges must define additional members as follows:

struct Sealed {
     @property bool empty();
     @property Widget front();
     @property void front(Widget rhs);
     @property Widget moveFront(Widget rhs);
     void popFront();
}

Similarly, if a range offers back() it should also offer back(Widget) and moveBack(). Finally, if a range offers opIndex(), it should also offer opIndeAssign(), opIndexOpAssign(), and Widget moveAt(size_t).

These three routines are all that's needed for implementing a host of efficient algorithms, starting of course with swap. I've already changed std.algorithm.swap and many other functions in std.algorithm to support this new concept. (I think most of these changes are not yet committed.) I haven't yet updated e.g. Zip.

I have followed a very similar pattern in std.container, which I'll check in soon.

Any comments, please chime in.

Andrei

On 11/13/2009 11:08 AM, David Simcha wrote:
> I was thinking about std.range.Zip and ways to improve it.  It would be really nice if we could give Zip assignable elements because right now, sorting of it works by a bunch of special-case ad-hockery. Furthermore, Andrei has mentioned the need for a stable O(N log N) sort in Phobos. The most obvious (and possibly the only) algorithm is a merge sort, but this can only be done (AFAIK) if you have assignable elements, not just swappable elements. Using some magic of postblits and opAssign, I think we can get assignable elements to work with Zip. First of all, there should be two kinds of Proxy structs: A value one and a reference one. This information would be stored in a flag in each instance. The compile time type would be the same. The values and the pointers would be stored together in a union. It would look something like this:
>
> struct Proxy {
> union {
> staticMap!(pointers, R) ptrs;
> R vals;
> }
>
> bool isReference; // Always true if this is stored inside a Zip.
> this(this) {
> if(isReference) {
> // Overwrite pointers with values.
> isReference = false;
> }
> }
>
> void opAssign(ref Proxy rhs) {
> if(isReference) {
> // Assign the values of rhs.at!(0)...
> // rhs.at!(rhs.length) to the deref
> // ptrs.
> } else {
> // Assign rhs's values to vals.
> }
> }
>
> auto at(int index)() {
> if(isReference) {
> return *(ptrs[index]);
> } else {
> return vals[index];
> }
> }
> }
>
> Then, the standard swap algorithm would work. Assuming random access:
>
> auto temp = myZip[index1]; // Postblit makes this a value proxy. myZip[index1] = myZip[index2]; // Writes to dereferences in myZip[index1]. myZip[index2] = temp; // Writes to dereferences in myZip[index2].
>
> Furthermore, you could store Proxy structs in an array as a buffer, etc. and it would (I think) just work. Yes, there would be some performance cost to all this branching and postblitting, but IMHO it's worth it to make Zip work the way that people who don't understand the implementation details would expect it to work.
>
> The only problem I see is uninitialized Proxy objects:
>
> Proxy proxy = void;
> proxy = myZip[5];

Forums