Improving std.range.Zip (page 2)

October 25, 2010
Re: Improving std.range.Zip
Posted by bearophile
in reply to Tomek Sowiński
Permalink
bearophile
Posted in reply to Tomek Sowiński
Permalink
Tomek S.:

> You're right. Still, modularity is a valid point.

This is a general and interesting design topic. The short answer is that a real language is designed taking into account different trade-offs.

Modularity is important, and Andrei surely likes it. Modularity means the ability to combine small parts to create many different complex systems. To do this the subsystems must interact clearly with each other, they must lack too many corner cases, and of course they need a common flexible interface. As I have recently read, in a modular system "the surprises will be pleasant ones".

On the other hand modularity has some disadvantages too. To be clean those interfaces among subsystems must often be hi-level, this means they may be not fully efficient (unless your compiler is very smart, as the most optimizing compiler, the Stalin Scheme compiler). Another disadvantage is that you have to build common higher order systems over and over again, and sometimes you need some brain and time to build them well.

To avoid this last problem a well designed system usually offers you pre-built higher order systems, made with the elementary components, that you may just use. In some situations very common usage patterns deserve to be encapsulated into handy forms.

So, if you have one of the most used high order functions (map), if its usage often enough deals with mapping functions that work on more than one range of items for its arguments, and if the current way to perform this operation from elementary components produces a not so readable syntax, then practicality asks for for a less modular but more handy design of map :-)

Few more examples of this trade-off below.

----------------

I have even (weakly) asked for amap()/afilter() functions, that are just array(map(...)) and array(filter(...)), because those are very common patterns in my code. I am still not sure adding amap/afilter to Phobos is a good idea, it just reduces the number of parenthesis a little, so it decreases noise in the code.

----------------

Recently in Bugzilla I have asked for sorted()/schwartzSorted(), in this case my desire is stronger, because they save a bit more than just few chars, they are expressions:
http://d.puremagic.com/issues/show_bug.cgi?id=5076

----------------

A similar thread has recently come up in the D.learn newsgroup: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=22413

This person has asked for a way to remove the first item in an array given not the item index (found with indexOf), but the item itself.

Python2 too has those two different ways to remove an item from a list (array), by inded and by item:

>>> arr = ["a", "b", "c", "d"]
>>> arr.remove("x")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: list.remove(x): x not in list
>>> arr.index("c")
2
>>> arr.remove("c")
>>> arr
['a', 'b', 'd']
>>> del arr[1]
>>> arr
['a', 'd']


This composed operation (list.remove in Python) is useful in D both because it's a common need, and because it saves the testing code in the middle (between the index and the del, to be sure the item is present). Well, the test is done anyway, but it's done after the single operation, this is a little better.

In Python the list.remove raises an exception if the item is missing. In D this function may throw an exception or return a false boolean (exceptions are safer, but less efficient, this is another trade-off. In some situations you may even add both functions, one that throws exceptions and one that returns true/false).

In D a combined function like the Python list.remove() one also avoids possible troubles caused by the fact that indexOf() returns a signed integer, and while the delete function accepts signed integers too, then I think inside it compares them to a unsigned length and this may cause troubles :-(

You may compile this wrong program in release mode, to see it:


import std.algorithm: remove, indexOf;
import std.stdio: writeln;
void main() {
    int[] a1 = [3, 5, 7, 8];
    auto pos = a1.indexOf(11);
    writeln(pos); // -1
    int[] a2 = a1.remove(pos);
    writeln(a2);
}


Partially related problem, this is part of the docs of std.algorithm:

>The original array has remained of the same length because all functions in std.algorithm only change content, not topology. The value 8 is repeated because std.algorithm.move  was invoked to move elements around and on integers move simply copies the source to the destination. To replace a with the effect of the removal, simply assign a = remove(a, 1). The slice will be rebound to the shorter array and the operation completes with maximal efficiency.<

This is bug-prone, but maybe there is no way to avoid this behaviour, because in D arrays aren't true reference types.

Bye,
bearophile
Forums