View mode: basic / threaded / horizontal-split · Log in · Help
March 09, 2006
std.array suggestion
Hello,

With the new IFTI support I have been looking at ways of upgrading the 
standard library with new and more generic functions. What follows is my 
suggestions for functions to be added to std.array. I have implemented 
all of them and the current (0.149) limited IFTI-support is enough to 
support them. That being said, I wish to hold off making a source code 
submission until a) I get review comments on the suggested function 
prototypes and b) it is more clear how far D is taking IFTI support 
(meaning possibly neater implementation).

All functions are designed to be used both as free function and as 
implicit array methods. Except for the inplace versions, no functions 
modifies the array.

The prototype notation is my own. a|b means two alternative types. T is 
the generic element type. T[] is the array.

-----------

T fold(T[] arr, T init, T delegate|function combiner(T,T));

Generic array recursion function. combiner is called recursively:
	return combiner(init,fold(arr[1..$],arr[0],combiner));
(The actual implementation will of course call the combiner iteratively)


T max(T[] arr)

Returns the maximum element in arr as defined by the > operator.


T min(T[] arr)

Returns the minimum element in arr as defined by the < operator.


T sum(T[] arr)

Returns the sum of the element in arr as defined by the + operator.


ptrdiff_t find(T[] arr, T|delegate|function d);

Returns the index of the first occurence of d or the first true 
predicate d applied to the elements in order. Returns -1 when no element 
is found.


size_t indexOf(T[] arr, T|delegate|function d);

Like find, but throws on missing element.


T[][] split(T[] arr, T|T[]|delegate|function d);

Split the array arr using a predicate/element/subarray d.
(obsoletes std.string.split that only works for char[])


T[] join(T[][] arr, T|T[]|delegate|function separator);

Join the elements array arr using separator.
(obsoletes std.string.join that only works for char[])


T[] join(T[][] arr);

Join the array T[][] without separator.


U[] map(T[] arr, U delegate|function f(T));

Map the elements of arr over function f, returning an array of the results.


T[] filter(T[] arr, delegate|function p(T));

Filters arr over the predicate p, returning array of elements of arr 
where the predicate is true.


Possible inplace version of map:

T[] doMap(T[], T delegate|function f(T));

---------

I would also prefer those over the language built in .sort .reverse:

T[] sort(T[]);
T[] stableSort(T[]);
T[] sort(T[], delegate|function wrongOrder(T,T));
T[] reverse(T[]);


With the corresponding inplace versions:

T[] doSort(T[]);
T[] doStableSort(T[]);
T[] doSort(T[], delegate|function wrongOrder(T,T));
T[] doReverse(T[]);

---------

Is there in general even any interest in adding generic functions to the 
standard library?

Regards,

Oskar
March 09, 2006
Re: std.array suggestion
"Oskar Linde" <oskar.lindeREM@OVEgmail.com> wrote in message 
news:dup4fr$2c0b$3@digitaldaemon.com...
> Hello,
>
> With the new IFTI support I have been looking at ways of upgrading the 
> standard library with new and more generic functions. What follows is my 
> suggestions for functions to be added to std.array. I have implemented all 
> of them and the current (0.149) limited IFTI-support is enough to support 
> them. That being said, I wish to hold off making a source code submission 
> until a) I get review comments on the suggested function prototypes and b) 
> it is more clear how far D is taking IFTI support (meaning possibly neater 
> implementation).
>
> All functions are designed to be used both as free function and as 
> implicit array methods. Except for the inplace versions, no functions 
> modifies the array.
>
> The prototype notation is my own. a|b means two alternative types. T is 
> the generic element type. T[] is the array.
>
> -----------
>
> T fold(T[] arr, T init, T delegate|function combiner(T,T));
>
> Generic array recursion function. combiner is called recursively:
> return combiner(init,fold(arr[1..$],arr[0],combiner));
> (The actual implementation will of course call the combiner iteratively)
>
>
> T max(T[] arr)
>
> Returns the maximum element in arr as defined by the > operator.
>
>
> T min(T[] arr)
>
> Returns the minimum element in arr as defined by the < operator.
>
>
> T sum(T[] arr)
>
> Returns the sum of the element in arr as defined by the + operator.
>
>
> ptrdiff_t find(T[] arr, T|delegate|function d);
>
> Returns the index of the first occurence of d or the first true predicate 
> d applied to the elements in order. Returns -1 when no element is found.
>
>
> size_t indexOf(T[] arr, T|delegate|function d);
>
> Like find, but throws on missing element.
>
>
> T[][] split(T[] arr, T|T[]|delegate|function d);
>
> Split the array arr using a predicate/element/subarray d.
> (obsoletes std.string.split that only works for char[])
>
>
> T[] join(T[][] arr, T|T[]|delegate|function separator);
>
> Join the elements array arr using separator.
> (obsoletes std.string.join that only works for char[])
>
>
> T[] join(T[][] arr);
>
> Join the array T[][] without separator.
>
>
> U[] map(T[] arr, U delegate|function f(T));
>
> Map the elements of arr over function f, returning an array of the 
> results.
>
>
> T[] filter(T[] arr, delegate|function p(T));
>
> Filters arr over the predicate p, returning array of elements of arr where 
> the predicate is true.
>
>
> Possible inplace version of map:
>
> T[] doMap(T[], T delegate|function f(T));
>
> ---------
>
> I would also prefer those over the language built in .sort .reverse:
>
> T[] sort(T[]);
> T[] stableSort(T[]);
> T[] sort(T[], delegate|function wrongOrder(T,T));
> T[] reverse(T[]);
>
>
> With the corresponding inplace versions:
>
> T[] doSort(T[]);
> T[] doStableSort(T[]);
> T[] doSort(T[], delegate|function wrongOrder(T,T));
> T[] doReverse(T[]);
>
> ---------
>
> Is there in general even any interest in adding generic functions to the 
> standard library?
>
> Regards,
>
> Oskar
I like it.  It would be especially cool if we could get rid of the necessary 
() after each call when using property syntax, thus making truely plugable 
properties.
March 09, 2006
Re: std.array suggestion
Oskar Linde wrote:
> Hello,
> 
> With the new IFTI support I have been looking at ways of upgrading the 
> standard library with new and more generic functions. 

[snip]
> Is there in general even any interest in adding generic functions to the
> standard library?

Some of these functions are the last thing remaining in std.math2 (but 
they definitely don't below there, none of them are truly mathematical). 
We definitely want to remove std.math2 prior to 1.0, std.array sounds 
good to me.

I'll just comment on one function:

> T sum(T[] arr)
>
> Returns the sum of the element in arr as defined by the + operator.

This one has an interesting twist. I'm not sure that the sum should 
necessarily be of type T. I've been playing around with the concept of 
what I've called the Archetype of a type, which is the largest type with 
the same semantics as T (possibly with the same calculation speed). (ie,
Archetype!(byte)= Archetype!(short)
Archetype!(int) = long
Archetype!(float) = Archetype!(double) = real,
Archetype!(cfloat)= creal, etc). Obviously it's a trivial template.

I think that at least,  sum(double[] ) should internally use a real 
while accumulating the sum, so that it can satisfy this test (at least 
on x86 platforms):

unittest {
   const double a = [ double.max, double.max, -double.max];
   assert(sum(a) == double.max);
}

After all, this is one of the reasons why reals exist. I'm still not 
sure if sum(double []) should return a double or a real, although I'm 
inclined to think that *any* function that returns a single floating 
point value should return a real (on x87, the 80-bit result is just left 
on the FPU stack anyway). But, I'm less confident about how a sum of 
ints should behave.

However, all the other functions seem to be free of mathematical 
subtleties. sum() is the only one which involves arithmetic operators, 
and therefore it might not belong with the rest.
March 09, 2006
Re: std.array suggestion
Ameer Armaly wrote:
> "Oskar Linde" <oskar.lindeREM@OVEgmail.com> wrote in message 
> news:dup4fr$2c0b$3@digitaldaemon.com...
>> Hello,
>>
>> With the new IFTI support I have been looking at ways of upgrading the 
>> standard library with new and more generic functions. What follows is my 
>> suggestions for functions to be added to std.array. I have implemented all 
>> of them and the current (0.149) limited IFTI-support is enough to support 
>> them. That being said, I wish to hold off making a source code submission 
>> until a) I get review comments on the suggested function prototypes and b) 
>> it is more clear how far D is taking IFTI support (meaning possibly neater 
>> implementation).
>>
>> All functions are designed to be used both as free function and as 
>> implicit array methods. Except for the inplace versions, no functions 
>> modifies the array.
>>
>> The prototype notation is my own. a|b means two alternative types. T is 
>> the generic element type. T[] is the array.
>>
>> -----------
>>
>> T fold(T[] arr, T init, T delegate|function combiner(T,T));
>>
>> Generic array recursion function. combiner is called recursively:
>> return combiner(init,fold(arr[1..$],arr[0],combiner));
>> (The actual implementation will of course call the combiner iteratively)
>>
>>
>> T max(T[] arr)
>>
>> Returns the maximum element in arr as defined by the > operator.
>>
>>
>> T min(T[] arr)
>>
>> Returns the minimum element in arr as defined by the < operator.
>>
>>
>> T sum(T[] arr)
>>
>> Returns the sum of the element in arr as defined by the + operator.
>>
>>
>> ptrdiff_t find(T[] arr, T|delegate|function d);
>>
>> Returns the index of the first occurence of d or the first true predicate 
>> d applied to the elements in order. Returns -1 when no element is found.
>>
>>
>> size_t indexOf(T[] arr, T|delegate|function d);
>>
>> Like find, but throws on missing element.
>>
>>
>> T[][] split(T[] arr, T|T[]|delegate|function d);
>>
>> Split the array arr using a predicate/element/subarray d.
>> (obsoletes std.string.split that only works for char[])
>>
>>
>> T[] join(T[][] arr, T|T[]|delegate|function separator);
>>
>> Join the elements array arr using separator.
>> (obsoletes std.string.join that only works for char[])
>>
>>
>> T[] join(T[][] arr);
>>
>> Join the array T[][] without separator.
>>
>>
>> U[] map(T[] arr, U delegate|function f(T));
>>
>> Map the elements of arr over function f, returning an array of the 
>> results.
>>
>>
>> T[] filter(T[] arr, delegate|function p(T));
>>
>> Filters arr over the predicate p, returning array of elements of arr where 
>> the predicate is true.
>>
>>
>> Possible inplace version of map:
>>
>> T[] doMap(T[], T delegate|function f(T));
>>
>> ---------
>>
>> I would also prefer those over the language built in .sort .reverse:
>>
>> T[] sort(T[]);
>> T[] stableSort(T[]);
>> T[] sort(T[], delegate|function wrongOrder(T,T));
>> T[] reverse(T[]);
>>
>>
>> With the corresponding inplace versions:
>>
>> T[] doSort(T[]);
>> T[] doStableSort(T[]);
>> T[] doSort(T[], delegate|function wrongOrder(T,T));
>> T[] doReverse(T[]);
>>
>> ---------
>>
>> Is there in general even any interest in adding generic functions to the 
>> standard library?
>>
>> Regards,
>>
>> Oskar

Upon rereading, I realized that the inplace versions should be void 
functions - not returning an array.

> I like it.  It would be especially cool if we could get rid of the necessary 
> () after each call when using property syntax, thus making truely plugable 
> properties. 

Yes, I agree. I would like to know if all pairs of empty parentheses 
after functions are supposed to be redundant or if calls without 
parentheses should be reserved to property like methods. Considering the 
current .sort and .reverse semantics, I guess the former is the case and 
DMD not allowing calls without parentheses for implicit array methods is 
an unintentional omission.

/Oskar
March 09, 2006
Re: std.array suggestion
Don Clugston wrote:
> Oskar Linde wrote:
> I'll just comment on one function:
> 
>  > T sum(T[] arr)
>  >
>  > Returns the sum of the element in arr as defined by the + operator.
> 
> This one has an interesting twist. I'm not sure that the sum should 
> necessarily be of type T. I've been playing around with the concept of 
> what I've called the Archetype of a type, which is the largest type with 
> the same semantics as T (possibly with the same calculation speed). (ie,
> Archetype!(byte)= Archetype!(short)
> Archetype!(int) = long
> Archetype!(float) = Archetype!(double) = real,
> Archetype!(cfloat)= creal, etc). Obviously it's a trivial template.

Interesting...

> I think that at least,  sum(double[] ) should internally use a real 
> while accumulating the sum, so that it can satisfy this test (at least 
> on x86 platforms):
> 
> unittest {
>    const double a = [ double.max, double.max, -double.max];
>    assert(sum(a) == double.max);
> }
> 
> After all, this is one of the reasons why reals exist. I'm still not 
> sure if sum(double []) should return a double or a real, although I'm 
> inclined to think that *any* function that returns a single floating 
> point value should return a real (on x87, the 80-bit result is just left 
> on the FPU stack anyway). But, I'm less confident about how a sum of 
> ints should behave.

Int overflows is well defined, making the sum of ints behave correctly 
in the case above. Int is also the promotion type for integral 
operations and would be the natural return type for sum(int[]). A sum of 
shorts should probably return an int though, following integer promotion 
rules. Should Archetype for integers follow the integer promotion rules 
or all be long?

> However, all the other functions seem to be free of mathematical 
> subtleties. sum() is the only one which involves arithmetic operators, 
> and therefore it might not belong with the rest.

You make a strong argument and I agree. sum() is used as join() in other 
languages that lack the distinction between addition and concatenation. 
D doesn't need a non-arithmetic sum function. With the proposed fold 
function, you could also easily implement sum as:

sum = fold(arr,0,int function(int a, int b) { return a+b; });

prod = fold(arr,1,int function(int a, int b) { return a*b; });

Or generic, using Archetype:

fold(arr,0,Archetype!(T) function(Archetype!(T) a, T b) {return a+b;});

/Oskar
March 09, 2006
Re: std.array suggestion
Oskar Linde wrote:
> sum = fold(arr,0,int function(int a, int b) { return a+b; });
> prod = fold(arr,1,int function(int a, int b) { return a*b; });
> fold(arr,0,Archetype!(T) function(Archetype!(T) a, T b) {return a+b;});
Oops, those should of course read:

sum = fold(arr,0, function int(int a, int b) { return a+b; });
prod = fold(arr,1, function int(int a, int b) { return a*b; });
fold(arr,0, function Archetype!(T)(Archetype!(T) a, T b) {return a+b;});

/Oskar
March 09, 2006
Re: std.array suggestion
Oskar Linde wrote:

...

Totally agree that something like this (and more) should be in the 
standard library.

> 
> Upon rereading, I realized that the inplace versions should be void 
> functions - not returning an array.

Not sure about this one. Returning an array allows chaining:

array.doMap(someDelegate).doSort();

...

> 
>> I like it.  It would be especially cool if we could get rid of the 
>> necessary () after each call when using property syntax, thus making 
>> truely plugable properties. 
> 
> 
> Yes, I agree. I would like to know if all pairs of empty parentheses 
> after functions are supposed to be redundant or if calls without 
> parentheses should be reserved to property like methods. Considering the 
> current .sort and .reverse semantics, I guess the former is the case and 
> DMD not allowing calls without parentheses for implicit array methods is 
> an unintentional omission.
> 
> /Oskar
March 09, 2006
Re: std.array suggestion
Ivan Senji wrote:
> Oskar Linde wrote:
> 
> ...
> 
> Totally agree that something like this (and more) should be in the 
> standard library.

Please elaborate on (and more).

>> Upon rereading, I realized that the inplace versions should be void 
>> functions - not returning an array.
> 
> Not sure about this one. Returning an array allows chaining:
> 
> array.doMap(someDelegate).doSort();

Yes, but void removes the chance of using inplace versions by mistake 
and makes it extra obvious that they are not ordinary functions. I think 
this is more important than the ability of chaining.

/Oskar
March 09, 2006
Re: std.array suggestion
"Oskar Linde" <oskar.lindeREM@OVEgmail.com> wrote in message 
news:dup4fr$2c0b$3@digitaldaemon.com...
> With the corresponding inplace versions:
>
> T[] doSort(T[]);
> T[] doStableSort(T[]);
> T[] doSort(T[], delegate|function wrongOrder(T,T));
> T[] doReverse(T[]);
>

My only reservation would be the decision to prefix these functions with 
"do". It's tautological and doesn't express anything about them being 
in-place versions. I suggest just calling them sortInPlace/inPlaceSort (or 
even sortInSitu or inSituSort).
March 09, 2006
Re: std.array suggestion
Oskar Linde wrote:
> Ivan Senji wrote:
> 
>> Oskar Linde wrote:
>>
>> ...
>>
>> Totally agree that something like this (and more) should be in the 
>> standard library.
> 
> 
> Please elaborate on (and more).
> 

:P I knew you were going to say that. I was just trying to say that the 
new features open up a lot of possibilites. (not trying to say you 
didn't to a great job, more that there is more that could be done in a 
standard library)

No ideas at the moment, but when I think of something I'll let you know.


>>> Upon rereading, I realized that the inplace versions should be void 
>>> functions - not returning an array.
>>
>>
>> Not sure about this one. Returning an array allows chaining:
>>
>> array.doMap(someDelegate).doSort();
> 
> 
> Yes, but void removes the chance of using inplace versions by mistake 
> and makes it extra obvious that they are not ordinary functions. I think 
> this is more important than the ability of chaining.
> 

What you say does make sense. But I don't think it would be souch a big 
problem. It is always safe to use ordinary array functions, and it would 
be nice to be able to just replace them with inplace ones for 
performance reasons if the original array is not needed any more.

So if I have:
array.map(someDelegate).sort();

and I figure out that it could be optimized because I don'n need the 
original array I could just change the functions used.

But if the general consensus would be for inplace functions to return 
void I would have to agree :)
« First   ‹ Prev
1 2 3
Top | Discussion index | About this forum | D home