std.array suggestion (page 3)

David Medlock wrote: > Oskar Linde wrote: > <snip> > I would also add: I think the following are redundant as functions as they all have an equivalent splice syntax: > T[] left( T[] array, int n ) identical to array[0..n]; > T[] right( T[] array, int n ) identical to array[$-n..$]; > T[] skip( T[] array, int n ) identical to array[n..$]; > In my own personal parsing stuff I regularly use the above plus: > > T[] countUntil( char[] str, bool delegate(char) pred ); > T[] countWhile( char[] str, bool delegate(char) pred ); I assume those should return a size_t and not a T[]? If I understand correctly, those functions would be similar to find, but return arr.length instead of -1 when no matching element is found? (The second one would also invert the predicate.) > So skipping whitespace becomes: > > text = skip( text.countWhile( delegate bool(char c){ c<=32; } ) ); /Oskar

March 09, 2006

Re: std.array suggestion

Posted by Oskar Linde
in reply to Sean Kelly

Permalink

Oskar Linde

Posted in reply to Sean Kelly

Permalink

Thank you for your comments.

Sean Kelly wrote:
> Oskar Linde wrote:

>> All functions are designed to be used both as free function and as implicit array methods. Except for the inplace versions, no functions modifies the array.
>> 
>> The prototype notation is my own. a|b means two alternative types. T is the generic element type. T[] is the array.
> 
> To those who are unfamiliar with ITI, it's worth noting that any template parameters that cannot be determined implicitly should be placed at the beginning of the template parameter list.  Doing so should allow this to be possible:
> 
> template func( Ret, Val )
> {
> Ret func( Val val ) {}
> }
> 
> int i;
> char c = func!(char)( i );
> 
> This isn't implemented yet, but I suspect it will be soon.

Yes, I see no obstacles in implementing this.

>> 
>> T min(T[] arr)
>> 
>> Returns the minimum element in arr as defined by the < operator.
> 
> min and max should probably allow a comparison predicate to be supplied as well.

That is a good idea. The meaning should be the same as for the sort predicate.

> Thus the declarations would be something like this:
> 
> T min(T[] arr, P pred = &less!(T));
> 
> I'm also hoping the above syntax will actually work, and that P will resolve to being a function pointer by default.

What you can do today is this:

template min(ArrTy, PredTy) {
        eltype!(ArrTy) min(ArrTy array, PredTy predicate) {
                ...
        }
}

template min(ArrTy) {
        eltype!(ArrTy) min(ArrTy array) {
                return array.min(&less!(eltype!(ArrTy)));
        }
}

>> T sum(T[] arr)
>> 
>> Returns the sum of the element in arr as defined by the + operator.
>>
>> ptrdiff_t find(T[] arr, T|delegate|function d);
>> 
>> Returns the index of the first occurence of d or the first true predicate d applied to the elements in order. Returns -1 when no element is found.
> 
> I suggest returning size_t instead and using size_t.max in place of -1.
>   The two are basically equivalent, but using size_t offers a bit more
> range in legal array sizes as the flag value occupies the least
> significant bit rather than the most significant.

Yes. This is probably better. size_t.max is still == -1 in comparisons.

> I also almost suggested adding a "substring" find function, except that seems more appropriate for a std.string module.  Have you thought about how the two will overlap?  I'm not entirely certain how template overloading will work for functions defined in different modules.  In fact, if this is indeed a limitation them it might be prudent to put all such algorithms in std.array instead of splitting them between std.array and std.string (I just thought of this--I had been considering multiple modules for overloads in Ares, and I suspect this won't work).

I'm not certain, but I believe the current dmd-ifti will have problems with splitting template functions with the same name and number of template arguments. It would probably be best to put a generic subarray find in std.array rather than extending the find function for just {,d,w}char[]. But while one may argue for a generic subarray find, it becomes less certain that there should be a generic subarray replace for instance.

>> T[] join(T[][] arr, T|T[]|delegate|function separator);
>> 
>> Join the elements array arr using separator.
>> (obsoletes std.string.join that only works for char[])
> 
> How do you envision the delegate version being applied?  Is it expected to return an element/array for a separator?

Oops, this is a copy-paste error. I have not implemented this as taking a delegate/function. I did not see any use for that. The signature should be:

T[] join(T[][] arr, T|T[] separator);

>> ---------
>> 
>> Is there in general even any interest in adding generic functions to the standard library?
> 
> Heck yes.  Many of the functions you've mentioned above were ones I was planning to implement.  I also like that you're using delegates as predicates, as it makes them far more generic than some of the hardcoded versions in Phobos' std.string.  I don't suppose you'd like to submit them to Ares as well?  Or please don't object if I end up implementing something quite similar :-)

I would definitely not want to exclude my implementation from Ares. Once the function specifications are defined, the implementation is really straight forward. I have no objections to you implementing similar functions either. My only concern is having this functionality in some form in a D standard library. :)

/Oskar

Oskar Linde wrote: > David Medlock wrote: > > >>Oskar Linde wrote: >><snip> > > >>I would also add: > > > I think the following are redundant as functions as they all have an > equivalent splice syntax: > > >>T[] left( T[] array, int n ) > > identical to array[0..n]; > > >>T[] right( T[] array, int n ) > > identical to array[$-n..$]; > > >>T[] skip( T[] array, int n ) > > identical to array[n..$]; > Yes but the last two avoid the hideous Perl-like $...hehe. These 3 are just more readable to me, doesnt matter though. > >>In my own personal parsing stuff I regularly use the above plus: >> >>T[] countUntil( char[] str, bool delegate(char) pred ); >>T[] countWhile( char[] str, bool delegate(char) pred ); > > > I assume those should return a size_t and not a T[]? > If I understand correctly, those functions would be similar to find, but > return arr.length instead of -1 when no matching element is found? (The > second one would also invert the predicate.) > Actually I copied the wrong utility functions. The countWhile/countUntil do return size_t/uint. The ones I meant to post are skipWhile/skipUntil which do return a slice of the T[]. > >>So skipping whitespace becomes: >> >>text = skip( text.countWhile( delegate bool(char c){ c<=32; } ) ); > should be: text = text.skip( text.countWhile( delegate bool(char c){ c<=32; } ) ); -DavidM

David Medlock wrote: > should be: > > text = text.skip( text.countWhile( delegate bool(char c){ c<=32; } ) ); > -DavidM Oops again. should be: text = text.skipWhile( delegate bool(char c){ c<=32; } ); arrgh. -DavidM

Stewart Gordon wrote: > Oskar Linde wrote: >> It is also in many cases simpler to define a ordering predicate than a 3-valued ordering. The name wrongOrder just helps as a reminder of what the predicate should return. > <snip> > > If that's so, I wonder why so many things have stuck with the three-valued ordering. A comparison operator is not the same thing as an ordering predicate. Consider sorting phone-book entries (Sorted by first city, then surname and finally first name): bool phoneBookOrder(Record a, Record b) { return a.city > b.city || a.surname > b.surname || a.name > b.name; } Forcing the user to write three-valued ordering functions for this is both unnecessary (the sorthing function will not use all 3 values), harder to get right (try it yourself) and probably less efficient. Instead of wrongOrder, I could name the predicate lessThan and swap its arguments. This is what C++ STL uses and may be more logical. /Oskar

Oskar Linde wrote: > Stewart Gordon wrote: >> Oskar Linde wrote: >>> It is also in many cases simpler to define a ordering predicate than a 3-valued ordering. The name wrongOrder just helps as a reminder of what the predicate should return. >> <snip> >> >> If that's so, I wonder why so many things have stuck with the three-valued ordering. > > A comparison operator is not the same thing as an ordering predicate. > Consider sorting phone-book entries (Sorted by first city, then surname and finally first name): > > bool phoneBookOrder(Record a, Record b) { > return a.city > b.city || > a.surname > b.surname || > a.name > b.name; > } > > Forcing the user to write three-valued ordering functions for this is both unnecessary (the sorthing function will not use all 3 values), harder to get right (try it yourself) and probably less efficient. > > Instead of wrongOrder, I could name the predicate lessThan and swap its arguments. This is what C++ STL uses and may be more logical. I like wrongOrder. It clearly states that it's a < rather than a <= comparison. Also, 'lessThan' sounds rather formal, whereas wrongOrder sounds more general.

In article <dus0hn$2tb0$1@digitaldaemon.com>, Oskar Linde says... > >Stewart Gordon wrote: >> Oskar Linde wrote: >>> It is also in many cases simpler to define a ordering predicate than a 3-valued ordering. The name wrongOrder just helps as a reminder of what the predicate should return. >> <snip> >> >> If that's so, I wonder why so many things have stuck with the three-valued ordering. > >A comparison operator is not the same thing as an ordering predicate. Consider sorting phone-book entries (Sorted by first city, then surname and finally first name): > >bool phoneBookOrder(Record a, Record b) { > return a.city > b.city || > a.surname > b.surname || > a.name > b.name; >} > >Forcing the user to write three-valued ordering functions for this is both unnecessary (the sorthing function will not use all 3 values), harder to get right (try it yourself) and probably less efficient. > >Instead of wrongOrder, I could name the predicate lessThan and swap its arguments. This is what C++ STL uses and may be more logical. > >/Oskar This contains a bug. It should be like this: >bool phoneBookOrder(Record a, Record b) { > if (a.city < b.city) return false; > if (a.city > b.city) return true; > > if (a.surname < b.surname) return false; > if (a.surname > b.surname) return true; > > return a.name > b.name; >} .. Otherwise, (a < b) && (b < a) is possible, i.e. when the cities are equal. Kevin

Kevin Bealer wrote: > In article <dus0hn$2tb0$1@digitaldaemon.com>, Oskar Linde says... >>Consider sorting phone-book entries (Sorted by first city, then surname >>and finally first name): >> >>bool phoneBookOrder(Record a, Record b) { >>return a.city > b.city || >> a.surname > b.surname || >> a.name > b.name; >>} >> > This contains a bug. It should be like this: > >>bool phoneBookOrder(Record a, Record b) { >> if (a.city < b.city) return false; >> if (a.city > b.city) return true; >> >> if (a.surname < b.surname) return false; >> if (a.surname > b.surname) return true; >> >> return a.name > b.name; >>} > > .. Otherwise, (a < b) && (b < a) is possible, i.e. when the cities are > equal. You are correct. How embarassing. I was sure I had read the above example somewhere, so I didn't care to think before typing it in... /Oskar

Forums