October 12, 2010 Re: improving the join function | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On 12/10/2010 11:33 AM, Andrei Alexandrescu wrote:
> I'm looking at http://d.puremagic.com/issues/show_bug.cgi?id=3313 and
> that got me looking at std.string.join, which currently has the sig:
>
> string join(in string[] words, string sep);
>
> A narrow fix:
>
> Char[] join(Char)(in Char[][] words, in Char[] sep)
> if (isSomeChar!Char);
>
> I think it's reasonable to assume that people would want to join things
> that aren't necessarily arrays of characters, so T could be pretty much
> any type. An obvious step towards generalization is:
>
> T[] join(T)(in T[][] items, T[] sep);
>
> But join doesn't really need random access for words - really, an input
> range should suffice. So a generally useful join, almost worth putting
> in std.algorithm, would be:
>
> ElementType!R1[] join(R1, R2)(R1 items, R2 sep)
> if (isInputRange!R1 && isForwardRange!R2
> && is(ElementType!R2 : ElementType!R1);
>
> Notice how the separator must be a forward range because it gets spanned
> multiple times, whereas the items need only be an input range as they
> are spanned once. This is at the same time a very general and very
> precise interface.
>
> One thing is still bothering me: the array output type. Why would the
> "default" output range be an array? What can be done to make join() at
> the same time a general function and also one that works for strings the
> way the old join did? For example, if I want to join things into an
> already-existing buffer, or if I want to write them straight to a file,
> there's no way to do so without having an array allocation in the loop.
> I have a couple of ideas but I wouldn't want to bias yours.
>
> I also have a question from people who dislike Phobos. Was there a point
> in the changes of signature above where you threw your hands thinking,
> "do the darn string version already and cut all that crap!"?
>
>
> Thanks,
>
> Andrei
Yes, "do the darn string version already and cut all that crap".
This is probably the thing to do to make for familiarity among
library users [of other languages].
However, if you have an urge to back-end the implementation
of the colloquial "join" by your ideas, do not give up your
dream. So long as it is implemented as your private dream
no one will notice and you will remain internally satisfied. :-)
- JJ
| |||
October 12, 2010 Re: improving the join function | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Justin Johansson | On 13/10/2010 1:28 AM, Justin Johansson wrote:
> Yes, "do the darn string version already and cut all that crap".
>
> This is probably the thing to do to make for familiarity among
> library users [of other languages].
>
> However, if you have an urge to back-end the implementation
> of the colloquial "join" by your ideas, do not give up your
> dream. So long as it is implemented as your private dream
> no one will notice and you will remain internally satisfied. :-)
>
> - JJ
>
I think I meant the "ubiquitous join" rather than the "colloquial
join".
| |||
October 12, 2010 Re: improving the join function | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Daniel Gibson | On 10/12/2010 03:09 AM, Daniel Gibson wrote:
> I don't like the name "join" - especially for general ranges.
> When I hear join I think of database like joins. These may not be
> horribly interesting for strings but certainly are for general ranges (*).
> union() or concat() would be better names for doing what std.string.join
> does.
I agree - what is currently offered by join() could simply be achieved by an optional argument to concat()
| |||
October 12, 2010 Re: improving the join function | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Justin Johansson | On 10/12/10 9:35 CDT, Justin Johansson wrote:
> On 13/10/2010 1:28 AM, Justin Johansson wrote:
>> Yes, "do the darn string version already and cut all that crap".
>>
>> This is probably the thing to do to make for familiarity among
>> library users [of other languages].
>>
>> However, if you have an urge to back-end the implementation
>> of the colloquial "join" by your ideas, do not give up your
>> dream. So long as it is implemented as your private dream
>> no one will notice and you will remain internally satisfied. :-)
>>
>> - JJ
>>
>
> I think I meant the "ubiquitous join" rather than the "colloquial
> join".
By both I understand "join as in Python". Right?
Question is, where to stop?
1. string only (i.e. leave as is)
2. const(char)[] only (to allow joining char[] values)
3. various width of char, i.e. why shouldn't you join an array of wstring?
From 3, the incremental effort to generalize to any type is virtually nonexistent, and the effort to generalize to ranges instead of arrays is minor. To me these are positives.
Andrei
| |||
October 12, 2010 Re: improving the join function | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On 13/10/2010 2:02 AM, Andrei Alexandrescu wrote:
> On 10/12/10 9:35 CDT, Justin Johansson wrote:
>> On 13/10/2010 1:28 AM, Justin Johansson wrote:
>>> Yes, "do the darn string version already and cut all that crap".
>>>
>>> This is probably the thing to do to make for familiarity among
>>> library users [of other languages].
>>>
>>> However, if you have an urge to back-end the implementation
>>> of the colloquial "join" by your ideas, do not give up your
>>> dream. So long as it is implemented as your private dream
>>> no one will notice and you will remain internally satisfied. :-)
>>>
>>> - JJ
>>>
>>
>> I think I meant the "ubiquitous join" rather than the "colloquial
>> join".
>
> By both I understand "join as in Python". Right?
>
> Question is, where to stop?
>
> 1. string only (i.e. leave as is)
>
> 2. const(char)[] only (to allow joining char[] values)
>
> 3. various width of char, i.e. why shouldn't you join an array of wstring?
>
> From 3, the incremental effort to generalize to any type is virtually
> nonexistent, and the effort to generalize to ranges instead of arrays is
> minor. To me these are positives.
>
>
> Andrei
Yes, I agree from a range idiom point of view.
Now, while understanding that D people don't care much for the
XPath 2.0 type system, and not myself caring much for the back-end
implementation, my XPath-ish function signature for this join() function
to preserve the generality that you suggest would be
item() join( things as item()*, separator as item()* );
Of course I'm anticipating an understanding of the above XPath 2.0
function signature syntax, an even then, I suspect my proposed
signature to be too liberal.
Regards, Justin
| |||
October 13, 2010 Re: improving the join function | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Mon, 11 Oct 2010 20:33:27 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote: > I'm looking at http://d.puremagic.com/issues/show_bug.cgi?id=3313 and that got me looking at std.string.join, which currently has the sig: > > string join(in string[] words, string sep); > > A narrow fix: > > Char[] join(Char)(in Char[][] words, in Char[] sep) > if (isSomeChar!Char); > > I think it's reasonable to assume that people would want to join things that aren't necessarily arrays of characters, so T could be pretty much any type. An obvious step towards generalization is: > > T[] join(T)(in T[][] items, T[] sep); This doesn't quite work if T is not a value type (actually, I think it does, but only because there are bugs in the compiler). > > But join doesn't really need random access for words - really, an input range should suffice. So a generally useful join, almost worth putting in std.algorithm, would be: > > ElementType!R1[] join(R1, R2)(R1 items, R2 sep) > if (isInputRange!R1 && isForwardRange!R2 > && is(ElementType!R2 : ElementType!R1); > > Notice how the separator must be a forward range because it gets spanned multiple times, whereas the items need only be an input range as they are spanned once. This is at the same time a very general and very precise interface. I think this is fine. Note that this does not take into account the constancy of items, meaning it is legal for this function to mess with the original data in items. Not that I think it's a bad thing, but it does lose some guarantees as compared to the original join. inout can't be used here because it doesn't work as a template parameter. > One thing is still bothering me: the array output type. Why would the "default" output range be an array? What can be done to make join() at the same time a general function and also one that works for strings the way the old join did? For example, if I want to join things into an already-existing buffer, or if I want to write them straight to a file, there's no way to do so without having an array allocation in the loop. I have a couple of ideas but I wouldn't want to bias yours. Well, one could have a version of join that takes an output range. It would have to return the output range instead of the *result* of the output range. And in that case, the standard join which returns an array can be implemented: ElementType!R1[] join(R1 items, R2 sep) ... { return join(R1, R2, Appender!(ElementType!R1)).data; } > I also have a question from people who dislike Phobos. Was there a point in the changes of signature above where you threw your hands thinking, "do the darn string version already and cut all that crap!"? It's not a problem with phobos, it's a problem with documentation. There is a fundamental issue with documenting complex templates which makes function signatures very difficult to understand. The doc generator can and should simplify things, and I think at some point we should address this. In other words, it should be transformed into a form that's easy to see that it's the same as string[] join(string[][], string[]). -Steve | |||
October 13, 2010 Re: improving the join function | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | On 10/13/10 14:03 CDT, Steven Schveighoffer wrote: > On Mon, 11 Oct 2010 20:33:27 -0400, Andrei Alexandrescu >> T[] join(T)(in T[][] items, T[] sep); > > This doesn't quite work if T is not a value type (actually, I think it > does, but only because there are bugs in the compiler). My focus in this discussion is not the const aspect, but point taken. > Well, one could have a version of join that takes an output range. It > would have to return the output range instead of the *result* of the > output range. And in that case, the standard join which returns an array > can be implemented: > > ElementType!R1[] join(R1 items, R2 sep) ... > { > return join(R1, R2, Appender!(ElementType!R1)).data; > } Yah, I had a similar idea: void join(In1, In2, Out)(In1 items, In2 sep, Out target); as an overload. >> I also have a question from people who dislike Phobos. Was there a >> point in the changes of signature above where you threw your hands >> thinking, "do the darn string version already and cut all that crap!"? > > It's not a problem with phobos, it's a problem with documentation. There > is a fundamental issue with documenting complex templates which makes > function signatures very difficult to understand. The doc generator can > and should simplify things, and I think at some point we should address > this. In other words, it should be transformed into a form that's easy > to see that it's the same as string[] join(string[][], string[]). Good point. On the other hand, an overly simplified documentation might hinder a good deal of legit uses for advanced users. I wonder how to please everyone. Andrei | |||
October 13, 2010 Re: improving the join function | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Wed, 13 Oct 2010 16:07:46 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote: >> It's not a problem with phobos, it's a problem with documentation. There >> is a fundamental issue with documenting complex templates which makes >> function signatures very difficult to understand. The doc generator can >> and should simplify things, and I think at some point we should address >> this. In other words, it should be transformed into a form that's easy >> to see that it's the same as string[] join(string[][], string[]). > > Good point. On the other hand, an overly simplified documentation might hinder a good deal of legit uses for advanced users. I wonder how to please everyone. Even though I consider myself a reasonable parser of function templates, sometimes in std.algorithm, I'll stare at a function signature for about 10 minutes trying to figure out whether I can do what I want, give up and finally just try to compile it. I think what might help is spelling out the constraints somehow and especially explaining how the alias parameters work. They are some sort of black magic I don't always understand :) -Steve | |||
October 13, 2010 Re: improving the join function | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | On Wed, 13 Oct 2010 16:42:35 -0400, "Steven Schveighoffer" <schveiguy@yahoo.com> wrote: > Even though I consider myself a reasonable parser of function templates, > sometimes in std.algorithm, I'll stare at a function signature for about > 10 minutes trying to figure out whether I can do what I want, give up and > finally just try to compile it. > I think what might help is spelling out the constraints somehow and > especially explaining how the alias parameters work. They are some sort > of black magic I don't always understand :) Glad to see I'm not the only one :) The asserts help a lot there; I understood that module better looking at them than with the signatures. Adding more (or just adding some where they're missing). The template constraints is something that could definitely kill more trees in future editions of TDPL. | |||
October 13, 2010 Re: improving the join function | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Juanjo Alvarez | On Thu, 14 Oct 2010 01:30:42 +0200, Juanjo Alvarez <fake@fakeemail.com> wrote:
> signatures. Adding more (or just adding some where they're missing).
Truncated sentence, I wanted to say that adding more asserts would not hurt.
| |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply