Jump to page: 1 2
Thread overview
cannot sort an array of char
Nov 05, 2014
Ivan Kazmenko
Nov 05, 2014
Marc Schütz
Nov 05, 2014
Marc Schütz
Nov 05, 2014
Ali Çehreli
Nov 05, 2014
Ali Çehreli
Nov 06, 2014
Ivan Kazmenko
Nov 06, 2014
Marc Schütz
Nov 11, 2014
Ivan Kazmenko
Nov 11, 2014
Ivan Kazmenko
Nov 11, 2014
Marc Schütz
November 05, 2014
Hi!

This gives an error (cannot deduce template function from argument types):

-----
import std.algorithm;
void main () {
	char [] c;
	sort (c);
}
-----

Why is "char []" so special that it can't be sorted?

For example, if I know the array contains only ASCII characters, sorting it sounds no different to sorting an "int []".

Ivan Kazmenko.
November 05, 2014
On Wednesday, 5 November 2014 at 12:54:03 UTC, Ivan Kazmenko wrote:
> Hi!
>
> This gives an error (cannot deduce template function from argument types):
>
> -----
> import std.algorithm;
> void main () {
> 	char [] c;
> 	sort (c);
> }
> -----
>
> Why is "char []" so special that it can't be sorted?
>
> For example, if I know the array contains only ASCII characters, sorting it sounds no different to sorting an "int []".

Hmm... this doesn't work either:

    import std.algorithm;
    import std.utf;
    void main () {
        char [] c;
        sort (c.byCodeUnit);
    }

But IMO it should.
November 05, 2014
On Wednesday, 5 November 2014 at 13:34:05 UTC, Marc Schütz wrote:
> On Wednesday, 5 November 2014 at 12:54:03 UTC, Ivan Kazmenko wrote:
>> Hi!
>>
>> This gives an error (cannot deduce template function from argument types):
>>
>> -----
>> import std.algorithm;
>> void main () {
>> 	char [] c;
>> 	sort (c);
>> }
>> -----
>>
>> Why is "char []" so special that it can't be sorted?
>>
>> For example, if I know the array contains only ASCII characters, sorting it sounds no different to sorting an "int []".
>
> Hmm... this doesn't work either:
>
>     import std.algorithm;
>     import std.utf;
>     void main () {
>         char [] c;
>         sort (c.byCodeUnit);
>     }
>
> But IMO it should.

https://issues.dlang.org/show_bug.cgi?id=13689
November 05, 2014
On 11/05/2014 05:44 AM, "Marc Schütz" <schuetzm@gmx.net>" wrote:
> On Wednesday, 5 November 2014 at 13:34:05 UTC, Marc Schütz wrote:
>> On Wednesday, 5 November 2014 at 12:54:03 UTC, Ivan Kazmenko wrote:
>>> Hi!
>>>
>>> This gives an error (cannot deduce template function from argument
>>> types):
>>>
>>> -----
>>> import std.algorithm;
>>> void main () {
>>>     char [] c;
>>>     sort (c);
>>> }
>>> -----
>>>
>>> Why is "char []" so special that it can't be sorted?
>>>
>>> For example, if I know the array contains only ASCII characters,
>>> sorting it sounds no different to sorting an "int []".
>>
>> Hmm... this doesn't work either:
>>
>>     import std.algorithm;
>>     import std.utf;
>>     void main () {
>>         char [] c;
>>         sort (c.byCodeUnit);
>>     }
>>
>> But IMO it should.
>
> https://issues.dlang.org/show_bug.cgi?id=13689

It can't be a RandomAccessRange because it cannot satisfy random access at O(1) time.

Ali

November 05, 2014
On 11/05/2014 10:01 AM, Ali Çehreli wrote:

>>>         sort (c.byCodeUnit);
>>>     }
>>>
>>> But IMO it should.
>>
>> https://issues.dlang.org/show_bug.cgi?id=13689
>
> It can't be a RandomAccessRange because it cannot satisfy random access
> at O(1) time.

Sorry, I misunderstood (again): code unit is random-access, code point is not.

Ali

P.S. I would like to have a word with the Unicode people who settled on the terms "code unit" and "code point". Every time I come across one of those, I have to think at least 5 seconds to fool myself to think that I understood correctly which one was meant. :p

November 06, 2014
On Wednesday, 5 November 2014 at 13:34:05 UTC, Marc Schütz wrote:
> On Wednesday, 5 November 2014 at 12:54:03 UTC, Ivan Kazmenko wrote:
>> Hi!
>>
>> This gives an error (cannot deduce template function from argument types):
>>
>> -----
>> import std.algorithm;
>> void main () {
>> 	char [] c;
>> 	sort (c);
>> }
>> -----
>>
>> Why is "char []" so special that it can't be sorted?
>>
>> For example, if I know the array contains only ASCII characters, sorting it sounds no different to sorting an "int []".
>
> Hmm... this doesn't work either:
>
>     import std.algorithm;
>     import std.utf;
>     void main () {
>         char [] c;
>         sort (c.byCodeUnit);
>     }
>
> But IMO it should.

So, you imply that to use a char array as a RandomAccessRange, I have to use byCodeUnit? (and it should work, but doesn't?)

Fine, but how does one learn that except by asking here?  Googling did not produce meaningful results for me.

For example, isRandomAccessRange[0] states the problem:
-----
Although char[] and wchar[] (as well as their qualified versions including string and wstring) are arrays, isRandomAccessRange yields false for them because they use variable-length encodings (UTF-8 and UTF-16 respectively). These types are bidirectional ranges only.
-----
but does not offer a solution.  If (when) byCodeUnit does really provide a random-access range, it would be desirable to have it linked where the problem is stated.

[0] http://dlang.org/phobos/std_range.html#.isRandomAccessRange
November 06, 2014
On Thursday, 6 November 2014 at 10:52:32 UTC, Ivan Kazmenko wrote:
> On Wednesday, 5 November 2014 at 13:34:05 UTC, Marc Schütz wrote:
>> On Wednesday, 5 November 2014 at 12:54:03 UTC, Ivan Kazmenko wrote:
>>> Hi!
>>>
>>> This gives an error (cannot deduce template function from argument types):
>>>
>>> -----
>>> import std.algorithm;
>>> void main () {
>>> 	char [] c;
>>> 	sort (c);
>>> }
>>> -----
>>>
>>> Why is "char []" so special that it can't be sorted?
>>>
>>> For example, if I know the array contains only ASCII characters, sorting it sounds no different to sorting an "int []".
>>
>> Hmm... this doesn't work either:
>>
>>    import std.algorithm;
>>    import std.utf;
>>    void main () {
>>        char [] c;
>>        sort (c.byCodeUnit);
>>    }
>>
>> But IMO it should.
>
> So, you imply that to use a char array as a RandomAccessRange, I have to use byCodeUnit? (and it should work, but doesn't?)
>

Yes. H.S. Teoh has already submitted a PR to fix it.

> Fine, but how does one learn that except by asking here?  Googling did not produce meaningful results for me.
>
> For example, isRandomAccessRange[0] states the problem:
> -----
> Although char[] and wchar[] (as well as their qualified versions including string and wstring) are arrays, isRandomAccessRange yields false for them because they use variable-length encodings (UTF-8 and UTF-16 respectively). These types are bidirectional ranges only.
> -----
> but does not offer a solution.  If (when) byCodeUnit does really provide a random-access range, it would be desirable to have it linked where the problem is stated.
>
> [0] http://dlang.org/phobos/std_range.html#.isRandomAccessRange

I agree. But how should it be implemented? We would have to modify algorithms that require an RA range to also accept char[], but then print an error message with the suggestion to use byCodeUnit. I think that's not practicable. Any better ideas?
November 06, 2014
On 11/5/14 7:54 AM, Ivan Kazmenko wrote:
> Hi!
>
> This gives an error (cannot deduce template function from argument types):
>
> -----
> import std.algorithm;
> void main () {
>      char [] c;
>      sort (c);
> }
> -----
>
> Why is "char []" so special that it can't be sorted?

Because sort works on ranges, and std.range has the view that char[] is a range of dchar without random access. Nevermind what the compiler thinks :)

I believe you can get what you want with std.string.representation:

import std.string;

sort(c.representation);

-Steve
November 11, 2014
IK>> For example, isRandomAccessRange[0] states the problem:
IK>> -----
IK>> Although char[] and wchar[] (as well as their qualified
IK>> versions including string and wstring) are arrays,
IK>> isRandomAccessRange yields false for them because they use
IK>> variable-length encodings (UTF-8 and UTF-16 respectively).
IK>> These types are bidirectional ranges only.
IK>> -----
IK>> but does not offer a solution.  If (when) byCodeUnit does
IK>> really provide a random-access range, it would be desirable to
IK>> have it linked where the problem is stated.
IK>>
IK>> [0] http://dlang.org/phobos/std_range.html#.isRandomAccessRange

MS> I agree. But how should it be implemented? We would have to
MS> modify algorithms that require an RA range to also accept
MS> char[], but then print an error message with the suggestion to
MS> use byCodeUnit. I think that's not practicable. Any better
MS> ideas?

I meant just mentioning a workaround (byCodeUnit or representation) in the documentation, not in a compiler error.  But the latter option does have some sense, too.
November 11, 2014
IK>> Why is "char []" so special that it can't be sorted?

SS> Because sort works on ranges, and std.range has the view that
SS> char[] is a range of dchar without random access. Nevermind
SS> what the compiler thinks :)
SS>
SS> I believe you can get what you want with
SS> std.string.representation:
SS>
SS> import std.string;
SS>
SS> sort(c.representation);

Thank you for showing a library way to do that.
I ended up with using a cast, like "sort (cast (ubyte []) c)".
And this looks like a safe way to do the same.

Now, std.utf's byCodeUnit and std.string's representation seem like duplicate functionality, albeit with different input and output types (and bugs :) ).
« First   ‹ Prev
1 2