Thread overview | |||||||
---|---|---|---|---|---|---|---|
|
May 03, 2018 Why does enumerate over range return dchar, when ranging without returns char? | ||||
---|---|---|---|---|
| ||||
I am puzzled why enumerating in a foreach returns a dchar (which forces me to cast), whereas without the enumerate the range returns a char as expected. Example: ``` import std.stdio; import std.range : enumerate; void main() { char[] s = ['a','b','c']; char[3] x; auto i = 0; foreach(c; s) { x[i] = c; i++; } writeln(x); } ``` Above works without cast. ''' import std.stdio; import std.range : enumerate; void main() { char[] s = ['a','b','c']; char[3] x; foreach(i, c; enumerate(s)) { x[i] = c; i++; } writeln(x); } ``` Above fails without casting c to type char. The function signature for enumerate shows "auto" return type, so that does not help me understand. Kind regards |
May 03, 2018 Re: Why does enumerate over range return dchar, when ranging without returns char? | ||||
---|---|---|---|---|
| ||||
Posted in reply to James Blachly | On 03/05/2018 5:44 PM, James Blachly wrote: > I am puzzled why enumerating in a foreach returns a dchar (which forces me to cast), whereas without the enumerate the range returns a char as expected. > > Example: > > ``` > import std.stdio; > import std.range : enumerate; > > void main() > { > char[] s = ['a','b','c']; > > char[3] x; > auto i = 0; > foreach(c; s) { > x[i] = c; > i++; > } > > writeln(x); > } > ``` > Above works without cast. > > ''' > import std.stdio; > import std.range : enumerate; > > void main() > { > char[] s = ['a','b','c']; > > char[3] x; > foreach(i, c; enumerate(s)) { > x[i] = c; > i++; > } > > writeln(x); > } > ``` > Above fails without casting c to type char. > > The function signature for enumerate shows "auto" return type, so that does not help me understand. > > Kind regards The first example uses auto-decoding (UTF-8 codepoints into a single UTF-32 one). This is considered a bad thing. But the compiler can disable it and leave it as UTF-8 code point upon request. The second example returns a Voldemort type (means no-name) which happens to be an input range. Where it can't disable anything and has been told that it is returning a dchar. See[0] as to where this gets decoded. Writing two small functions to replace it (and popFront), will override this behavior. [0] https://dlang.org/phobos/std_range_primitives.html#.front |
May 03, 2018 Re: Why does enumerate over range return dchar, when ranging without returns char? | ||||
---|---|---|---|---|
| ||||
Posted in reply to rikki cattermole | On 05/03/2018 07:56 AM, rikki cattermole wrote: >> ``` >> import std.stdio; >> import std.range : enumerate; >> >> void main() >> { >> char[] s = ['a','b','c']; >> >> char[3] x; >> auto i = 0; >> foreach(c; s) { >> x[i] = c; >> i++; >> } >> >> writeln(x); >> } >> ``` >> Above works without cast. >> >> ''' >> import std.stdio; >> import std.range : enumerate; >> >> void main() >> { >> char[] s = ['a','b','c']; >> >> char[3] x; >> foreach(i, c; enumerate(s)) { >> x[i] = c; >> i++; >> } >> >> writeln(x); >> } >> ``` [...] > The first example uses auto-decoding (UTF-8 codepoints into a single UTF-32 one). This is considered a bad thing. But the compiler can disable it and leave it as UTF-8 code point upon request. The first example (foreach over a char[]) doesn't do any decoding. UTF-8 stays UTF-8. Also, a `char` is a UTF-8 code *unit*, not a code *point*. > The second example returns a Voldemort type (means no-name) which happens to be an input range. Where it can't disable anything and has been told that it is returning a dchar. See[0] as to where this gets decoded. This is auto decoding. > Writing two small functions to replace it (and popFront), will override this behavior. This sounds like you can disable auto decoding by providing your own range primitives in your own module. That doesn't work, because Phobos would still use the ones from std.range.primitives. > [0] https://dlang.org/phobos/std_range_primitives.html#.front |
May 03, 2018 Re: Why does enumerate over range return dchar, when ranging without returns char? | ||||
---|---|---|---|---|
| ||||
Posted in reply to ag0aep6g | On 03/05/2018 9:50 PM, ag0aep6g wrote:
> On 05/03/2018 07:56 AM, rikki cattermole wrote:
>>> ```
>>> import std.stdio;
>>> import std.range : enumerate;
>>>
>>> void main()
>>> {
>>> char[] s = ['a','b','c'];
>>>
>>> char[3] x;
>>> auto i = 0;
>>> foreach(c; s) {
>>> x[i] = c;
>>> i++;
>>> }
>>>
>>> writeln(x);
>>> }
>>> ```
>>> Above works without cast.
>>>
>>> '''
>>> import std.stdio;
>>> import std.range : enumerate;
>>>
>>> void main()
>>> {
>>> char[] s = ['a','b','c'];
>>>
>>> char[3] x;
>>> foreach(i, c; enumerate(s)) {
>>> x[i] = c;
>>> i++;
>>> }
>>>
>>> writeln(x);
>>> }
>>> ```
> [...]
>> The first example uses auto-decoding (UTF-8 codepoints into a single UTF-32 one). This is considered a bad thing. But the compiler can disable it and leave it as UTF-8 code point upon request.
>
> The first example (foreach over a char[]) doesn't do any decoding. UTF-8 stays UTF-8.
>
> Also, a `char` is a UTF-8 code *unit*, not a code *point*.
>
>> The second example returns a Voldemort type (means no-name) which happens to be an input range. Where it can't disable anything and has been told that it is returning a dchar. See[0] as to where this gets decoded.
>
> This is auto decoding.
>
>> Writing two small functions to replace it (and popFront), will override this behavior.
>
> This sounds like you can disable auto decoding by providing your own range primitives in your own module. That doesn't work, because Phobos would still use the ones from std.range.primitives.
Hmm, I swear this use to work.
Oh well, easy fix:
import std.algorithm;
struct Wrapper {
char[] input;
alias input this;
@property char front() { return input[0]; }
@property bool empty() {return input.length == 0;}
void popFront() { input = input[1 .. $]; }
}
void main() {
char[] text = ['1', '2', '3'];
foreach(c; Wrapper(text).filter!(a => a != '\0')) {
pragma(msg, typeof(c));
}
}
|
May 03, 2018 Re: Why does enumerate over range return dchar, when ranging without returns char? | ||||
---|---|---|---|---|
| ||||
Posted in reply to rikki cattermole | On Thursday, May 03, 2018 22:00:04 rikki cattermole via Digitalmars-d-learn wrote:
> On 03/05/2018 9:50 PM, ag0aep6g wrote:
> > On 05/03/2018 07:56 AM, rikki cattermole wrote:
> >>> ```
> >>> import std.stdio;
> >>> import std.range : enumerate;
> >>>
> >>> void main()
> >>> {
> >>> char[] s = ['a','b','c'];
> >>>
> >>> char[3] x;
> >>> auto i = 0;
> >>> foreach(c; s) {
> >>> x[i] = c;
> >>> i++;
> >>> }
> >>>
> >>> writeln(x);
> >>> }
> >>> ```
> >>> Above works without cast.
> >>>
> >>> '''
> >>> import std.stdio;
> >>> import std.range : enumerate;
> >>>
> >>> void main()
> >>> {
> >>> char[] s = ['a','b','c'];
> >>>
> >>> char[3] x;
> >>> foreach(i, c; enumerate(s)) {
> >>> x[i] = c;
> >>> i++;
> >>> }
> >>>
> >>> writeln(x);
> >>> }
> >>> ```
> >
> > [...]
> >
> >> The first example uses auto-decoding (UTF-8 codepoints into a single UTF-32 one). This is considered a bad thing. But the compiler can disable it and leave it as UTF-8 code point upon request.
> >
> > The first example (foreach over a char[]) doesn't do any decoding. UTF-8
> > stays UTF-8.
> >
> > Also, a `char` is a UTF-8 code *unit*, not a code *point*.
> >
> >> The second example returns a Voldemort type (means no-name) which happens to be an input range. Where it can't disable anything and has been told that it is returning a dchar. See[0] as to where this gets decoded.
> >
> > This is auto decoding.
> >
> >> Writing two small functions to replace it (and popFront), will
> >> override this behavior.
> >
> > This sounds like you can disable auto decoding by providing your own range primitives in your own module. That doesn't work, because Phobos would still use the ones from std.range.primitives.
>
> Hmm, I swear this use to work.
>
> Oh well, easy fix:
>
> import std.algorithm;
>
> struct Wrapper {
> char[] input;
> alias input this;
>
> @property char front() { return input[0]; }
> @property bool empty() {return input.length == 0;}
> void popFront() { input = input[1 .. $]; }
> }
>
> void main() {
> char[] text = ['1', '2', '3'];
>
> foreach(c; Wrapper(text).filter!(a => a != '\0')) {
> pragma(msg, typeof(c));
> }
> }
The standard way to get around auto-decoding is std.utf.byCodeUnit.
- Jonathan M Davis
|
Copyright © 1999-2021 by the D Language Foundation