Thread overview | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
July 01, 2015 Same process to different results? | ||||
---|---|---|---|---|
| ||||
When I run the code (compiled on DMD 2.067.1): ------------------------------------------------------ import std.algorithm; import std.stdio; import std.range; string A="AaA"; string B="BbBb"; string C="CcCcC"; void main(){ int L=25; int seg1len=(L-B.length)/2; int seg2len=B.length; int seg3len=L-seg1len-seg2len; (A.cycle.take(seg1len).array ~B.cycle.take(seg2len).array ~C.cycle.take(seg3len).array).writeln; string q = cast(string) (A.cycle.take(seg1len).array ~B.cycle.take(seg2len).array ~C.cycle.take(seg3len).array); q.writeln; } ----------------------------------------------- I get a weird result of AaAAaAAaAABbBbCcCcCCcCcCC A a A A a A A a A A B b B b C c C c C C c C c C C Any ideas why? |
July 01, 2015 Re: Same process to different results? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Taylor Hillegeist | I betcha it is because A, B, and C are modified by the first pass. A lot of the range functions consume their input. |
July 01, 2015 Re: Same process to different results? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Adam D. Ruppe | On Wednesday, 1 July 2015 at 17:06:01 UTC, Adam D. Ruppe wrote: > I betcha it is because A, B, and C are modified by the first pass. A lot of the range functions consume their input. Running them one at a time produces the same result. for some reason: (A.cycle.take(seg1len).array ~B.cycle.take(seg2len).array ~C.cycle.take(seg3len).array).writeln; is different from: string q = cast(string) (A.cycle.take(seg1len).array ~B.cycle.take(seg2len).array ~C.cycle.take(seg3len).array); q.writeln; I was wondering if it might be the cast? |
July 01, 2015 Re: Same process to different results? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Taylor Hillegeist | On Wednesday, 1 July 2015 at 17:13:03 UTC, Taylor Hillegeist wrote:
> string q = cast(string)
> (A.cycle.take(seg1len).array
> ~B.cycle.take(seg2len).array
> ~C.cycle.take(seg3len).array);
> q.writeln;
>
> I was wondering if it might be the cast?
Yes, the cast is wrong. You're reinterpreting (not converting) an array of `dchar`s (UTF-32 code units) as an array of `char`s (UTF-8 code units).
If you print the numeric values of the string, e.g. via std.string.representation, you can see that every actual character has three null bytes following it:
----
import std.string: representation;
writeln(q.representation);
----
[65, 0, 0, 0, 97, 0, 0, 0, 65, 0, 0, 0, 65, 0, 0, 0, 97, 0, 0, 0, 65, 0, 0, 0, 65, 0, 0, 0, 97, 0, 0, 0, 65, 0, 0, 0, 65, 0, 0, 0, 66, 0, 0, 0, 98, 0, 0, 0, 66, 0, 0, 0, 98, 0, 0, 0, 67, 0, 0, 0, 99, 0, 0, 0, 67, 0, 0, 0, 99, 0, 0, 0, 67, 0, 0, 0, 67, 0, 0, 0, 99, 0, 0, 0, 67, 0, 0, 0, 99, 0, 0, 0, 67, 0, 0, 0, 67, 0, 0, 0]
----
Use std.conv.to for less surprising conversions. And don't use casts unless you know exactly what you're doing.
|
July 01, 2015 Re: Same process to different results? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Taylor Hillegeist | On Wednesday, 1 July 2015 at 17:00:51 UTC, Taylor Hillegeist wrote:
> When I run the code (compiled on DMD 2.067.1):
>
>
> ------------------------------------------------------
> import std.algorithm;
> import std.stdio;
> import std.range;
>
> string A="AaA";
> string B="BbBb";
> string C="CcCcC";
>
> void main(){
> int L=25;
>
> int seg1len=(L-B.length)/2;
> int seg2len=B.length;
> int seg3len=L-seg1len-seg2len;
>
> (A.cycle.take(seg1len).array
> ~B.cycle.take(seg2len).array
> ~C.cycle.take(seg3len).array).writeln;
>
> string q = cast(string)
> (A.cycle.take(seg1len).array
> ~B.cycle.take(seg2len).array
> ~C.cycle.take(seg3len).array);
>
> q.writeln;
>
> }
> -----------------------------------------------
>
> I get a weird result of
> AaAAaAAaAABbBbCcCcCCcCcCC
> A a A A a A A a A A B b B b C c C
> c C C c C c C C
>
> Any ideas why?
Some way or another the type was converted to a dchar[]
during this process:
A.cycle.take(seg1len).array
~B.cycle.take(seg2len).array
~C.cycle.take(seg3len).array
Why would it change the type so sneaky like?... Except for maybe its the default behavior with string due to 32bits => (typically one grapheme)?
I bet cycle did this.
|
July 01, 2015 Re: Same process to different results? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Taylor Hillegeist | On 7/1/15 1:00 PM, Taylor Hillegeist wrote:
> When I run the code (compiled on DMD 2.067.1):
>
>
> ------------------------------------------------------
> import std.algorithm;
> import std.stdio;
> import std.range;
>
> string A="AaA";
> string B="BbBb";
> string C="CcCcC";
>
> void main(){
> int L=25;
>
> int seg1len=(L-B.length)/2;
> int seg2len=B.length;
> int seg3len=L-seg1len-seg2len;
>
> (A.cycle.take(seg1len).array
> ~B.cycle.take(seg2len).array
> ~C.cycle.take(seg3len).array).writeln;
>
> string q = cast(string)
> (A.cycle.take(seg1len).array
> ~B.cycle.take(seg2len).array
> ~C.cycle.take(seg3len).array);
>
> q.writeln;
>
> }
> -----------------------------------------------
>
> I get a weird result of
> AaAAaAAaAABbBbCcCcCCcCcCC
> A a A A a A A a A A B b B b C c C c
> C C c C c C C
>
> Any ideas why?
Schizophrenia of Phobos.
Phobos thinks a string is a range of dchar instead of a range of char. So what cycle, take, and array all output are dchar ranges and arrays.
When you cast the dchar[] result to a string, (which is a char[]), it then treats all the 0's in each dchar element as '\0', printing a blank apparently.
-Steve
|
July 01, 2015 Re: Same process to different results? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | On 7/1/15 1:44 PM, Steven Schveighoffer wrote:
> Schizophrenia of Phobos.
>
> Phobos thinks a string is a range of dchar instead of a range of char.
> So what cycle, take, and array all output are dchar ranges and arrays.
>
> When you cast the dchar[] result to a string, (which is a char[]), it
> then treats all the 0's in each dchar element as '\0', printing a blank
> apparently.
This has to be one of the most obvious cases I've ever seen that phobos treating string as a range of dchar was the wrong decision. That one can't use ranges to make a new string is ridiculous. Just the thought of "fixing" this by re-encoding...
-Steve
|
July 01, 2015 Re: Same process to different results? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | On Wed, Jul 01, 2015 at 02:14:49PM -0400, Steven Schveighoffer via Digitalmars-d-learn wrote: > On 7/1/15 1:44 PM, Steven Schveighoffer wrote: > > >Schizophrenia of Phobos. > > > >Phobos thinks a string is a range of dchar instead of a range of char. So what cycle, take, and array all output are dchar ranges and arrays. > > > >When you cast the dchar[] result to a string, (which is a char[]), it then treats all the 0's in each dchar element as '\0', printing a blank apparently. > > This has to be one of the most obvious cases I've ever seen that phobos treating string as a range of dchar was the wrong decision. That one can't use ranges to make a new string is ridiculous. Just the thought of "fixing" this by re-encoding... [...] Yeah, although Andrei has vetoed all suggestions of getting rid of autodecoding, this is one of the glaring cases where it's obviously a bad idea. It almost makes me want to create my own custom string type that serves up char instead of dchar. T -- There are four kinds of lies: lies, damn lies, and statistics. |
Copyright © 1999-2021 by the D Language Foundation