Thread overview
Same process to different results?
Jul 01, 2015
Taylor Hillegeist
Jul 01, 2015
Adam D. Ruppe
Jul 01, 2015
Taylor Hillegeist
Jul 01, 2015
anonymous
Jul 01, 2015
Taylor Hillegeist
Jul 01, 2015
H. S. Teoh
July 01, 2015
When I run the code (compiled on DMD 2.067.1):


------------------------------------------------------
import std.algorithm;
import std.stdio;
import std.range;

string A="AaA";
string B="BbBb";
string C="CcCcC";

void main(){
	int L=25;

  int seg1len=(L-B.length)/2;
  int seg2len=B.length;
  int seg3len=L-seg1len-seg2len;

  (A.cycle.take(seg1len).array
  ~B.cycle.take(seg2len).array
  ~C.cycle.take(seg3len).array).writeln;

  string q = cast(string)
  (A.cycle.take(seg1len).array
  ~B.cycle.take(seg2len).array
  ~C.cycle.take(seg3len).array);

  q.writeln;

}
-----------------------------------------------

I get a weird result of
AaAAaAAaAABbBbCcCcCCcCcCC
A   a   A   A   a   A   A   a   A   A   B   b   B   b   C   c   C   c   C   C   c   C   c   C   C

Any ideas why?


July 01, 2015
I betcha it is because A, B, and C are modified by the first pass. A lot of the range functions consume their input.
July 01, 2015
On Wednesday, 1 July 2015 at 17:06:01 UTC, Adam D. Ruppe wrote:
> I betcha it is because A, B, and C are modified by the first pass. A lot of the range functions consume their input.

Running them one at a time produces the same result.

for some reason:

  (A.cycle.take(seg1len).array
  ~B.cycle.take(seg2len).array
  ~C.cycle.take(seg3len).array).writeln;

is different from:

  string q = cast(string)
  (A.cycle.take(seg1len).array
  ~B.cycle.take(seg2len).array
  ~C.cycle.take(seg3len).array);
  q.writeln;

I was wondering if it might be the cast?
July 01, 2015
On Wednesday, 1 July 2015 at 17:13:03 UTC, Taylor Hillegeist wrote:
>   string q = cast(string)
>   (A.cycle.take(seg1len).array
>   ~B.cycle.take(seg2len).array
>   ~C.cycle.take(seg3len).array);
>   q.writeln;
>
> I was wondering if it might be the cast?

Yes, the cast is wrong. You're reinterpreting (not converting) an array of `dchar`s (UTF-32 code units) as an array of `char`s (UTF-8 code units).

If you print the numeric values of the string, e.g. via std.string.representation, you can see that every actual character has three null bytes following it:
----
import std.string: representation;
writeln(q.representation);
----
[65, 0, 0, 0, 97, 0, 0, 0, 65, 0, 0, 0, 65, 0, 0, 0, 97, 0, 0, 0, 65, 0, 0, 0, 65, 0, 0, 0, 97, 0, 0, 0, 65, 0, 0, 0, 65, 0, 0, 0, 66, 0, 0, 0, 98, 0, 0, 0, 66, 0, 0, 0, 98, 0, 0, 0, 67, 0, 0, 0, 99, 0, 0, 0, 67, 0, 0, 0, 99, 0, 0, 0, 67, 0, 0, 0, 67, 0, 0, 0, 99, 0, 0, 0, 67, 0, 0, 0, 99, 0, 0, 0, 67, 0, 0, 0, 67, 0, 0, 0]
----

Use std.conv.to for less surprising conversions. And don't use casts unless you know exactly what you're doing.
July 01, 2015
On Wednesday, 1 July 2015 at 17:00:51 UTC, Taylor Hillegeist wrote:
> When I run the code (compiled on DMD 2.067.1):
>
>
> ------------------------------------------------------
> import std.algorithm;
> import std.stdio;
> import std.range;
>
> string A="AaA";
> string B="BbBb";
> string C="CcCcC";
>
> void main(){
> 	int L=25;
>
>   int seg1len=(L-B.length)/2;
>   int seg2len=B.length;
>   int seg3len=L-seg1len-seg2len;
>
>   (A.cycle.take(seg1len).array
>   ~B.cycle.take(seg2len).array
>   ~C.cycle.take(seg3len).array).writeln;
>
>   string q = cast(string)
>   (A.cycle.take(seg1len).array
>   ~B.cycle.take(seg2len).array
>   ~C.cycle.take(seg3len).array);
>
>   q.writeln;
>
> }
> -----------------------------------------------
>
> I get a weird result of
> AaAAaAAaAABbBbCcCcCCcCcCC
> A   a   A   A   a   A   A   a   A   A   B   b   B   b   C   c   C
>   c   C   C   c   C   c   C   C
>
> Any ideas why?

Some way or another the type was converted to a dchar[]
during this process:

 A.cycle.take(seg1len).array
~B.cycle.take(seg2len).array
~C.cycle.take(seg3len).array

Why would it change the type so sneaky like?... Except for maybe its the default behavior with string due to 32bits => (typically one grapheme)?
I bet cycle did this.
July 01, 2015
On 7/1/15 1:00 PM, Taylor Hillegeist wrote:
> When I run the code (compiled on DMD 2.067.1):
>
>
> ------------------------------------------------------
> import std.algorithm;
> import std.stdio;
> import std.range;
>
> string A="AaA";
> string B="BbBb";
> string C="CcCcC";
>
> void main(){
>      int L=25;
>
>    int seg1len=(L-B.length)/2;
>    int seg2len=B.length;
>    int seg3len=L-seg1len-seg2len;
>
>    (A.cycle.take(seg1len).array
>    ~B.cycle.take(seg2len).array
>    ~C.cycle.take(seg3len).array).writeln;
>
>    string q = cast(string)
>    (A.cycle.take(seg1len).array
>    ~B.cycle.take(seg2len).array
>    ~C.cycle.take(seg3len).array);
>
>    q.writeln;
>
> }
> -----------------------------------------------
>
> I get a weird result of
> AaAAaAAaAABbBbCcCcCCcCcCC
> A   a   A   A   a   A   A   a   A   A   B   b   B   b   C   c   C   c
> C   C   c   C   c   C   C
>
> Any ideas why?

Schizophrenia of Phobos.

Phobos thinks a string is a range of dchar instead of a range of char. So what cycle, take, and array all output are dchar ranges and arrays.

When you cast the dchar[] result to a string, (which is a char[]), it then treats all the 0's in each dchar element as '\0', printing a blank apparently.

-Steve
July 01, 2015
On 7/1/15 1:44 PM, Steven Schveighoffer wrote:

> Schizophrenia of Phobos.
>
> Phobos thinks a string is a range of dchar instead of a range of char.
> So what cycle, take, and array all output are dchar ranges and arrays.
>
> When you cast the dchar[] result to a string, (which is a char[]), it
> then treats all the 0's in each dchar element as '\0', printing a blank
> apparently.

This has to be one of the most obvious cases I've ever seen that phobos treating string as a range of dchar was the wrong decision. That one can't use ranges to make a new string is ridiculous. Just the thought of "fixing" this by re-encoding...

-Steve
July 01, 2015
On Wed, Jul 01, 2015 at 02:14:49PM -0400, Steven Schveighoffer via Digitalmars-d-learn wrote:
> On 7/1/15 1:44 PM, Steven Schveighoffer wrote:
> 
> >Schizophrenia of Phobos.
> >
> >Phobos thinks a string is a range of dchar instead of a range of char.  So what cycle, take, and array all output are dchar ranges and arrays.
> >
> >When you cast the dchar[] result to a string, (which is a char[]), it then treats all the 0's in each dchar element as '\0', printing a blank apparently.
> 
> This has to be one of the most obvious cases I've ever seen that phobos treating string as a range of dchar was the wrong decision. That one can't use ranges to make a new string is ridiculous. Just the thought of "fixing" this by re-encoding...
[...]

Yeah, although Andrei has vetoed all suggestions of getting rid of autodecoding, this is one of the glaring cases where it's obviously a bad idea.

It almost makes me want to create my own custom string type that serves up char instead of dchar.


T

-- 
There are four kinds of lies: lies, damn lies, and statistics.