Thread overview
char[][] join ==> string
Apr 07, 2011
bearophile
Apr 07, 2011
Ali Çehreli
Apr 07, 2011
spir
Apr 07, 2011
spir
Apr 07, 2011
spir
Apr 07, 2011
Ali Çehreli
Apr 07, 2011
Jesse Phillips
Apr 07, 2011
bearophile
Apr 07, 2011
Simen kjaeraas
Apr 07, 2011
Ali Çehreli
April 07, 2011
Given an array of strings std.string.join() returns a single string:

import std.string;
void main() {
    string[] a1 = ["hello", "red"];
    string j1 = join(a1, " "); // OK
}


But in a program I need an array of mutable arrays of chars. If I join the arrays I get a mutable array of chars. But I need a string:

import std.string;
void main() {
    char[][] a2 = ["hello".dup, "red".dup];
    string j2 = join(a2, " "); // error
}

Error: cannot implicitly convert expression (join(a," ")) of type char[] to string

.idup avoids the error:

string j3 = join(a2, " ").idup; // OK

Given the low efficiency of the D GC it's better to reduce memory allocations as much as possible.
Here join() creates a brand new array, so idup performs a useless copy. To avoid this extra copy do I have to write another joinString() function?

Bye,
bearophile
April 07, 2011
On 04/06/2011 05:13 PM, bearophile wrote:
> Given an array of strings std.string.join() returns a single string:
>
> import std.string;
> void main() {
>      string[] a1 = ["hello", "red"];
>      string j1 = join(a1, " "); // OK
> }
>
>
> But in a program I need an array of mutable arrays of chars. If I join the arrays I get a mutable array of chars. But I need a string:

Tangentially off-topic: This does not apply to your case, but I think we should think twice before deciding that we need a string. For example, functions parameters should be const(char[]) (or const(char)[]) instead of string as that type accepts both mutable and immutable strings.

>
> import std.string;
> void main() {
>      char[][] a2 = ["hello".dup, "red".dup];
>      string j2 = join(a2, " "); // error

If possible, this might work:

    const(char[]) j2 = join(a2, " ");

There is also std.exception.assumeUnique, but it's too eager to be safe and tries to null its parameter and this fails:

    string j2 = assumeUnique(join(a2, " ")); // error

Finally, casting ourselves works:

    string j2 = cast(string)join(a2, " ");

> }
>
> Error: cannot implicitly convert expression (join(a," ")) of type char[] to string
>
> ..idup avoids the error:
>
> string j3 = join(a2, " ").idup; // OK
>
> Given the low efficiency of the D GC it's better to reduce memory allocations as much as possible.
> Here join() creates a brand new array, so idup performs a useless copy. To avoid this extra copy do I have to write another joinString() function?
>
> Bye,
> bearophile

Ali

April 07, 2011
On 04/07/2011 03:07 AM, Ali Çehreli wrote:
>>  Given an array of strings std.string.join() returns a single string:
>>
>>  import std.string;
>>  void main() {
>>       string[] a1 = ["hello", "red"];
>>       string j1 = join(a1, " "); // OK
>>  }
>>
>>
>>  But in a program I need an array of mutable arrays of chars. If I join the
> arrays I get a mutable array of chars.
> [...]
> Finally, casting ourselves works:
>
>      string j2 = cast(string)join(a2, " ");

Oh, that's very good news! Thans Ali, I never thought at that solution. I'm often i/dup-ing from/to string to manipulate text due to the fact there is no automatic conversion.
cast() works in place, doesn't it? so this is supposed avoid to avoid copy.

PS: Checked: indeed, it works in-place. But watch the gotcha:

unittest {
    string s = "abc";
    char[] chars = cast(char[])s;
    chars ~= "de";
    s = cast(string) chars;
    writeln(s, ' ', chars); // abcde abcde

    chars[1] = 'z';
    writeln(s, ' ', chars); // azcde azcde
}

s's chars are mutable ;-) So, I guess there is /really/ no reason for implicite casts between char[] and string no to exist. (I assumed the reason was precisely to avoid such traps).

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

April 07, 2011
On 04/07/2011 09:52 AM, spir wrote:
> On 04/07/2011 03:07 AM, Ali Çehreli wrote:
>>> Given an array of strings std.string.join() returns a single string:
>>>
>>> import std.string;
>>> void main() {
>>> string[] a1 = ["hello", "red"];
>>> string j1 = join(a1, " "); // OK
>>> }
>>>
>>>
>>> But in a program I need an array of mutable arrays of chars. If I join the
>> arrays I get a mutable array of chars.
>> [...]
>> Finally, casting ourselves works:
>>
>> string j2 = cast(string)join(a2, " ");
>
> Oh, that's very good news! Thans Ali, I never thought at that solution. I'm
> often i/dup-ing from/to string to manipulate text due to the fact there is no
> automatic conversion.
> cast() works in place, doesn't it? so this is supposed avoid to avoid copy.
>
> PS: Checked: indeed, it works in-place. But watch the gotcha:
>
> unittest {
> string s = "abc";
> char[] chars = cast(char[])s;
> chars ~= "de";
> s = cast(string) chars;
> writeln(s, ' ', chars); // abcde abcde

Sorry: forgot this line:
    assert(s.ptr == chars.ptr);	// pass

>
> chars[1] = 'z';
> writeln(s, ' ', chars); // azcde azcde
> }
>
> s's chars are mutable ;-) So, I guess there is /really/ no reason for implicite
> casts between char[] and string no to exist. (I assumed the reason was
> precisely to avoid such traps).
>
> Denis

-- 
_________________
vita es estrany
spir.wikidot.com

April 07, 2011
On 04/07/2011 09:52 AM, spir wrote:
> On 04/07/2011 03:07 AM, Ali Çehreli wrote:
>>> Given an array of strings std.string.join() returns a single string:
>>>
>>> import std.string;
>>> void main() {
>>> string[] a1 = ["hello", "red"];
>>> string j1 = join(a1, " "); // OK
>>> }
>>>
>>>
>>> But in a program I need an array of mutable arrays of chars. If I join the
>> arrays I get a mutable array of chars.
>> [...]
>> Finally, casting ourselves works:
>>
>> string j2 = cast(string)join(a2, " ");
>
> Oh, that's very good news! Thans Ali, I never thought at that solution. I'm
> often i/dup-ing from/to string to manipulate text due to the fact there is no
> automatic conversion.
> cast() works in place, doesn't it? so this is supposed avoid to avoid copy.
>
> PS: Checked: indeed, it works in-place. But watch the gotcha:
>
> unittest {
> string s = "abc";
> char[] chars = cast(char[])s;
> chars ~= "de";
> s = cast(string) chars;
> writeln(s, ' ', chars); // abcde abcde
>
> chars[1] = 'z';
> writeln(s, ' ', chars); // azcde azcde
> }
>
> s's chars are mutable ;-) So, I guess there is /really/ no reason for implicite
> casts between char[] and string no to exist. (I assumed the reason was
> precisely to avoid such traps).

After some more thought, I guess it's better to leave things as are. We have a way to cast without copy --which is one issue perfectly solved. The other issue --typing-- is small enough to keep it, since it also serves as warning to the programmer about the above trap.
What should definitely be done is teaching this idiom in all relevant places of the reference, manuals, tutorials: while this issue is often submitted on D lists, I had never read about it (nore thought about it myself).

Questions: did you know this idiom? if yes, have you found it yourself or read about it? if the latter, where?

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

April 07, 2011
On Thu, 07 Apr 2011 02:13:16 +0200, bearophile <bearophileHUGS@lycos.com> wrote:

> Given an array of strings std.string.join() returns a single string:
>
> import std.string;
> void main() {
>     string[] a1 = ["hello", "red"];
>     string j1 = join(a1, " "); // OK
> }
>
>
> But in a program I need an array of mutable arrays of chars. If I join the arrays I get a mutable array of chars. But I need a string:
>
> import std.string;
> void main() {
>     char[][] a2 = ["hello".dup, "red".dup];
>     string j2 = join(a2, " "); // error
> }
>
> Error: cannot implicitly convert expression (join(a," ")) of type char[] to string
>
> .idup avoids the error:
>
> string j3 = join(a2, " ").idup; // OK
>
> Given the low efficiency of the D GC it's better to reduce memory allocations as much as possible.
> Here join() creates a brand new array, so idup performs a useless copy. To avoid this extra copy do I have to write another joinString() function?
>
> Bye,
> bearophile

Isn't this a prime case for std.exception.assumeUnique?

-- 
Simen
April 07, 2011
On 04/07/2011 09:01 AM, Simen kjaeraas wrote:
> On Thu, 07 Apr 2011 02:13:16 +0200, bearophile
> <bearophileHUGS@lycos.com> wrote:
>
>> Given an array of strings std.string.join() returns a single string:
>>
>> import std.string;
>> void main() {
>> string[] a1 = ["hello", "red"];
>> string j1 = join(a1, " "); // OK
>> }
>>
>>
>> But in a program I need an array of mutable arrays of chars. If I join
>> the arrays I get a mutable array of chars. But I need a string:
>>
>> import std.string;
>> void main() {
>> char[][] a2 = ["hello".dup, "red".dup];
>> string j2 = join(a2, " "); // error
>> }
>>
>> Error: cannot implicitly convert expression (join(a," ")) of type
>> char[] to string
>>
>> .idup avoids the error:
>>
>> string j3 = join(a2, " ").idup; // OK
>>
>> Given the low efficiency of the D GC it's better to reduce memory
>> allocations as much as possible.
>> Here join() creates a brand new array, so idup performs a useless
>> copy. To avoid this extra copy do I have to write another joinString()
>> function?
>>
>> Bye,
>> bearophile
>
> Isn't this a prime case for std.exception.assumeUnique?

Almost. assumeUnique is too eager and tries to null its reference parameter. Copying from std/exception.d:

immutable(T)[] assumeUnique(T)(ref T[] array) pure nothrow
{
    auto result = cast(immutable(T)[]) array;
    array = null;
    return result;
}

And that fails as join's return type is not an lvalue.

We need a simplyAssumeUnique() that doesn't null the reference parameter :); and it would have no value over casting other than communicating the intent.

Ali

April 07, 2011
On 04/07/2011 01:04 AM, spir wrote:
> On 04/07/2011 09:52 AM, spir wrote:
>> On 04/07/2011 03:07 AM, Ali Çehreli wrote:
>>>> Given an array of strings std.string.join() returns a single string:
>>>>
>>>> import std.string;
>>>> void main() {
>>>> string[] a1 = ["hello", "red"];
>>>> string j1 = join(a1, " "); // OK
>>>> }
>>>>
>>>>
>>>> But in a program I need an array of mutable arrays of chars. If I
>>>> join the
>>> arrays I get a mutable array of chars.
>>> [...]
>>> Finally, casting ourselves works:
>>>
>>> string j2 = cast(string)join(a2, " ");

> Questions: did you know this idiom? if yes, have you found it yourself
> or read about it? if the latter, where?

I had heard about assumeUnique() a couple of times in these forums. I remember looking at its implementation.

Ali

April 07, 2011
spir Wrote:

> > unittest {
> > string s = "abc";
> > char[] chars = cast(char[])s;
> > chars ~= "de";
> > s = cast(string) chars;
> > writeln(s, ' ', chars); // abcde abcde
> >
> > chars[1] = 'z';
> > writeln(s, ' ', chars); // azcde azcde
> > }
> >
> > s's chars are mutable ;-) So, I guess there is /really/ no reason for implicite casts between char[] and string no to exist. (I assumed the reason was precisely to avoid such traps).
> 
> After some more thought, I guess it's better to leave things as are. We have a
> way to cast without copy --which is one issue perfectly solved. The other issue
> --typing-- is small enough to keep it, since it also serves as warning to the
> programmer about the above trap.
> What should definitely be done is teaching this idiom in all relevant places of
> the reference, manuals, tutorials: while this issue is often submitted on D
> lists, I had never read about it (nore thought about it myself).
> 
> Questions: did you know this idiom? if yes, have you found it yourself or read about it? if the latter, where?

Casting to and from string/char[] is very dangerous, even through assumeUnique. AssumeUnique is intended to be used for returning a mutable as immutable from a function. Casting is often a no-op for the CPU and as you discovered removes any safety provided by the type system.

While modifying immutable data is undefined, if you can guarantee the data truly is mutable and the compiler won't be optimizing with the assumption of immutability,  it is perfectly safe. It is just in the hands of the programmer now.
April 07, 2011
Jesse Phillips:

> Casting to and from string/char[] is very dangerous, even through assumeUnique. AssumeUnique is intended to be used for returning a mutable as immutable from a function. Casting is often a no-op for the CPU and as you discovered removes any safety provided by the type system.

A more expressive type system (uniqueness annotations, lending, linear types) allows to write that code in a safe way, but introduces some new complexities too.

The GHC Haskell compiler has some experimental extensions (mostly to its type system) disabled on default (you need to add an annotation in your programs to switch each extension on), to experiment and debug/engineer ideas like that. I presume Walter is too much busy to do this in D, but I'd like something similar.

On the other hand we'll probably have a Phobos package for "experimental" standard modules.

Bye,
bearophile