May 14, 2012
I thing the zero-terminated literal shtick is pointless. Literals are rarely passed to C functions, so we gotta use the std.utf.toUTFz  anyway.

On Mon, May 14, 2012 at 5:03 PM, Christophe <travert@phare.normalesup.org>wrote:

> deadalnix , dans le message (digitalmars.D:167258), a écrit :
> > A good solution would be to set the pointer to 0 when the length is set to 0.
>
> String literal are zero-terminated. "" cannot point to 0x0, unless we drop this rule. Maybe we should...
>



-- 
Bye,
Gor Gyolchanyan.


May 14, 2012
On Mon, 14 May 2012 06:08:17 -0400, Gor Gyolchanyan <gor.f.gyolchanyan@gmail.com> wrote:

> Hi! I have a small question:
> Is the test for a null array equivalent to a test for zero-length array?

== tests for length and content equivalence.

'is' tests for both pointer and length equivalence (and therefore, content equality is implied).

There is a large confusion with null arrays.  A null array is simply an empty array that happens to be pointing to null.  Other than that, it is equivalent to an empty array, and should be treated as such.

One can use the idea that "null arrays are special", but it leads to likely confusing semantics, where an empty array is different from a null array.  if(arr) should IMO succeed iff length > 0.  That is one of the main reasons of the confusion.

Note that [] is a request to the runtime to build an empty array.  The runtime detects this, and rather than consuming a heap allocation to build nothing, it simply returns a null-pointed array.  This is 100% the right decision, and I don't think anyone would ever convince me (or Andrei or Walter) otherwise.

> This is particularly interesting for strings.
> For instance, I could return an empty string from a toString-like function
> and the empty string would be printed, but If I returned a null string,
> that would indicate, that there is no string representation and it would
> cause some default string to be printed.

These are the confusing semantics I was referring to ;)  I would recommend we try to avoid this kind of distinction wherever possible.

> So, the question is, if a null array is any different from an empty array?

I would say it technically is different, but you should treat it as equivalent unless you have a really really good reason not to.  It's just another empty array which happens to be pointing at 0.

-Steve
May 14, 2012
On 14-05-2012 15:21, Gor Gyolchanyan wrote:
> I thing the zero-terminated literal shtick is pointless. Literals are
> rarely passed to C functions, so we gotta use the std.utf.toUTFz  anyway.
>
> On Mon, May 14, 2012 at 5:03 PM, Christophe
> <travert@phare.normalesup.org <mailto:travert@phare.normalesup.org>> wrote:
>
>     deadalnix , dans le message (digitalmars.D:167258), a écrit :
>      > A good solution would be to set the pointer to 0 when the length
>     is set
>      > to 0.
>
>     String literal are zero-terminated. "" cannot point to 0x0,
>     unless we drop this rule. Maybe we should...
>
>
>
>
> --
> Bye,
> Gor Gyolchanyan.

This is very false. I invite you to read almost any module in druntime. You'll find that it makes heavy use of printf debugging.

That being said, dropping the null-termination rule when passing strings to non-const(char)* parameters/variables/etc would be sane enough (I think).

-- 
- Alex
May 14, 2012
Le 14/05/2012 16:37, Steven Schveighoffer a écrit :
> Note that [] is a request to the runtime to build an empty array. The
> runtime detects this, and rather than consuming a heap allocation to
> build nothing, it simply returns a null-pointed array. This is 100% the
> right decision, and I don't think anyone would ever convince me (or
> Andrei or Walter) otherwise.
>

Obviously this is the right thing to do !

The question is why an array of length 0 isn't nulled ? It lead to confusing semantic here, and can keep alive memory that can't be accessed.
May 14, 2012
On Mon, 14 May 2012 15:30:25 -0400, deadalnix <deadalnix@gmail.com> wrote:

> Le 14/05/2012 16:37, Steven Schveighoffer a écrit :
>> Note that [] is a request to the runtime to build an empty array. The
>> runtime detects this, and rather than consuming a heap allocation to
>> build nothing, it simply returns a null-pointed array. This is 100% the
>> right decision, and I don't think anyone would ever convince me (or
>> Andrei or Walter) otherwise.
>>
>
> Obviously this is the right thing to do !
>
> The question is why an array of length 0 isn't nulled ? It lead to confusing semantic here, and can keep alive memory that can't be accessed.

int[] arr;
arr.reserve(10000);
assert(arr.length == 0);

-Steve
May 14, 2012
On 05/14/2012 01:51 PM, deadalnix wrote:
> Le 14/05/2012 12:49, Gor Gyolchanyan a écrit :
>> So, null arrays and empty arrays are always the same, except for an
>> empty string, which is a valid non-nill array of characters with length
>> 0, right?
>>
>
> If it is the current behavior, it deserve a WAT !

I agree, but it is explained easily. Built-in string literals are always zero-terminated, therefore an empty string literal must point into accessible memory. I'd like to have [] !is null as well, so that null can reliably be used as a sentinel value.
May 15, 2012
Le 14/05/2012 19:38, Alex Rønne Petersen a écrit :
> On 14-05-2012 15:21, Gor Gyolchanyan wrote:
>> I thing the zero-terminated literal shtick is pointless. Literals are
>> rarely passed to C functions, so we gotta use the std.utf.toUTFz anyway.
>>
>> On Mon, May 14, 2012 at 5:03 PM, Christophe
>> <travert@phare.normalesup.org <mailto:travert@phare.normalesup.org>>
>> wrote:
>>
>> deadalnix , dans le message (digitalmars.D:167258), a écrit :
>> > A good solution would be to set the pointer to 0 when the length
>> is set
>> > to 0.
>>
>> String literal are zero-terminated. "" cannot point to 0x0,
>> unless we drop this rule. Maybe we should...
>>
>>
>>
>>
>> --
>> Bye,
>> Gor Gyolchanyan.
>
> This is very false. I invite you to read almost any module in druntime.
> You'll find that it makes heavy use of printf debugging.
>
> That being said, dropping the null-termination rule when passing strings
> to non-const(char)* parameters/variables/etc would be sane enough (I
> think).
>

This looks to me like a bad practice. C string and D string are different beasts, and we have toStringz .

It is kind of dumb to create a WAT is the language because druntime dev did mistakes. It have to be fixed.
May 15, 2012
Le 14/05/2012 21:53, Steven Schveighoffer a écrit :
> On Mon, 14 May 2012 15:30:25 -0400, deadalnix <deadalnix@gmail.com> wrote:
>
>> Le 14/05/2012 16:37, Steven Schveighoffer a écrit :
>>> Note that [] is a request to the runtime to build an empty array. The
>>> runtime detects this, and rather than consuming a heap allocation to
>>> build nothing, it simply returns a null-pointed array. This is 100% the
>>> right decision, and I don't think anyone would ever convince me (or
>>> Andrei or Walter) otherwise.
>>>
>>
>> Obviously this is the right thing to do !
>>
>> The question is why an array of length 0 isn't nulled ? It lead to
>> confusing semantic here, and can keep alive memory that can't be
>> accessed.
>
> int[] arr;
> arr.reserve(10000);
> assert(arr.length == 0);
>
> -Steve

The length isn't set to 0 here. You obviously don't want that to be nulled.
May 15, 2012
deadalnix , dans le message (digitalmars.D:167404), a écrit :
> This looks to me like a bad practice. C string and D string are different beasts, and we have toStringz .

C string and D string are different, but it's not a bad idea to have string *literals* that works for both C and D strings, otherwise using printf will lead to a bug each time the programmer forget the trailing \0.

> It is kind of dumb to create a WAT is the language because druntime dev did mistakes. It have to be fixed.

You can't rely on an empty string to be null since you must be able to reserve place at the end of the array, and or the string could be the result of poping a full string.
May 15, 2012
On Tue, May 15, 2012 at 7:51 PM, Christophe <travert@phare.normalesup.org>wrote:

> using printf will lead to a bug each time the programmer forget the
> trailing
> \0.


First of all, printf shouldn't be used! There's writef and it's superior to
printf in any way!
Second of all, if the zero-termination of literals are to be removed, the
literals will no longer be accepted as a pointer to a character.
The appropriate type mismatch error will force the user to use toUTF8z to
get ht e zero-terminated utf-8 version of the original string.
In case it's a literal, one could use the compile-time version of toUTF8z
to avoid run-time overhead.
This all doesn't sound like a bad idea to me. I don't see any security or
performance flaws in this scheme.

-- 
Bye,
Gor Gyolchanyan.