Thread overview | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
August 03, 2011 "" gives an empty string, while "".idup gives null | ||||
---|---|---|---|---|
| ||||
void main() { assert(is(typeof("") == typeof("".idup))); // both is immutable(char)[] assert("" !is null); assert("".idup !is null); // fails - s is null. Why? } |
August 03, 2011 Re: "" gives an empty string, while "".idup gives null | ||||
---|---|---|---|---|
| ||||
Posted in reply to simendsjo | simendsjo:
> void main() {
> assert(is(typeof("") == typeof("".idup))); // both is immutable(char)[]
>
> assert("" !is null);
> assert("".idup !is null); // fails - s is null. Why?
> }
I think someone has even suggested to statically forbid "is null" on strings :-)
Bye,
bearophile
|
August 03, 2011 Re: "" gives an empty string, while "".idup gives null | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On 03.08.2011 15:49, bearophile wrote:
> simendsjo:
>
>> void main() {
>> assert(is(typeof("") == typeof("".idup))); // both is immutable(char)[]
>>
>> assert("" !is null);
>> assert("".idup !is null); // fails - s is null. Why?
>> }
>
> I think someone has even suggested to statically forbid "is null" on strings :-)
>
> Bye,
> bearophile
How should I test for null if not with "is null"? There is a difference between null and empty, and avoiding this is not necessarily easy or even wanted.
I couldn't find anything in the specification stating this difference.
So... Is it a bug?
|
August 03, 2011 Re: "" gives an empty string, while "".idup gives null | ||||
---|---|---|---|---|
| ||||
Posted in reply to simendsjo | On 8/3/2011 11:23 PM, simendsjo wrote:
> On 03.08.2011 15:49, bearophile wrote:
>> simendsjo:
>>
>>> void main() {
>>> assert(is(typeof("") == typeof("".idup))); // both is immutable(char)[]
>>>
>>> assert("" !is null);
>>> assert("".idup !is null); // fails - s is null. Why?
>>> }
>>
>> I think someone has even suggested to statically forbid "is null" on
>> strings :-)
>>
>> Bye,
>> bearophile
>
> How should I test for null if not with "is null"? There is a difference
> between null and empty, and avoiding this is not necessarily easy or
> even wanted.
> I couldn't find anything in the specification stating this difference.
> So... Is it a bug?
>
This is apparently a bug. Somehow, the idup is clobbering the pointer. You can see it more clearly here:
void main()
{
assert("".ptr);
auto s = "".idup;
assert(s.ptr); // boom!
}
|
August 03, 2011 Re: "" gives an empty string, while "".idup gives null | ||||
---|---|---|---|---|
| ||||
Posted in reply to Mike Parker | On Thursday 04 August 2011 00:27:12 Mike Parker wrote:
> On 8/3/2011 11:23 PM, simendsjo wrote:
> > On 03.08.2011 15:49, bearophile wrote:
> >> simendsjo:
> >>> void main() {
> >>> assert(is(typeof("") == typeof("".idup))); // both is
> >>> immutable(char)[]
> >>>
> >>> assert("" !is null);
> >>> assert("".idup !is null); // fails - s is null. Why?
> >>> }
> >>
> >> I think someone has even suggested to statically forbid "is null" on strings :-)
> >>
> >> Bye,
> >> bearophile
> >
> > How should I test for null if not with "is null"? There is a difference
> > between null and empty, and avoiding this is not necessarily easy or
> > even wanted.
> > I couldn't find anything in the specification stating this difference.
> > So... Is it a bug?
>
> This is apparently a bug. Somehow, the idup is clobbering the pointer. You can see it more clearly here:
>
> void main()
> {
> assert("".ptr);
>
> auto s = "".idup;
> assert(s.ptr); // boom!
> }
I don't know if it's a bug or not. The string _was_ duped. assert(s == "") passes. So, as far as equality goes, they're equal, and they don't point to the same memory. Now, you'd think that the new string would be just empty rather than null, but whether it's a bug or not depends exactly on what dup and idup are supposed to do with regards to null. It's probably just a side effect of how dup and idup are implemented rather than it being planned one way or the other. I don't know if it matters or not though. In general, I don't like the conflation of null and empty, but is this particular case, you _do_ get a string which is equal to the original and which doesn't point to the same memory. So, I don't know whether this should be considered a bug or not. It depends on what dup and idup are ultimately supposed to do.
- Jonathan M Davis
|
August 03, 2011 Re: "" gives an empty string, while "".idup gives null | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | On 03.08.2011 18:18, Jonathan M Davis wrote:
> On Thursday 04 August 2011 00:27:12 Mike Parker wrote:
>> On 8/3/2011 11:23 PM, simendsjo wrote:
>>> On 03.08.2011 15:49, bearophile wrote:
>>>> simendsjo:
>>>>> void main() {
>>>>> assert(is(typeof("") == typeof("".idup))); // both is
>>>>> immutable(char)[]
>>>>>
>>>>> assert("" !is null);
>>>>> assert("".idup !is null); // fails - s is null. Why?
>>>>> }
>>>>
>>>> I think someone has even suggested to statically forbid "is null" on
>>>> strings :-)
>>>>
>>>> Bye,
>>>> bearophile
>>>
>>> How should I test for null if not with "is null"? There is a difference
>>> between null and empty, and avoiding this is not necessarily easy or
>>> even wanted.
>>> I couldn't find anything in the specification stating this difference.
>>> So... Is it a bug?
>>
>> This is apparently a bug. Somehow, the idup is clobbering the pointer.
>> You can see it more clearly here:
>>
>> void main()
>> {
>> assert("".ptr);
>>
>> auto s = "".idup;
>> assert(s.ptr); // boom!
>> }
>
> I don't know if it's a bug or not. The string _was_ duped. assert(s == "")
> passes. So, as far as equality goes, they're equal, and they don't point to
> the same memory. Now, you'd think that the new string would be just empty
> rather than null, but whether it's a bug or not depends exactly on what dup
> and idup are supposed to do with regards to null. It's probably just a side
> effect of how dup and idup are implemented rather than it being planned one way
> or the other. I don't know if it matters or not though. In general, I don't
> like the conflation of null and empty, but is this particular case, you _do_
> get a string which is equal to the original and which doesn't point to the
> same memory. So, I don't know whether this should be considered a bug or not.
> It depends on what dup and idup are ultimately supposed to do.
>
> - Jonathan M Davis
I would think it's a bug, but strings doesn't quite behave as regular references anyway...
But why should dup/idup change the semantics of the array?
void main() {
// A null string or empty string works as expected
string s1;
assert(s1 is null);
assert(s1.ptr is null);
assert(s1 == ""); // We can check for empty even if it's null, and it's equal to ""
assert(s1.length == 0); // ...and length even if it's null
s1 = "";
assert(s1 !is null);
assert(s1.ptr !is null);
assert(s1.length == 0);
assert(s1 == "");
// the same applies to null mutable arrays
char[] s2;
assert(s2 is null);
assert(s2.ptr is null);
assert(s2 == "");
assert(s2.length == 0);
// but with .dup/.idup things is different!
s2 = "".dup;
//assert(s2 !is null); // fails
//assert(s2.ptr !is null); // fails
assert(s2.length == 0); // but... s2 is null..?
assert(s2 == "");
assert(s2 == s1);
}
|
August 03, 2011 Re: "" gives an empty string, while "".idup gives null | ||||
---|---|---|---|---|
| ||||
Posted in reply to simendsjo | On Wed, 03 Aug 2011 06:35:08 -0400, simendsjo <simendsjo@gmail.com> wrote:
> void main() {
> assert(is(typeof("") == typeof("".idup))); // both is immutable(char)[]
>
> assert("" !is null);
> assert("".idup !is null); // fails - s is null. Why?
> }
An empty string manifest constant (i.e. string literal) still must have a valid pointer, because it's mandated that the string have a zero byte appended to it. This is so you can pass it to C functions which expect null-terminated strings.
So essentially, there is a '\0' in memory, and "" points to that character with a length of 0
However, idup calls a runtime function which *purposely* asks to make a copy. However, it's *NOT* required to copy the 'zero after the string' part.
The implementation, knowing that a null array is equivalent to an empty array, is going to return null to avoid the performance penalty of allocating a block that won't be used. If you append, it will simply allocate a block as needed.
I see no reason the runtime should waste cycles or a perfectly good 16-byte buffer to give you an empty array.
Definitely functions as designed, not a bug. If you would like different behavior, you are going to have to have a really really good use case to get this changed.
-Steve
|
August 03, 2011 Re: "" gives an empty string, while "".idup gives null | ||||
---|---|---|---|---|
| ||||
Posted in reply to simendsjo | > On 03.08.2011 18:18, Jonathan M Davis wrote: > > On Thursday 04 August 2011 00:27:12 Mike Parker wrote: > >> On 8/3/2011 11:23 PM, simendsjo wrote: > >>> On 03.08.2011 15:49, bearophile wrote: > >>>> simendsjo: > >>>>> void main() { > >>>>> assert(is(typeof("") == typeof("".idup))); // both is > >>>>> immutable(char)[] > >>>>> > >>>>> assert("" !is null); > >>>>> assert("".idup !is null); // fails - s is null. Why? > >>>>> } > >>>> > >>>> I think someone has even suggested to statically forbid "is null" on strings :-) > >>>> > >>>> Bye, > >>>> bearophile > >>> > >>> How should I test for null if not with "is null"? There is a difference > >>> between null and empty, and avoiding this is not necessarily easy or > >>> even wanted. > >>> I couldn't find anything in the specification stating this difference. > >>> So... Is it a bug? > >> > >> This is apparently a bug. Somehow, the idup is clobbering the pointer. You can see it more clearly here: > >> > >> void main() > >> { > >> > >> assert("".ptr); > >> > >> auto s = "".idup; > >> assert(s.ptr); // boom! > >> > >> } > > > > I don't know if it's a bug or not. The string _was_ duped. assert(s == "") passes. So, as far as equality goes, they're equal, and they don't point to the same memory. Now, you'd think that the new string would be just empty rather than null, but whether it's a bug or not depends exactly on what dup and idup are supposed to do with regards to null. It's probably just a side effect of how dup and idup are implemented rather than it being planned one way or the other. I don't know if it matters or not though. In general, I don't like the conflation of null and empty, but is this particular case, you _do_ get a string which is equal to the original and which doesn't point to the same memory. So, I don't know whether this should be considered a bug or not. It depends on what dup and idup are ultimately supposed to do. > > > > - Jonathan M Davis > > I would think it's a bug, but strings doesn't quite behave as regular > references anyway... > But why should dup/idup change the semantics of the array? > > void main() { > // A null string or empty string works as expected > string s1; > assert(s1 is null); > assert(s1.ptr is null); > assert(s1 == ""); // We can check for empty even if it's > null, and it's equal to "" > assert(s1.length == 0); // ...and length even if it's null > s1 = ""; > assert(s1 !is null); > assert(s1.ptr !is null); > assert(s1.length == 0); > assert(s1 == ""); > > // the same applies to null mutable arrays > char[] s2; > assert(s2 is null); > assert(s2.ptr is null); > assert(s2 == ""); > assert(s2.length == 0); > // but with .dup/.idup things is different! > s2 = "".dup; > //assert(s2 !is null); // fails > //assert(s2.ptr !is null); // fails > assert(s2.length == 0); // but... s2 is null..? > assert(s2 == ""); > assert(s2 == s1); > } If you look at the spec ( http://d-programming-language.org/arrays.html ), it says: dup: Create a dynamic array of the same size and copy the contents of the array into it. idup: Create a dynamic array of the same size and copy the contents of the array into it. The copy is typed as being immutable. D 2.0 only This is _exactly_ what dup and idup are doing. You get a new array with the exact same size and contents. null doesn't factor into it at all. So, per the spec, there's no bug here at all. dup and idup promise _nothing_ with regards to null. It may be that it would be better if dup and idup returned an array which was null if the original was null, and that would also be within the spec, but what dup and idup do at the moment _does_ follow the spec. So, feel free to file a bug report on it. Maybe it'll get changed, but the current behavior follows the spec. And given how arrays don't generally treat empty and null as being different, I wouldn't really expect an array to stay null if you do _anything_ to it other than simply pass it around or check its value. In this case, you're creating a new array, and D just doesn't generally care about null vs empty when it comes to arrays. I wouldn't argue that that's a good thing (because I don't really think that it is), but because of that, you can't really expect much to treat null and empty as being different. And in this particular case, it's not only debatable as to whether it matters, but the current behavior is completely within the spec. - Jonathan M Davis |
August 03, 2011 Re: "" gives an empty string, while "".idup gives null | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | On 03.08.2011 19:15, Jonathan M Davis wrote:
>> On 03.08.2011 18:18, Jonathan M Davis wrote:
>>> On Thursday 04 August 2011 00:27:12 Mike Parker wrote:
>>>> On 8/3/2011 11:23 PM, simendsjo wrote:
>>>>> On 03.08.2011 15:49, bearophile wrote:
>>>>>> simendsjo:
>>>>>>> void main() {
>>>>>>> assert(is(typeof("") == typeof("".idup))); // both is
>>>>>>> immutable(char)[]
>>>>>>>
>>>>>>> assert("" !is null);
>>>>>>> assert("".idup !is null); // fails - s is null. Why?
>>>>>>> }
>>>>>>
>>>>>> I think someone has even suggested to statically forbid "is null" on
>>>>>> strings :-)
>>>>>>
>>>>>> Bye,
>>>>>> bearophile
>>>>>
>>>>> How should I test for null if not with "is null"? There is a difference
>>>>> between null and empty, and avoiding this is not necessarily easy or
>>>>> even wanted.
>>>>> I couldn't find anything in the specification stating this difference.
>>>>> So... Is it a bug?
>>>>
>>>> This is apparently a bug. Somehow, the idup is clobbering the pointer.
>>>> You can see it more clearly here:
>>>>
>>>> void main()
>>>> {
>>>>
>>>> assert("".ptr);
>>>>
>>>> auto s = "".idup;
>>>> assert(s.ptr); // boom!
>>>>
>>>> }
>>>
>>> I don't know if it's a bug or not. The string _was_ duped. assert(s ==
>>> "") passes. So, as far as equality goes, they're equal, and they don't
>>> point to the same memory. Now, you'd think that the new string would be
>>> just empty rather than null, but whether it's a bug or not depends
>>> exactly on what dup and idup are supposed to do with regards to null.
>>> It's probably just a side effect of how dup and idup are implemented
>>> rather than it being planned one way or the other. I don't know if it
>>> matters or not though. In general, I don't like the conflation of null
>>> and empty, but is this particular case, you _do_ get a string which is
>>> equal to the original and which doesn't point to the same memory. So, I
>>> don't know whether this should be considered a bug or not. It depends on
>>> what dup and idup are ultimately supposed to do.
>>>
>>> - Jonathan M Davis
>>
>> I would think it's a bug, but strings doesn't quite behave as regular
>> references anyway...
>> But why should dup/idup change the semantics of the array?
>>
>> void main() {
>> // A null string or empty string works as expected
>> string s1;
>> assert(s1 is null);
>> assert(s1.ptr is null);
>> assert(s1 == ""); // We can check for empty even if it's
>> null, and it's equal to ""
>> assert(s1.length == 0); // ...and length even if it's null
>> s1 = "";
>> assert(s1 !is null);
>> assert(s1.ptr !is null);
>> assert(s1.length == 0);
>> assert(s1 == "");
>>
>> // the same applies to null mutable arrays
>> char[] s2;
>> assert(s2 is null);
>> assert(s2.ptr is null);
>> assert(s2 == "");
>> assert(s2.length == 0);
>> // but with .dup/.idup things is different!
>> s2 = "".dup;
>> //assert(s2 !is null); // fails
>> //assert(s2.ptr !is null); // fails
>> assert(s2.length == 0); // but... s2 is null..?
>> assert(s2 == "");
>> assert(s2 == s1);
>> }
>
> If you look at the spec ( http://d-programming-language.org/arrays.html ), it
> says:
>
> dup: Create a dynamic array of the same size and copy the contents of the
> array into it.
>
> idup: Create a dynamic array of the same size and copy the contents of the
> array into it. The copy is typed as being immutable. D 2.0 only
>
>
> This is _exactly_ what dup and idup are doing. You get a new array with the
> exact same size and contents. null doesn't factor into it at all. So, per the
> spec, there's no bug here at all. dup and idup promise _nothing_ with regards
> to null.
>
> It may be that it would be better if dup and idup returned an array which was
> null if the original was null, and that would also be within the spec, but
> what dup and idup do at the moment _does_ follow the spec.
>
> So, feel free to file a bug report on it. Maybe it'll get changed, but the
> current behavior follows the spec. And given how arrays don't generally treat
> empty and null as being different, I wouldn't really expect an array to stay
> null if you do _anything_ to it other than simply pass it around or check its
> value. In this case, you're creating a new array, and D just doesn't generally
> care about null vs empty when it comes to arrays. I wouldn't argue that that's
> a good thing (because I don't really think that it is), but because of that,
> you can't really expect much to treat null and empty as being different. And
> in this particular case, it's not only debatable as to whether it matters, but
> the current behavior is completely within the spec.
>
> - Jonathan M Davis
Schveighoffer also states it is as designed.
But it really doesn't behave as one (at least I) would expect.
So in essence (as bearophile says), "is null" should not be used on arrays.
I was bitten by a bug because of this, and used "" intead of "".idup to avoid this, but given D doesn't distinguish between empty and null arrays, this doesn't feel very safe now..
In the code in question I have a lazy initialized string. The problem is that I would see if it has been initialized, but an empty string is also a valid value. Because I shouldn't check for null, I now have to add another field to the struct to see if the array has been initialized. This feels like a really suboptimal solution.
|
August 03, 2011 Re: "" gives an empty string, while "".idup gives null | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | > On Wed, 03 Aug 2011 06:35:08 -0400, simendsjo <simendsjo@gmail.com> wrote:
> > void main() {
> >
> > assert(is(typeof("") == typeof("".idup))); // both is
> >
> > immutable(char)[]
> >
> > assert("" !is null);
> > assert("".idup !is null); // fails - s is null. Why?
> >
> > }
>
> An empty string manifest constant (i.e. string literal) still must have a valid pointer, because it's mandated that the string have a zero byte appended to it. This is so you can pass it to C functions which expect null-terminated strings.
>
> So essentially, there is a '\0' in memory, and "" points to that character with a length of 0
>
> However, idup calls a runtime function which *purposely* asks to make a copy. However, it's *NOT* required to copy the 'zero after the string' part.
>
> The implementation, knowing that a null array is equivalent to an empty array, is going to return null to avoid the performance penalty of allocating a block that won't be used. If you append, it will simply allocate a block as needed.
>
> I see no reason the runtime should waste cycles or a perfectly good 16-byte buffer to give you an empty array.
>
> Definitely functions as designed, not a bug. If you would like different behavior, you are going to have to have a really really good use case to get this changed.
Given that if you really wanted the duped string to be empty instead of null, it wouldn't be very hard to write a wrapper function for dup which did that, I'd be _very_ surprised if you could find a use case where dup should allocate for an empty string.
I don't generally like the fact that D tends to conflate null and empty, but you're creating a new array here. It's not at all surprising if it ends up null if it has no elements in it. In general though, you need to be fairly careful about where you rely on the difference between empty and null. If any kind of memory allocation occurs to an array and its length is 0, it's pretty much free game as to whether it's empty or null.
- Jonathan M Davis
|
Copyright © 1999-2021 by the D Language Foundation