September 22, 2009 Re: .init property for char[] type | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Justin Johansson | Justin Johansson wrote:
> Jeremie Pelletier Wrote:
>> Besides, if you initialize character arrays to "", what do you initialize other arrays to, and other reference types to? It just wouldn't be consistent.
>
> Consistency. Since when is that an argument?
>
> Just to be a PITA, pick the inconsistent row in the table below (from spec_D1.00.pdf).
> The row ordering of the the table has been shuffled just to make it a bit more difficult to spot :-)
>
> short.init 0
> int.init 0
> bool.init false
> byte.init 0
> double.init double.nan
> long.init 0L
>
Obviously the nan floating points, which has annoyed me quite many times, every other type in D inits to zeroed memory, with the exception of void initializers.
| |||
September 22, 2009 Re: .init property for char[] type | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | Steven Schveighoffer Wrote:
> A null string *is* an empty string, but an empty string may not be a null string.
>
> The subtle difference is that the pointer points to null versus some data.
>
> A non-null empty string:
>
> - May be pointing to heap data, therefore keeping the data from being
> collected.
> - May reallocate in place on appending (a null string always must
> allocate new data on append).
>
> It's a difficult concept to get, but an array is really a hybrid type between a reference and a value type. The array is actually a value type struct with a pointer reference and a length value. If the length is zero, then the pointer value technically isn't needed, but in subtle cases, it makes a difference. When you copy the array, the length behaves like a value type (changing the length of one array doesn't affect the other), but the array data is referenced (changing an element of the array *does* affect the other).
>
> I think plans are to make the array a full reference type, and leave slices as these structs (in D2). This probably will clear up a lot of confusion people have.
>
> I hope this helps...
>
> Oh, and BTW, you can pass string literals to C functions, but *not* char[] variables. Always pass them through toStringz. It generally does not take much time/resources to add the zero.
>
> -Steve
Good write-up Steve; thanks.
Being relatively new to D, but from a strong C++ and assembler background, I did the usual interrogation for interest:
writefln( "(char[]).sizeof=%d", (char[]).sizeof);
8 bytes.
So if you wanted to intern string data to conserve memory, and reference such data with a single 32-bit pointer, sounds like you would have to do this with either a char* or perhaps a pointer to a char[], rather than a full char[] field in your class or struct.
There's less reason to want to intern string data if you still need 8 bytes to reference said data.
Justin
| |||
September 22, 2009 Re: .init property for char[] type | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Justin Johansson | Justin Johansson wrote:
> Jeremie Pelletier Wrote:
>> Besides, if you initialize character arrays to "", what do you initialize other arrays to, and other reference types to? It just wouldn't be consistent.
>
> Consistency. Since when is that an argument?
>
> Just to be a PITA, pick the inconsistent row in the table below (from spec_D1.00.pdf).
> The row ordering of the the table has been shuffled just to make it a bit more difficult to spot :-)
>
> short.init 0
> int.init 0
> bool.init false
> byte.init 0
> double.init double.nan
> long.init 0L
>
You forgot
char.init 0xFF
wchar.init 0xFFFF
dchar.init 0xFFFFFFFF
Andrei
| |||
September 22, 2009 Re: .init property for char[] type | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | Andrei Alexandrescu wrote:
> Justin Johansson wrote:
>> Jeremie Pelletier Wrote:
>>> Besides, if you initialize character arrays to "", what do you initialize other arrays to, and other reference types to? It just wouldn't be consistent.
>>
>> Consistency. Since when is that an argument?
>>
>> Just to be a PITA, pick the inconsistent row in the table below (from spec_D1.00.pdf).
>> The row ordering of the the table has been shuffled just to make it a bit more difficult to spot :-)
>>
>> short.init 0
>> int.init 0
>> bool.init false
>> byte.init 0
>> double.init double.nan
>> long.init 0L
>>
>
> You forgot
>
> char.init 0xFF
> wchar.init 0xFFFF
> dchar.init 0xFFFFFFFF
>
>
> Andrei
Actually, dchar.init is "\U0000ffff".
Jeremie
| |||
September 22, 2009 Re: .init property for char[] type | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | Andrei Alexandrescu Wrote:
> Justin Johansson wrote:
> > Jeremie Pelletier Wrote:
> >> Besides, if you initialize character arrays to "", what do you initialize other arrays to, and other reference types to? It just wouldn't be consistent.
> >
> > Consistency. Since when is that an argument?
> >
> > Just to be a PITA, pick the inconsistent row in the table below (from spec_D1.00.pdf).
> > The row ordering of the the table has been shuffled just to make it a bit more difficult to spot :-)
> >
> > short.init 0
> > int.init 0
> > bool.init false
> > byte.init 0
> > double.init double.nan
> > long.init 0L
> >
>
> You forgot
>
> char.init 0xFF
> wchar.init 0xFFFF
> dchar.init 0xFFFFFFFF
>
>
> Andrei
Shhh; don't tell anybody; I left those out of the quiz to weigh in favour of zero bit pattern init values. (This trick, i.e. omitting information, is one I learned from the Ministries of Statistics and (un)Employment.)
Seriously though, I imagine the D design choices to be influenced by the desire to propagate NaN and invalid UTF in their respective cases so as to detect uninitialized data errors. Hmm, guess one could argue the init issue for eons.
-- Justin
| |||
September 23, 2009 Re: .init property for char[] type | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Justin Johansson | On 2009-09-22 18:08:24 -0400, Justin Johansson <procode@adam-dott-com.au> said: >> You forgot >> >> char.init 0xFF >> wchar.init 0xFFFF >> dchar.init 0xFFFFFFFF >> >> Andrei > > Shhh; don't tell anybody; I left those out of the quiz to weigh in favour of zero bit pattern init values. > (This trick, i.e. omitting information, is one I learned from the Ministries of Statistics and (un)Employment.) > > Seriously though, I imagine the D design choices to be influenced by the desire to propagate NaN and invalid UTF in their respective cases so as to detect uninitialized data errors. Hmm, guess one could argue the init issue for eons. Well, I see this as a problem because I've often relied on default initialization being zero in my algorithms. I was bitten once when my algorithm worked perfectly with char but not with wchar. Turns out that char.init == 0 (contraty to what Andrei wrote) and wchar.init == 0xFFFF. -- Michel Fortin michel.fortin@michelf.com http://michelf.com/ | |||
September 23, 2009 Re: .init property for char[] type | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Michel Fortin | Michel Fortin wrote:
> On 2009-09-22 18:08:24 -0400, Justin Johansson <procode@adam-dott-com.au> said:
>
>>> You forgot
>>>
>>> char.init 0xFF
>>> wchar.init 0xFFFF
>>> dchar.init 0xFFFFFFFF
>>>
>>> Andrei
>>
>> Shhh; don't tell anybody; I left those out of the quiz to weigh in favour of zero bit pattern init values.
>> (This trick, i.e. omitting information, is one I learned from the Ministries of Statistics and (un)Employment.)
>>
>> Seriously though, I imagine the D design choices to be influenced by the desire to propagate NaN and invalid UTF in their respective cases so as to detect uninitialized data errors. Hmm, guess one could argue the init issue for eons.
>
> Well, I see this as a problem because I've often relied on default initialization being zero in my algorithms. I was bitten once when my algorithm worked perfectly with char but not with wchar. Turns out that char.init == 0 (contraty to what Andrei wrote) and wchar.init == 0xFFFF.
>
pragma(msg, char.init.stringof);
outputs '\xff' in D2, wchar and dchar have the same initializer: '\U0000FFFF'.
If you rely on char initializer being the null character, use char c = 0, or else your char gets initialized to an invalid character, just like floats get initialized to nan, other types have the invalid value as either null or do not have an invalid value and use 0.
| |||
September 24, 2009 Re: .init property for char[] type | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Justin Johansson | Justin Johansson wrote:
> Seriously though, I imagine the D design choices to be influenced by
> the desire to propagate NaN and invalid UTF in their respective cases
> so as to detect uninitialized data errors.
That's exactly what drove the design choices.
If there was a nan value for integers, D would use that. But there isn't, so 0 is the best we can do.
Andrei and I were talking last night about the purity of software design principles and the reality, and how the reality forces compromise on the purity if you wanted to get anything done.
| |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply