char[] initialization

Could somebody shed light on the subject:

According to http://digitalmars.com/d/type.html

characters in D are getting initialized by following values

char -> 0xFF
wchar -> 0xFFFF
dchar -> 0x0000FFFF

what is the idea to have string initialized by valid character code instead of 0?

And that 0xFFFF.... Why is this special character (See Basic
Multilingual Plane) was selected?

To avoid use of strcat & co. on d strings?

(Sorry if it was discussed before)

Andrew Fedoniouk.
http://terrainformatica.com

July 29, 2006

Re: char[] initialization

Posted by kris
in reply to Andrew Fedoniouk

Permalink

kris

Posted in reply to Andrew Fedoniouk

Permalink

Andrew Fedoniouk wrote:
> Could somebody shed light on the subject:
> 
> According to http://digitalmars.com/d/type.html
> 
> characters in D are getting initialized by following values
> 
> char -> 0xFF
> wchar -> 0xFFFF
> dchar -> 0x0000FFFF
> 
> what is the idea to have string initialized by valid character code instead of 0?

Try google?

http://www.digitalmars.com/d/archives/digitalmars/D/3239.html

July 29, 2006

Re: char[] initialization

Posted by Hasan Aljudy
in reply to kris

Permalink

Hasan Aljudy

Posted in reply to kris

Permalink

kris wrote:
> Andrew Fedoniouk wrote:
> 
>> Could somebody shed light on the subject:
>>
>> According to http://digitalmars.com/d/type.html
>>
>> characters in D are getting initialized by following values
>>
>> char -> 0xFF
>> wchar -> 0xFFFF
>> dchar -> 0x0000FFFF
>>
>> what is the idea to have string initialized by valid character code instead of 0?
> 
> 
> Try google?
> 
> http://www.digitalmars.com/d/archives/digitalmars/D/3239.html

I don't understand why the compiler should initialize variables to illegal values!!

OK, is it because you have to initialize variables explicitly?
Just WHY?

As far as I know, the notion that non-initialized variables are bad is a side-effect of the C (and C++) language, because non-inited variables are garbage.

However, in D (and Java .. and others), vars are always initialized.
So, if the compiler can init variables to good defaults, why should it still be considered a bad habit not to init variables explicitly? That just makes no sense to me.

July 29, 2006

Re: char[] initialization

Posted by Derek
in reply to Hasan Aljudy

Permalink

Derek

Posted in reply to Hasan Aljudy

Permalink

On Sat, 29 Jul 2006 06:29:21 -0600, Hasan Aljudy wrote:

> kris wrote:
>> Andrew Fedoniouk wrote:
>> 
>>> Could somebody shed light on the subject:
>>>
>>> According to http://digitalmars.com/d/type.html
>>>
>>> characters in D are getting initialized by following values
>>>
>>> char -> 0xFF
>>> wchar -> 0xFFFF
>>> dchar -> 0x0000FFFF
>>>
>>> what is the idea to have string initialized by valid character code instead of 0?
>> 
>> Try google?
>> 
>> http://www.digitalmars.com/d/archives/digitalmars/D/3239.html
> 
> I don't understand why the compiler should initialize variables to illegal values!!
> 
> OK, is it because you have to initialize variables explicitly? Just WHY?
> 
> As far as I know, the notion that non-initialized variables are bad is a side-effect of the C (and C++) language, because non-inited variables are garbage.
> 
> However, in D (and Java .. and others), vars are always initialized. So, if the compiler can init variables to good defaults, why should it still be considered a bad habit not to init variables explicitly? That just makes no sense to me.

I believe that D's philopsophy is that all datatypes are initialized to 'invalid' values if they possibly can be. The ones that can't are integers, bytes, and bools. References, floating point values, and characters are initialized to 'wrong' values.

-- 
Derek Parnell
Melbourne, Australia
"Down with mediocrity!"

July 29, 2006

Re: char[] initialization

Posted by Hasan Aljudy
in reply to Derek

Permalink

Hasan Aljudy

Posted in reply to Derek

Permalink


Derek wrote:
> On Sat, 29 Jul 2006 06:29:21 -0600, Hasan Aljudy wrote:
> 
> 
>>kris wrote:
>>
>>>Andrew Fedoniouk wrote:
>>>
>>>
>>>>Could somebody shed light on the subject:
>>>>
>>>>According to http://digitalmars.com/d/type.html
>>>>
>>>>characters in D are getting initialized by following values
>>>>
>>>>char -> 0xFF
>>>>wchar -> 0xFFFF
>>>>dchar -> 0x0000FFFF
>>>>
>>>>what is the idea to have string initialized by valid character code instead of 0?
>>>
>>>Try google?
>>>
>>>http://www.digitalmars.com/d/archives/digitalmars/D/3239.html
>>
>>I don't understand why the compiler should initialize variables to illegal values!!
>>
>>OK, is it because you have to initialize variables explicitly?
>>Just WHY?
>>
>>As far as I know, the notion that non-initialized variables are bad is a side-effect of the C (and C++) language, because non-inited variables are garbage.
>>
>>However, in D (and Java .. and others), vars are always initialized.
>>So, if the compiler can init variables to good defaults, why should it still be considered a bad habit not to init variables explicitly? That just makes no sense to me.
> 
> 
> I believe that D's philopsophy is that all datatypes are initialized to
> 'invalid' values if they possibly can be. The ones that can't are integers,
> bytes, and bools. References, floating point values, and characters are
> initialized to 'wrong' values.
> 

I know .. I was asking "but why?" :(

July 29, 2006

Re: char[] initialization

Posted by Robert Atkinson
in reply to Hasan Aljudy

Permalink

Robert Atkinson

Posted in reply to Hasan Aljudy

Permalink

Hasan Aljudy wrote:
> 
> 
> Derek wrote:
>> On Sat, 29 Jul 2006 06:29:21 -0600, Hasan Aljudy wrote:
>>
>>
>>> kris wrote:
>>>
>>>> Andrew Fedoniouk wrote:
>>>>
>>>>
>>>>> Could somebody shed light on the subject:
>>>>>
>>>>> According to http://digitalmars.com/d/type.html
>>>>>
>>>>> characters in D are getting initialized by following values
>>>>>
>>>>> char -> 0xFF
>>>>> wchar -> 0xFFFF
>>>>> dchar -> 0x0000FFFF
>>>>>
>>>>> what is the idea to have string initialized by valid character code instead of 0?
>>>>
>>>> Try google?
>>>>
>>>> http://www.digitalmars.com/d/archives/digitalmars/D/3239.html
>>>
>>> I don't understand why the compiler should initialize variables to illegal values!!
>>>
>>> OK, is it because you have to initialize variables explicitly?
>>> Just WHY?
>>>
>>> As far as I know, the notion that non-initialized variables are bad is a side-effect of the C (and C++) language, because non-inited variables are garbage.
>>>
>>> However, in D (and Java .. and others), vars are always initialized.
>>> So, if the compiler can init variables to good defaults, why should it still be considered a bad habit not to init variables explicitly? That just makes no sense to me.
>>
>>
>> I believe that D's philopsophy is that all datatypes are initialized to
>> 'invalid' values if they possibly can be. The ones that can't are integers,
>> bytes, and bools. References, floating point values, and characters are
>> initialized to 'wrong' values.
>>
> 
> I know .. I was asking "but why?" :(

The intent I believe is to signal the programmer as soon as possible showing they have missed something.  In C/C++ an un-initialised variable can easily survive thousands of debug runs until it 'initialises' to a completely wrong value.  Most often on a release build and a end-users system.

Take floats.  By starting at NaN, from the very start you'll know you missed initialising it.  You'll catch the error earlier in your debug process.

July 29, 2006

Re: char[] initialization

Posted by Hasan Aljudy
in reply to Robert Atkinson

Permalink

Hasan Aljudy

Posted in reply to Robert Atkinson

Permalink


Robert Atkinson wrote:
> Hasan Aljudy wrote:
> 
>>
>>
>> Derek wrote:
>>
>>> On Sat, 29 Jul 2006 06:29:21 -0600, Hasan Aljudy wrote:
>>>
>>>
>>>> kris wrote:
>>>>
>>>>> Andrew Fedoniouk wrote:
>>>>>
>>>>>
>>>>>> Could somebody shed light on the subject:
>>>>>>
>>>>>> According to http://digitalmars.com/d/type.html
>>>>>>
>>>>>> characters in D are getting initialized by following values
>>>>>>
>>>>>> char -> 0xFF
>>>>>> wchar -> 0xFFFF
>>>>>> dchar -> 0x0000FFFF
>>>>>>
>>>>>> what is the idea to have string initialized by valid character code instead of 0?
>>>>>
>>>>>
>>>>> Try google?
>>>>>
>>>>> http://www.digitalmars.com/d/archives/digitalmars/D/3239.html
>>>>
>>>>
>>>> I don't understand why the compiler should initialize variables to illegal values!!
>>>>
>>>> OK, is it because you have to initialize variables explicitly?
>>>> Just WHY?
>>>>
>>>> As far as I know, the notion that non-initialized variables are bad is a side-effect of the C (and C++) language, because non-inited variables are garbage.
>>>>
>>>> However, in D (and Java .. and others), vars are always initialized.
>>>> So, if the compiler can init variables to good defaults, why should it still be considered a bad habit not to init variables explicitly? That just makes no sense to me.
>>>
>>>
>>>
>>> I believe that D's philopsophy is that all datatypes are initialized to
>>> 'invalid' values if they possibly can be. The ones that can't are integers,
>>> bytes, and bools. References, floating point values, and characters are
>>> initialized to 'wrong' values.
>>>
>>
>> I know .. I was asking "but why?" :(
> 
> 
> The intent I believe is to signal the programmer as soon as possible showing they have missed something.  In C/C++ an un-initialised variable can easily survive thousands of debug runs until it 'initialises' to a completely wrong value.  Most often on a release build and a end-users system.
> 
> Take floats.  By starting at NaN, from the very start you'll know you missed initialising it.  You'll catch the error earlier in your debug process.

Still missing my point.
in C/C++ that's a problem because un-initialized variables carry garbage.
in D, it's not; if you init them to a reasonable valid default, this problem won't exist anymore.

If un-initializing is bad just for its own sake .. then the compiler should detect it and issue an error/warning, otherwise it should default to a reasonable valid value; in this case, zero for chars and floats.

July 29, 2006

To Walter, about char[] initialization by FF

Posted by Andrew Fedoniouk
in reply to kris

Permalink

Andrew Fedoniouk

Posted in reply to kris

Permalink

"kris" <foo@bar.com> wrote in message news:eaf9ei$2m7$1@digitaldaemon.com...
> Andrew Fedoniouk wrote:
>> Could somebody shed light on the subject:
>>
>> According to http://digitalmars.com/d/type.html
>>
>> characters in D are getting initialized by following values
>>
>> char -> 0xFF
>> wchar -> 0xFFFF
>> dchar -> 0x0000FFFF
>>
>> what is the idea to have string initialized by valid character code instead of 0?
>
> Try google?
>
> http://www.digitalmars.com/d/archives/digitalmars/D/3239.html

Thanks, Kris.

To Walter:

Following assumption ( http://www.digitalmars.com/d/archives/digitalmars/D/3239.html):

"codepoint U+FFFF is not a legitimate Unicode character, and, furthermore,
it is guaranteed by the
Unicode Consortium that 0xFFFF will NEVER be a legitimate Unicode character.
This codepoint will remain forever unassigned, precisely so that it may be
used
for purposes such as this."

is just wrong.

1) 0xFFFF is a valid UNICODE character - it is one of the "Specials" from R-zone: {U+FFF0..U+FFFF} - region assigned already.

2) For char[] selection of 0xFF is wrong and even worse.
For example character with code 0xFF in Latin-I encoding is
"y diaeresis". In many European languages and Far East encodings 0xFF is a
valid code point.
For example in KOI-8 encoding 0xFF is officially assigned value.

What is the point of current initializaton?

If you are doing intialization already
and this intialization is a part of specification so why not to use
official "Nul" values in this case?

You are doing the same for floats - you are using NaNs there
 (Null value for floats). Why not to use the same for chars?

I think I understand your intention, 0xFF is sort of
debug values in Visual C++:

0xCDCDCDCD
  - Allocated in heap, but not initialized
0xDDDDDDDD
  - Released heap memory.
0xFDFDFDFD
  - "NoMansLand" fences automatically placed at boundary of heap memory.
Should never be overwritten. If you do overwrite one, you're probably
walking off the end of an array.
0xCCCCCCCC
  - Allocated on stack, but not initialized

but this is far from concept of null codepoint in character encodings.

Andrew Fedoniouk.
http://terrainformatica.com

July 29, 2006

Re: char[] initialization

Posted by Carlos Santander
in reply to Hasan Aljudy

Permalink

Carlos Santander

Posted in reply to Hasan Aljudy

Permalink

Hasan Aljudy escribió:
> 
> 
> Still missing my point.
> in C/C++ that's a problem because un-initialized variables carry garbage.
> in D, it's not; if you init them to a reasonable valid default, this problem won't exist anymore.
> 
> If un-initializing is bad just for its own sake .. then the compiler should detect it and issue an error/warning, otherwise it should default to a reasonable valid value; in this case, zero for chars and floats.

The issue here is, a "reasonable valid default" will change from one app to the other, one function to the next, one variable to another, so the intention here is force the developer to be explicit about his/her intentions.

Walter has said in the past that if there was a NAN for int/long/etc, he'd use that instead of 0.

-- 
Carlos Santander Bernal

July 29, 2006

Re: To Walter, about char[] initialization by FF

Posted by Carlos Santander
in reply to Andrew Fedoniouk

Permalink

Carlos Santander

Posted in reply to Andrew Fedoniouk

Permalink

Andrew Fedoniouk escribió:
> 2) For char[] selection of 0xFF is wrong and even worse.
> For example character with code 0xFF in Latin-I encoding is
> "y diaeresis". In many European languages and Far East encodings 0xFF is a valid code point.
> For example in KOI-8 encoding 0xFF is officially assigned value.
> 

But D's chars are UTF-8, not Latin-1 nor any other, so I don't think this applies.

-- 
Carlos Santander Bernal

Top | Forum index | About this forum

Forums