View mode: basic / threaded / horizontal-split · Log in · Help
August 01, 2005
Re: Walter - Should we use arrays as Null?
Hi Ben,

>Let me step through some choices that I was hoping you would do. Let's start by
>thinking about what an array with reference-based length would look like. It
>would either be a pointer to today's dynamic array (a ptr and a length) or it
>would be a pointer to one memory block with the length stored either at the
>front or end of the array data. How would slicing work for those two
>implementations? For the first slicing would have to allocate memory to store
>the new ptr and new length. For the second slicing would have to be a different
>type since it is impossible to store the length for the slice in the middle of
>the original source array. So that's why I suggested you think through your
>initial suggestion and work out the impact on slicing and arrays in general.

I don't think this change in the way arrays operate internally would be
necessary. What about simply using the current data pointer as it is to
implement reference semantics? A null pointer means the reference is null; and
vice-versa.

The problem I keep hearing comes when trying to re-size (specifically, enlarge),
an array, by reference. So then what it all comes down to re: .length is the
inability of realloc() to guarantee that the pointer it returns is the same on
it receives. Is this correct?

>But to be honest I would still prefer the current behavior where the length
>information is always available without having to check for null first - even >if you could somehow make the rest of D remain the same as today.

I understand this concern, and it is a valid one. However, at this point D is
trying to have the cake and eating it too: It wants to have null arrays, but not
have to go thru null checks. The result is a bit confusing, IMHO. Moreover, it
is buggy. Worse of all, it is not well documented.

This combination of factors leads me to think something should be done.

Frankly, from the docs I can't make out what the semantics of arrays are
supposed to be. That was why I asked the original question: should we or
shouldn't we treat arrays as null? I guess maybe not even Walter knows ;) ?

Cheers,
--AJG.
August 01, 2005
Re: Walter - Should we use arrays as Null?
Ben Hinkle wrote:
> I think you'll have a hard time getting lots of support for that. I much 
> prefer the current behavior and I bet there is lots of existing D code that 
> assumes one can test the length of an array at any time. Since an array is 
> not an object I see no problem with the "inconistency" - an array is an 
> array. 

Indeed. I think the array semantics where you can't access a property of 
the array without the Fear of the NullPointerException is the most 
annoying thing in the world, or at least in the field of programming.

I will happily agree to this difference in semantics because the 
benefits far outweigh the slight inconsistency.

Besides, in a way there is no inconsistency. An array reference is a 
value type consisting of two 4-byte integers (in 32-bit environments). 
This is different from an object reference. The first integer is the 
length of the array and the second is a pointer to the first item of the 
array. Whenever an array reference is created a pointer to the data 
exists. The .length property is just a shortcut to access the length 
field of the array. The .sort property is a function called on the array 
reference. These always work even if the array reference points to an 
empty array. Trying to access the elements of an empty array will 
segfault in the usual way.

Object references stored in an array have the usual semantics. IMO 
nothing forces a language to treat arrays as templated instances of a 
class Array with regular object semantics. D's way is just better.

-- 
Niko Korhonen
SW Developer
August 01, 2005
Re: Walter - Should we use arrays as Null?
On Mon, 01 Aug 2005 09:56:57 +0300, Niko Korhonen wrote:

> Ben Hinkle wrote:
>> I think you'll have a hard time getting lots of support for that. I much 
>> prefer the current behavior and I bet there is lots of existing D code that 
>> assumes one can test the length of an array at any time. Since an array is 
>> not an object I see no problem with the "inconistency" - an array is an 
>> array. 
> 
> Indeed. I think the array semantics where you can't access a property of 
> the array without the Fear of the NullPointerException is the most 
> annoying thing in the world, or at least in the field of programming.
> 
> I will happily agree to this difference in semantics because the 
> benefits far outweigh the slight inconsistency.
> 
> Besides, in a way there is no inconsistency. An array reference is a 
> value type consisting of two 4-byte integers (in 32-bit environments). 
> This is different from an object reference.

Agreed. The way I look at it is that a D array variable *contains* a
reference to the array elements but is, in itself, not the reference.

When it comes to implementation, dynamic-length arrays always have an
8-byte structure allocated to themselves, and may have more RAM allocated
if there are any elements in the array. The address of the array variable
is not the address of the first element; the length property is fetched at
runtime from the array variable. 

However, fixed-length arrays always have a minimum of 8 bytes allocated
regardless of the number of elements declared, and the address of the array
variable is also the address of its first element; the length property is
'hard-coded' by the compiler in any expressions that use it. 

-- 
Derek
Melbourne, Australia
1/08/2005 5:01:43 PM
August 01, 2005
Re: Walter - Should we use arrays as Null?
"Derek Parnell" <derek@psych.ward> wrote in message 
news:a118xxgyuee7.t1828b9vk5du$.dlg@40tude.net...
> On Sat, 30 Jul 2005 22:12:31 +0000 (UTC), AJG wrote:
>
>
> [snip]
>> What is bizarre is the current array semantics, be it due to "close to 
>> the
>> metal" requirements, or whatever. If you don't think arrays at the moment 
>> follow
>> at least _partial_ reference semantics, then why does:
>>
>> # char[] A = "123"; // Yes, it's static, bear with me.
>> # char[] B = A;
>> # B.reverse;
>>
>> Reverse _also_ the contents of A?
>
> There might have been be an argument that .reverse and .sort should follow
> Walter's Copy-on-Write rules of engagement, but the current behavior is
> documented and relied upon in current code.

Besides those reasons writing "B.reverse" to me indicates you want to affect 
B hence no COW while "reverse(B)" says you want a reversed B hence COW. 
That's one reason why I don't really like the current syntax hack of being 
able to write B.tolower() to mean tolower(B).
August 01, 2005
Re: Walter - Should we use arrays as Null?
In article <dclba9$2pif$1@digitaldaemon.com>, Ben Hinkle says...
>
>
>"Derek Parnell" <derek@psych.ward> wrote in message 
>news:a118xxgyuee7.t1828b9vk5du$.dlg@40tude.net...
>> On Sat, 30 Jul 2005 22:12:31 +0000 (UTC), AJG wrote:
>>
>>
>> [snip]
>>> What is bizarre is the current array semantics, be it due to "close to 
>>> the
>>> metal" requirements, or whatever. If you don't think arrays at the moment 
>>> follow
>>> at least _partial_ reference semantics, then why does:
>>>
>>> # char[] A = "123"; // Yes, it's static, bear with me.
>>> # char[] B = A;
>>> # B.reverse;
>>>
>>> Reverse _also_ the contents of A?
>>
>> There might have been be an argument that .reverse and .sort should follow
>> Walter's Copy-on-Write rules of engagement, but the current behavior is
>> documented and relied upon in current code.
>
>Besides those reasons writing "B.reverse" to me indicates you want to affect 
>B hence no COW while "reverse(B)" says you want a reversed B hence COW. 
>That's one reason why I don't really like the current syntax hack of being 
>able to write B.tolower() to mean tolower(B). 

Utterly confusing!  reserve(b) and B.reverse have nothing in their name to imply
that either one copies the data.  By default COW should not happen.  Believe me,
look at .NET where everything is COW.  New memory allocations all over the
place.  IMHO .dup is there for a reason, and nothing is preventing you from
doing:

foo.dup.reverse

If somebody else comes along, they will knows you are copying the array. It's
only 4 more characters of typing.  Plus no confusion as to what does cow and
what doesn't.  I can copy the thing first with .dup if I want.  This isn't C
where it's 5 lines of code every time you need to copy an array!

-Sha
August 01, 2005
Re: Walter - Should we use arrays as Null?
"Shammah Chancellor" <Shammah_member@pathlink.com> wrote in message 
news:dcleqr$2ti5$1@digitaldaemon.com...
> In article <dclba9$2pif$1@digitaldaemon.com>, Ben Hinkle says...
>>
>>
>>"Derek Parnell" <derek@psych.ward> wrote in message
>>news:a118xxgyuee7.t1828b9vk5du$.dlg@40tude.net...
>>> On Sat, 30 Jul 2005 22:12:31 +0000 (UTC), AJG wrote:
>>>
>>>
>>> [snip]
>>>> What is bizarre is the current array semantics, be it due to "close to
>>>> the
>>>> metal" requirements, or whatever. If you don't think arrays at the 
>>>> moment
>>>> follow
>>>> at least _partial_ reference semantics, then why does:
>>>>
>>>> # char[] A = "123"; // Yes, it's static, bear with me.
>>>> # char[] B = A;
>>>> # B.reverse;
>>>>
>>>> Reverse _also_ the contents of A?
>>>
>>> There might have been be an argument that .reverse and .sort should 
>>> follow
>>> Walter's Copy-on-Write rules of engagement, but the current behavior is
>>> documented and relied upon in current code.
>>
>>Besides those reasons writing "B.reverse" to me indicates you want to 
>>affect
>>B hence no COW while "reverse(B)" says you want a reversed B hence COW.
>>That's one reason why I don't really like the current syntax hack of being
>>able to write B.tolower() to mean tolower(B).
>
> Utterly confusing!  reserve(b) and B.reverse have nothing in their name to 
> imply
> that either one copies the data.  By default COW should not happen. 
> Believe me,
> look at .NET where everything is COW.  New memory allocations all over the
> place.  IMHO .dup is there for a reason, and nothing is preventing you 
> from
> doing:
>
> foo.dup.reverse
>
> If somebody else comes along, they will knows you are copying the array. 
> It's
> only 4 more characters of typing.  Plus no confusion as to what does cow 
> and
> what doesn't.  I can copy the thing first with .dup if I want.  This isn't 
> C
> where it's 5 lines of code every time you need to copy an array!
>
> -Sha

You've lost me. Are you proposing a change to any existing behavior or 
coding practice (ie COW)?
August 01, 2005
Re: Walter - Should we use arrays as Null?
In article <dclfvs$2usj$1@digitaldaemon.com>, Ben Hinkle says...
>
>
>"Shammah Chancellor" <Shammah_member@pathlink.com> wrote in message 
>news:dcleqr$2ti5$1@digitaldaemon.com...
>> In article <dclba9$2pif$1@digitaldaemon.com>, Ben Hinkle says...
>>>
>>>
>>>"Derek Parnell" <derek@psych.ward> wrote in message
>>>news:a118xxgyuee7.t1828b9vk5du$.dlg@40tude.net...
>>> [snip]
>>>Besides those reasons writing "B.reverse" to me indicates you want to 
>>>affect
>>>B hence no COW while "reverse(B)" says you want a reversed B hence COW.
>>>That's one reason why I don't really like the current syntax hack of being
>>>able to write B.tolower() to mean tolower(B).
>>
>> Utterly confusing!  reserve(b) and B.reverse have nothing in their name to 
>> imply
>> that either one copies the data.  By default COW should not happen. 
>> Believe me,
>> look at .NET where everything is COW.  New memory allocations all over the
>> place.  IMHO .dup is there for a reason, and nothing is preventing you 
>> from
>> doing:
>>
>> foo.dup.reverse
>>
>> If somebody else comes along, they will knows you are copying the array. 
>> It's
>> only 4 more characters of typing.  Plus no confusion as to what does cow 
>> and
>> what doesn't.  I can copy the thing first with .dup if I want.  This isn't 
>> C
>> where it's 5 lines of code every time you need to copy an array!
>>
>> -Sha
>
>You've lost me. Are you proposing a change to any existing behavior or 
>coding practice (ie COW)? 

I wasn't proposing a change at all.  I was disagreing with Derek.  I think COW
is a bad thing for API functions to be doing mysteriously.  It leads to crap
like this:

foo = foo.Replace("Hello","");
dateFoo = dateFoo.AddDays(1);

If I want a duplicate something, in D, it's as easy as saying:
# foo2 = foo.dup.replace("Hello","");
(Not that replace is a valid property for char[]s, but you get my gist)

This leads to effective memory use, and no confusion about:

reverse(b), or b.reverse 

Which one does c-o-w?  The name certainly doesn't say, maybe by somebodies
reasoning it might make sense that one does cow and one doesn't.  But certainly
not mine, from the information given.

Also, you might say for consistency, always use cow.  But cow is not always what
you want. Since there's no way to manually un-cowify it,  It would make logical
sense to NEVER do cow, and let the programmer call dup first.

-Sha
August 01, 2005
Re: Walter - Should we use arrays as Null?
Hi,

>If I want a duplicate something, in D, it's as easy as saying:
># foo2 = foo.dup.replace("Hello","");
>(Not that replace is a valid property for char[]s, but you get my gist)

Exactly.

>This leads to effective memory use, and no confusion about:
>reverse(b), or b.reverse 
>
>Which one does c-o-w?  The name certainly doesn't say, maybe by somebodies
>reasoning it might make sense that one does cow and one doesn't.  But certainly
>not mine, from the information given.

IMHO, and for consistency, it should never do COW. If a user wants to do COW,
let the user do it. That's exactly what I mean by reference semantics, so it
seems we are in agreement here.

>Also, you might say for consistency, always use cow.  But cow is not always 
>you want. Since there's no way to manually un-cowify it,  It would make logical
>sense to NEVER do cow, and let the programmer call dup first.

Interestingly enough (and one of my points), .length does COW about half of the
time, and there's no way to un-cowify it.

That's a great word, btw, un-cowify. It had me chuckling.

Cheers,
--AJG.
August 01, 2005
Re: Walter - Should we use arrays as Null?
"Shammah Chancellor" <Shammah_member@pathlink.com> wrote in message 
news:dclk4p$1o0$1@digitaldaemon.com...
> In article <dclfvs$2usj$1@digitaldaemon.com>, Ben Hinkle says...
>>
>>
>>"Shammah Chancellor" <Shammah_member@pathlink.com> wrote in message
>>news:dcleqr$2ti5$1@digitaldaemon.com...
>>> In article <dclba9$2pif$1@digitaldaemon.com>, Ben Hinkle says...
>>>>
>>>>
>>>>"Derek Parnell" <derek@psych.ward> wrote in message
>>>>news:a118xxgyuee7.t1828b9vk5du$.dlg@40tude.net...
>>>> [snip]
>>>>Besides those reasons writing "B.reverse" to me indicates you want to
>>>>affect
>>>>B hence no COW while "reverse(B)" says you want a reversed B hence COW.
>>>>That's one reason why I don't really like the current syntax hack of 
>>>>being
>>>>able to write B.tolower() to mean tolower(B).
>>>
>>> Utterly confusing!  reserve(b) and B.reverse have nothing in their name 
>>> to
>>> imply
>>> that either one copies the data.  By default COW should not happen.
>>> Believe me,
>>> look at .NET where everything is COW.  New memory allocations all over 
>>> the
>>> place.  IMHO .dup is there for a reason, and nothing is preventing you
>>> from
>>> doing:
>>>
>>> foo.dup.reverse
>>>
>>> If somebody else comes along, they will knows you are copying the array.
>>> It's
>>> only 4 more characters of typing.  Plus no confusion as to what does cow
>>> and
>>> what doesn't.  I can copy the thing first with .dup if I want.  This 
>>> isn't
>>> C
>>> where it's 5 lines of code every time you need to copy an array!
>>>
>>> -Sha
>>
>>You've lost me. Are you proposing a change to any existing behavior or
>>coding practice (ie COW)?
>
> I wasn't proposing a change at all.  I was disagreing with Derek.  I think 
> COW
> is a bad thing for API functions to be doing mysteriously.  It leads to 
> crap
> like this:
>
> foo = foo.Replace("Hello","");
> dateFoo = dateFoo.AddDays(1);

I didn't read Derek's post as proposing reverse use COW. He was pointing out 
that it doesn't. It's too bad you see COW as mysterious.

> If I want a duplicate something, in D, it's as easy as saying:
> # foo2 = foo.dup.replace("Hello","");
> (Not that replace is a valid property for char[]s, but you get my gist)
>
> This leads to effective memory use, and no confusion about:
>
> reverse(b), or b.reverse
>
> Which one does c-o-w?  The name certainly doesn't say, maybe by somebodies
> reasoning it might make sense that one does cow and one doesn't.  But 
> certainly
> not mine, from the information given.

The statement about effective memory use only is true when the operation is 
guaranteed to change the string. If foo in the example didn't contain any 
Hellos then the dup would be wasteful. Plus I'm surprised you don't see any 
difference between reverse(b) and b.reverse since it's common in OOP to 
interpret b.foo as acting on b while foo(b) is just some function of b.

> Also, you might say for consistency, always use cow.  But cow is not 
> always what
> you want. Since there's no way to manually un-cowify it,  It would make 
> logical
> sense to NEVER do cow, and let the programmer call dup first.

That would be a big change in D style since many times you do not know if a 
dup will be needed or not (eg most of the functions in std.string might just 
return the original string).
August 01, 2005
Re: Walter - Should we use arrays as Null?
On Mon, 1 Aug 2005 16:54:49 +0000 (UTC), Shammah Chancellor wrote:


>>>>"Derek Parnell" <derek@psych.ward> wrote in message
>>>>news:a118xxgyuee7.t1828b9vk5du$.dlg@40tude.net...
>>>> [snip]
>>>>Besides those reasons writing "B.reverse" to me indicates you want to 
>>>>affect
>>>>B hence no COW while "reverse(B)" says you want a reversed B hence COW.
>>>>That's one reason why I don't really like the current syntax hack of being
>>>>able to write B.tolower() to mean tolower(B).

> I was disagreing with Derek.  I think COW
> is a bad thing for API functions to be doing mysteriously.  It leads to crap
> like this:
> 
> foo = foo.Replace("Hello","");
> dateFoo = dateFoo.AddDays(1);

Hi Shammah,
I wasn't actually saying that .reverse must use CoW. I was saying that it
didn't and that fact seems go counter to Walter's general principle (as I
understand it) about when to use Cow or not. I thought that one should use
CoW if the code is actually changing the data *and* the data might be
accessible to the calling routine. Thus as the .reverse will change the
data for lengths > 1, and the data is probably accessible to the code using
.reverse, one could have expected it to CoW.

Of course, I might be misunderstanding that 'general principle' ;-)

As the current behaviour is documented, we can cope with this seeming
exception.

-- 
Derek Parnell
Melbourne, Australia
2/08/2005 7:21:43 AM
1 2 3 4 5 6
Top | Discussion index | About this forum | D home