View mode: basic / threaded / horizontal-split · Log in · Help
July 30, 2005
Re: Walter - Should we use arrays as Null?
so wait, you basically want an array to be a pointer to data containing 
a length and a pointer? i have been following this thread somewhat but I 
can hardly find the benifit here. it seems to me you want to take 
something very straightforward and close to the metal and turn it into a 
referenced object, for some bizzare reason regarding reference 
semantics. why dont you just put your arrays in objects if you are 
having problems?
July 30, 2005
Re: Walter - Should we use arrays as Null?
Hi,

>so wait, you basically want an array to be a pointer to data containing 
>a length and a pointer? i have been following this thread somewhat but I 
>can hardly find the benifit here.

No. I would like it to be that way, but I know there wouldn't be support for
this. What I'd like is for all array properties to follow reference semantics.

>it seems to me you want to take 
>something very straightforward and close to the metal and turn it into a 
>referenced object, for some bizzare reason regarding reference 
>semantics. 

What is bizarre is the current array semantics, be it due to "close to the
metal" requirements, or whatever. If you don't think arrays at the moment follow
at least _partial_ reference semantics, then why does:

# char[] A = "123"; // Yes, it's static, bear with me.
# char[] B = A;
# B.reverse;

Reverse _also_ the contents of A? Those are reference semantics. According to
Derek, the array reference itself is implemented on the stack in 8-byte chunks.
That's fine. I'm not talking about making the array itself a pointer.

Now, my point is that .length breaks reference semantics in special cases,
because:

# char[] A = "123";
# char[] B = A;
# B.length = 4;

A.length did not change. If it were consistent with .reverse and .sort, then A's
length too would have changed.

Cheers,
--AJG.







why dont you just put your arrays in objects if you are 
>having problems?
July 30, 2005
Re: Walter - Should we use arrays as Null?
On Sat, 30 Jul 2005 17:07:06 +0000 (UTC), AJG wrote:


[snip]

> 
> SomeObject A = new SomeObject;
> SomeObject B = A;
> B.SomeProperty; // Operates on A.
> 
> SomeStruct A;
> SomeStruct B = A;
> B.SomeProperty; // Operates on B.
> 
> int[] A = new int[5];
> int[] B = A;
> B.SomeProperty; // Operates on A; 
> // _Except_ if it's .length.
> 
> This behaviour seems much more in line with Objects than with Structs, to me.
> That's why I don't see how .length should break the current semantics.

You are wrong here because 'B.someProperty' operates on B not A. 
A simple proof is this ...

int[] A = new int[5];
int[] B = A;
A.length = 4;
writefln("%d", B.length);  // displays 5.

In your example, it *appears* to operate on A (the 8-byte array structure)
because B and A have the same values. That is A.ptr == B.ptr and A.length
== B.length. 

We just have to admit that arrays in D are not the classical array
definition and are really a different type of thing altogether. Then get to
learn the rules of D 'arrays'. If you want arrays to behave like objects,
then maybe you can write an array class.

-- 
Derek Parnell
Melbourne, Australia
31/07/2005 8:26:46 AM
July 30, 2005
Re: Walter - Should we use arrays as Null?
On Sat, 30 Jul 2005 22:12:31 +0000 (UTC), AJG wrote:


[snip]
> What is bizarre is the current array semantics, be it due to "close to the
> metal" requirements, or whatever. If you don't think arrays at the moment follow
> at least _partial_ reference semantics, then why does:
> 
> # char[] A = "123"; // Yes, it's static, bear with me.
> # char[] B = A;
> # B.reverse;
> 
> Reverse _also_ the contents of A?

There might have been be an argument that .reverse and .sort should follow
Walter's Copy-on-Write rules of engagement, but the current behavior is
documented and relied upon in current code.

-- 
Derek Parnell
Melbourne, Australia
31/07/2005 8:53:41 AM
July 30, 2005
Re: Walter - Should we use arrays as Null?
Hi Derek,

>> int[] A = new int[5];
>> int[] B = A;
>> B.SomeProperty; // Operates on A; 
>> // _Except_ if it's .length.
>> 
>> This behaviour seems much more in line with Objects than with Structs, to me.
>> That's why I don't see how .length should break the current semantics.
>
>You are wrong here because 'B.someProperty' operates on B not A. 
>A simple proof is this ...
>
> int[] A = new int[5];
> int[] B = A;
> A.length = 4;
> writefln("%d", B.length);  // displays 5.
> 
>In your example, it *appears* to operate on A (the 8-byte array structure)
>because B and A have the same values. That is A.ptr == B.ptr and A.length
>== B.length.

Um... I said "except .length" for a reason. That's my very point. That .length
is the exception. All others operate on A.

>We just have to admit that arrays in D are not the classical array
>definition and are really a different type of thing altogether. Then get to
>learn the rules of D 'arrays'. If you want arrays to behave like objects,
>then maybe you can write an array class.

First of all, this would throw efficiency out the window. Second, let me quote
you a little of the D manifesto:

[Taken from "The D Programming Language" written by Walter Bright]
[Arrays Section]

"Arrays are enhanced from being little more than an alternative syntax for a
pointer into first class objects."

That's, ahem, "First Class Objects," for those that missed it.

Cheers,
--AJG.
July 31, 2005
Re: Walter - Should we use arrays as Null?
In article <dch28c$1nrj$1@digitaldaemon.com>, AJG says...
>
>Hi Derek,
>
>>> int[] A = new int[5];
>>> int[] B = A;
>>> B.SomeProperty; // Operates on A; 
>>> // _Except_ if it's .length.
>>> 
>>> This behaviour seems much more in line with Objects than with Structs, to me.
>>> That's why I don't see how .length should break the current semantics.
>>
>>You are wrong here because 'B.someProperty' operates on B not A. 
>>A simple proof is this ...
>>
>> int[] A = new int[5];
>> int[] B = A;
>> A.length = 4;
>> writefln("%d", B.length);  // displays 5.
>> 
>>In your example, it *appears* to operate on A (the 8-byte array structure)
>>because B and A have the same values. That is A.ptr == B.ptr and A.length
>>== B.length.
>
>Um... I said "except .length" for a reason. That's my very point. That .length
>is the exception. All others operate on A.

No, All others do _NOT_ operate on A.  They happen to operate on the same data
that A points to.  A is 
a struct which an int and a ptr, obviously changing B's ptr, or B's length do
not affect A.  You're thinking 
about D arrays all wrong.   That's what Derek was getting at.  A and B are two
separate objects which 
happen to be able to have references to the same data.   For effiencies sake
both the length and the ptr 
are assigned by value.  Think of it this way in C, if you have this structure:

struct Array {  int length; void* ptr; } a, b; 

a.ptr = new char[100];
b = a;

What does this do?  This is the semantics of D arrays. A and B are distinct
structures, and if you allocate 
more memory for b then it's not going to change A.  As you can see this is not
the same as reference 
semantics at all, otherwise A's ptr would change as well.  If you want reference
semantics you are free 
to use an array handle.  But the way D arrays are handled is not mystical or
inconsistent.  They're 
perfectly consistent with themselves, and if you understand how they operate
(which is not hard) then 
you won't make mistakes.

As for your other issue, where array nullness and length == 0 being converged, I
do not think this is an 
issue.  length == 0 is the definition of a null set (arrays in CS seem to be
more in line with sets, dunno 
why they're named as they are). But if you want to be consitent with
terminology, techincally a null array 
is a an array with all elements set to null.    Can you show me an example where
it matters if length == 
0 and arr.ptr == null does not denote the same thing?

-Sha
July 31, 2005
Re: Walter - Should we use arrays as Null?
AJG escribió:
> 
> I'm not suggesting making .length read-only. I'm suggesting making it operate on
> the same data it has a pointer to. Just like .sort or .reverse would. The way I
> see it, if you explicitly want to make a copy of the data, that's why there is
> dup. Why should .length secretely call .dup sometimes, and sometimes not?
> 
> Cheers,
> --AJG.
> 
> 

First of all, I don't agree with AJG: I think D arrays are very well the 
way they're now.

There's something, though, and correct me if I'm wrong, but I think 
array.length doesn't go hand in hand with COW.

char [] a;
a.length = 3;
foo(a);
void foo(char [] b)
{
	b[0] = 'f';    // 1
	b.length = 5;  // 2
}

COW says to do 1, you have to dup first, because you don't own the 
array, but when you do 2, b is automatically dupped. So, my point is 
that to be consistent, maybe resizing should also require dupping.

Am I right? Does it make sense?

-- 
Carlos Santander Bernal
July 31, 2005
Re: Walter - Should we use arrays as Null?
Hi,

>>Um... I said "except .length" for a reason. That's my very point. That .length
>>is the exception. All others operate on A.
>
>No, All others do _NOT_ operate on A.  They happen to operate on the same data
>that A points to.

You are simply splitting hairs here. You are arguing language semantics. The
fact of the matter is that for all practical purposes, EXCEPT for .length,
arrays in D are by reference. This means that for all practical purposes, EXCEPT
for .length, B operates on A. It doesn't matter if it's because of the pointer
(an implementation, system-dependent, gory detail) or because of any other
reason.

If assiging an array _immediately_ copied the data, then what you said is true.
But it doesn't, because (a) that would be inefficient, and (b) that would remove
_all_ reference semantics.

Therefore, as it is, reference semantics are broken when it comes to .length.

<snip>
> RE: Arrays as structs.

This is were _you_ are wrong. Arrays are not structs. Arrays do not share the
semantics of structs. Arrays share _implementation details_ with structs, and
that's _it_.

Didn't you see the quote from the D language doc? It clearly says "First-Class
Objects." Not structs. Not primitives. Not pointers.

If you, however, equate that with structs, that's fine. But I certainly do not.

>They're 
>perfectly consistent with themselves,

This means absolutely nothing. A bug can be perfectly consistent with itself and
it is still a bug. To be meaningful, they would have to be consistent with the
rest of the language. Or perhaps, consistent with another part of the language,
like, say, Objects. 

>and if you understand how they operate
>(which is not hard) then 
>you won't make mistakes.

It's not about making mistakes. Sure, I can just as well avoid a function in a
library that is buggy, and I'll avoid a mistake. That's not the point. If
something is broken, then it need to be fixed. If Walter could perhaps clarify
the semantics of arrays, then we would get somewhere.

>As for your other issue, where array nullness and length == 0 being converged
>do not think this is an 
>issue.  length == 0 is the definition of a null set

So? What I would like to express is _No Set_.

>(arrays in CS seem to be
>more in line with sets, dunno 
>why they're named as they are). But if you want to be consitent with
>terminology, techincally a null array 
>is a an array with all elements set to null. Can you show me an example
>it matters if length == 
>0 and arr.ptr == null does not denote the same thing?

When you are returning fields from a database, for instance. If you've ever
dealt with a DB, you would know fields can be NULL, meaning no value. This is
different than "", which means explicitly the empty string. It is very difficult
to do this because of certain bugs which meld .length == 0 and .ptr == null.

They are not the same thing. Not semantically. Not technically, at the moment,
except for the "bugs." That's why I'm asking Walter whether he _plans_ on
merging the two into one. If that's his vision, which would be unfortunate, then
those things aren't "bugs" at all, but rather the intended design.


Cheers,
--AJG.
July 31, 2005
Re: Walter - Should we use arrays as Null?
In article <dchgkl$23v5$1@digitaldaemon.com>, AJG says...
>
>Hi,
>
>>>Um... I said "except .length" for a reason. That's my very point. That .length
>>>is the exception. All others operate on A.
>>
>>No, All others do _NOT_ operate on A.  They happen to operate on the same data
>>that A points to.
>
>You are simply splitting hairs here. You are arguing language semantics. The
>fact of the matter is that for all practical purposes, EXCEPT for .length,
>arrays in D are by reference. This means that for all practical purposes, EXCEPT
>for .length, B operates on A. It doesn't matter if it's because of the pointer
>(an implementation, system-dependent, gory detail) or because of any other
>reason.

I am not splitting hairs.  I gave you a very valid reason why a and b are not
references, not even 
theoretically. They happen to have a reference member that in some cases, will
point to the same data.  
YOU are in full control over when that happens. If that's not what you intended,
then you should be 
using references to the ARRAY.  Rather than using multiple arrays with have
references to the same 
data. 

I might ask you this:  What MAGIC would you like to happen with arrays?  What
you want is not possible 
without some kind of magic.  Try this example on for size, from classic C:

int* a = malloc(100 * sizeof(int));
int* b = a;

b = realloc( b, 1000 * sizeof(int) );

Guess what, a is most likely now a bad reference.  Is this what you would like D
to do?  Probably not, 
you probably want 'a' to point to the new array of length 1000.  Do you want the
compiler to magically 
handle this for you?  

Would you like length to be read only?  Forcing us to call b = new int[], and
then manually code up the 
data copy to resize the array?  Starting to sound like C.... What a pain arrays
were.  And a still didn't 
change automatically to where b is pointing now.

>If assiging an array _immediately_ copied the data, then what you said is true.
>But it doesn't, because (a) that would be inefficient, and (b) that would remove
>_all_ reference semantics.
>
>Therefore, as it is, reference semantics are broken when it comes to .length.

There are no reference semantics when it comes to arrays.  Maybe what you want
is D to automagically 
do a Copy-on-Write.  Any time an array that is set to a reference of another
array the flag could get 
turned on, and when you use it as an lvalue and that is on, it could dup the
array.   But that's silly since 

b = new int[100]; is perfectly legal in D, and would result in a double memory
access if you ever tried to 
assign to the array.  Wonder what kind of magic would have to be done to fix
this case.  

IMHO, Better to let the programmer specify when he wants a and b to point a the
same data.

>
><snip>
>> RE: Arrays as structs.
>
>This is were _you_ are wrong. Arrays are not structs. Arrays do not share the
>semantics of structs. Arrays share _implementation details_ with structs, and
>that's _it_.
>
>Didn't you see the quote from the D language doc? It clearly says "First-Class
>Objects." Not structs. Not primitives. Not pointers.
>
>If you, however, equate that with structs, that's fine. But I certainly do not.

You can't use a language to it's fully potential if you don't know
implementation details.  There will 
always be ambiguities of when references are by value, by ref, or whatever else.
As the saying goes:  
the language is in the details.

Here's a good example for you, from a VB.NET project i just inherited:

If arr.Length - arr.Replace(",", "").Length <> 17 Then
'error out

What's the big deal?  It's only one line of code, must be just as good as
counting the number of commas 
in the array....

>
>>They're 
>>perfectly consistent with themselves,
>
>This means absolutely nothing. A bug can be perfectly consistent with itself and
>it is still a bug. To be meaningful, they would have to be consistent with the
>rest of the language. Or perhaps, consistent with another part of the language,
>like, say, Objects. 
>
>>and if you understand how they operate
>>(which is not hard) then 
>>you won't make mistakes.
>
>It's not about making mistakes. Sure, I can just as well avoid a function in a
>library that is buggy, and I'll avoid a mistake. That's not the point. If
>something is broken, then it need to be fixed. If Walter could perhaps clarify
>the semantics of arrays, then we would get somewhere.
>
>>As for your other issue, where array nullness and length == 0 being converged
>>do not think this is an 
>>issue.  length == 0 is the definition of a null set
>
>So? What I would like to express is _No Set_.

Not Set?

>
>>(arrays in CS seem to be
>>more in line with sets, dunno 
>>why they're named as they are). But if you want to be consitent with
>>terminology, techincally a null array 
>>is a an array with all elements set to null. Can you show me an example
>>it matters if length == 
>>0 and arr.ptr == null does not denote the same thing?
>
>When you are returning fields from a database, for instance. If you've ever
>dealt with a DB, you would know fields can be NULL, meaning no value. This is
>different than "", which means explicitly the empty string. It is very difficult
>to do this because of certain bugs which meld .length == 0 and .ptr == null.

I see your point, but any kind of attempt to do that would be abusing the array.
There are laws against 
array abuse in most countries these days. </sarcasm>

Most every single database api in existence deals with that by having special
objects.

so you have this:

static char[0] DBNull; in your database module;

then

char[] foo;
foo = dbCommand.executeScalar( );

if( foo is DBNull )
// I'm not sure if the .ptr prop is needed here.  Last I heard if you just use
the array name it defaults to 
the ptr
. oh noes, the field was null!
else 
. oh good ..



>
>They are not the same thing. Not semantically. Not technically, at the moment,
>except for the "bugs." That's why I'm asking Walter whether he _plans_ on
>merging the two into one.

They should never be the same thing.  But there's a gotcha,  if .ptr is null,
then length should always be 
0.  Other way around is not necessarily true.  Just because length == 0 the ptr
isn't necesisarily null.  
This should be the case when the array was at one point allocated, and then
length was reduced.  It 
should be that way for efficiency.  

That however is not useful for your example of DBNulls.  It would be silly to
allocate some space and 
then just not use it and say that's when somebody entered something, and it was
nothing.

> If that's his vision, which would be unfortunate, then
>those things aren't "bugs" at all, but rather the intended design.

What 'things'?  Are you talking about the .ptr value being the same for two
arrays?
July 31, 2005
Re: Walter - Should we use arrays as Null?
In article <dcgkt5$1b4i$1@digitaldaemon.com>, AJG says...
>
>Hi Ben,
>
>>>So then .length is related to slicing? How does the semantics of
>>>.length affect
>>>slicing? Or perhaps you meant other benefits?
>>
>>I recommend you pursue some of your ideas where length is manipulated by
>>reference and follow the dependencies to see how different dynamic arrays (and,
>>yes, slicing) would be. In particular I recommend you learn more about slicing.
>>I'm sorry if that sounds harsh but I've gotten the opinion now that you haven't
>>really gotten experience with D arrays as they exist now.
>
>Would an example do? I may not be an expert regarding slicing, but I could see a
>discrete problem if you point it out.

Let me step through some choices that I was hoping you would do. Let's start by
thinking about what an array with reference-based length would look like. It
would either be a pointer to today's dynamic array (a ptr and a length) or it
would be a pointer to one memory block with the length stored either at the
front or end of the array data. How would slicing work for those two
implementations? For the first slicing would have to allocate memory to store
the new ptr and new length. For the second slicing would have to be a different
type since it is impossible to store the length for the slice in the middle of
the original source array. So that's why I suggested you think through your
initial suggestion and work out the impact on slicing and arrays in general.

But to be honest I would still prefer the current behavior where the length
information is always available without having to check for null first - even if
you could somehow make the rest of D remain the same as today.
1 2 3 4 5 6
Top | Discussion index | About this forum | D home