Thread overview | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
June 27, 2004 empty arrays - no complaints? | ||||
---|---|---|---|---|
| ||||
Why are there (almost) no complaints about D's support for empty arrays? Just to get ex-BASIC programmers in touch with this aspect of D arrays, here's a (not so) small D sample that shows how to create a)null arrays (named: null1, null2, null3) b)empty arrays (named: array1, array2, array3) and also shows how they differ. [D arrays have sooooo obvious semantic, that D programmers should feel free to skip to the end of this post and read the conclusion.] --------------------- array sample code --------------------- void printTraits(char[] array, char[] name) { printf("\n%10.*s%-13.*s", name, ".length == 0"); if (array.length == 0) printf("%10.*s","is true"); else printf("%10.*s","is false"); printf("%10.*s%-13.*s", name, " is null"); if (array is null) printf("%10.*s","is true"); else printf("%10.*s","is false"); printf("\n%10.*s%-13.*s", name, " == null"); if (array == null) printf("%10.*s","is true"); else printf("%10.*s","is false"); printf("%10.*s%-13.*s", name, " == \"\""); if (array == "") printf("%10.*s","is true"); else printf("%10.*s","is false"); } int main(char args[][]) { char[] empty1=(new char[1])[0..0]; char[] empty2="1"[1..1]; // empty2="1"[2..2] causes ArrayBoundsError char[] empty3=""; char[] null1; char[] null2=new char[0]; char[] null3=empty1; null3.length=0; printTraits(null1, "null1"); printTraits(null2, "null2"); printTraits(null3, "null3"); printf("\n"); printTraits(empty1, "empty1"); printTraits(empty2, "empty2"); printTraits(empty3, "empty3"); printf("\n\n"); if (null1 == null) printf("%20.*s","null1 == null "); if (empty1 == null1) printf("%20.*s","empty1 == null1 "); if (empty1 != null) printf("%20.*s","but empty1 != null"); printf("\n"); return 0; } Build with DMD 0.93 (Windows), the output is: null1.length == 0 is true null1 is null is true null1 == null is true null1 == "" is true null2.length == 0 is true null2 is null is true null2 == null is true null2 == "" is true null3.length == 0 is true null3 is null is true null3 == null is true null3 == "" is true empty1.length == 0 is true empty1 is null is false empty1 == null is false empty1 == "" is true empty2.length == 0 is true empty2 is null is false empty2 == null is false empty2 == "" is true empty3.length == 0 is true empty3 is null is false empty3 == null is false empty3 == "" is true null1 == null empty1 == null1 but empty1 != null --------------------- end of array sample --------------------- Conclusion: D does have empty-arrays and null-arrays but the language tries to blur them. This is unfortunate as 1) a clear separation of empty-arrays vs. null-arrays is useful for functional rich but simple API interfaces: Imagine a function that returns the value of attributes of a XML-element char[] getAttrValue(char[] name) The attribute value could be non-existant (the attribute doesn't exist), be empty, or have a non-empty value. If empty-arrays vs. null-arrays are blurred, the interface gets more bloated: // additional parameter char[] getAttrValue(char[] name, out bit isNull) // additional function, potentially wasting a slot in the VTable bit hasAttrValue(char[] name) // additional indirection Attribute getAttribute(char[] name) 2) Initialization bugs are not detected at runtime. D has -null-references for objects -null for pointers -nan's for FP types -invalid characters for unicode characters -garantueed initialization of structs (Constructors are comming, soon !) -and strong typedefs that empower the programmer to define application specific 'not-initialized' values for integer types to make an ubiquitous source of bugs, easy to spot and fix. But if empty/null arrays are commonly treated as being the same thing, uninitialized arrays will cause subtle bugs here and there. 3) This aspect of array behaviour is not obvious! Ok, what's obvious is always a moot point. (If I knew, what's obvious, I would write posts about bit vs. bool vs. strong bool types.) But I know that the array behaviour is definitely not obvious to all D/C/C++ programmers. So, why doesn't anyone complain? Farmer. |
June 27, 2004 Re: empty arrays - no complaints? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Farmer | In article <Xns9515C8A3CA1ACitsFarmer@63.105.9.61>, Farmer says... > >Conclusion: D does have empty-arrays and null-arrays but the language tries to blur them. Not really. I'd rather argue that D tries to make both usable and reduce odd errors resulting from uninitialized arrays. >This is unfortunate as > >1) a clear separation of empty-arrays vs. null-arrays is useful for functional rich but simple API interfaces: > >Imagine a function that returns the value of attributes of a XML-element char[] getAttrValue(char[] name) > >The attribute value could be non-existant (the attribute doesn't exist), be empty, or have a non-empty value. I'd say this is an interface or documentaation problem, not a language problem. >2) Initialization bugs are not detected at runtime. This makes sense in this case. I don't like the idea of having to distinguish between an initialized array with no elements and an uninitialized array, as both are equivalent IMO. Further, setting the length property will cause a reallocation for both types of arrays. >to make an ubiquitous source of bugs, easy to spot and fix. But if empty/null arrays are commonly treated as being the same thing, uninitialized arrays will cause subtle bugs here and there. I believe the opposite would be true. Sean |
June 27, 2004 Re: empty arrays - no complaints? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Farmer | Farmer wrote:
> Why are there (almost) no complaints about D's support for empty arrays?
>
> Conclusion: D does have empty-arrays and null-arrays but the language tries to blur them.
>
> This is unfortunate ...
>
> So, why doesn't anyone complain?
I think the problem is that D arrays almost always behave like reference types, and therefore are almost always treated like reference types.
They aren't. null arrays *are* empty arrays.
Arrays are value types which consist of a length and a pointer to memory. Copying and slicing an array creates a brand new array whose data happens to (generally) be memory that is also pointed to by another array.
So! Rules of thumb:
1) think of arrays as though they are value types which can be cheaply copied.
2) use .dup if you need to mutate copies made in this way. (the Copy-on-Write principle)
-- andy
|
June 27, 2004 Re: empty arrays - no complaints? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Farmer | In article <Xns9515C8A3CA1ACitsFarmer@63.105.9.61>, Farmer says... > >Why are there (almost) no complaints about D's support for empty arrays? Actually, I think that D has got it right here. At least mostly. I'm happy with the fact that null counts as an empty array. But I do have SOME gripes. These are: (1) given that a is an array of length n, the expression a[n..n] gives an array bounds exception, and I don't believe it should. I would prefer that it simply evaluated to an empty string. I've lost count of the number of times I've had to put a special test for this case in various bits of code. It's a fairly normal thing to do, to have a pointer (or index in this case) to the first element BEYOND the last one in which you're interested, and to slice against it. Currently you get the assert if n == a.length. I don't believe it should assert unless n >= a.length (2) I think it is wrong that the test (a == null) will return true if and only if BOTH the length AND the address are zero. I think, if we're going to have a model in which the statement a = null; will create an empty array, then (a == null) should return true if a /is/ an empty array. That is, only the length should be tested, not the address. (If you want to test both parts, well there's always a === null). Arcane Jill |
June 27, 2004 Re: empty arrays - no complaints? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Arcane Jill | On Sun, 27 Jun 2004 18:58:50 +0000 (UTC), Arcane Jill <Arcane_member@pathlink.com> wrote: > In article <Xns9515C8A3CA1ACitsFarmer@63.105.9.61>, Farmer says... >> >> Why are there (almost) no complaints about D's support for empty arrays? > > Actually, I think that D has got it right here. At least mostly. I'm happy with > the fact that null counts as an empty array. But I do have SOME gripes. These > are: > > (1) given that a is an array of length n, the expression a[n..n] gives an array > bounds exception, and I don't believe it should. I would prefer that it simply > evaluated to an empty string. I've lost count of the number of times I've had to > put a special test for this case in various bits of code. It's a fairly normal > thing to do, to have a pointer (or index in this case) to the first element > BEYOND the last one in which you're interested, and to slice against it. > Currently you get the assert if n == a.length. I don't believe it should assert > unless n >= a.length This (now?) works. void main() { char[] a; a ~= "1"; a ~= "2"; a ~= "3"; printf("%.*s\n",a[3..3]); printf("%.*s\n",a[2..3]); printf("%.*s\n",a[1..3]); printf("%.*s\n",a[0..3]); } > (2) I think it is wrong that the test (a == null) will return true if and only > if BOTH the length AND the address are zero. I think this is correct. > I think, if we're going to have a > model in which the statement a = null; will create an empty array, I think this is wrong. a = null should set the data to null and length to 0. It should *not* create an empty array. > then (a == > null) should return true if a /is/ an empty array. That is, only the length > should be tested, not the address. (If you want to test both parts, well there's > always a === null). We *need* to have *both* null and empty arrays. The reason is pretty simple: - null means does not exist - emtpy means exists, but has no value (or empty value) This is important in situations like the original poster mentioned and in my experience for example... When reading POST input from a web page, you get a string like so: Setting1=Regan+Heath&Setting2=&& when requesting items you might have a function like: char[] getFormValue(char[] label); the code to get the values for the above form might go: char[] s; s = getFormValue("Setting1"); // s is "Regan Heath" s = getFormValue("Setting2"); // s is "" s = getFormValue("Setting3"); // s is null It is important the above code can tell that Setting3 was not passed in the form, so it can decide not to overwrite whatever current value that setting has, whereas it can tell Setting2 was passed and will overwrite the current value with a new blank one. I think the problem with arrays is that a null array should not compare equal to an empty array. In other words the original post test(s) null1 == "" null1 == empty1 should be false. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/ |
June 27, 2004 Re: empty arrays - no complaints? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Regan Heath | On Mon, 28 Jun 2004 10:06:18 +1200, Regan Heath wrote: [snip] > > We *need* to have *both* null and empty arrays. The reason is pretty > simple: > - null means does not exist > - emtpy means exists, but has no value (or empty value) > Agreed. A non-existant array is not the same as an array with no elements. -- Derek Melbourne, Australia |
June 27, 2004 Re: empty arrays - no complaints? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | Sean Kelly <sean@f4.ca> wrote in news:cbn29h$rpo$1@digitaldaemon.com: > Not really. I'd rather argue that D tries to make both usable and reduce odd errors resulting from uninitialized arrays. I think, D tries to *hide* errors resulting from uninitialized arrays. > >>This is unfortunate as >> >>1) a clear separation of empty-arrays vs. null-arrays is useful for functional rich but simple API interfaces: >> >>Imagine a function that returns the value of attributes of a XML-element char[] getAttrValue(char[] name) >> >>The attribute value could be non-existant (the attribute doesn't exist), be empty, or have a non-empty value. > > I'd say this is an interface or documentaation problem, not a language problem. You misunderstood me, I meant that the function interface is a good one. I could document the function like this: /* Function returns the value the attribute of the given name. @param name name of the attribute @return returns null if the attribute doesn't exist returns value of the attribute otherwise */ char[] getAttrValue(char[] name) But the other functions, I mentioned would be a necessary workaround if you couldn't distinguish between null and empty arrays. And these functions are a waste of both cpu cycles and developer brain. >>2) Initialization bugs are not detected at runtime. > > This makes sense in this case. I don't like the idea of having to distinguish between an initialized array with no elements and an uninitialized array, as both are equivalent IMO. Further, setting the length property will cause a reallocation for both types of arrays. Well, it's quite easy to do distinquish between an empty and a null array: An uninitialized array (null array) is a bug in either the programmer's code or in the code of a library. An initialized array (empty array) is a perfectly legal thing. Why is the idea to distinguish between a bug and correct programm behaviour such an unpleasent thing? Reallocation occures if the length is greater than the allocated size. I'm fine with that, the length 'property' is such an oddity that whatever it does, I would call it consistent. Reallocation is garanteed to not happen if the new length is less or equal the allocated size (Walter said so). Well, except when the new length happens to be 0. Talk about consistency. |
June 27, 2004 Re: empty arrays - no complaints? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Arcane Jill | Arcane Jill <Arcane_member@pathlink.com> wrote in news:cbn5da$vu1$1@digitaldaemon.com: > In article <Xns9515C8A3CA1ACitsFarmer@63.105.9.61>, Farmer says... >> >>Why are there (almost) no complaints about D's support for empty arrays? > > Actually, I think that D has got it right here. At least mostly. I'm happy with the fact that null counts as an empty array. But I do have SOME gripes. These are: > > (1) given that a is an array of length n, the expression a[n..n] gives an array bounds exception, and I don't believe it should. I would prefer that it simply evaluated to an empty string. I've lost count of the number of times I've had to put a special test for this case in various bits of code. It's a fairly normal thing to do, to have a pointer (or index in this case) to the first element BEYOND the last one in which you're interested, and to slice against it. Currently you get the assert if n == a.length. I don't believe it should assert unless n >= a.length I'm a bit confused, since in my sample, the array 'empty2' is created from a slice that points behind the array and it didn't cause an array bounds exception. Or did you need empty-slices, that point at arbitrary memory locations? > (2) I think it is wrong that the test (a == null) will return true if and only if BOTH the length AND the address are zero. I think, if we're going to have a model in which the statement a = null; will create an empty array, then (a == null) should return true if a /is/ an empty array. That is, only the length should be tested, not the address. (If you want to test both parts, well there's always a === null). I guess the rule here is simple: For value types (as the array handle is one) ==/equals() is exactly the same as ===/is. But why should we're going to model arrays in way that make arrays less powerful and requires *additional* code to make the model work correct? Regards, Farmer. |
June 27, 2004 Re: empty arrays - no complaints? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andy Friesen | Andy Friesen <andy@ikagames.com> wrote in news:cbn3js$tgq$1@digitaldaemon.com: > > I think the problem is that D arrays almost always behave like reference types, and therefore are almost always treated like reference types. Yes, this is a problem. It is a necessary evil to archive that outstanding performance. But it is not really related to the topic null array vs. empty array, since empty arrays are possible with the D array layout > They aren't. null arrays *are* empty arrays. No, null arrays are not empty arrays, as my sample proofs. > Arrays are value types which consist of a length and a pointer to memory. Copying and slicing an array creates a brand new array whose data happens to (generally) be memory that is also pointed to by another array. I think there's a lapsus, slices *always* point to the same memory as the array from which they were created. Regards, Farmer. |
June 27, 2004 Re: empty arrays - no complaints? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Farmer | Farmer wrote: > Andy Friesen <andy@ikagames.com> wrote in news:cbn3js$tgq$1@digitaldaemon.com: > >>I think the problem is that D arrays almost always behave like reference types, and therefore are almost always treated like reference types. > > Yes, this is a problem. It is a necessary evil to archive that outstanding performance. But it is not really related to the topic null array vs. empty array, since empty arrays are possible with the D array layout Sure, in the same sense that D allows 'empty' integers. :) >>They aren't. null arrays *are* empty arrays. > > No, null arrays are not empty arrays, as my sample proofs. Conceptually they are. If the length is zero, then the data pointer is meaningless. Testing the data pointer in such a case can be likened to using the result of a division by zero. Doing things like mathematically 'proving' that 3==5 or that empty!==null is easy when you go into the twilight zone. :) As an example: import std.string; char[] permute(char[] c) { // mutate that to which the array refers c[0] = 'H'; // mutate the array c.length = 4; return c; } int main() { char[] c = "hello world!"; printf("%s\n", toStringz(c)); char[] d = permute(c); printf("Post-permute\n"); printf("%s\n", toStringz(c)); printf("%s\n", toStringz(d)); return 0; } This program produces the output: hello world! Hello world! Hell The array is a value type. The data it points to is not. >>Arrays are value types which consist of a length and a pointer to memory. Copying and slicing an array creates a brand new array whose data happens to (generally) be memory that is also pointed to by another array. > > I think there's a lapsus, slices *always* point to the same memory as the array from which they were created. In my experience, this is true, but I don't know if it *must*, so I felt obligated to qualify my statement. -- andy |
Copyright © 1999-2021 by the D Language Foundation