April 25, 2009 Re: If T[new] is the container for T[], then what is the container for T[U]? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Robert Jacques | Robert Jacques wrote:
> 1) In what way would it help to give the owner an identity?
For example, on an implementation that doesn't want gc, they'd need the owner for deterministic destruction.
Again, slices could be conflated with arrays mainly because the gc took care of the litter left after rebinding slices. If you want to offer the option to do without a gc, then arrays and slices (= ranges) must be separate entities. Then you hold on to the arrays and pass slices around. When you destroy the arrays, the slices originating from them become invalid (which can be checked or not).
With hashes it's even more interesting. We need two types: the hash and the hash range that crawls over that hash. Right now they are conflated into one V[K]. If we want to define a range crawling over a hash (and consequently do away with opApply) we'd need the range to store a little stack of breadcrumbs so the range knows how to iterate. So definitely there must be two entities. Traditionally we've been using V[K] both as a range and as a container, but the usage so far was closer to a range (I think without being sure). So one issue is to find a good type literal to express "type for which the corresponding range is V[K]".
Makes sense?
Andrei
| |||
April 25, 2009 Re: If T[new] is the container for T[], then what is the container for T[U]? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | Op Sat, 25 Apr 2009 15:07:52 +0200 schreef Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org>: > It looks we can't make it with only T[]. We need a genuine container type, and T[new] was suggested. It would probably have value semantics. > > T[U] seems to have the same problem. If T[U] is the range, then how do you call the container? T[new U] ? T[new(U)] ? T[U new] ? you knew...? still, reusing new like that looks pretty akward to me. | |||
April 25, 2009 Re: If T[new] is the container for T[], then what is the container | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Robert Jacques | Robert Jacques Wrote:
> On Sat, 25 Apr 2009 14:21:49 -0400, Unknown W. Brackets <unknown@simplemachines.org> wrote:
> > I'm not talking about invariant(char)[], I'm talking about invariant(char[]). That cannot be extended in length. Really, what I want is something that is a "invariant(variant(char)[])", I suppose, but I don't even think that's necessary.
> >
> > When the default string type was made invariant, many people were non-plussed at best. That doesn't mean it was wrong.
>
> Okay I think we're on different pages, so I'll reiterate:
> > Robert Jacques wrote:
> >> No, immutability really applies to the element and not just the array
> >> length.
>
> 1) Immutability is a bad way to express 'this array doesn't change
> length'. Consider arrays of numbers instead of characters and you'll see
> what I mean.
> 2) A large portion of the time immutable arrays are concatenated (i.e.
> "hello" ~ "world")
>
> > How crazy would it be if, for example, std.string.split returned invariant(string[]) instead of string[]? I doubt it would hurt many people, in actuality, and you could always get a mutable copy.
> >
> > -[Unknown]
>
> 1) At first glance, I don't see any reason why split couldn't or shouldn't, besides string[] being shorter than immutable string[]. 2) I don't see how this is relevant to array capacities / builders.
invariant(char[]) is not rebindable. There are bugs in bugzilla that show holes in the usage of invariant(char)[] instead of a rebindable invariant(char[]).
| |||
April 25, 2009 Re: If T[new] is the container for T[], then what is the container for T[U]? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On 2009-04-25 18:04:32 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> said: > Again, slices could be conflated with arrays mainly because the gc took care of the litter left after rebinding slices. If you want to offer the option to do without a gc, then arrays and slices (= ranges) must be separate entities. Then you hold on to the arrays and pass slices around. When you destroy the arrays, the slices originating from them become invalid (which can be checked or not). > > With hashes it's even more interesting. We need two types: the hash and the hash range that crawls over that hash. Right now they are conflated into one V[K]. If we want to define a range crawling over a hash (and consequently do away with opApply) we'd need the range to store a little stack of breadcrumbs so the range knows how to iterate. So definitely there must be two entities. Traditionally we've been using V[K] both as a range and as a container, but the usage so far was closer to a range (I think without being sure). So one issue is to find a good type literal to express "type for which the corresponding range is V[K]". You say there must be two entities. But the two entities already exist, only the container is stored by reference. T[] is a reference to a portion of a fixed-size array (static array or heap array). T[U] is a reference to an associative array. If you could call "new" and "delete" on these two types to delete the referenced container, or if the container were reference-counted, you wouldn't need a garbage collector. It's pretty much the same as for classes in fact. -- Michel Fortin michel.fortin@michelf.com http://michelf.com/ | |||
April 25, 2009 Re: If T[new] is the container for T[], then what is the container for T[U]? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Sat, 25 Apr 2009 18:04:32 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote: > Robert Jacques wrote: >> 1) In what way would it help to give the owner an identity? > > For example, on an implementation that doesn't want gc, they'd need the owner for deterministic destruction. Why is a single owner needed for deterministic destruction? > Again, slices could be conflated with arrays mainly because the gc took care of the litter left after rebinding slices. If you want to offer the option to do without a gc, then arrays and slices (= ranges) must be separate entities. Then you hold on to the arrays and pass slices around. When you destroy the arrays, the slices originating from them become invalid (which can be checked or not). What is wrong with reference counting arrays just like other objects? (which is what I've been proposing) Also, I see a major logical problem with your proposal: Given A single owner is required for deterministic destruction. Then The array's owner is responsible for the deterministic destruction. Therefore When the array's owner goes out of scope, it must destruct the array. But Then How does a function return an array? Your original post mentioned value semantics, but that results in a lot of needless allocation and copying. > With hashes it's even more interesting. We need two types: the hash and the hash range that crawls over that hash. Right now they are conflated into one V[K]. If we want to define a range crawling over a hash (and consequently do away with opApply) we'd need the range to store a little stack of breadcrumbs so the range knows how to iterate. So definitely there must be two entities. Traditionally we've been using V[K] both as a range and as a container, but the usage so far was closer to a range (I think without being sure). So one issue is to find a good type literal to express "type for which the corresponding range is V[K]". While separating hashes and ranges is important, it seems orthogonal to the topic at hand. > Makes sense? Yes and no. What I wanted was some high-level reason why have T[new] and T[] was fundamentally better than only having T[]. What I got was a low level assertion that they are needed to do -nogc. However, since earlier in this thread I provided a counter example, i.e. reference counted arrays, your assertion stands as false. And a language change shouldn't be based on a faulty assumption. | |||
April 25, 2009 Re: If T[new] is the container for T[], then what is the container for T[U]? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Michel Fortin | Michel Fortin wrote: > On 2009-04-25 18:04:32 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> said: > >> Again, slices could be conflated with arrays mainly because the gc took care of the litter left after rebinding slices. If you want to offer the option to do without a gc, then arrays and slices (= ranges) must be separate entities. Then you hold on to the arrays and pass slices around. When you destroy the arrays, the slices originating from them become invalid (which can be checked or not). >> >> With hashes it's even more interesting. We need two types: the hash and the hash range that crawls over that hash. Right now they are conflated into one V[K]. If we want to define a range crawling over a hash (and consequently do away with opApply) we'd need the range to store a little stack of breadcrumbs so the range knows how to iterate. So definitely there must be two entities. Traditionally we've been using V[K] both as a range and as a container, but the usage so far was closer to a range (I think without being sure). So one issue is to find a good type literal to express "type for which the corresponding range is V[K]". > > You say there must be two entities. But the two entities already exist, only the container is stored by reference. T[] is a reference to a portion of a fixed-size array (static array or heap array). Yah, and the question is where is the entire array. > T[U] is a reference to an associative array. Yah, and the question is where is the range that the associative array uses for iteration. > If you could call "new" and "delete" on these two types to delete the referenced container, or if the container were reference-counted, you wouldn't need a garbage collector. It's pretty much the same as for classes in fact. It might be wise to make those types values, not references. Andrei | |||
April 25, 2009 Re: If T[new] is the container for T[], then what is the container for T[U]? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Robert Jacques | Robert Jacques wrote:
> On Sat, 25 Apr 2009 18:04:32 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> Also, I see a major logical problem with your proposal:
> Given A single owner is required for deterministic destruction.
> Then The array's owner is responsible for the deterministic destruction.
> Therefore When the array's owner goes out of scope, it must destruct the array.
> But Then How does a function return an array?
My understanding is that you don't know how C++ and D value semantics work. It would be impossible to answer your (albeit good) question without first writing a tutorial on the topic, which unfortunately I don't have the time for.
Andrei
| |||
April 25, 2009 Re: If T[new] is the container for T[], then what is the container for T[U]? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Robert Jacques | Robert Jacques wrote:
> On Sat, 25 Apr 2009 18:04:32 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>
>> Robert Jacques wrote:
>>> 1) In what way would it help to give the owner an identity?
>>
>> For example, on an implementation that doesn't want gc, they'd need the owner for deterministic destruction.
>
> Why is a single owner needed for deterministic destruction?
>
>> Again, slices could be conflated with arrays mainly because the gc took care of the litter left after rebinding slices. If you want to offer the option to do without a gc, then arrays and slices (= ranges) must be separate entities. Then you hold on to the arrays and pass slices around. When you destroy the arrays, the slices originating from them become invalid (which can be checked or not).
>
> What is wrong with reference counting arrays just like other objects? (which is what I've been proposing)
>
> Also, I see a major logical problem with your proposal:
> Given A single owner is required for deterministic destruction.
> Then The array's owner is responsible for the deterministic destruction.
> Therefore When the array's owner goes out of scope, it must destruct the array.
> But Then How does a function return an array?
>
> Your original post mentioned value semantics, but that results in a lot of needless allocation and copying.
>
>> With hashes it's even more interesting. We need two types: the hash and the hash range that crawls over that hash. Right now they are conflated into one V[K]. If we want to define a range crawling over a hash (and consequently do away with opApply) we'd need the range to store a little stack of breadcrumbs so the range knows how to iterate. So definitely there must be two entities. Traditionally we've been using V[K] both as a range and as a container, but the usage so far was closer to a range (I think without being sure). So one issue is to find a good type literal to express "type for which the corresponding range is V[K]".
>
> While separating hashes and ranges is important, it seems orthogonal to the topic at hand.
>
>> Makes sense?
>
> Yes and no. What I wanted was some high-level reason why have T[new] and T[] was fundamentally better than only having T[]. What I got was a low level assertion that they are needed to do -nogc. However, since earlier in this thread I provided a counter example, i.e. reference counted arrays, your assertion stands as false. And a language change shouldn't be based on a faulty assumption.
Simple solution: put the array definition in object.d and try implementing arrays with reference counting or manual memory management.
I think stored slices break manual memory management, even with a dedicated slice type; but they should work just fine with refcounting. If you don't want to change the language, object.Array will have to implement the logic for slices and for allocated arrays. It's a bit ugly, and it makes the Array type larger. Also, Array's reference count would need to be accessed by reference.
| |||
April 25, 2009 Re: If T[new] is the container for T[], then what is the container for T[U]? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Christopher Wright | Christopher Wright wrote:
> Simple solution: put the array definition in object.d and try implementing arrays with reference counting or manual memory management.
>
> I think stored slices break manual memory management, even with a dedicated slice type; but they should work just fine with refcounting. If you don't want to change the language, object.Array will have to implement the logic for slices and for allocated arrays. It's a bit ugly, and it makes the Array type larger. Also, Array's reference count would need to be accessed by reference.
There are three schemes that are good starting points:
1. As right now :o).
2. Using refcounting. Arrays will be 4 words long (begin, end, end-of-store, refcount*) and slices will be 3 words long (begin, end, and owner*)
3. Using manual management. Arrays will be 3 words long (begin, end, end-of-store) and slices will be 2 words long (begin, end). This is close to C++/STL. This case is less safe but very efficient.
If we manage to integrate them all... that's quite the holy grail. And I think it's entirely possible with only a few changes to the language.
Andrei
| |||
April 26, 2009 Re: If T[new] is the container for T[], then what is the container for T[U]? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Robert Jacques | Let me say it a different way:
split() returns results that are, often, stored or compared. It is not often that these results are concatenated to or modified. Possibly, they may not be sliced (further) often either.
Such a case represents a perfect example of an array (or string) that need not have a greater capacity than its length. Indeed; no property of the array is likely to change - its length, contents, capacity, or even ptr. All are likely to remain constant until the object is deleted from memory.
So, it would seem to me, this return value (as a quick example) could easily be changed to one that fits the following:
1. Is an array of data.
2. Does not have mutable contents (data.)
3. Does not allow its length or capacity to be extended (excuse me for considering this mutation.)
4. Is a reference, rather than a static array.
My suggestion is that "char[]" should _be_ a StringBuilder, and "int[]" should _be_ an ArrayBuilder, and we should adopt a separate syntax for something that _isn't_ one of those.
That said, I did just browse some of my D source files - an XML 1.0 parser/tree and a custom rfc-compliant ftp server - and nearly the only concatenation I use is building exception strings. I'm a little surprised. Only a couple times do I do anything a builder would be useful for.
One time is on XML child nodes; since I'm building a list of children, it is a "builder" type thing. However, my XML tree is fully incremental - that means you could traverse it before all of the data has been read from the net. It would not be practical for me to use a builder that didn't have proper read access for this case.
I really think using a "StringBuilder" class is just like the solution of using a "String" class. D, in my opinion, proved the latter was not necessary; I hold that the former isn't either.
-[Unknown]
Robert Jacques wrote:
> On Sat, 25 Apr 2009 14:21:49 -0400, Unknown W. Brackets <unknown@simplemachines.org> wrote:
>> I'm not talking about invariant(char)[], I'm talking about invariant(char[]). That cannot be extended in length. Really, what I want is something that is a "invariant(variant(char)[])", I suppose, but I don't even think that's necessary.
>>
>> When the default string type was made invariant, many people were non-plussed at best. That doesn't mean it was wrong.
>
> Okay I think we're on different pages, so I'll reiterate:
>> Robert Jacques wrote:
>>> No, immutability really applies to the element and not just the array length.
>
> 1) Immutability is a bad way to express 'this array doesn't change length'. Consider arrays of numbers instead of characters and you'll see what I mean.
> 2) A large portion of the time immutable arrays are concatenated (i.e. "hello" ~ "world")
>
>> How crazy would it be if, for example, std.string.split returned invariant(string[]) instead of string[]? I doubt it would hurt many people, in actuality, and you could always get a mutable copy.
>>
>> -[Unknown]
>
> 1) At first glance, I don't see any reason why split couldn't or shouldn't, besides string[] being shorter than immutable string[].
> 2) I don't see how this is relevant to array capacities / builders.
| |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply