April 25, 2009
Robert Jacques, el 25 de abril a las 10:58 me escribiste:
> On Sat, 25 Apr 2009 09:16:39 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> 
> >Stewart Gordon wrote:
> >>Andrei Alexandrescu wrote:
> >>>It looks we can't make it with only T[]. We need a genuine container type, and T[new] was suggested.
> >><snip>
> >> What do you mean by a "genuine container type"?
> >
> >One with capacity and probably value semantics.
> >
> >Andrei
> 
> First, capacity is only static (i.e. can be included in the array as you suggest) for free-list based allocation. Which is _only_ used by malloc and mark-sweep GCs. And I'm really hoping Leandro gives us something better than mark-sweep.

I don't think I will, not at least for my thesis (I will be concentrating in make the current GC concurrent and making pauses as small as possible).

But maybe Keith will: http://www.dsource.org/projects/tango/forums/topic/743

8-)

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
The diference is simple: hackers build things, crackers break them.
April 25, 2009
Well, a range of an associative array is a possibility (not a slice, but e.g. from std.algorithm.)

I think it'd be a mistake to discount that.  Being able to treat associative arrays like arrays, in some cases (e.g. with count(), etc.) is nice.  That said, i don't even know that std.algorithm currently supports this.

-[Unknown]


Steve Schveighoffer wrote:
> On Sat, 25 Apr 2009 08:07:52 -0500, Andrei Alexandrescu wrote:
> 
>> It looks we can't make it with only T[]. We need a genuine container
>> type, and T[new] was suggested. It would probably have value semantics.
>>
>> T[U] seems to have the same problem. If T[U] is the range, then how do
>> you call the container?
>>
>> If we follow through with a no-garbage-collected option for the core
>> types, then we also need to distinguish between a slice and its
>> container. The fact that we (almost) got away with T[] being at the same
>> time the container and the slice was that the garbage collector would
>> collect all unused slices.
>>
>>
>> Andrei
> 
> You are confusing the difference between T[] and T[U].  T[U] is a strange beast because it does not need to be new'd, but it acts completely like a reference type.  But T[U] *is* the container type.  What you should be asking is what is the *slice* type for T[U].  My answer would be that a slice type for a hashtable doesn't make sense.  We don't need AAs to be completely on par with normal arrays, so just leave them alone ;)
> 
> -Steve
April 25, 2009
The only downside to this is, it's a Bad Thing to have the size of an array be different between GC's, but I think it's probably not entirely avoidable.

But, I really think it's the only good way to do it, imho.  Maybe the exact implementation is different, but creating a new type just seems even messier... the beauty of slices in D is that they're Not Different (TM.)

-[Unknown]


Robert Jacques wrote:
> On Sat, 25 Apr 2009 09:07:52 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>> It looks we can't make it with only T[]. We need a genuine container type, and T[new] was suggested. It would probably have value semantics.
>>
>> T[U] seems to have the same problem. If T[U] is the range, then how do you call the container?
>>
>> If we follow through with a no-garbage-collected option for the core types, then we also need to distinguish between a slice and its container. The fact that we (almost) got away with T[] being at the same time the container and the slice was that the garbage collector would collect all unused slices.
>>
>>
>> Andrei
> 
> No, it's perfectly possible to have T[] be the same type for the container and the slice with reference counting instead of a full GC. All you need to do is store a pointer to the memory block's node. Then it's trivial to 1) get the capacity and 2) increase / decrease the reference count. You could even add an 'allocated length' parameter to the node, so you'd avoid some of the slicing bugs in the current implementation.
> 
> struct T[] {
>     T*     ptr;
>     size_t length;
>     void*  memory_block;
> }
April 25, 2009
What about simply using the const/invariant information?

After all, an "array builder" is a mutable array.  If you don't want to extend it, it should be invariant or const - e.g. invariant(string).

I've always thought the separation of strings and string builders, arrays and array builders, etc. was flawed.  It's just like strcmp; sure, it works, but is it really the right way to compare strings? Can't I just have my == and not worry about it?

-[Unknown]


Robert Jacques wrote:
> 3) Over-allocating all arrays is a wonderful waste of memory and performance (re chache lines, etc). A cleaner answer is an array builder class, which only pays for this overhead when it's actually needed.
April 25, 2009
Andrei Alexandrescu wrote:
> It looks we can't make it with only T[].

Citation needed. If this is a continuation of a previous discussion, a link to the previous discussion will suffice.

I grant that appending to an array is expensive. Make Array a user-defined type and we can see what's appropriate.
April 25, 2009
On Sat, 25 Apr 2009 13:30:22 -0400, Unknown W. Brackets <unknown@simplemachines.org> wrote:
> What about simply using the const/invariant information?
>
> After all, an "array builder" is a mutable array.  If you don't want to extend it, it should be invariant or const - e.g. invariant(string).

No, immutability really applies to the element and not just the array length. Besides, what about "hello" ~ "world"? Essentially, an "array builder" is an array with a cheaply, extend-able length and it's often used for immutable strings, etc. Also, while a concatenate strings all the time I never end up concatenating non-char arrays. And they almost always end up being mutable.

> I've always thought the separation of strings and string builders, arrays and array builders, etc. was flawed.  It's just like strcmp; sure, it works, but is it really the right way to compare strings? Can't I just have my == and not worry about it?

Well, you can just use ~= or ~ and not worry about it. Array builders are a performance enhancement for one special case at the cost of a general performance degradation. (There was a long discussion previously about these trade-offs)


April 25, 2009
I'm not talking about invariant(char)[], I'm talking about invariant(char[]).  That cannot be extended in length.  Really, what I want is something that is a "invariant(variant(char)[])", I suppose, but I don't even think that's necessary.

When the default string type was made invariant, many people were non-plussed at best.  That doesn't mean it was wrong.

How crazy would it be if, for example, std.string.split returned invariant(string[]) instead of string[]?  I doubt it would hurt many people, in actuality, and you could always get a mutable copy.

-[Unknown]


Robert Jacques wrote:
> On Sat, 25 Apr 2009 13:30:22 -0400, Unknown W. Brackets <unknown@simplemachines.org> wrote:
>> What about simply using the const/invariant information?
>>
>> After all, an "array builder" is a mutable array.  If you don't want to extend it, it should be invariant or const - e.g. invariant(string).
> 
> No, immutability really applies to the element and not just the array length. Besides, what about "hello" ~ "world"? Essentially, an "array builder" is an array with a cheaply, extend-able length and it's often used for immutable strings, etc. Also, while a concatenate strings all the time I never end up concatenating non-char arrays. And they almost always end up being mutable.
> 
>> I've always thought the separation of strings and string builders, arrays and array builders, etc. was flawed.  It's just like strcmp; sure, it works, but is it really the right way to compare strings? Can't I just have my == and not worry about it?
> 
> Well, you can just use ~= or ~ and not worry about it. Array builders are a performance enhancement for one special case at the cost of a general performance degradation. (There was a long discussion previously about these trade-offs)
> 
> 
April 25, 2009
On Sat, 25 Apr 2009 14:21:49 -0400, Unknown W. Brackets <unknown@simplemachines.org> wrote:
> I'm not talking about invariant(char)[], I'm talking about invariant(char[]).  That cannot be extended in length.  Really, what I want is something that is a "invariant(variant(char)[])", I suppose, but I don't even think that's necessary.
>
> When the default string type was made invariant, many people were non-plussed at best.  That doesn't mean it was wrong.

Okay I think we're on different pages, so I'll reiterate:
> Robert Jacques wrote:
>>  No, immutability really applies to the element and not just the array length.

1) Immutability is a bad way to express 'this array doesn't change length'. Consider arrays of numbers instead of characters and you'll see what I mean.
2) A large portion of the time immutable arrays are concatenated (i.e. "hello" ~ "world")

> How crazy would it be if, for example, std.string.split returned invariant(string[]) instead of string[]?  I doubt it would hurt many people, in actuality, and you could always get a mutable copy.
>
> -[Unknown]

1) At first glance, I don't see any reason why split couldn't or shouldn't, besides string[] being shorter than immutable string[].
2) I don't see how this is relevant to array capacities / builders.
April 25, 2009
Robert Jacques wrote:
> On Sat, 25 Apr 2009 09:07:52 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>> It looks we can't make it with only T[]. We need a genuine container type, and T[new] was suggested. It would probably have value semantics.
>>
>> T[U] seems to have the same problem. If T[U] is the range, then how do you call the container?
>>
>> If we follow through with a no-garbage-collected option for the core types, then we also need to distinguish between a slice and its container. The fact that we (almost) got away with T[] being at the same time the container and the slice was that the garbage collector would collect all unused slices.
>>
>>
>> Andrei
> 
> No, it's perfectly possible to have T[] be the same type for the container and the slice with reference counting instead of a full GC. All you need to do is store a pointer to the memory block's node. Then it's trivial to 1) get the capacity and 2) increase / decrease the reference count. You could even add an 'allocated length' parameter to the node, so you'd avoid some of the slicing bugs in the current implementation.
> 
> struct T[] {
>     T*     ptr;
>     size_t length;
>     void*  memory_block;
> }

Yes, but that way you take the owner (i.e. the memory block) completely out of the picture. It would help to give the owner an identity.

Andrei
April 25, 2009
On Sat, 25 Apr 2009 16:16:07 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> Robert Jacques wrote:
>> On Sat, 25 Apr 2009 09:07:52 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>>> It looks we can't make it with only T[]. We need a genuine container type, and T[new] was suggested. It would probably have value semantics.
>>>
>>> T[U] seems to have the same problem. If T[U] is the range, then how do you call the container?
>>>
>>> If we follow through with a no-garbage-collected option for the core types, then we also need to distinguish between a slice and its container. The fact that we (almost) got away with T[] being at the same time the container and the slice was that the garbage collector would collect all unused slices.
>>>
>>>
>>> Andrei
>>  No, it's perfectly possible to have T[] be the same type for the container and the slice with reference counting instead of a full GC. All you need to do is store a pointer to the memory block's node. Then it's trivial to 1) get the capacity and 2) increase / decrease the reference count. You could even add an 'allocated length' parameter to the node, so you'd avoid some of the slicing bugs in the current implementation.
>>  struct T[] {
>>     T*     ptr;
>>     size_t length;
>>     void*  memory_block;
>> }
>
> Yes, but that way you take the owner (i.e. the memory block) completely out of the picture. It would help to give the owner an identity.
>
> Andrei

1) In what way would it help to give the owner an identity?

2) How does reference counting arrays "take the owner completely out of the picture"? Isn't this simply a switch from single ownership to shared ownership?