| Thread overview | ||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
December 21, 2009 Enhanced array appending | ||||
|---|---|---|---|---|
| ||||
All, I created a new patched druntime that prevents array stomping and at the same time increases append performance for thread-local array appending. The patch is attached to bugzilla entry 3637. ( http://d.puremagic.com/issues/show_bug.cgi?id=3637 ) For those of you unfamiliar with array stomping, here is code that demonstrates the problem: int[] a = [1,2,3].dup; int[] b = a[0..1]; b ~= 4; assert(b == [1,4]); assert(a == [1,2,3]); // fails with old code, passes with mine. Note that my code still does not help with the problem of having a shared view of data when appending, as outlined in Bartosz's NG thread "D array expansion and non-deterministic re-allocation." I don't think this problem can be solved without sacrificing performance of array appending. For those of you unfamiliar with D's poor array append performance, please see http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=103563 Please give it a try and see if you can find any issues with it. I'm sorry that I don't have a binary distribution channel for this. Maybe someone can build some binaries for others to download. -Steve | ||||
December 23, 2009 Re: Enhanced array appending | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | Steven Schveighoffer wrote:
> All,
>
> I created a new patched druntime that prevents array stomping and at the same time increases append performance for thread-local array appending.
>
> The patch is attached to bugzilla entry 3637. ( http://d.puremagic.com/issues/show_bug.cgi?id=3637 )
Nobody cares about it? (Except someone commenting in bugzilla, but where's Andrei and Walter?)
Also, what again were the arguments against adding a "capacity" field in the slice struct to deal with the append problem?
I think Go does this.
| |||
December 23, 2009 Re: Enhanced array appending | ||||
|---|---|---|---|---|
| ||||
Posted in reply to grauzone | grauzone Wrote: > Steven Schveighoffer wrote: > > All, > > > > I created a new patched druntime that prevents array stomping and at the same time increases append performance for thread-local array appending. > > > > The patch is attached to bugzilla entry 3637. ( http://d.puremagic.com/issues/show_bug.cgi?id=3637 ) > > Nobody cares about it? (Except someone commenting in bugzilla, but where's Andrei and Walter?) I have already discussed this patch with Andrei and Walter (and the rest of the compiler/phobos crew), they are on board. I am a little disappointed with the response here, but I guess it could be attributed to the barrier to experimentation (you have to apply the patch and compile the runtime) and the likelihood that many are on vacation right now (as I will be as of tonight). > Also, what again were the arguments against adding a "capacity" field in the slice struct to deal with the append problem? The problem is that adding a capacity field only works if the object is a reference type. A value type will make either choose to make copies of the capacity field, resulting in the same stomping problem (and maybe worse), or will zero out one of the copy's capacity field, resulting in an artificial limitation for appending. The only true way to make this work right is to have a capacity field shared among all the aliases to the same data, and the most logical place for that is in the allocated block. -Steve | |||
December 23, 2009 Re: Enhanced array appending | ||||
|---|---|---|---|---|
| ||||
Posted in reply to grauzone | grauzone wrote: > Steven Schveighoffer wrote: >> All, >> >> I created a new patched druntime that prevents array stomping and at the same time increases append performance for thread-local array appending. >> >> The patch is attached to bugzilla entry 3637. ( http://d.puremagic.com/issues/show_bug.cgi?id=3637 ) > > Nobody cares about it? (Except someone commenting in bugzilla, but where's Andrei and Walter?) The entire Phobos team discussed this change very thoroughly. We are all very excited about it. > Also, what again were the arguments against adding a "capacity" field in the slice struct to deal with the append problem? > > I think Go does this. But Go is unfinished. When they'll finish it, they'll remove the capacity field :o). Stop being a curmudgeon and give us some love. Andrei | |||
December 23, 2009 Re: Enhanced array appending | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | Steven Schveighoffer wrote: > grauzone Wrote: > >> Steven Schveighoffer wrote: >>> All, >>> >>> I created a new patched druntime that prevents array stomping and at the same time increases append performance for thread-local array appending. >>> >>> The patch is attached to bugzilla entry 3637. ( http://d.puremagic.com/issues/show_bug.cgi?id=3637 ) >> Nobody cares about it? (Except someone commenting in bugzilla, but where's Andrei and Walter?) > > I have already discussed this patch with Andrei and Walter (and the rest of the compiler/phobos crew), they are on board. I am a little disappointed with the response here, but I guess it could be attributed to the barrier to experimentation (you have to apply the patch and compile the runtime) and the likelihood that many are on vacation right now (as I will be as of tonight). Now Andrei replied and said everyone was "excited". I'm expecting most people to wait until there's a dmd release with the patch included (I'll do that). >> Also, what again were the arguments against adding a "capacity" field in the slice struct to deal with the append problem? > > The problem is that adding a capacity field only works if the object is a reference type. A value type will make either choose to make copies of the capacity field, resulting in the same stomping problem (and maybe worse), or will zero out one of the copy's capacity field, resulting in an artificial limitation for appending. The only true way to make this work right is to have a capacity field shared among all the aliases to the same data, and the most logical place for that is in the allocated block. I see... Go makes slices value types (which references the array data), but I'm not sure how or if they handle this situation. Then, why use a cache to find the pointer to the capacity field inside the array memory block? You could just store a pointer to the capacity field in the slice. Something like this, I guess: struct slice { void* ptr; size_t length; array_info* array; } //can be located at the start of an array memory block struct array_info { size_t max_length; size_t capacity; } void slice_set_length(ref slice s, size_t newlen) { if (newlen < s.length) { s.length = newlen; return; } //was not allocated from the GC? if (!s.array) goto new_array; //danger of stomping? if (s.length != s.array.max_length) goto new_array; //memory block too small? if (newlen > s.array.capacity) goto new_array; s.length = newlen; s.array.max_length = max(s.array.max_length, s.length); return; new_array: ... allocate a memory block with array descriptor ... copy s[] into it ... set s to new array pointer, length, array descriptor } Would this work? (Except that I didn't include handling of array element default initialization.) Actually, I think that's the same idea (as Andrei's original post on the array append cache thing), just much shorter and no indeterminism due to a volatile cache. > -Steve | |||
December 23, 2009 Re: Enhanced array appending | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | Andrei Alexandrescu wrote: > grauzone wrote: >> Steven Schveighoffer wrote: >>> All, >>> >>> I created a new patched druntime that prevents array stomping and at the same time increases append performance for thread-local array appending. >>> >>> The patch is attached to bugzilla entry 3637. ( http://d.puremagic.com/issues/show_bug.cgi?id=3637 ) >> >> Nobody cares about it? (Except someone commenting in bugzilla, but where's Andrei and Walter?) > > The entire Phobos team discussed this change very thoroughly. We are all very excited about it. Did anyone try it and make initial tests? >> Also, what again were the arguments against adding a "capacity" field in the slice struct to deal with the append problem? >> >> I think Go does this. > > But Go is unfinished. When they'll finish it, they'll remove the capacity field :o). I believe not even you can predict the future. > Stop being a curmudgeon and give us some love. Sorry. dmd compiler bugs made me this way. I mean no harm. > > Andrei | |||
December 23, 2009 Re: Enhanced array appending | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | Steven Schveighoffer, el 23 de diciembre a las 08:44 me escribiste: > grauzone Wrote: > > > Steven Schveighoffer wrote: > > > All, > > > > > > I created a new patched druntime that prevents array stomping and at the same time increases append performance for thread-local array appending. > > > > > > The patch is attached to bugzilla entry 3637. ( http://d.puremagic.com/issues/show_bug.cgi?id=3637 ) > > > > Nobody cares about it? (Except someone commenting in bugzilla, but where's Andrei and Walter?) > > I have already discussed this patch with Andrei and Walter (and the rest of the compiler/phobos crew), they are on board. I am a little disappointed with the response here, but I guess it could be attributed to the barrier to experimentation (you have to apply the patch and compile the runtime) and the likelihood that many are on vacation right now (as I will be as of tonight). > > > Also, what again were the arguments against adding a "capacity" field in the slice struct to deal with the append problem? > > The problem is that adding a capacity field only works if the object is a reference type. A value type will make either choose to make copies of the capacity field, resulting in the same stomping problem (and maybe worse), or will zero out one of the copy's capacity field, resulting in an artificial limitation for appending. The only true way to make this work right is to have a capacity field shared among all the aliases to the same data, and the most logical place for that is in the allocated block. This is making the half-value, half-reference semantics of arrays even worse, since now the capacity is "passed by reference" (at the end of the array data) but the length and ptr as values. I don't comment much about the patch (and the design) because I already did so in the several threads about it. I think is a very bad idea, I think dynamic arrays should be a proper reference type (with ptr, length and capacity) and slices should be a value type without appending (only with ptr and length). -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- You can do better than me. You could throw a dart out the window and hit someone better than me. I'm no good! -- George Constanza | |||
December 24, 2009 Re: Enhanced array appending | ||||
|---|---|---|---|---|
| ||||
Posted in reply to grauzone | grauzone Wrote:
> Steven Schveighoffer wrote:
> > The problem is that adding a capacity field only works if the object is a reference type. A value type will make either choose to make copies of the capacity field, resulting in the same stomping problem (and maybe worse), or will zero out one of the copy's capacity field, resulting in an artificial limitation for appending. The only true way to make this work right is to have a capacity field shared among all the aliases to the same data, and the most logical place for that is in the allocated block.
>
> I see... Go makes slices value types (which references the array data), but I'm not sure how or if they handle this situation.
>
> Then, why use a cache to find the pointer to the capacity field inside the array memory block? You could just store a pointer to the capacity field in the slice.
>
> Something like this, I guess:
>
> struct slice {
> void* ptr;
> size_t length;
> array_info* array;
> }
>
> [snip]
>
> Would this work?
Yes, but the problem I see with it is you are passing around that array_info pointer with the slice for *all* slices, not just ones you intend to use appending on. There is plenty of code that uses arrays and slices without ever doing appending, so that extra pointer is wasted space, and also increases the cost of passing slices by value.
Your idea is essentially the idea behind my patch, except you can get the array info without having an additional pointer. The cost of lookup for the block info is mitigated by the cache, making lookups cost very little.
The patch is a compromise between performance for appending and performance for everything else. I think there will still be a need in specialized code for a fatter array type that allows appending without complicated GC lookups, but the patch provides safe appending for a reasonable performance hit without affecting other aspects of array usage. As with all compromises, there are always different ways of doing things, so time will tell if this patch is the best method.
-Steve
| |||
December 24, 2009 Re: Enhanced array appending | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Leandro Lucarella | Leandro Lucarella Wrote: > This is making the half-value, half-reference semantics of arrays even worse, since now the capacity is "passed by reference" (at the end of the array data) but the length and ptr as values. This is absolutely false. It does not make anything worse. It may not make it as good as you want, but it definitely is better than the current situation. The current situation is that the capacity is "invented" by the runtime to be exactly what you need if the slice is at the beginning of a block, or 0 if it's not. Such a scheme is totally unsafe. The proposed patch leaves all aspects of arrays and slices alone *except* for this one thing. > > I don't comment much about the patch (and the design) because I already > did so in the several threads about it. I think is a very bad idea, > I think dynamic arrays should be a proper reference type (with ptr, length > and capacity) and slices should be a value type without appending (only > with ptr and length). I don't agree. I like that slices and arrays are the same type, it allows for much cleaner code, and much less function duplication. I see the "append extends into a block" as an optimization, one that currently is invalid, and with the patch is valid. Disallowing appending to slices means you must generate arrays whenever you want to do appending (which copies the entire slice), incurring a higher cost for little gain. I'd actually say no gain. Do you have an example where such a scheme is better than arrays with my patch? Note that nobody is stopping you from making such array types, you are free to add them. -Steve | |||
December 24, 2009 Re: Enhanced array appending | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | Steven Schveighoffer, el 24 de diciembre a las 10:15 me escribiste: > Leandro Lucarella Wrote: > > > This is making the half-value, half-reference semantics of arrays even worse, since now the capacity is "passed by reference" (at the end of the array data) but the length and ptr as values. > > This is absolutely false. It does not make anything worse. It may not make it as good as you want, but it definitely is better than the current situation. I meant from the language semantics POV (you have more rules to remember about arrays), I agree it makes things better from a practical POV. > > I don't comment much about the patch (and the design) because I already > > did so in the several threads about it. I think is a very bad idea, > > I think dynamic arrays should be a proper reference type (with ptr, length > > and capacity) and slices should be a value type without appending (only > > with ptr and length). > > I don't agree. I like that slices and arrays are the same type, it > allows for much cleaner code, and much less function duplication. I see > the "append extends into a block" as an optimization, one that currently > is invalid, and with the patch is valid. Disallowing appending to > slices means you must generate arrays whenever you want to do appending > (which copies the entire slice), incurring a higher cost for little > gain. I'd actually say no gain. Do you have an example where such > a scheme is better than arrays with my patch? > > Note that nobody is stopping you from making such array types, you are free to add them. I said this several times but here it comes again: We agree that the current scheme is better in terms of performance than having a proper reference type dynamic array and a separated slice value type. The thing is, I think optimization is nice as long as it doesn't leak the abstractions to the user. The current scheme is clearly this scenario, to make an optimization (one that problably a lot of people wouldn't need), you are making things harder for the user and creating a monster half array, half slice that is harder to understand than 2 smaller and simpler types. You think this performance gain is more important than the added complexity, I think the other way. You think D should be *by default* as fast as possible, even when making the programmers life a little harder, and let the user create or use library types for easy use and I think it should be the exact opposite, make *default* things easier and not as fast, and let the user create/user library types if they need top performance/optimizations. -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- MATAN AL PERRO: DICEN QUE ESTABA POSEIDO POR EL DEMONIO... -- Crónica TV | |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply