April 02, 2007 Re: A Lock-Free Hash Table (google video) | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Lionello Lunesu | Yes. I've done benchmarks and power of two is better. I wonder, does Tango use a power of two algorithm for its AA's? -Craig "Lionello Lunesu" <lio@lunesu.remove.com> wrote in message news:euqekc$24nu$1@digitalmars.com... > Knud Soerensen wrote: >> I would like to share this interesting video http://video.google.com/videoplay?docid=2139967204534450862 > > Very interesting, thanks for that. Maybe now we can convince Walter to include the patch I made about a year ago to make D's AA power-of-2/AND instead of prime/MOD? > > I'm wondering if this would be useful for D. What about memory allocations? If one thread is allocating a key/value, will that allocation block other threads? Any difference between Phobos/Tango? > > L. | |||
April 02, 2007 Re: A Lock-Free Hash Table (google video) | ||||
|---|---|---|---|---|
| ||||
Posted in reply to David B. Held | "David B. Held" <dheld@codelogicconsulting.com> wrote in message news:eupvmj$1au5$1@digitalmars.com... > Craig Black wrote: >> Good stuff. BTW, does anyone know when is D going to start using lock >> free >> algorithms with their built-in/templated containers? > > When STM is implemented. ;) And I hope I'm not giving away too much by saying that some reasonably smart folks are working on it even as we speak... > > Dave This is very good. It seems that there are a number of things in the works for the D compiler. The const stuff, Walter has mentioned run-time reflection, and now this STM as you now mention. Any of these things would be very cool IMO. Although Walter sometimes mentions stuff currently in development, he does not say much about it. Do you have any insight on what is coming next? -Craig | |||
April 02, 2007 Re: A Lock-Free Hash Table (google video) | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Craig Black | Craig Black Wrote:
> > Very interesting, thanks for that. Maybe now we can convince Walter to include the patch I made about a year ago to make D's AA power-of-2/AND instead of prime/MOD?
oh my! I thought this was obvious; as was the powers of two, and using atomic asm operators to complete whole transactions or not at all. I assumed this was already well known and implemented.
Actually, another way to implement the atomic transaction is to use any SSE/SSE2 move instruction to move a key/value pair into place with a:
struct KeyVal {
char* key;
union {
long l_value;
void* ptr_value;
double d_value;
}
}
assert(KeyVal.sizeof == 16);
into a ^2 sized array. It's either there, or it's not; so it's concurrent if you follow the rest of the principles, and requires absolutely no preparation for a CAS or locking.
I'm going to hope that moving a 16-byte uses SSE2...
| |||
April 02, 2007 Re: A Lock-Free Hash Table (google video) | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Lionello Lunesu | Lionello Lunesu wrote:
> Knud Soerensen wrote:
>> I would like to share this interesting video
>> http://video.google.com/videoplay?docid=2139967204534450862
>
> Very interesting, thanks for that. Maybe now we can convince Walter to include the patch I made about a year ago to make D's AA power-of-2/AND instead of prime/MOD?
>
> I'm wondering if this would be useful for D. What about memory allocations? If one thread is allocating a key/value, will that allocation block other threads?
Yes.
> Any difference between Phobos/Tango?
No. I've considered creating a GC with per-thread memory pools and such, but haven't had the time.
Sean
| |||
April 02, 2007 Re: A Lock-Free Hash Table (google video) | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Craig Black | Craig Black wrote:
> Yes. I've done benchmarks and power of two is better. I wonder, does Tango use a power of two algorithm for its AA's?
It's on my "to do" list, actually. So far, however, I have been avoiding making making unnecessary changes to the compiler runtime in case Walter decides to take over management of it. But there are a few things I want to resolve before Tango goes 1.0 either way.
Sean
| |||
April 02, 2007 Re: A Lock-Free Hash Table (google video) | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Dan | Dan wrote: > Craig Black Wrote: > >>> Very interesting, thanks for that. Maybe now we can convince Walter to include the patch I made about a year ago to make D's AA power-of-2/AND instead of prime/MOD? > > oh my! I thought this was obvious; as was the powers of two, and using atomic asm operators to complete whole transactions or not at all. I assumed this was already well known and implemented. > > Actually, another way to implement the atomic transaction is to use any SSE/SSE2 move instruction to move a key/value pair into place with a: > > struct KeyVal { > char* key; > union { > long l_value; > void* ptr_value; > double d_value; > } > } > assert(KeyVal.sizeof == 16); > > into a ^2 sized array. It's either there, or it's not; so it's concurrent if you follow the rest of the principles, and requires absolutely no preparation for a CAS or locking. Yup. It's rehashing that gets a bit messy. > I'm going to hope that moving a 16-byte uses SSE2... Nope. It's all plain old D code at the moment IIRC. Sean | |||
April 02, 2007 Re: A Lock-Free Hash Table (google video) | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Craig Black | It seems to me that this is the only optimal form for hashtables - a massively scaleable one. Why? For cases where n < 8, a linear search is typically faster than any other search algorithm. The reason is less overhead. For cases where n < 1024, a level order binary search array is typically faster than any other search algorithm I've learned about, including a hashtable. The reason is again less overhead. For cases where n > 1024, I currently think that O(log(n)) in a level order binary search array is actually more expensive than that to manage a hashtable. For cases where n > 1048576, the data set is getting quite large. Assuming, through heuristics, that the number of gets and puts is also increasing, it becomes very practical to parallelize the hashtable. My understanding is that we tend to organize data either very small or very large, rarely between 1024 and 1048576. So implementing a hashtable, to me, suggests it ought to always be parallelizeable. Especially considering the low overhead involved if it's done right. | |||
April 02, 2007 Re: A Lock-Free Hash Table (google video) | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Craig Black | Craig Black wrote:
> [...]
> This is very good. It seems that there are a number of things in the works for the D compiler. The const stuff, Walter has mentioned run-time reflection, and now this STM as you now mention. Any of these things would be very cool IMO. Although Walter sometimes mentions stuff currently in development, he does not say much about it. Do you have any insight on what is coming next?
Nothing more exciting than bug fixes, but that's always good, right? ;)
Dave
| |||
April 02, 2007 Re: A Lock-Free Hash Table (google video) | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | Sean Kelly Wrote:
> Yup. It's rehashing that gets a bit messy.
>
> > I'm going to hope that moving a 16-byte uses SSE2...
>
> Nope. It's all plain old D code at the moment IIRC.
Rehashing, you'd probe until you've found the right location *then* move the 16-byte struct. He was suggesting using stride-1 probing for better cache coherency - but it's aweful at cascading into linear searches for an empty slot.
Plain old D doesn't use SSE or SSE2??? Oh my... Hasn't SSE been out since 1997? I guess with 64-bit GPR's, SSE1 is practically irrelevant now, but SSE2 and SSE3 could definitely improve performance of certain things... like copying arrays of 16-byte Value structs in DMDScript. I assumed that was what Walter meant when he commented that using 16 byte Value structs was critical for performance.
I'll be sure to use little asm { } blocks to get certain things done then.
The wonderful thing about D is that Walter provides a default mechanism for prototyping that works well. If you want something that works superbly,you can cut under it without much hassle.
A superb language, that D.
| |||
April 02, 2007 Re: A Lock-Free Hash Table (google video) | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Dan | Dan wrote:
> Sean Kelly Wrote:
>> Yup. It's rehashing that gets a bit messy.
>>
>>> I'm going to hope that moving a 16-byte uses SSE2...
>> Nope. It's all plain old D code at the moment IIRC.
>
> Rehashing, you'd probe until you've found the right location *then* move the 16-byte struct. He was suggesting using stride-1 probing for better cache coherency - but it's aweful at cascading into linear searches for an empty slot.
>
> Plain old D doesn't use SSE or SSE2???
I have no idea. I thought you were asking if there were inline asm blocks. I suppose the codegen may use SSE2 instructions where appropriate.
Sean
| |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply