Thread overview
Associative Arrays and Interior Pointers
May 04, 2009
dsimcha
May 05, 2009
bearophile
May 10, 2009
Sean Kelly
May 10, 2009
dsimcha
May 10, 2009
Fawzi Mohamed
May 10, 2009
Leandro Lucarella
May 10, 2009
dsimcha
May 04, 2009
I've been playing around with associative array and it's led to two pretty interesting questions for other D users:

1.  I have a prototype associative array implementation that's designed with D's GC implementation in mind.  It is much easier on the GC (allocates much less frequently, causes almost no false pointers) than the current implementation.  However, this comes at the price of worse cache performance that can cause lookups to take up to twice as long as the current implementation for large AAs.  On the other hand, building these large AAs is an order of magnitude faster.  Furthermore, my implementation only allocates occasionally, meaning in multithreaded mode, inserting an element will seldom require a lock.  Therefore, the question is, how much cache/lookup performance are you willing to give up for an implementation that's more GC friendly, allows faster insertions, and requires less locking?  Also, in your opinion, should D's builtin AA be designed around the GC implementation, or is this kind of coupling unacceptable?

2.  Other than its abysmal interaction with the current GC and the lack of ability to iterate using ranges, the current AA implementation actually seems pretty good.  One way to remedy a large portion of the GC issues without a massive overhaul of the GC would be to introduce a feature into the GC where a block of memory can be flagged as NO_INTERIOR.  This would mean that it could only be kept alive by pointers to the base address.  If this feature were implemented and used in memory blocks allocated by the current AA implementation, it would drastically reduce the false pointer issue.  (What seems to happen is, first of all, the current implementation generates a lot of false pointers, and secondly, it allocates so many blocks of memory that a few of them are guaranteed to be kept around by false pointers, leading to massive heap fragmentation.)  I've submitted a patch that implements this NO_INTERIOR feature.  (See bugzilla 2927).  Do you believe that this patch belongs in the GC?
May 05, 2009
dsimcha:

I can see you are working a lot to try to improve the implementation of the language. Thank you for this.

> this comes at the price of worse cache performance
> that can cause lookups to take up to twice as long as the current
> implementation for large AAs.  On the other hand, building these large AAs is
> an order of magnitude faster.  Furthermore, my implementation only allocates
> occasionally, meaning in multithreaded mode, inserting an element will seldom
> require a lock.

The built-in AAs have to be flexible and not too much bad for most purposes.
This implementation of yours has different trade offs compared to the built-in one. One way to answer what implementation is to prefer is to collect some amount of typical D1 code, and benchmark it with the two different AA implementations, and keep the AA implementation that leads to an average smaller total running time.
Then one or more different implementations may be added to Phobos for special usages (even Python has a second kind of associative array in the std lib, named defaultdict(), it's based on the same hashing machinery, but has some differences).


> One way to remedy a large portion of the GC issues without a
> massive overhaul of the GC would be to introduce a feature into the GC where a
> block of memory can be flagged as NO_INTERIOR.

Improving current things in a gradual way like this seems a good starting point (this is essentially the same answer I have given to Don regarding complex numbers).

Bye,
bearophile
May 10, 2009
dsimcha wrote:
> 
> 2.  Other than its abysmal interaction with the current GC and the lack of
> ability to iterate using ranges, the current AA implementation actually seems
> pretty good.  One way to remedy a large portion of the GC issues without a
> massive overhaul of the GC would be to introduce a feature into the GC where a
> block of memory can be flagged as NO_INTERIOR.

Neat idea.  Some GCs (like the Boehm GC) can be set NO_INTERIOR globally, but it never crossed my mind to do this per block.  For certain data structures, this might be pretty useful.
May 10, 2009
== Quote from Sean Kelly (sean@invisibleduck.org)'s article
> dsimcha wrote:
> >
> > 2.  Other than its abysmal interaction with the current GC and the lack of ability to iterate using ranges, the current AA implementation actually seems pretty good.  One way to remedy a large portion of the GC issues without a massive overhaul of the GC would be to introduce a feature into the GC where a block of memory can be flagged as NO_INTERIOR.
> Neat idea.  Some GCs (like the Boehm GC) can be set NO_INTERIOR globally, but it never crossed my mind to do this per block.  For certain data structures, this might be pretty useful.

I get the impression that D's GC will always have a significant degree of conservatism.  Of course, there's always unions, but unioning a pointer type w/ a non-pointer type is such an edge case that, if this is the only thing that's conservative, then the GC can be considered precise for all practical purposes. For now, I'd love to see this added, because it would be an extremely "cheap" way to solve a lot of annoying problems.

A few questions:
1.  (For Leonardo, especially):  Is the GC likely to get precise enough in the
near future that something like this would end up being considered a piece of cruft?
2.  Other than that, is there any reason this should not go in?
3.  Sean, if you're seriously thinking of putting this in in the near future, let
me know so I can fix that patch, too.  I did it the same way as my other GC patch,
with no whitespace, so the line numbers might be screwy here, too.
May 10, 2009
On 2009-05-10 20:04:40 +0200, dsimcha <dsimcha@yahoo.com> said:

> == Quote from Sean Kelly (sean@invisibleduck.org)'s article
>> dsimcha wrote:
>>> 
>>> 2.  Other than its abysmal interaction with the current GC and the lack of
>>> ability to iterate using ranges, the current AA implementation actually seems
>>> pretty good.  One way to remedy a large portion of the GC issues without a
>>> massive overhaul of the GC would be to introduce a feature into the GC where a
>>> block of memory can be flagged as NO_INTERIOR.
>> Neat idea.  Some GCs (like the Boehm GC) can be set NO_INTERIOR
>> globally, but it never crossed my mind to do this per block.  For
>> certain data structures, this might be pretty useful.
> 
> I get the impression that D's GC will always have a significant degree of
> conservatism.  Of course, there's always unions, but unioning a pointer type w/ a
> non-pointer type is such an edge case that, if this is the only thing that's
> conservative, then the GC can be considered precise for all practical purposes.
> For now, I'd love to see this added, because it would be an extremely "cheap" way
> to solve a lot of annoying problems.
> 
> A few questions:
> 1.  (For Leonardo, especially):  Is the GC likely to get precise enough in the
> near future that something like this would end up being considered a piece of cruft?
> 2.  Other than that, is there any reason this should not go in?
> 3.  Sean, if you're seriously thinking of putting this in in the near future, let
> me know so I can fix that patch, too.  I did it the same way as my other GC patch,
> with no whitespace, so the line numbers might be screwy here, too.

Yes the idea of NO_INTERIOR was floated before, and it is a good idea and works for 99% of the usages of AA and similar objects.

Normally it is not very different from having a GC managed wrapper object that alloc non GC managed memory, which for example is what I do with multidimensional arrays.
For these there isn't a big advantage in NO_INTERIOR,because basically always you have a wrapper object, and for these wrapper objects one can explicitly say that internal pointers are valid only if a pointer to the AA is kept

For the AA usage these is still the 0.1% usage in which one might lose the pointer to the AA, but still have a pointer to some of its contents. At the moment this is legal (I think) your change would disallow it. This might still be ok, but one should be aware of it

Fawzi

May 10, 2009
dsimcha, el 10 de mayo a las 18:04 me escribiste:
> == Quote from Sean Kelly (sean@invisibleduck.org)'s article
> > dsimcha wrote:
> > >
> > > 2.  Other than its abysmal interaction with the current GC and the lack of ability to iterate using ranges, the current AA implementation actually seems pretty good.  One way to remedy a large portion of the GC issues without a massive overhaul of the GC would be to introduce a feature into the GC where a block of memory can be flagged as NO_INTERIOR.
> > Neat idea.  Some GCs (like the Boehm GC) can be set NO_INTERIOR globally, but it never crossed my mind to do this per block.  For certain data structures, this might be pretty useful.
> 
> I get the impression that D's GC will always have a significant degree of conservatism.  Of course, there's always unions, but unioning a pointer type w/ a non-pointer type is such an edge case that, if this is the only thing that's conservative, then the GC can be considered precise for all practical purposes. For now, I'd love to see this added, because it would be an extremely "cheap" way to solve a lot of annoying problems.

I really like the idea of NO_INTERIOR too. I don't think it's possible to be the default because if that is the case you can't have array slices and/or pointers to a member of a struct/class. But I think it's great for very special cases (specially to implement high performance data structures) to be able to instruct the GC to discard interior pointers.

> A few questions:
> 1.  (For Leonardo, especially):  Is the GC likely to get precise enough in the
> near future that something like this would end up being considered a piece of cruft?

I'm sorry, but I finally decided to add concurrency to the current GC. The main idea is to reduce (almost vanish) pauses. If is not too hard, and time help, I'll try to add some preciseness if it only involves changes in the internal runtime, but I honestly don't think I'll have the time.

I'm talking about finishing my thesis here. When it's finished I think and hope to keep working on the D GC...

> 2.  Other than that, is there any reason this should not go in?

I don't see any reason, is a backward compatible change.

> 3.  Sean, if you're seriously thinking of putting this in in the near future, let me know so I can fix that patch, too.  I did it the same way as my other GC patch, with no whitespace, so the line numbers might be screwy here, too.

I think it would be great to have a centralized place where to put this improvements. This is another situation where I think Tango vs. Phobos issue is killing D. When I started my work in the thesis I had to decide whether to work with Phobos or Tango. I finally decided for Tango, because is the only option for LDC and because is way better organized (and more receptive to patches). But I hate knowing that my work will be available (in the best case) only for people using Tango.

I hate to see the D "community" fragmented.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
Estamos cantando a la sombra de nuestra parra
una canción que dice que uno sólo conserva lo que no amarra
May 10, 2009
== Quote from Leandro Lucarella (llucax@gmail.com)'s article
> I really like the idea of NO_INTERIOR too. I don't think it's possible to be the default because if that is the case you can't have array slices and/or pointers to a member of a struct/class. But I think it's great for very special cases (specially to implement high performance data structures) to be able to instruct the GC to discard interior pointers.

Of course, I don't think any reasonable person thinks it should be the default. The idea is that it would be an unsafe optimization for people who really know what they're doing or are really having serious trouble with false pointers.  It should be thought of kind of like using manual memory management in D.

> > A few questions:
> > 1.  (For Leonardo, especially):  Is the GC likely to get precise enough in the
> > near future that something like this would end up being considered a piece of
cruft?
> I'm sorry, but I finally decided to add concurrency to the current GC. The
> main idea is to reduce (almost vanish) pauses. If is not too hard, and
> time help, I'll try to add some preciseness if it only involves changes in
> the internal runtime, but I honestly don't think I'll have the time.
> I'm talking about finishing my thesis here. When it's finished I think and
> hope to keep working on the D GC...

Ok, cool.  I just wanted to make sure that, if we took the time to put this in, you weren't likely to come back in 6 months and say "I've made the GC almost completely precise.  All false pointer stuff is now irrelevant and just a bunch of cruft."  On the other hand, I certainly wouldn't mind if the GC became more precise instead.

> > 2.  Other than that, is there any reason this should not go in?
> I don't see any reason, is a backward compatible change.
> > 3.  Sean, if you're seriously thinking of putting this in in the near future, let me know so I can fix that patch, too.  I did it the same way as my other GC patch, with no whitespace, so the line numbers might be screwy here, too.
> I think it would be great to have a centralized place where to put this
> improvements. This is another situation where I think Tango vs. Phobos
> issue is killing D. When I started my work in the thesis I had to decide
> whether to work with Phobos or Tango. I finally decided for Tango, because
> is the only option for LDC and because is way better organized (and more
> receptive to patches). But I hate knowing that my work will be available
> (in the best case) only for people using Tango.
> I hate to see the D "community" fragmented.

You're right, this is unfortunate.  Basically every contribution I've made to D is D2-oriented and completely irrelevant to D1.