May 11, 2022

On Wednesday, 11 May 2022 at 19:15:36 UTC, Steven Schveighoffer wrote:

> >
  • add some methods returning a copy of the value instead of a pointer to value

I return a reference to the value (via opIndex). Can you elaborate on why you would need it to be a copy? What is your API?

-Steve

Sorry, it is safe to return reference for your implementation, as you call new Entry on every insert, like standard AA. I keep entries in preallocated vector for performance reasons and every resize can break reference.

May 11, 2022

On 5/11/22 5:41 PM, ikod wrote:

>

On Wednesday, 11 May 2022 at 19:15:36 UTC, Steven Schveighoffer wrote:

> >
  • add some methods returning a copy of the value instead of a pointer to value

I return a reference to the value (via opIndex). Can you elaborate on why you would need it to be a copy? What is your API?

Sorry, it is safe to return reference for your implementation, as you call new Entry on every insert, like standard AA. I keep entries in preallocated vector for performance reasons and every resize can break reference.

Ah yes, with a non-gc allocator, it would be important to either ensure references do not get stale, or to at least identify the pitfalls. Maybe the "return a copy" version is @safe, whereas the normal opIndex is @system.

That reminds me though! with this, I can experiment with preallocating (reserve) entries, something that is oft-requested for AA.

-Steve

May 12, 2022
On Wed, May 11, 2022 at 11:31:02AM -0400, Steven Schveighoffer via Digitalmars-d-announce wrote:
> I just spent a couple hours making a library AA solution that is binary compatible with druntime's builtin AA.

This is awesome!  Don't have time to look at it in detail right now, but will definitely keep this in mind.


> The benefits:
> 
> 1. Proves that a library implementation is possible, also shows where shortcomings are.

What are the shortcomings that you found?


> 2. Usable at compile time to make an AA that can be used at runtime.

Awesome.


> 3. Much more approachable code than the AA runtime, does not require "faking" a typeinfo, dealing with typeinfo in general, or deal with magic compiler hooks. This gives a good base to start experimenting with.

Awesome.


> For the future:
> 
> 1. Implement all the things that AAs can do (which are possible, some
> are not).

Which things are not possible to do?


> 2. Look at alternatives to GC for allocation/deallocation. 3. Possible use with betterC?
[...]

Just use malloc/free?  Could have problems with dangling references to buckets, though, if somebody retains the pointer returned by `key in aa` expressions.  Or use ref-counting of some sort.  But hard to do this without changing binary compatibility.


T

-- 
"You know, maybe we don't *need* enemies." "Yeah, best friends are about all I can take." -- Calvin & Hobbes
May 12, 2022

On 5/12/22 1:15 PM, H. S. Teoh wrote:

>

On Wed, May 11, 2022 at 11:31:02AM -0400, Steven Schveighoffer via Digitalmars-d-announce wrote:

>

I just spent a couple hours making a library AA solution that is
binary compatible with druntime's builtin AA.

This is awesome! Don't have time to look at it in detail right now, but
will definitely keep this in mind.

>

The benefits:

  1. Proves that a library implementation is possible, also shows where
    shortcomings are.

What are the shortcomings that you found?

I mean it has the benefit of highlighting where the language doesn't provide the capability of replacing the existing AA with a library type. I literally threw this together in a few hours, so I haven't used it or tested it a lot. I know there are some things, but I haven't tried yet to exhaustively make it a drop-in replacement.

I was surprised at how short the code is once you throw out all the TypeInfo BS that is currently in druntime.

> >

For the future:

  1. Implement all the things that AAs can do (which are possible, some
    are not).

Which things are not possible to do?

The whole thing how the compiler knows that an outer AA is being used to initialize an inner AA.

e.g. this works currently, but is impossible to hook for a library (I think):

int[int][int] aa;
aa[0][1] = 5;

There's also possible problems with qualifiers that are yet-to-be-discovered, but usually show up when the compiler is cheating.

> >
  1. Look at alternatives to GC for allocation/deallocation.
  2. Possible use with betterC?
    [...]

Just use malloc/free? Could have problems with dangling references to
buckets, though, if somebody retains the pointer returned by key in aa expressions. Or use ref-counting of some sort. But hard to do this
without changing binary compatibility.

Yes, the lifetime issues are the real problem with not using the GC. Maybe you just avoid the in operator in those flavors? Instead you can use a match-style operation, something like:

hash.match(k, (ref v) {
  // work with v
});

The whole point of this "exercise" is to see what is missing, and explore what is possible.

-Steve

May 12, 2022
On Thu, May 12, 2022 at 01:46:19PM -0400, Steven Schveighoffer via Digitalmars-d-announce wrote:
> On 5/12/22 1:15 PM, H. S. Teoh wrote:
> > On Wed, May 11, 2022 at 11:31:02AM -0400, Steven Schveighoffer via Digitalmars-d-announce wrote:
[...]
> > > 1. Proves that a library implementation is possible, also shows where shortcomings are.
> > 
> > What are the shortcomings that you found?
[...]
> I was surprised at how short the code is once you throw out all the TypeInfo BS that is currently in druntime.

Yeah, the TypeInfo stuff takes up a lot of code. And is hackish and ugly.


> > > For the future:
> > > 
> > > 1. Implement all the things that AAs can do (which are possible,
> > > some are not).
> > 
> > Which things are not possible to do?
> 
> The whole thing how the compiler knows that an outer AA is being used to initialize an inner AA.
> 
> e.g. this works currently, but is impossible to hook for a library (I
> think):
> 
> ```d
> int[int][int] aa;
> aa[0][1] = 5;
> ```

I already saw this problem 8 years ago:

	https://issues.dlang.org/show_bug.cgi?id=7753

Maybe it's time for us to write a DIP for this? ;-)


> There's also possible problems with qualifiers that are yet-to-be-discovered, but usually show up when the compiler is cheating.

Ugh. AA + qualifiers == one of the dark dirty corners of D that I don't like looking at. It's a prime example of an issue that's papered over half-heartedly with a half-solution that doesn't really solve the problem:

1. In the bad ole days, AA's used to allow unqualified types for the key. This wasn't a problem with POD or immutable types, but shows up when you try to make an AA with, say, a char[] key type. You'd insert a key into the AA, then accidentally overwrite the array contents, and suddenly the key "vanishes" from the AA -- because the contents changed without the AA knowing, so it's still sitting in the old slot with the old hash value.

2. This problem was brought up, so somebody thought it was a good idea to implicitly force the key type into const. So when you declared an AA of type int[char[]] for example, it'd get implicitly converted to int[const(char[])].  But this doesn't solve anything!  All the const does is ensure the AA cannot modify the key. But the user still can!! So the original problem still exists.

AA keys really should be immutable, i.e., it should not be modifiable by *anyone*, not the AA code, not the caller, not anyone else. That's the only way you can guarantee the consistency of the hash values stored in the AA vs. the contents of the key.

OTOH, though, you *do* want to accept const-qualified versions of the key *during lookup*. (It would be onerous to require .idup'ing a char[] into a string just so you can do a lookup in the AA, for example.) This gets a bit hairy, though: if the AA entry may not exist and may need to be created, then the key must be immutable ('cos we'll potentially be storing a reference to the key in the AA). But if it's a pure lookup without new entry creation, then const is acceptable.


> > > 2. Look at alternatives to GC for allocation/deallocation. 3. Possible use with betterC?
> > [...]
> > 
> > Just use malloc/free?  Could have problems with dangling references to buckets, though, if somebody retains the pointer returned by `key in aa` expressions.  Or use ref-counting of some sort.  But hard to do this without changing binary compatibility.
> 
> Yes, the lifetime issues are the real problem with not using the GC.
> Maybe you just avoid the `in` operator in those flavors?
> Instead you can use a match-style operation, something like:
> 
> ```d
> hash.match(k, (ref v) {
>   // work with v
> });
> ```
[...]

That's a great idea.  Should be `scope ref`, though, to avoid the reference leaking out via a closure / global. ;-)


T

-- 
English is useful because it is a mess. Since English is a mess, it maps well onto the problem space, which is also a mess, which we call reality. Similarly, Perl was designed to be a mess, though in the nicest of all possible ways. -- Larry Wall
1 2
Next ›   Last »