May 15, 2019
OK! I finally remembered why we went with the design "any two ProtoObjects must be comparable". We should have a way to document this stuff, I'd internalized it so much I'd forgotten the reason.

It has to do with the way D implements built-in tables. They use indirect calls via TypeInfo classes that essentially require the existence of equality and hashing in class objects as virtual functions with fixed signatures.

So if we go with a statically-driven solution we lose hashtables keyed on any class not derived from Object. That would be awkward but hopefully acceptable until we have good containers (which are becoming possible only now with copy ctors and the upcoming __RefCounted type in druntime).

A possibility to bridge us is to create a wrapper struct:

struct HashKey(T) if(is(T == class)) {
    private T payload;
    bool opEquals(HashKey rhs) { ... }
    size_t toHash() { ... }
    alias payload this;
}

Inside the implementations introspection would be used to figure out how type T chose to implement comparison and hashing.

My thinking is, built-in hashtables are overdue for improvement anyway so keeping compatibility with them isn't awfully motivating.

Satisfactory or not?
May 15, 2019
On Wednesday, 15 May 2019 at 22:51:11 UTC, Andrei Alexandrescu wrote:
> It has to do with the way D implements built-in tables.

Huh, that is a kinda weird implementation, but it makes sense now...

> So if we go with a statically-driven solution we lose hashtables keyed on any class not derived from Object.

But I'm OK with that. I virtually never use classes in a hash table anyway... and if I did, I think I'd be fine using the workaround or just inheriting from Object.

Let's do it.
May 16, 2019
On 5/15/19 5:36 PM, H. S. Teoh wrote:
> FYI, even with the above amusingly elaborate hack, you still cannot
> achieve proper 4-way comparison results.

That was the joke.
May 16, 2019
On Wednesday, 15 May 2019 at 16:36:27 UTC, H. S. Teoh wrote:

> FYI, even with the above amusingly elaborate hack, you still cannot achieve proper 4-way comparison results.  Consider: what should OverengineeredCmpResult.opCmp return for payload == ionno, such that <, <=, >, >=, ==, != would all produce the correct result?
>
> Answer: it's not possible unless you return float, because x < y translates to x.opCmp(y) < 0, and x > y translates to x.opCmp(y) > 0, so the only way to represent an incomparable state is for opCmp to return some value z for which z < 0 and z
> > 0 are *both* false.  There is no integer value that fits this
> description; the only candidate is float.nan. Substituting the return value of opCmp with a custom struct doesn't fix this problem; it only defers it to the custom struct's opCmp, which suffers from the same problem.

I see.  I'm a little embarrassed I didn't realize that but happy you took the time to explain it.  Thank you.

Mike

May 16, 2019
On Wednesday, May 15, 2019 4:51:11 PM MDT Andrei Alexandrescu via Digitalmars-d wrote:
> OK! I finally remembered why we went with the design "any two ProtoObjects must be comparable". We should have a way to document this stuff, I'd internalized it so much I'd forgotten the reason.
>
> It has to do with the way D implements built-in tables. They use indirect calls via TypeInfo classes that essentially require the existence of equality and hashing in class objects as virtual functions with fixed signatures.
>
> So if we go with a statically-driven solution we lose hashtables keyed on any class not derived from Object. That would be awkward but hopefully acceptable until we have good containers (which are becoming possible only now with copy ctors and the upcoming __RefCounted type in druntime).
>
> A possibility to bridge us is to create a wrapper struct:
>
> struct HashKey(T) if(is(T == class)) {
>      private T payload;
>      bool opEquals(HashKey rhs) { ... }
>      size_t toHash() { ... }
>      alias payload this;
> }
>
> Inside the implementations introspection would be used to figure out how type T chose to implement comparison and hashing.
>
> My thinking is, built-in hashtables are overdue for improvement anyway so keeping compatibility with them isn't awfully motivating.
>
> Satisfactory or not?

I didn't realize that that was quite what the built-in AAs were doing, but I was aware that their implementation isn't templatized, so I didn't expect them to work with any classes not derived from Object without them being improved. Either way, that does seem like a good workaround, and I certainly wouldn't want to hurt what we're doing to improve classes because of internal issues with our AA implementation.

Having built-in AAs is nice, but given all of the problems that we've had with their implementation over the years, I'm inclined to think that having them built into the language was ultimately a mistake anyway. We probably should figure out how to make AA literals work with user-defined types (without actually creating a built-in AA) at some point here, since that's the really the only thing that built-in AAs can do that a user-defined hash table type couldn't.

- Jonathan M Davis



May 16, 2019
On Wednesday, 15 May 2019 at 22:51:11 UTC, Andrei Alexandrescu wrote:
> OK! I finally remembered why we went with the design "any two ProtoObjects must be comparable". We should have a way to document this stuff, I'd internalized it so much I'd forgotten the reason.
>
> It has to do with the way D implements built-in tables. They use indirect calls via TypeInfo classes that essentially require the existence of equality and hashing in class objects as virtual functions with fixed signatures.

Static logic is still preferable from my point of view. I wonder how often I use objects as keys.
May 16, 2019
On 5/15/19 11:51 PM, Andrei Alexandrescu wrote:
> OK! I finally remembered why we went with the design "any two ProtoObjects must be comparable". We should have a way to document this stuff, I'd internalized it so much I'd forgotten the reason.
> 
> It has to do with the way D implements built-in [hash]tables. They use indirect calls via TypeInfo classes that essentially require the existence of equality and hashing in class objects as virtual functions with fixed signatures.

Yes, but that can be redone. Basically the TypeInfo for an object uses Object.toHash expecting the base to be object (and BTW is calling this on a specifically non-CONST object, even though keys are supposed to be const!)

The TypeInfo_ProtoObject (which would be what we are talking about here), can use the same technique as structs -- the compiler writes an xtoHash function for it, and puts it in the TypeInfo as a function pointer.

> My thinking is, built-in hashtables are overdue for improvement anyway so keeping compatibility with them isn't awfully motivating.

We need to redo builtin hash tables. As I understand it, the biggest hurdle is this feature of builtin hashtables which isn't exactly easy to do with our current operator overloading scheme:

int[int[int]] nestedAA;

nestedAA[1][2] = 3; // magically knows to create the intermediate AA if it doesn't exist.

-Steve
May 16, 2019
On Thursday, May 16, 2019 7:53:45 AM MDT Steven Schveighoffer via Digitalmars-d wrote:
> On 5/15/19 11:51 PM, Andrei Alexandrescu wrote:
> > OK! I finally remembered why we went with the design "any two ProtoObjects must be comparable". We should have a way to document this stuff, I'd internalized it so much I'd forgotten the reason.
> >
> > It has to do with the way D implements built-in [hash]tables. They use indirect calls via TypeInfo classes that essentially require the existence of equality and hashing in class objects as virtual functions with fixed signatures.
>
> Yes, but that can be redone. Basically the TypeInfo for an object uses Object.toHash expecting the base to be object (and BTW is calling this on a specifically non-CONST object, even though keys are supposed to be const!)

Really, they need to be immtable if you want to guarantee that they don't change. Having them be const doesn't really buy you anything other than ensuring that the AA implementation doesn't modify it (and that sort of thing in druntime has a nasty habit of casting to get around such things). If the key is not a value type, then other, mutable references to the data could change it. So, either the key has to be immutable, or you have to rely on other code not mutating the key via another reference.

- Jonathan M Davis



May 16, 2019
On 5/16/19 3:15 PM, Jonathan M Davis wrote:
> On Thursday, May 16, 2019 7:53:45 AM MDT Steven Schveighoffer via
> Digitalmars-d wrote:
>> On 5/15/19 11:51 PM, Andrei Alexandrescu wrote:
>>> OK! I finally remembered why we went with the design "any two
>>> ProtoObjects must be comparable". We should have a way to document this
>>> stuff, I'd internalized it so much I'd forgotten the reason.
>>>
>>> It has to do with the way D implements built-in [hash]tables. They use
>>> indirect calls via TypeInfo classes that essentially require the
>>> existence of equality and hashing in class objects as virtual functions
>>> with fixed signatures.
>>
>> Yes, but that can be redone. Basically the TypeInfo for an object uses
>> Object.toHash expecting the base to be object (and BTW is calling this
>> on a specifically non-CONST object, even though keys are supposed to be
>> const!)
> 
> Really, they need to be immtable if you want to guarantee that they don't
> change. Having them be const doesn't really buy you anything other than
> ensuring that the AA implementation doesn't modify it (and that sort of
> thing in druntime has a nasty habit of casting to get around such things).

That's beside the point. You shouldn't call a non-const function on something that is const *or* immutable. Which is what druntime is currently doing.

For most intents and purposes, I think it should be fine to use non-const keys, as long as the keys are not modified while they are used as keys. You could, for instance, have a key that has some mutable data that doesn't affect the hash/equality. The language shouldn't impose restrictions that don't make sense for some cases, it just reduces the code you can write, even if that code would be perfectly valid.

-Steve
May 16, 2019
On Thu, May 16, 2019 at 08:15:07AM -0600, Jonathan M Davis via Digitalmars-d wrote:
> On Thursday, May 16, 2019 7:53:45 AM MDT Steven Schveighoffer via Digitalmars-d wrote:
[...]
> > Yes, but that can be redone. Basically the TypeInfo for an object uses Object.toHash expecting the base to be object (and BTW is calling this on a specifically non-CONST object, even though keys are supposed to be const!)
> 
> Really, they need to be immtable if you want to guarantee that they don't change. Having them be const doesn't really buy you anything other than ensuring that the AA implementation doesn't modify it
[...]

Yes, AA keys must be immutable. The current const requirement guarantees nothing.

There is more to this, though. Lookups using non-immutable (even mutable) keys ought to be allowed, since you only need to compare keys in that case. But assignment must require immutable. IOW:

	// Key is any type that contains references
	Key mk;
	const Key ck;
	immutable Key ik;

	//int[Key] aa; // not allowed, require immutable
	//int[const(Key)] aa; // not allowed, require immutable
	int[immutable(Key)] aa; // OK
	// Of course, to reduce verbosity we could simply define V[Key]
	// to mean V[immutable(Key)] implicitly.

	//aa[mk] = 1;	// not allowed, require immutable
	//aa[ck] = 1;	// not allowed, require immutable
	aa[ik] = 1;	// OK

	int i;
	i = aa[mk]; // OK: lookup only
	i = aa[ck]; // OK: lookup only
	i = aa[ik]; // OK

	i = aa.get(mk, 0); // OK: lookup only
	i = aa.get(ck, 0); // OK: lookup only
	i = aa.get(ik, 0); // OK

	int* p;
	p = mk in aa;	// OK: lookup only
	p = ck in aa;	// OK: lookup only
	p = ik in aa;	// OK

	aa.remove(mk); // OK: lookup only
	aa.remove(ck); // OK: lookup only
	aa.remove(ik); // OK


T

-- 
Let's not fight disease by killing the patient. -- Sean 'Shaleh' Perry