Object.toString, toHash, opCmp, opEquals (page 3)

Settings

Help

Index » General » Object.toString, toHash, opCmp, opEquals (page 3)

April 26

Re: Object.toString, toHash, opCmp, opEquals

Posted by Per Nordlöw
in reply to Walter Bright

Permalink

Per Nordlöw

Posted in reply to Walter Bright

Permalink

On Thursday, 25 April 2024 at 23:06:27 UTC, Walter Bright wrote:

The prototypes are:

string toString();
size_t toHash() @trusted nothrow;
int opCmp(Object o);
bool opEquals(Object o);

which long predated const. The trouble is, they should be:

Shouldn't some or all of them be qualified as scope aswell?

April 27

Re: Object.toString, toHash, opCmp, opEquals

Posted by Richard (Rikki) Andrew Cattermole
in reply to Timon Gehr

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Timon Gehr

Permalink

On 27/04/2024 1:33 AM, Timon Gehr wrote:
> On 4/26/24 15:27, Timon Gehr wrote:
> 
>         The borrow checker does solve it, though. ...
> 
>     It does not, because it does not actually get aliasing under
>     control. It adds checks that are incomplete
> 
> (And also insufficient.)

Yes, this is something I've been trying to explain (highly unsuccessfully I might add) from pretty much day 1 of @live.

For a borrow checker to actually be useful, it must start from the point of allocation and track all the way to deallocation.

But in my view there are two behaviors here:

- Ownership transfer
- Owner/borrow relationship

The owner/borrow relationship is the thing Walter has just given me green light on to do a DIP for that I've been wanting for years.

That relies on DIP1000 to detect relationships via the use of ``return`` (talked with Dennis, confirmed that this is what is *meant* to be happening).

The ownership transfer however is what I want to see solved with isolated. This solves aliasing since which sub graph of memory is in each variable at the point of a transfer.

April 26

Re: Object.toString, toHash, opCmp, opEquals

Posted by Walter Bright
in reply to Jonathan M Davis

Permalink

Walter Bright

Posted in reply to Jonathan M Davis

Permalink

D1 is an example of a language with no attributes and no const. D1 works as a good programming language.

But it gives the programmer no indication of whether the arguments get mutated or not. He'll have to read and understand the called function, as well as the functions it calls.

It is reasonable to use const parameters when the argument is not going to be mutated. I personally prefer to use that as much as possible, and I like very much that the compiler will enforce it. With the mutating 4 functions, I cannot use const class objects.

Mutating toString, toHash, opCmp, and opEquals is unusual behavior, which is why const should be the default for them. After all, who expects a==b to change a or b?

I showed how to use the toString, toHash, opCmp, and opEquals functions with objects that do want to use mutating implementations of those functions. It will also be clear to the user which toString is mutating and which is not. It satisfies the use cases Timon mentioned - he'll still be able to use a mutating toString that will be used by writeln().

April 26

Re: Object.toString, toHash, opCmp, opEquals

Posted by Walter Bright
in reply to Timon Gehr

Permalink

Walter Bright

Posted in reply to Timon Gehr

Permalink

On 4/26/2024 7:00 AM, Timon Gehr wrote:
> This is not a theoretical problem either. This kind of introspection would be the proper fix for the following issue with std.typecons.Tuple.toString:
> 
> ```d
> import std.stdio, std.typecons;
> class C{
>      override string toString()=>"correct";
> }
> 
> void main(){
>      writeln(new C()); // "correct"
>      writeln(tuple(new C())); // "Tuple!(C)(const(tt.C))"
> }
> ```
> 
> This is also the same issue that prevents tuples with range members from being printed properly.

I would like to see "new C()" and "tuple(new C())" mean exactly the same thing. After all, with the builtin-tuples (not the struct library version) they are the same thing.

April 26

Re: Object.toString, toHash, opCmp, opEquals

Posted by Walter Bright
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Walter Bright

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On 4/25/2024 11:10 PM, Richard (Rikki) Andrew Cattermole wrote:
> With struct destructors the unwinding table should already be in use.
> So it is a cost we are already paying, that shouldn't be something to worry about as it is not a new cost.

Structs are passed around by ref all the time to avoid this cost. With RC, that goes out the window.

> DIP1000 isn't doing what I would expect it to do for slices:

Please file bug reports, and tag DIP1000 bugs with the "safe" keyword.

Also, this is drifting off topic. If you want to continue, please start a new thread.

April 26

Re: Object.toString, toHash, opCmp, opEquals

Posted by Dennis
in reply to Walter Bright

Permalink

Dennis

Posted in reply to Walter Bright

Permalink

On Friday, 26 April 2024 at 20:17:09 UTC, Walter Bright wrote:

Mutating toString, toHash, opCmp, and opEquals is unusual behavior, which is why const should be the default for them. After all, who expects a==b to change a or b?

Timon has mentioned data structures with amortized time complexity several times now, but perhaps an example closer to home helps:

https://github.com/dlang/dmd/blob/9ffa763540e16228138b44c3731d9edc2a7728b6/compiler/src/dmd/dsymbol.d#L668

In this case, toString (or toPrettyChars, same idea) is logically const because it doesn't mutate the Dsymbol meaningfully, but it can't be D's memory const because it does change a class field to store a cached result.

April 26

Re: Object.toString, toHash, opCmp, opEquals

Posted by Jonathan M Davis
in reply to Walter Bright

Permalink

Jonathan M Davis

Posted in reply to Walter Bright

Permalink

On Friday, April 26, 2024 2:17:09 PM MDT Walter Bright via Digitalmars-d wrote:
> D1 is an example of a language with no attributes and no const. D1 works as a good programming language.
>
> But it gives the programmer no indication of whether the arguments get mutated or not. He'll have to read and understand the called function, as well as the functions it calls.
>
> It is reasonable to use const parameters when the argument is not going to be mutated. I personally prefer to use that as much as possible, and I like very much that the compiler will enforce it. With the mutating 4 functions, I cannot use const class objects.

You can do that today. E.G. This code compiles and runs just fine.

```
void main()
{
    static class C
    {
        int i;

        this(int i)
        {
            this.i = i;
        }

        override bool opEquals(Object rhs)
        {
            if(auto r = cast(C)rhs)
                return opEquals(r);
            return false;
        }

        bool opEquals(const C rhs) const
        {
            return this.i == rhs.i;
        }

        override int opCmp(Object rhs)
        {
            if(auto r = cast(C)rhs)
                return opCmp(r);

            throw new Exception("Cannot compare C with types that aren't C");
        }

        int opCmp(const C rhs) const
        {
            if(this.i < rhs.i)
                return -1;
            if(this.i > rhs.i)
                return 1;
            return true;
        }

        override string toString() const
        {
            import std.format : format;
            return format!"C(%s)"(i);
        }

        override size_t toHash() const @safe nothrow
        {
            return i;
        }
    }

    const c1 = new C(42);
    const c2 = new C(99);

    assert(c1 == c1);
    assert(c1 != c2);

    assert(c1 <= c1);
    assert(c1 < c2);

    assert(c1.toHash() == 42);

    import std.format : format;
    assert(format!"c1: %s"(c1) == "c1: C(42)");
}
```

All four functions worked with const references. What does not work is if you use const Object references.

    assert(c1 == c2);

gets lowered to the free function, opEquals:

    assert(opEquals(c1, c2));

and because that function is templated, the derived class overloads get used. This is what happens in almost all D code using classes. The exception is code that uses Object directly, and pretty much no code should doing that.

Java passes Object all over the place and stores it in data structures such as containers, because Java doesn't have templates. For them, generic code has to operate on Object, because they don't have any other way to do it. In sharp contrast, we have templates. So, generic code has no need to operate on Object, and as such, it has no need to call opEquals, opCmp, toHash, or toString on Object. As it is, Object's opCmp throws, because there are plenty of classes where opCmp doesn't even make sense, and there really isn't a way to give Object an implementation that makes sense.

Generic code in D is templated, and as such, we can do stuff like we've done with the free funtion, opEquals, and make it so the code that needs to operate on classes generically operates on the exact type that it's given instead of degrading to Object. And as such, we don't need any of these functions to be on Object.

It's already the case that code like format and writeln operate on the actual class type that they're given and not Object. You already saw that when you talked about using alternate overloads for toString.

D code in general does not operate on Object. AFAIK, the main place that anything in D operates on Object at present is in old druntime code that has yet to be templated. And if that code is templated, the need to have these functions on Object goes away entirely. Then the entire debate of which attributes these functions should have goes away. Classes can define them in whatever way the programmer sees fit so long as the parameters and return types match what's necessary for them to be called by the code that uses these functions - just like happens with structs. The only difference is that the derived classes within a particular class hierarchy will have to be built on top of whatever signatures those functions were given on the first class in the hierarchy that had them, whereas structs don't have to worry about inheritance. But those signatures can then be whatever is appropriate for that particular class hierarchy instead of trying to come up with a set of attributes that make sense for all classes (which isn't possible).

And given that Object really doesn't need to have any of these functions, we likely would have removed them years ago if it weren't for the fact that something like that would break code (in large part due to the override keyword; a lot of the code would have worked just fine with those functions being removed from Object if the derived classes didn't have to have override, which will then become an error when the base class version of the funtion is removed). Andrei also proposed ProtoObject as a way to change the class hierarchy so that we could remove these functions (as well as the monitor) from classes without breaking code built on top of Object. So, we've known for years that we could fix this problem if we could just remove these functions from Object.

Editions gives us a way to make breaking changes in a mangeable manner. This should give us the opportunity to remove these four functions from Object like we've discussed for years and couldn't do because it would break code.

And if we decide to not do that, putting const on these four functions would actually make the situation worse. Yes, you could then call those four functions on const Objects, but it would mean that every single class will be forced to have these functions even if they cannot actually implement them properly with const. And what do such types do at that point? Do they throw an exception?

```
    static class C
    {
        Mutex mutex;
        shared int* ptr;

        this()
        {
            this.ptr = new shared int;
        }

        override bool opEquals(const Object rhs) const
        {
            throw new Exception("C does not support const")
        }

        bool opEquals(C rhs)
        {
            mutex.lock();
            immutable left = cast()*this.ptr;
            mutex.unlock();

            rhs.mutex.lock();
            immutable right = cast()*rhs.ptr;
            rhs.mutex.unlock();

            return left == right;
        }
    }
```

Do they cast away const and mutate?

```
    static class C
    {
        Mutex mutex;
        shared int* ptr;

        this(int i)
        {
            this.ptr = new shared int;
        }

        override bool opEquals(const Object rhs) const
        {
            if(auto r = cast(C)rhs)
                return (cast()this).opEquals(r);
            return false;
        }

        bool opEquals(C rhs)
        {
            mutex.lock();
            immutable left = cast()*this.ptr;
            mutex.unlock();

            rhs.mutex.lock();
            immutable right = cast()*rhs.ptr;
            rhs.mutex.unlock();

            return left == right;
        }
    }
```

Do they just have different behavior in the Object overload?

```
    static class C
    {
        Mutex mutex;
        shared int* ptr;

        this(int i)
        {
            this.ptr = new shared int;
        }

        override bool opEquals(const Object rhs) const
        {
            return this is rhs;
        }

        bool opEquals(C rhs)
        {
            mutex.lock();
            immutable left = cast()*this.ptr;
            mutex.unlock();

            rhs.mutex.lock();
            immutable right = cast()*rhs.ptr;
            rhs.mutex.unlock();

            return left == right;
        }
    }
```

You'd have a type which could technically be used as a const Object but which would not do the correct thing if it ever is. In contrast, right now, while you can't call any of these functions with a const Object, you _can_ call them on a const reference of the derived type if the derived type has const on them.

So, the code right now will do the correct thing, and it will work with const in any normal situation, whereas if we put const on these functions, such classes will have overloads that will not - and cannot - do the correct thing. And while those Object overloads would not normally be used, if they ever are, you have a bug - one which could be pretty annoying to track down, depending on what the const overload does.

I completely agree with you that _most_ classes should have const on these functions so that they can work with const references, but not all classes will work with const, and I don't see why there is any need to make these functions work with const Objects - const class references, yes, but not const Objects.

Normal D code does not use Object directly, and we should be able to templatize what little druntime code there is left which operates on Object and needs to use one or more of these functions. Once that's done, we can use an Edition to remove these functions from Object, and this entire issue goes up in smoke.

Instead, what you're proposing also causes breakage, but it puts perfectly legitimate use cases in a situation where they have to implement functions which they literally cannot implement properly. And it's for a use case that normal D code shouldn't even be doing - that is operating on Object instead of whatever derived types the code base in question is actually using. D is not Java. It may have made sense to put these functions on Object with D1, but with D2, we have a powerful template system which generally obviates the need to operate on Object. We shouldn't need to have these functions on Object, and Editions should give us what we need to remove them in a manageable way.

- Jonathan M Davis

April 27

Re: Object.toString, toHash, opCmp, opEquals

Posted by Timon Gehr
in reply to Walter Bright

Permalink

Timon Gehr

Posted in reply to Walter Bright

Permalink

On 4/26/24 22:23, Walter Bright wrote:
> On 4/26/2024 7:00 AM, Timon Gehr wrote:
>> This is not a theoretical problem either. This kind of introspection would be the proper fix for the following issue with std.typecons.Tuple.toString:
>>
>> ```d
>> import std.stdio, std.typecons;
>> class C{
>>      override string toString()=>"correct";
>> }
>>
>> void main(){
>>      writeln(new C()); // "correct"
>>      writeln(tuple(new C())); // "Tuple!(C)(const(tt.C))"
>> }
>> ```
>>
>> This is also the same issue that prevents tuples with range members from being printed properly.
> 
> I would like to see "new C()" and "tuple(new C())" mean exactly the same thing. After all, with the builtin-tuples (not the struct library version) they are the same thing.
> 

Ok, then I will further pursue an implementation of `opArgs` I guess. It does have a couple of drawbacks, but if this is your preference we can try to make it work.

April 26

Re: Object.toString, toHash, opCmp, opEquals

Posted by Walter Bright
in reply to Dennis

Permalink

Walter Bright

Posted in reply to Dennis

Permalink

On 4/26/2024 2:38 PM, Dennis wrote:
> Timon has mentioned data structures with amortized time complexity several times now, but perhaps an example closer to home helps:
> 
> https://github.com/dlang/dmd/blob/9ffa763540e16228138b44c3731d9edc2a7728b6/compiler/src/dmd/dsymbol.d#L668
> 
> In this case, `toString` (or `toPrettyChars`, same idea) is *logically* const because it doesn't mutate the Dsymbol meaningfully, but it can't be D's *memory* const because it does change a class field to store a cached result.
> 

I'm aware of that, and have investigated it several times looking for what can be made const. The compiler does a lot of lazy evaluation.

The simplest way to deal with that is to get the logical const value at the call site, then pass it to a const parameter.

April 26

Re: Tuple-icious

Posted by Walter Bright
in reply to Timon Gehr

Permalink

Walter Bright

Posted in reply to Timon Gehr

Permalink

On 4/26/2024 4:30 PM, Timon Gehr wrote:
> On 4/26/24 22:23, Walter Bright wrote:
>> I would like to see "new C()" and "tuple(new C())" mean exactly the same thing. After all, with the builtin-tuples (not the struct library version) they are the same thing.
>>
> 
> Ok, then I will further pursue an implementation of `opArgs` I guess. It does have a couple of drawbacks, but if this is your preference we can try to make it work.

I also mean that given:
```
int mul(int x, int y);
```
then:
```
mul(1, 2);
```
should mean the same thing as:
```
mul(tuple(1, 2));
```

This currently works with the builtin tuples. The trouble is that:

```
struct Args { int a; int b };
Args args = { 1, 2 };
mul(args);
```

fundamentally does not work, because the binary function call API for arguments is not the same as the binary function call API for structs with the corresponding fields.

This has frustrated me for some time. Static arrays and structs are binary API compatible, and I've been careful not to break that in D's design. I.e. they are unified. I'd like to extend that unification to tuples-as-arguments.

I.e.:

```
struct <=> static array <=> tuple <=> argument list
```

should be binary interchangeable.

Note that I was very pleased to make the discovery that pointers to:

struct member functions <=> class member functions <=> nested functions

are all interchangeable delegates! This has been a big win for D.

Top | Forum index | About this forum

Forums