Object.toString, toHash, opCmp, opEquals (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Object.toString, toHash, opCmp, opEquals (page 2)

April 26

Re: Object.toString, toHash, opCmp, opEquals

Posted by Richard (Rikki) Andrew Cattermole
in reply to Walter Bright

Richard (Rikki) Andrew Cattermole

Posted in reply to Walter Bright

On 26/04/2024 3:15 PM, Walter Bright wrote:
>     Stuff like this is always solvable if we acknowledge (in other words
>     write them all down) what the requirements are!
> 
> We had a requirement for memory safety. Without it, RC was more of a step sideways than forwards.

I have had a solution to this since before @live that I was screaming about in the guise of DIP1000's last big hole.

Xor mutable references with borrows.

Protects against assigns, function parameter passing, doesn't need any extra syntax...

I am highly annoyed by this. THIS WAS SOLVABLE WITH SOMETHING I HAVE BEEN SCREAMING ABOUT FOR ALMOST THIS EXACT THING.

If this is literally the *only* thing blocking RC, I can do the DIP for it.

April 25

Re: Object.toString, toHash, opCmp, opEquals

Posted by Walter Bright
in reply to Richard (Rikki) Andrew Cattermole

Walter Bright

Posted in reply to Richard (Rikki) Andrew Cattermole

On 4/25/2024 8:21 PM, Richard (Rikki) Andrew Cattermole wrote:
> If this is literally the *only* thing blocking RC, I can do the DIP for it.

The other problem with RC is the exception handler for every decrement. If there's a DIP in it, please do so!

April 26

Re: Object.toString, toHash, opCmp, opEquals

Posted by Richard (Rikki) Andrew Cattermole
in reply to Walter Bright

Richard (Rikki) Andrew Cattermole

Posted in reply to Walter Bright

On 26/04/2024 6:02 PM, Walter Bright wrote:
> On 4/25/2024 8:21 PM, Richard (Rikki) Andrew Cattermole wrote:
>> If this is literally the *only* thing blocking RC, I can do the DIP for it.
> 
> The other problem with RC is the exception handler for every decrement. If there's a DIP in it, please do so!

With struct destructors the unwinding table should already be in use.

So it is a cost we are already paying, that shouldn't be something to worry about as it is not a new cost.

On that note, unwinding tables need to be turned on for -betterC in dmd. Turning them off can only cause program corruption when calling non -betterC code including C/C++. https://github.com/dlang/dmd/pull/16177

But yes, I'll start working on a DIP!

Although I need to talk with Dennis. DIP1000 isn't doing what I would expect it to do for slices:

```d
import std;
void main() @safe {
    Context context;
    char[] str = context.acquire();
    char[] var = test(str);
    writeln(var); // Should be erroring with: scope variable `var` assigned to non-scope parameter `__param_0` calling `writeln`
}

struct Context {
    char[] acquire() scope return @trusted {
        return "Abc".dup;
    }
}

char[] test(return char[] input) @safe {
    return input;
}
```

It does for ``void*``.

April 25

Re: Object.toString, toHash, opCmp, opEquals

Posted by Walter Bright
in reply to Jonathan M Davis

Walter Bright

Posted in reply to Jonathan M Davis

Perhaps I can help things work for you and Timon:

```
import std.stdio;

class A
{
    string xxx(const Object) const { return "A"; }
}

class B : A
{
    alias xxx = A.xxx;
    string xxx(Object) { return "B"; }
}

void main()
{
    const A a = new A();
    B b = new B();
    const B c = new B();
    writeln(a.xxx(a));
    writeln(b.xxx(b));
    writeln(c.xxx(c));
}
```
I'm calling this xxx instead of toString, just so I can show all the code. Compiling it and running it prints:

A
B
A

In other words, you can have a toString() that is mutable and it will work fine with writeln(), because writeln(x) does not look for Object.toString(), it looks for x.toString().

Does this work for you?

April 26

Re: Object.toString, toHash, opCmp, opEquals

Posted by Jonathan M Davis
in reply to Walter Bright

Jonathan M Davis

Posted in reply to Walter Bright

On Friday, April 26, 2024 12:44:03 AM MDT Walter Bright via Digitalmars-d wrote:
> Perhaps I can help things work for you and Timon:
>
> ```
> import std.stdio;
>
> class A
> {
>      string xxx(const Object) const { return "A"; }
> }
>
> class B : A
> {
>      alias xxx = A.xxx;
>      string xxx(Object) { return "B"; }
> }
>
> void main()
> {
>      const A a = new A();
>      B b = new B();
>      const B c = new B();
>      writeln(a.xxx(a));
>      writeln(b.xxx(b));
>      writeln(c.xxx(c));
> }
> ```
> I'm calling this xxx instead of toString, just so I can show all the code.
> Compiling it and running it prints:
>
> A
> B
> A
>
> In other words, you can have a toString() that is mutable and it will work
> fine with writeln(), because writeln(x) does not look for
> Object.toString(), it looks for x.toString().
>
> Does this work for you?

The ideal situation here is that none of these functions are on Object at all. They really aren't useful there, because it's not particularly useful or necessary to operate on Object. Some of the druntime code does, because it hasn't been templated yet, but once it has been, it won't need to operate on Object at all. At that point, we won't need to have any of these functions on Object, and Editions should give us the ability to remove them.

And then, yes, classes can define those functions as they see fit without having to worry about an implementation on Object, because the code that's using them will be templated. To an extent, that can be done now. However, you then have a problem if the Object version ever gets called, and the derived version does not override the Object version, because then the wrong version gets called. And if we add something like const to these functions, and Object gets used at all, then either the wrong overload will be called, or the derived class will need to be casting away const to have the correct version called (or druntime will be casting away const like it unfortunately does with opEquals right now if you compare two const Objects, and when that happens, it can easily violate the type system's guarantees with const).

Adding const to any of these functions on Object is just putting the code in a position where either we're at risk of the wrong overload being called, or const is going to end up being cast away if the Object overload ever gets used. And it really doesn't buy us much, since code that wants to have them work with const can overload them on const right now and have the non-const overload call the const one. Object can't be compared as const, but that's not generally necessary anyway, since normal code is going to use reference which are typed as the actual classes, not Object. For the most part, the only code that's going to have that issue is the druntime code that we need to turn into templates anyway but currently uses Object because of how old it is. And once that's done, then there's no need to have the Object versions of these functions at all. So, I don't think that it makes any sense whatsoever to add const to the functions on Object. Rather, we need to be getting the druntime code to the point that it will work to remove them from Object. And that will fix far more than the issue of const working with these functions, because then it will allow user code to define these functions with whatever set of attributes make sense for that code, thereby fixing the problem for attributes in general.

Also, I would point out that if your motivation for trying to put const on these functions is related to DIP 1021, then that's going to cause a whole other set of problems anyway, because if I understand correctly, DIP 1021 is trying to disallow stuff like

    foo(bar, bar);

where bar is not taken as const via at least one of those parameters. And yet it's extremely common that neither parameter can be marked as const, because the code is either written to work with a type that does not work with const, or it's templated and therefore needs to not assume that the type it's given works with const (and will often not be instantiated with const types). The free function, opEquals, is precisely such a case, becase it's not only designed to work with classes whether their opEquals is const or not (so long as they're not compared as Object), but it's specifically designed to be able to compare the same class object against itself (whether it's literally the same reference or two references which happen to point to the same object). So, the free function, opEquals, needs to be able to accept the same object for both arguments without using const at all. Stuff like

    auto eq = cls == cls;

or

    auto cls2 = cls;
    auto eq = cls == cls2;

need to compile (especially the second one), and that needs to work without requiring const, because not all classes can use const.

And if DIP 1021 is trying to force code to use const, that's going to be non-starter for a lot of code because of how restrictive D's const is. How common it is for the same reference to be passed multiple times, I don't know (certainly, it's going to be far more common with opEquals or opCmp than with most functions), so the issue may be fairly restricted in practice, but with pretty much any part of D, it's going to be a problem any time that the language tries to require that stuff be const, because const is simply too restrictive to work with code in general. It would be like requiring that a feature be pure or @nogc. It will work in many cases, but it also won't work in many cases, so requiring it makes it so that code that really should work won't.

Of course, const will work with some types just fine (especially primitive types), but there are lots of cases in D code where const is avoided completely, because it's too restrictive for that code to use. And templated code typically avoids explicitly using it at all on its parameters, because if the parameters were explicitly marked as const, the code wouldn't work with any types that don't work as const, whereas if you don't mark them as const and then pass a const object, then the template is instantiated with const and works just fine so long as the type itself works as const. So, for most templated code, explicitly using const is not only completely unnecessary, but it's bad practice.

So, I find it to be extremely concerning if we're trying to force const anywhere. We need to support it where we can, but requiring it is going to cause problems with any types that can't use it - or which can't use it for the particular operations that are involved with the code trying to require it. So, if a new language feature is trying to require const, we really need to revisit that feature.

- Jonathan M Davis

April 26

Re: Object.toString, toHash, opCmp, opEquals

Posted by Timon Gehr
in reply to Walter Bright

Timon Gehr

Posted in reply to Walter Bright

On 4/26/24 05:13, Walter Bright wrote:
> On 4/25/2024 6:32 PM, Timon Gehr wrote:
>> A range is useless unless it is mutable. The range interface is inherently mutable. To iterate a range, you have to call `popFront()` on it. There is no way to have a `const popFront()`.
> 
> I agree there's no reason to have a const popFront(). But opEquals() is inherently non-mutable.

That does not mean it can be D `const`. This is one of the two reasons I mentioned why "const correctness" is such a damaging concept for a D programmer.

Here, you are again conflating the logical with the physical semantics. It's a bit like saying "`opEquals` changes the state of the stack, hence obviously it cannot be `const`!", just one abstraction level higher.

Ideally, `opEquals` implements an equivalence relation. It is fine if it changes the representatives in the process, as long as it properly encapsulates the internal state such that whenever two values compare equal, the observable semantics of the two representatives is the same.

> Let's posit a mutating opEquals() and:
> 
> ```
> o.opEquals(o);
> ```
> 
> and the opEquals() mutated which one, or both, or what would happen if it did?
> ...

"If you stop brushing your teeth, you might get cavities! There is hence no reason not to record video evidence!"

Anyway, here is a simple contrived example of a mutating `opEquals` that is not a logical problem:

```d
struct int31{
    private int payload;
    bool opEquals(int31 rhs){
        payload^=1;
        return (payload>>1)==(rhs.payload>>1);
    }
    void opBinary(string op:"+")(int31 rhs){
        return ((payload>>1)+(rhs>>1))<<1;
    }
}
```

If you want something that is actually useful, you will have to look into splay trees or something like that. Or e.g., maybe you have a ring buffer or something that compacts itself on iteration. As I said, amortized data structures. It may be incorrect to have a const opEquals. It can introduce a performance regression.

> 
>>> The utility is being able to write borrow-checker style code, so you can avoid things like double frees.
>>> ...
>>
>> `@live` does not enable this.
> 
> ```
> auto p = q;
> free(p);
> free(q);
> ```
> ...

Well, I can just not use `malloc` and `free`. Anyway, to me this is not "borrow-checker style" code. This is C-style `@system` code.

>> Anyway, you are trying to impose nonsensical restrictions on garbage-collected code. I have yet to run into a double-free using GC allocation and I doubt `@live` would help me avoid that if it were a thing.
> 
> D doesn't distinguish between gc pointers and non-gc pointers. It has been proposed, but I have very extensive experience with multiple pointer types and it is a cure worse than the disease.
> ...

I understand that there exist bad solutions to basically any problem. This very thread provides ample evidence of that fact. We have `scope` and non-`scope` pointers and the world has not ended yet.

> 
>>> As I recall, it was you that pointed out that reference counting can never be safe if two mutable pointers to the same ref counted object (one to the object, the other to its interior) were passed to a function. (Freeing the first can leave the second interior pointer pointing to a deleted object.) The entire ref counting scheme capsized because of this.
>> I provided the counterexample, but the unsound generalization is yours.
> 
> All it takes is one counterexample to capsize it.
> ...

Sure, I was just objecting to the characterization that I claimed a "Rust-style" mutation-restricting solution is the only possible one.

> 
>> (Technically, there would be ways to type check that code without banning mutation outright.)
> 
> Neither Andrei nor I nor anyone else working on it could figure out a solution (other than disallowing all pointers to payload).

This is not true, it seems they just did not explain it to you. You could have some sort of more precise type-state system that only disallows operations that may deallocate the payload. This is the kind of thing that Rust initially explored. Anyway, I am not even saying that this is necessarily better, I just don't like technically wrong words being put into my mouth. ;)

> The borrow checker does solve it, though.
> ...

It does not, because it does not actually get aliasing under control. It adds checks that are incomplete in some programs, and unnecessary in other programs.

> 
>>> Why would anyone need toHash(), toString(), opEquals() or opCmp() to mutate their data? Wouldn't that be quite surprising behavior?
>>>
>>
>> As I keep pointing out, there is a difference between mutating abstract data and concrete memory locations. For instance, data types with amortized guarantees usually have to reorganize the internal data representation on each query. (Think e.g. splay trees.)
>>
>> Anyway, let's for the sake of argument assume that I want to write functions that leave memory in exactly the state they encountered it in. Const will _still_ unduly restrict me because it is not fine-grained enough.
>>
>> ```d
>> import std.stdio, std.range, std.conv;
>>
>> struct S{
>>      auto r=iota(1,2);
>>      string toString()const{ return text(r); }
> 
> I agree that mutates the argument passed to toString(). That would consume the range. Calling toString() again would return an empty string.
> ...

No, this is not true. `text` does not accept its argument by `ref`. The range stays intact. This is similar to how in:

```
int[] a = [1,2,3];
writeln(a);
```

The array `a` is not empty after printing.

> 
>> Sometimes there is not even a safe workaround to get a mutable version of a range, because of transitive `const`. A range can have indirections in its implementation.
>> This is just one example establishing that `const` is not expressive enough to say _ONLY_ "this will not mutate anything". It also spells: "This code can be a huge pain in the ass at any point in the future for dumb, incidental reasons."
>>
>> I really do not want to deal with this. I'd much rather fork Phobos so it uses non-const alternatives to toHash and toString.
> 
> I suppose it wouldn't help if I suggest:
> 
> ```
> writeln(text(r));
> ```
> ...

No, it does not. I do not see how this would help.

> I only proposed the const toString() for Object.toString(), not for struct, where indeed you are free to have struct toString() do anything you want.
> ...

I happen to be already using classes. Forking Phobos is less effort than moving to structs. Or I could just switch to OpenD I guess.

> Class and struct are fundamentally different in that class is a universal hierarchy with a common root, and hence we must define what that common root is. Struct, on the other hand, is rootless, and hence the user can define it however he pleases.
> 
> I agree with you that Object shouldn't have had any members, and Andrei and I did discuss that, but since it had members, we couldn't really take them away. Note that COM classes also have a common root with one member QueryInterface().
> ...

I am amazed that you want to break most D code by imposing attributes on common root functions, but removing functions from the common root is a bridge too far even though the fix is usually simply to remove `override`.

> 
>> If you expect people to prove properties to an incomplete type system via annotations and to accept unnecessary restrictions, they have to get some value out of it. You also would not go: "Starting from tomorrow, you have to prove to me that you brush your teeth every day. I want video evidence." And then, when I refuse, you can't say: "Why would you not brush your teeth?" This is what this is.
>>
>> I caution you to now not miss the forest for the trees and engage in a "tooth-brushing related" argument (e.g., proposing a different range design or something like that). This is an inherent issue. Even if you make the type system more expressive, the annotation overhead is still real, and often uneconomical.
>>
>> I am perfectly fine with having some restricted system like Rust for people who want to do safe manual memory management. This would even be useful to me. But this has to be opt-in, based on data structures, and interoperate as seamlessly as possible with the full language.
> 
> 
> I think I see your point of view. Mine is a little different.

My point of view is D-focused, yours often enough seems to be C-focused. There is only so much insights about D's design that can be extracted from issues with C's design. Actual experience with D is increasingly important.

You will notice that all of the experience you mention in this thread is with systems that do not work well. I have considerable experience with D, and the only memory-management related issue that I care about is use after free. Yet `@live` does not solve this problem for me. (I am aware that you can write a snippet of code that is rejected by @live for use after free. Personally I care about code that is accepted and hence is guaranteed not to have use after free.)

> I have considerable experience with C. When I see:
> 
> ```
> int foo(T* p);
> ```
> 
> Is p an array? is foo() going to mutate what it points to? Is foo() going to free() it?

I agree with this point of view, this is not what I am objecting to. This is a "tooth-brushing related" argument.

Anyway, this is the C point of view. OTOH, in @safe D, `p` cannot be `free`d. It may e.g. be a GC pointer.

If you want to allow an `@safe foo` to free its argument, you will have to encode in the type of that argument that it is a malloc'd pointer. There is just no way around that unless you say "in this language, every non-scope pointer comes from malloc". That would be a bad outcome.

The best way to do such an encoding is to have a struct wrapper around the pointer, have proper move semantics and a borrow checker that works well, and soundly, with data abstraction. In this case, the borrow checker actually makes a difference in `@safe` code. Otherwise it does not.

> How would I know without reading the implementation? (The documentation is always incomplete, wrong, or missing.) Annotations give me confidence that I understand what it does. const/ref/scope here answer my questions, and the compiler backs it up.
> ...

Your considerable experience with C contradicts your extensive experience with "multiple pointer types" and D's actual, existing DIP1000 and `const` design. I implore you to refine your position, otherwise it is simply internally inconsistent and hence allows you to dismiss any argument. This is very frustrating for an interlocutor.

Anyway, I agree that `const` and `scope` can be very useful in cases where they work. They are just not a panacea.

> 
>  > One thing I absolutely agree on with Robert is that it should always be
>  > _possible_ to write simple @safe D code without any advanced type system
>  > shenanigans. I think any design that strays from that principle is bad. This
>  > proposed change absolutely torpedoes that.
> 
> I agree with Robert, too. I asked him to prepare a list of his proposals so I can see what can be done.
> ...

One concrete thing that can be done is to change course here. If you want to do a breaking change, do one that causes less pain and does not make D code more complicated by default.

> P.S. const class Objects are more or less unusable with the non-const toString, toHash, opCmp and opEquals.
> ...

`const` class Objects are more or less unusable full stop. You can't even have a tail-const class reference.

Yet `const` class Objects are exactly what this proposal is trying to impose on unsuspecting D programmers. It just does not work.

> P.P.S. all of D's annotations are subtractive. This means you can write code without annotations and it'll work.

That's great, but it will sometimes not interoperate with code that has annotations, as in this case. Hence if you start imposing annotations on code, you lose this property. This would be a significant loss for the approachability of D, particularly as a first language. Furthermore, it is also a slap in the face to experienced D developers that have come to understand the limitations and proper applications of D's annotations.

> But safe, probably not.
> ...

I do not understand. Do you agree with Robert or not?

A big strength of D is that you can start out prototyping stuff with the GC without unnecessary annotation overhead and then often it will be good enough. If it is not, you can then explore different memory management options, surgically for the parts of the program state where that actually makes a difference. Only at this point is it then okay to expect people to annotate things if they want checked safety.

> P.P.P.S. I almost never write a multiple free bug these days. But that doesn't translate to "don't need double free protection", as I spent many years making that mistake and tracking them down. I even wrote my own malloc/free debugger to help. Eventually, I simply internalized what not to do. But that isn't a transferable skill. I can't even explain what I do.
> ...

As I said many times, if you want `@live` to be a linter to avoid manual memory management bugs in `@system/@trusted` functions that avoid proper data abstraction with constructors and destructors, that is fine. But you cannot hold this position and at the same time turn around and claim it does anything for `@safe` reference counting. It just does not. A more careful approach is needed.

> Anyhow, thanks for the food for thought!
> 

My pleasure! Here is some more: Why did you not propose to add `pure` to the signatures? How about `@nogc`? `nothrow`? `@safe`? Why is `toHash` `@trusted nothrow`, but not other functions?

April 26

Re: Object.toString, toHash, opCmp, opEquals

Posted by Timon Gehr
in reply to Timon Gehr

Timon Gehr

Posted in reply to Timon Gehr

On 4/26/24 15:27, Timon Gehr wrote:
>> The borrow checker does solve it, though.
>> ...
> 
> It does not, because it does not actually get aliasing under control. It adds checks that are incomplete

(And also insufficient.)

> in some programs, and unnecessary in other programs.

(I assumed you were talking about the `@live` borrow checker.)

April 26

Re: Object.toString, toHash, opCmp, opEquals

Posted by Timon Gehr
in reply to Timon Gehr

Timon Gehr

Posted in reply to Timon Gehr

On 4/26/24 15:27, Timon Gehr wrote:
> Ideally, `opEquals` implements an equivalence relation. It is fine if it changes the representatives in the process, as long as it properly encapsulates the internal state such that whenever two values compare equal, the observable semantics of the two representatives is the same.
> ...
> If you want something that is actually useful, you will have to look into splay trees or something like that. Or e.g., maybe you have a ring buffer or something that compacts itself on iteration. As I said, amortized data structures. It may be incorrect to have a const opEquals. It can introduce a performance regression.

Jonathan's examples with concurrency are also a very good practical illustration of this.

April 26

Re: Object.toString, toHash, opCmp, opEquals

Posted by Timon Gehr
in reply to Walter Bright

Timon Gehr

Posted in reply to Walter Bright

On 4/26/24 08:44, Walter Bright wrote:
> Perhaps I can help things work for you and Timon:
> 
> ```
> import std.stdio;
> 
> class A
> {
>      string xxx(const Object) const { return "A"; }
> }
> 
> class B : A
> {
>      alias xxx = A.xxx;
>      string xxx(Object) { return "B"; }
> }
> 
> void main()
> {
>      const A a = new A();
>      B b = new B();
>      const B c = new B();
>      writeln(a.xxx(a));
>      writeln(b.xxx(b));
>      writeln(c.xxx(c));
> }
> ```
> I'm calling this xxx instead of toString, just so I can show all the code. Compiling it and running it prints:
> 
> A
> B
> A
> 
> In other words, you can have a toString() that is mutable and it will work fine with writeln(), because writeln(x) does not look for Object.toString(), it looks for x.toString().
> 
> Does this work for you?

It clutters the user code with aliases. Might be preferable to forking Phobos though. Also, it can give wrong results at runtime.

For example, if a templated library type uses DbI to check whether it should make `toString` `const` by checking whether there is a `const toString` on the argument type, it will find `Object.toString`. Then those types will not properly compose with my `toString`.

This is not a theoretical problem either. This kind of introspection would be the proper fix for the following issue with std.typecons.Tuple.toString:

```d
import std.stdio, std.typecons;
class C{
    override string toString()=>"correct";
}

void main(){
    writeln(new C()); // "correct"
    writeln(tuple(new C())); // "Tuple!(C)(const(tt.C))"
}
```

This is also the same issue that prevents tuples with range members from being printed properly.

April 26

Re: Object.toString, toHash, opCmp, opEquals

Posted by Timon Gehr
in reply to Timon Gehr

Timon Gehr

Posted in reply to Timon Gehr

On 4/26/24 16:00, Timon Gehr wrote:
> 
> 
> This is not a theoretical problem either. This kind of introspection would be the proper fix for the following issue with std.typecons.Tuple.toString:
> 
> ```d
> import std.stdio, std.typecons;
> class C{
>      override string toString()=>"correct";
> }
> 
> void main(){
>      writeln(new C()); // "correct"
>      writeln(tuple(new C())); // "Tuple!(C)(const(tt.C))"
> }
> ```
> 
> This is also the same issue that prevents tuples with range members from being printed properly.

For reference, this is how I deal with problems like that currently:
https://github.com/tgehr/util/blob/master/tuple.d

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation