October 19, 2022
On Wednesday, 19 October 2022 at 18:38:55 UTC, mw wrote:
> That's because now we want to add @mustuse half-way, and it requires global system analysis.

It does not require global system analysis.

> If D started from scratch as Eiffel did, and enforces all query methods are @mustuse from the very beginning, such problem and remedy headache would never existed.

Sure. It's the same with other attributes, like @safe and nothrow.

> So now we need to balance either we want prevent accidentally (but fatal in most cases) discarding function returns, or we want prevent more coupling between modules.

It is possible to achieve both. If the compiler enforces the rules I laid out in my initial post [1] (i.e., inheritance can only remove @mustuse, not add it), then accidentally discarding the return value of a @mustuse function will be impossible, *and* there will be no additional coupling between modules.

Of course, this means that there will be some functions that can never be marked as @mustuse without a breaking change--just like with @safe and nothrow. But that's true even with your proposal. Global analysis does not prevent the addition of @mustuse from breaking code, it just increases the number of places where that breaking change can be made.

[1] https://forum.dlang.org/post/cqlwlnpcbtbkzqnhicwc@forum.dlang.org

October 19, 2022
On Wednesday, 19 October 2022 at 18:56:35 UTC, Paul Backus wrote:
> On Wednesday, 19 October 2022 at 18:38:55 UTC, mw wrote:
>
> It is possible to achieve both. If the compiler enforces the rules I laid out in my initial post [1] (i.e., inheritance can only remove @mustuse, not add it),

This won't work as Teoh has showed the loophole:

1) Base class has @mustuse, derived class does not.  Then the base class's @mustuse can be circumvented by a derived class that omits the attribute, defeating the purpose of @mustuse in the base class.


> Of course, this means that there will be some functions that can never be marked as @mustuse without a breaking change--just like with @safe and nothrow. But that's true even with your proposal.

In any case it will be a breaking change, I never said otherwise; and it's not of my concern. My only concern is how to make the ideal D code more robust.

October 19, 2022
On Wed, Oct 19, 2022 at 06:50:41PM +0000, Paul Backus via Digitalmars-d wrote: [...]
> The difference between @mustuse and @safe is that adding @safe imposes additional restrictions on the *function*, but adding @mustuse imposes additional restrictions on the *calling code*.
> 
> Another way to think of it is: @safe is like an "out" contract, and @mustuse is like an "in" contract.
> 
> Derived classes are allowed to weaken in contracts and strengthen out contracts, but not the reverse. By the same logic, derived classes are allowed to remove @mustuse and add @safe, but not the reverse.

Hmm, this actually makes a lot of sense.

If a base class method Base.method has @mustuse but the derived class method Derived.method doesn't, that's not a problem: callers who hold a Base reference to the derived instance will respect @mustuse, but Derived.method doesn't care.  Conversely, you can only cast a Base to Derived if it's actually an instance of Derived, so calling .method afterwards without respecting @mustuse doesn't break anything (this does not allow you to circumvent @mustuse on AnotherDerived.method).

So yes, @mustuse propagates up the class hierarchy, but not necessarily down.


T

-- 
He who does not appreciate the beauty of language is not worthy to bemoan its flaws.
October 19, 2022
On Wednesday, 19 October 2022 at 18:50:41 UTC, Paul Backus wrote:
> Another way to think of it is: @safe is like an "out" contract, and @mustuse is like an "in" contract.

ok yeah, that's exactly what i had in mind, i was just in a rush (made it to my meeting with 20s to spare! lol). I think this makes sense and is a useful framework for answering all the other questions.
October 19, 2022
On 10/19/22 11:16, mw wrote:

>> If I now decide to add @mustuse to Derived.fun, in the "derived"
>> module, and we apply your proposed global analysis, this will cause a
>> compilation error in the "doStuff" function in the "base" module!
>>
>> Note that the "base" module does not have any explicit dependency on
>> the "derived" module. It does not import it, or otherwise refer to it
>> in any way. In the real world, these two modules might even be in
>> separate dub packages.
>
> Yes, I know that. But this in my view is still a compiler implementation
> issue:

It is a human issue: Let's assume twenty developers have been using an Animal hierarchy. One day, Bertrand decides to add @mustuse to the new class Giraffe, a decendent of Animal. Now the code is broken all over the place.

Twenty developers chase Bertrand in and around the building until he/she removes @mustuse. (Commenting-out is acceptable as well.) Very human indeed... :o)

> even in separate dub packages, as long as the compiler visit that
> package, it need to propagate the attribute to that package's relevant
> class.

The compiler does not and should not do anything like that. I can imagine the specification for compilers finding source code (of potentially pre-compiled libraries) and visiting all their source code would be very complicated and very different from @mustuse.

We started @mustuse as a simple concept of "this value must be used" and ended up with coming up with a whole new compilation system. Not practical... :)

All aside, I agree with the fact that @mustuse should somehow be per-function. If it were left to me, I would make it the default... which would annoy even myself because it would make quick-and-dirty prototype test code unnecessarily noisy. Perhaps a compiler switch to set the default behavior would work.

Ali

October 19, 2022

On Wednesday, 19 October 2022 at 13:17:00 UTC, Paul Backus wrote:

>

In other words, you cannot introduce @mustuse in a derived class; it must be present in the base class.

I just realized that we two have different design motivations / goals: my goal is allow programmers to add @mustuse annotation to any methods that return values, and then make the whole thing work (to simulate Eiffel behavior to the max extent). So I need global analysis.

While your goal is to add @mustuse to method in such a way, which won't break current system (or with minimal impact), if the impact is too big, e.g conflict with prebuilt binary, then constraint the annotation in such a way that some of the annotations are not allowed to avoid the conflicts.

>

This is somewhat problematic for D, because we have a universal base class, Object, and neither it nor any of its methods are @mustuse.

This is not a concern/problem at all, esp in your design goal, because your rule will prohibit adding such annotation to existing methods, e.g. Object.opEquals(); and for the methods that programmers do care, e.g. Paths.remove(i), there is no such method in Object.

October 19, 2022

On Wednesday, 19 October 2022 at 21:13:00 UTC, mw wrote:

>

On Wednesday, 19 October 2022 at 13:17:00 UTC, Paul Backus wrote:

>

In other words, you cannot introduce @mustuse in a derived class; it must be present in the base class.

I just realized that we two have different design motivations / goals: my goal is allow programmers to add @mustuse annotation to any methods that return values, and then make the whole thing work (to simulate Eiffel behavior to the max extent). So I need global analysis.

While your goal is to add @mustuse to method in such a way, which won't break current system (or with minimal impact), if the impact is too big, e.g conflict with prebuilt binary, then constraint the annotation in such a way that some of the annotations are not allowed to avoid the conflicts.

The only problem with your goal is that Walter and Atila would never have accepted a proposal that actually achieves it. :)

If you are designing your own language, you can do whatever you want, but if you are contributing to an existing language, you have to work within the limits of what the project leaders will allow.

My goal was to make @mustuse as useful as possible while (a) remaining inside those limits, and (b) keeping it simple enough that I could implement it on my own.

October 22, 2022

On Wednesday, 19 October 2022 at 21:13:00 UTC, mw wrote:

>

On Wednesday, 19 October 2022 at 13:17:00 UTC, Paul Backus wrote:

>

In other words, you cannot introduce @mustuse in a derived class; it must be present in the base class.

I spent some more time thinking about this, and I do not agree with this rule even though I know what your design goals (and constraints) are.

I will make an improved proposal based on my initial transitive closure design, and I will write a longer post about it, which may take some time, please bear with me.

As a summary, this is what I am going to propose: introduce two variants of @mustuse

  1. @mustUse_remedyLegacy: this annotation is to allow programmer flag existing legacy library code that s/he has no right to change, but want to set a flag and let the compiler to help to find violations, but the compiler only output warnings instead of errors.

  2. @mustuse: proper, the default. For programmer to use in new code, or library code s/he can change (from the root of the inheritance tree). Violations of this annotation will cause compiler error.

And:

a) in both cases, this function property will be transitive closure, I.e. be propagated upwards and downwards, in all directions.

b) in both cases, removing the annotation is not allowed in the derived class if the supper class method carry such annotation.

I will write more about my rationale and considerations, and give examples when I get more time.

November 02, 2022

Finally I got some time to write this draft:

https://github.com/mw66/mustuse/blob/main/mustuse.md

which is copy & paste-ed below. Feel free to comment here, or log an issue on github.

I will update, or address concerns by revising the doc there.


DIP: @mustuse as function return value annotation

There are two scenarios we need to handle:

  • existing legacy library code, which the programmer has no right to modify
  • new code, which the programmer has full control from the root of the inheritance tree

@mustuse is neither covariant nor contravariant, it is invariant!

@mustuse as a function attribute only enforces there must be a receiver of the function call result, and do not throw away
the returned result. It has nothing to do with the (sub-)types of the returned value.

Let's consider the following example:

abstract class AbsPaths {
  // no annotation
  abstract AbsPaths remove(int i);  // return an AbsPaths object with i-th element removed
}

class ImperativePaths : AbsPaths {  // will modify `this` object itself
  // no @mustuse annotation, with the implicit assumption that the caller will continue to use `this` object
  override AbsPaths remove(int i) {
    ... // remove the i-th element of `this` object
    return this;
  }
}

class FunctionalPaths : AbsPaths {  // will NOT modify `this` object, but return new (separate) object on modification
  @mustuse  // should have this annotation
  override AbsPaths remove(int i) {
    AbsPaths other = new ImmutablePaths();
    ...  // `other` is the object with the i-th element of `this` object being removed; `this` object did not change
    return other;
  }
}

void main() {
  AbsPaths paths = new ImperativePaths();  // or FunctionalPaths() interchangeably
  AbsPaths shortPaths = paths.remove(i);   // and this line should always work, as long as the return value is not discarded
}

Here both ImperativePaths and FunctionalPaths inherit from AbsPaths, if one branch has @mustuse and the other one
does not, and the programmer is not forced to explicitly take the return value and use it, these two derived classes cannot be used
interchangeably. And even worse: in the case of FunctionalPaths, the result will be totally wrong
(the OP author's problem).

@mustuse propagation: transitive closure

In the following class inheritance tree:

----------Base-------
|         |         |
Derived1  Derived2  Derived3 <-user only manually maked Derived3.remove() as @mustuse here
|         |         |
GrandDr1  GrandDr2  GrandDr3
|
...

If the programmer only manually marked Derived3.remove() as @mustuse, then everything else (Base, Derived1, Derived2,
Derived3, GrandDr1, GrandDr2 GrandDr3, ...)'s .remove() method will all become @mustuse (as seen by the compiler internally).

With transitive closure @mustuse rule,

Pros:

  1. the method interface is consistent, e.g. at any call-site of AbsPaths.remove(i), its return value must be received.
    the programmer only need to remember one interface, instead of looking through docs/code for all the branches in the
    inheritance tree, and check for a particular class, whether its remove(i) method is @mustuse or not.
  2. the programmer can easily switch between different derived implementation classes of AbsPaths to maximize efficiency /
    performance, without worrying about potential breakage.

Cons:

  1. a few more key-strokes on every call-site of @mustuse marked method.

Prior work

Paul Backus proposal:

>

In other words, you cannot introduce @mustuse in a derived class; it must be present in the base class.

This is somewhat problematic for D, because we have a universal base class, Object, and neither it nor any of its methods are @mustuse.

His reasoning is logically correct by itself, with the constraint that legacy code is un-modifiable.
However, his proposal is not very useful because of the constraint:

  1. it won't help the existing library code where there is no @mustuse presence today. But, these library code are where the programmer
    want the compiler help most, as demonstrated by the OP user who brought up this issue on the forum. (This is also why Paul talked about
    Object, although it's not a very good example; instead we can think about std.lib.AbsPath example above).
  2. if the new rule have to fully honor the legacy (with deficiency) code, how we can improve for future D?
    Also honoring legacy code, does not mean we should not even check for potential problems.
    Even typically we cannot modify the the legacy code, at least we want the compiler help to check where are the potential problems
    are; the compiler can give warning messages, and if they are manually verified, these findings should be formally
    logged as bugs, and be fixed in the next release.
  3. and if we follow this line of reasoning, it also beg the question: whether one can remove @mustuse in a derived class.
    E.g. let ImperativePaths (mutable implementation) inherit
    from FunctionalPaths (immutable implementation), and the caller can just use the this object as the result of the computation. Again,
    this logic is correct by itself, but it make the whole code base brittle: what if the library author decided later one day that s/he want to
    modify the class ImperativePaths again to implement another immutable implementation?

Introduce two variants: @mustuse and @mustuse_remedy_legacy

  1. @mustuse_remedy_legacy: this annotation is to allow programmer flag existing legacy library code that s/he has no
    right to change, but want to set a flag and let the compiler to help to find violations; the compiler only output warnings
    instead of errors. This annotation will be implicitly propagated by the compiler throughout the whole inheritance tree.

  2. @mustuse: proper, the default. For programmer to use in new code, or library code s/he can change (from the root of the
    inheritance tree). Violations of this annotation will cause compiler error. This annotation must be explicit (just as the keyword
    override).

and:

a) in both cases, this function property will be transitive closure, i.e. be propagated upwards and downwards, in all directions.

b) in both cases, removing the annotation is not allowed in the derived class if any supper class method carry such annotation.

Actually this rule is very simple: there is only one consistent interface for any method, @mustuse is part of that method interface;
for legacy code, @mustuse_remedy_legacy will cause compiler to generate warning message, and for new code @mustuse will cause
the compiler to generate error message. That's all.

@mustuse_remedy_legacy for legacy code base and pre-built binaries

Let's revisit the motivating example:

paths.remove(i); // compiles fine but does nothing
paths = paths.remove(i); // works - what I erroneously thought the previous line was doing

Suppose the paths' type is AbsPaths, and the programmer have no right to modify it (e.g. in std.lib, or even pre-built binaries),
the compiler can only issue warning (instead of error) messages. But the programmer must be made aware of where
these potential problems are located.

And, once the programmer discovered one such misuse problem, s/he can try to find and fix all such potential problems by defining
a helper class RemedyPaths as follows:

// class Paths is located in the source file that the programmer has no right to modify
class RemedyPaths : std.lib.AbsPaths {  // helper class to trace all the occurrences of the @mustuse violation
  @mustuse_remedy_legacy
  override Paths remove(int i) {return null;}
}

and the compiler will find out all the occurrences of the same issues in the code visited by the compilation, and issue
warnings (not errors), so the programmer have a chance to visit all the code locations where remove(i)'s function
return value are discarded.

Informative detailed compiler warning messages

Since this kind of warning message is transitive closure by design, so for the implicit markings made by the
compiler: as a debugging aid the warning message should indicate the originating source of the annotation
to make it clear to the programmers, e.g.:

warning: std.lib.foo.d:123, AbsPaths.remove(int i)'s return value is discarded, violates the originating
annotation from user.codebase.RemedyPaths.d:456 @mustuse_remedy_legacy.

Summary

With universal (i.e. transitive closure) @mustuse,

  1. when the library author has decided to return a value from a function, typically it's represent the computation result or status report,
    which the function caller should either use or check instead of discard. That is good engineering practice. It forces the programmer
    to pay attention to the returned value, instead of assuming the semantics of the function e.g. based purely on the function name.
  ResultType result = someFunction();
  ... // use or check `result`

it will save lots of debugging time, at the expense of just a few more key-strokes.

  1. the library author can implement both ImperativePaths and FunctionalPaths, and the library users can choose
    them interchangably for the maximal efficiency / performance without worrying about code breakage.

History: command query separation principle

Not discarding function return value has its root from the command query separation principle.
As an informal exercise: let's derive command query separation principle from DbC.

The contract in DbC mostly exhibits as assertions in the code.

The programmer can insert assertions at any place of the code, without changing the code's original semantics (i.e the behavior when the assertions are turned-off, e.g. in release mode). In an OOP language, most of the time the assertions are checking some properties of an object, hence any method that can be called in an assertion must be a query (i.e a query can be called on an object for any number times without changing that object's internal state). So now we have query method.

But we do need to change an object's state in imperative programming, then those methods are classified as commands. After changing an object state, the command must NOT return any value. Why? because otherwise, the programmer may accidentally want to call that command and check the returned value in some assertions ... then you know what happens in the end: the program behaves differently when assertions are turn on in debug mode and off in release mode.

Therefore, we have this:

>

every method should either be a command that performs an action, or a query that returns data to the caller, but not both.

References:


1 2 3 4
Next ›   Last »