Thread overview
DIP: @mustuse as function return value annotation
Aug 11
mw
Aug 12
mw
Aug 12
mw
Aug 12
mw
Aug 19
user1234
August 11

https://github.com/mw66/mustuse/blob/main/mustuse.md

DIP: @mustuse as function return value annotation

There are two scenarios we need to handle:

  • existing legacy library code, which the programmer has no right to modify
  • new code, which the programmer has full control from the root of the inheritance tree

@mustuse is neither covariant nor contravariant, it is invariant!

@mustuse as a function attribute only enforces there must be a receiver of the function call result, and do not throw away
the returned result. It has nothing to do with the (sub-)types of the returned value.

Let's consider the following example:

abstract class AbsPaths {
  // no annotation
  abstract AbsPaths remove(int i);  // return an AbsPaths object with i-th element removed
}

class ImperativePaths : AbsPaths {  // will modify `this` object itself
  // no @mustuse annotation, with the implicit assumption that the caller will continue to use `this` object
  override AbsPaths remove(int i) {
    ... // remove the i-th element of `this` object
    return this;
  }
}

class FunctionalPaths : AbsPaths {  // will NOT modify `this` object, but return new (separate) object on modification
  @mustuse  // should have this annotation
  override AbsPaths remove(int i) {
    AbsPaths other = new ImmutablePaths();
    ...  // `other` is the object with the i-th element of `this` object being removed; `this` object did not change
    return other;
  }
}

void main() {
  AbsPaths paths = new ImperativePaths();  // or FunctionalPaths() interchangeably
  AbsPaths shortPaths = paths.remove(i);   // and this line should always work, as long as the return value is not discarded
}

Here both ImperativePaths and FunctionalPaths inherit from AbsPaths, if one branch has @mustuse and the other one
does not, and the programmer is not forced to explicitly take the return value and use it, these two derived classes cannot be used
interchangeably. And even worse: in the case of FunctionalPaths, the result will be totally wrong
(the OP author's problem).

@mustuse propagation: transitive closure

In the following class inheritance tree:

----------Base-------
|         |         |
Derived1  Derived2  Derived3 <-user only manually marked Derived3.remove() as @mustuse here
|         |         |
GrandDr1  GrandDr2  GrandDr3
|
...

If the programmer only manually marked Derived3.remove() as @mustuse, then everything else (Base, Derived1, Derived2,
Derived3, GrandDr1, GrandDr2 GrandDr3, ...)'s .remove() method will all become @mustuse (as seen by the compiler internally).

With transitive closure @mustuse rule,

Pros:

  1. the method interface is consistent, e.g. at any call-site of AbsPaths.remove(i), its return value must be received.
    the programmer only need to remember one interface, instead of looking through docs/code for all the branches in the
    inheritance tree, and check for a particular class, whether its remove(i) method is @mustuse or not.
  2. the programmer can easily switch between different derived implementation classes of AbsPaths to maximize efficiency /
    performance, without worrying about potential breakage.

Cons:

  1. a few more key-strokes on every call-site of @mustuse marked method.

Prior work

Paul Backus proposal:

>

In other words, you cannot introduce @mustuse in a derived class; it must be present in the base class.

This is somewhat problematic for D, because we have a universal base class, Object, and neither it nor any of its methods are @mustuse.

His reasoning is logically correct by itself, with the constraint that legacy code is un-modifiable.
However, his proposal is not very useful because of the constraint:

  1. it won't help the existing library code where there is no @mustuse presence today. But, these library code are WHERE the programmer
    want the compiler help most
    , as demonstrated by the OP user who brought up this issue on the forum. (This is also why Paul talked about
    class Object, although it's not a very good example; instead we can think about std.lib.AbsPath example above).
  2. if the new rule have to fully honor the legacy (with deficiency) code, how we can improve for future D?
    Also honoring legacy code, does not mean we should not even check for potential problems.
    Even typically we cannot modify the the legacy code, at least we want the compiler help to check where are the potential problems
    are; the compiler can give warning messages, and if they are manually verified, these findings should be formally
    logged as bugs, and be fixed in the next release.
  3. and if we follow this line of reasoning, it also beg the question: whether one can remove @mustuse in a derived class.
    E.g. let ImperativePaths (mutable implementation) inherit
    from FunctionalPaths (immutable implementation), and the caller can just use the this object as the result of the computation. Again,
    this logic is correct by itself, but it make the whole code base brittle: what if the library author decided later one day that s/he want to
    modify the class ImperativePaths again to implement another immutable implementation?

Introduce two variants: @mustuse and @mustuse_remedy_legacy

  1. @mustuse_remedy_legacy: this annotation is to allow programmer flag existing legacy library code that s/he has no
    right to change, but want to set a flag and let the compiler to help to find violations; the compiler only output warnings
    instead of errors. This annotation will be implicitly propagated by the compiler throughout the whole inheritance tree.

  2. @mustuse: proper, the default. For programmer to use in new code, or library code s/he can change (from the root of the
    inheritance tree). Violations of this annotation will cause compiler error. This annotation must be explicit (just as the keyword
    override).

and:

a) in both cases, this function property will be transitive closure, i.e. be propagated upwards and downwards, in all directions.

b) in both cases, removing the annotation is not allowed in the derived class if any supper class method carry such annotation.

Actually this rule is very simple: there is only one consistent interface for any method, @mustuse is part of that method interface;
for legacy code, @mustuse_remedy_legacy will cause compiler to generate warning message, and for new code @mustuse will cause
the compiler to generate error message. That's all.

@mustuse_remedy_legacy for legacy code base and pre-built binaries

Let's revisit the motivating example:

paths.remove(i); // compiles fine but does nothing
paths = paths.remove(i); // works - what I erroneously thought the previous line was doing

Suppose the paths' type is AbsPaths, and the programmer have no right to modify it (e.g. in std.lib, or even pre-built binaries),
the compiler can only issue warning (instead of error) messages. But the programmer must be made aware of where
these potential problems are located.

And, once the programmer discovered one such misuse problem, s/he can try to find and fix all such potential problems by defining
a helper class RemedyPaths as follows:

// class Paths is located in the source file that the programmer has no right to modify
class RemedyPaths : std.lib.AbsPaths {  // helper class to trace all the occurrences of the @mustuse violation
  @mustuse_remedy_legacy
  override Paths remove(int i) {return null;}
}

and the compiler will find out all the occurrences of the same issues in the code visited by the compilation, and issue
warnings (not errors), so the programmer have a chance to visit all the code locations where remove(i)'s function
return value are discarded.

Informative detailed compiler warning messages

Since this kind of warning message is transitive closure by design, so for the implicit markings made by the
compiler: as a debugging aid the warning message should indicate the originating source of the annotation
to make it clear to the programmers, e.g.:

warning: std.lib.foo.d:123, AbsPaths.remove(int i)'s return value is discarded, violates the originating
annotation from user.codebase.RemedyPaths.d:456 @mustuse_remedy_legacy.

Summary

With universal (i.e. transitive closure) @mustuse,

  1. when the library author has decided to return a value from a function, typically it's represent the computation result or status report,
    which the function caller should either use or check instead of discard. That is good engineering practice. It forces the programmer
    to pay attention to the returned value, instead of assuming the semantics of the function e.g. based purely on the function name.
  ResultType result = someFunction();
  ... // use or check `result`

it will save lots of debugging time, at the expense of just a few more key-strokes.

  1. the library author can implement both ImperativePaths and FunctionalPaths, and the library users can choose
    them interchangably for the maximal efficiency / performance without worrying about code breakage.

History: command query separation principle

Not discarding function return value has its root from the command query separation principle.
As an informal exercise: let's derive command query separation principle from DbC.

The contract in DbC mostly exhibits as assertions in the code.

The programmer can insert assertions at any place of the code, without changing the code's original semantics (i.e the behavior when the assertions are turned-off, e.g. in release mode). In an OOP language, most of the time the assertions are checking some properties of an object, hence any method that can be called in an assertion must be a query (i.e a query can be called on an object for any number times without changing that object's internal state). So now we have query method.

But we do need to change an object's state in imperative programming, then those methods are classified as commands. After changing an object state, the command must NOT return any value. Why? because otherwise, the programmer may accidentally want to call that command and check the returned value in some assertions ... then you know what happens in the end: the program behaves differently when assertions are turn on in debug mode and off in release mode.

Therefore, we have this:

>

every method should either be a command that performs an action, or a query that returns data to the caller, but not both.

References:

previous discussion:

https://forum.dlang.org/post/ojciqsikubbgzkkhdbdz@forum.dlang.org

August 12
If I understand this correctly, you want the ``@mustuse`` restriction to inherit upwards?

This will not work due to the way D's compilation model works. In a process not all compiler invocations will see this upward inheritance.

The only restriction I am aware of that can work is to simply disallow casting upwards into a class/interface that doesn't have it.

August 12
On Monday, 12 August 2024 at 06:02:20 UTC, Richard (Rikki) Andrew Cattermole wrote:
> If I understand this correctly, you want the ``@mustuse`` restriction to inherit upwards?

Yes. As said:

a) in both cases (@mustuse and @mustuse_remedy_legacy), this function property will be *transitive closure*, i.e. be propagated upwards and downwards, in all directions.


> This will not work due to the way D's compilation model works. In a process not all compiler invocations will see this upward inheritance.

I have no idea how the current D's compilation model works.

To implement what is proposed above, I assume there is only one global ASTree (singleton) of all the source files (even in the context of separate compilation of each D source file), which holds all the calculated annotation from different compilation passes, and the @mustuse checker is invoked after all the annotations are fully calculated.

August 12
On 12/08/2024 8:09 PM, mw wrote:
> I have no idea how the current D's compilation model works.
> 
> To implement what is proposed above, I assume there is only one global ASTree (singleton) of all the source files (even in the context of separate compilation of each D source file), which holds all the calculated annotation from different compilation passes, and the @mustuse checker is invoked after all the annotations are fully calculated.

That works for one process invocation.

But D is multi-process invocation.

A process is built by multiple compiler executions and do not share this knowledge.

This information simply wouldn't exist for later stages of compilation.
August 12
On Monday, 12 August 2024 at 08:11:58 UTC, Richard (Rikki) Andrew Cattermole wrote:
> On 12/08/2024 8:09 PM, mw wrote:
>> I have no idea how the current D's compilation model works.
>> 
>> To implement what is proposed above, I assume there is only one global ASTree (singleton) of all the source files (even in the context of separate compilation of each D source file), which holds all the calculated annotation from different compilation passes, and the @mustuse checker is invoked after all the annotations are fully calculated.
>
> That works for one process invocation.
>
> But D is multi-process invocation.
>
> A process is built by multiple compiler executions and do not share this knowledge.
>
> This information simply wouldn't exist for later stages of compilation.

Can we write the ASTree to temporary files, or a daemon process? so the calculated global annotation can be shared by later compilation process.





August 12
On 12/08/2024 8:29 PM, mw wrote:
> On Monday, 12 August 2024 at 08:11:58 UTC, Richard (Rikki) Andrew Cattermole wrote:
>> On 12/08/2024 8:09 PM, mw wrote:
>>> I have no idea how the current D's compilation model works.
>>>
>>> To implement what is proposed above, I assume there is only one global ASTree (singleton) of all the source files (even in the context of separate compilation of each D source file), which holds all the calculated annotation from different compilation passes, and the @mustuse checker is invoked after all the annotations are fully calculated.
>>
>> That works for one process invocation.
>>
>> But D is multi-process invocation.
>>
>> A process is built by multiple compiler executions and do not share this knowledge.
>>
>> This information simply wouldn't exist for later stages of compilation.
> 
> Can we write the ASTree to temporary files, or a daemon process? so the calculated global annotation can be shared by later compilation process.

This has been discussed a lot of times in the past including for the benefit to performance.

Also no, we cannot use a daemon process for this. The multiple compiler invocations may not be on the same machine, or started by the same person.

Unfortunately the only option is to either disallow adding it to classes, or disallow casting upwards, I am unaware of any other solution to this.

August 12
On 12/08/2024 8:35 PM, Richard (Rikki) Andrew Cattermole wrote:
> On 12/08/2024 8:29 PM, mw wrote:
>> On Monday, 12 August 2024 at 08:11:58 UTC, Richard (Rikki) Andrew Cattermole wrote:
>>> On 12/08/2024 8:09 PM, mw wrote:
>>>> I have no idea how the current D's compilation model works.
>>>>
>>>> To implement what is proposed above, I assume there is only one global ASTree (singleton) of all the source files (even in the context of separate compilation of each D source file), which holds all the calculated annotation from different compilation passes, and the @mustuse checker is invoked after all the annotations are fully calculated.
>>>
>>> That works for one process invocation.
>>>
>>> But D is multi-process invocation.
>>>
>>> A process is built by multiple compiler executions and do not share this knowledge.
>>>
>>> This information simply wouldn't exist for later stages of compilation.
>>
>> Can we write the ASTree to temporary files, or a daemon process? so the calculated global annotation can be shared by later compilation process.
> 
> This has been discussed a lot of times in the past including for the benefit to performance.
> 
> Also no, we cannot use a daemon process for this. The multiple compiler invocations may not be on the same machine, or started by the same person.
> 
> Unfortunately the only option is to either disallow adding it to classes, or disallow casting upwards, I am unaware of any other solution to this.

Oh and you can also disallow overriding functions that don't have it, and then adding it. That'll also work!
August 12
On Monday, 12 August 2024 at 08:36:36 UTC, Richard (Rikki) Andrew Cattermole wrote:

>>>
>>> Can we write the ASTree to temporary files, or a daemon process? so the calculated global annotation can be shared by later compilation process.
>> 
>> This has been discussed a lot of times in the past including for the benefit to performance.
>> 
>> Also no, we cannot use a daemon process for this. The multiple compiler invocations may not be on the same machine, or started by the same person.

Well, the daemon process can be made to accept RPC calls.


>> Unfortunately the only option is to either disallow adding it to classes, or disallow casting upwards, I am unaware of any other solution to this.
>
> Oh and you can also disallow overriding functions that don't have it, and then adding it. That'll also work!

No, I don't want to compromise the design logic, just because there are some (non overcomable) implementation constraints.
August 19

On Monday, 12 August 2024 at 08:35:19 UTC, Richard (Rikki) Andrew Cattermole wrote:

>

This has been discussed a lot of times in the past including for the benefit to performance.

Also no, we cannot use a daemon process for this. The multiple compiler invocations may not be on the same machine, or started by the same person.

Unfortunately the only option is to either disallow adding it to classes,

This the obvious solution. So many problems are caused by OOP. Conctract inheritance still not fixed, const or safety attributes are not either, etc. At some point D has to recognize that. OOP has something that just cannot work with static checks.