June 15, 2016
On Wednesday, 15 June 2016 at 07:16:18 UTC, Walter Bright wrote:
> On 6/14/2016 9:48 PM, tsbockman wrote:
>> `Int`? `Base`?
>
> 'Integer' would work fine.

`BaseInt`?

`SmartInt!Integer` looks weird to me, because of the repetition. Also, if we're going to use a long name like that I think it should hint at the fact that the true base type of a checked integer type is always a basic type, not a custom type like `SmartInt` or `BigInt`.

> Besides, the signature will be the actual type name, not the alias for it.

I'm not sure what you're getting at here?

I'm concerned about how stuff like this appears in the docs (which is the only place where the user will ever see this name). Obviously the docs must use the alias, since they are for the generic type, rather than a specific instantiation.

>> I have seen all of `S`, `F`, `G`, `X`, `R`, `C`, `A`, and `B` used as template
>> parameter names in Phobos. Often there is no particular significance to the
>> letter chosen, but the purpose of the parameter is obvious from the context,
>> anyway.
>
> 1. Each one should be evaluated on its own merits.
> 2. Style is not something cast in stone, we try to constant evolve better ways.

Obviously I don't think your way is better.  :-)

> 3. Bad practice in one case is not a rationale to use bad practice elsewhere :-)

But when a practice works out fine in one case, that *is* evidence that it is likely to work out in others.

> 4. You mentioned greppability - 'N' is as ungreppable as it gets!

Which is irrelevant, because `N`/`BaseIntegralType`/whatever will never appear in the user's code (but `bscal` will).

>> Using short template parameter names helps keep D's already-very-long signatures
>> from growing even longer than they already are.
>
> The parameter names don't appear in the signature.

They appear in the docs. I'm talking about the docs, because that's the only place anyone outside the Phobos dev team will ever see the `N`/`Integer`/whatever name.

>> Again, this needs to be short for readability and usability. Normally it's not
>> needed at all, but when it is needed (like in the implementation of `SmartInt`
>> and `SafeInt`), it tends to be needed *a lot*.
>
> Internal to the implementation does not require it to be short.

Keeping names that don't need to be long short(ish) makes it a lot easier for me to read code.

Also, in practice people usually choose convenience over safety. If writing an algorithm using `checkedint` instead of the basic integer types makes the algorithm twice as long, the vast majority of people simply won't use `checkedint`.

For me, there is no point to pursuing inclusion of `checkedint` in Phobos if doing so requires making the API so verbose that no one will want to use it. (Which is not to say that `.bscal` => `.integer` is such a problem.)

June 15, 2016
On Wednesday, 15 June 2016 at 07:17:56 UTC, Walter Bright wrote:
> On 6/14/2016 11:16 PM, tsbockman wrote:
>> On Wednesday, 15 June 2016 at 00:16:12 UTC, Walter Bright wrote:
>>> Remove all use of 'you' and 'your' from the documentation.
>>
>> Done.
>
> I hope you like the results, and are not doing it just because I asked.

Definitely the latter.  :-)

I have heard such rules before, and they seem silly to me, personally. But, it's not a big deal either way, so I changed it.

June 15, 2016
> PR: https://github.com/dlang/phobos/pull/4407
> DUB: http://code.dlang.org/packages/checkedint

Thanks for this work. Documentation can be seen here:

http://dtest.thecybershadow.net/artifact/website-7cc6e938154f90aa49fa6451a85b13e35ab2de99-1cfc1d833df5be7c570307da6f7bf5eb/web/phobos-prerelease/std_experimental_checkedint.html

I think there are a few considerable issues with the proposal, but also that all are fixable. Here are a few notes:

* The opening documentation states this proposal is concerned with adding checking capabilities to integral types. It is a full-blown package with 6 modules totaling 4690 lines as wc counts. (For comparison: std.experimental.allocator, which offers many facilities and implements a number of difficult structures and algorithms, has 11831 lines.) That's a high budget and a high imposition on users for adding checks to integral types. Such extensive style of coding goes against the D style we want to promote. Also it is worrisome that one type wasn't enough (in spite of extensive parameterization with policies) and two are needed, with subtle differences between them that need to be summarized in a table. The budget I'd establish for this is one parameterized type in one module of manageable size. Several parameterized types would be okay if they characterize distinct abstractions and use the same backend. Anything else is an indication of a design run awry.

* std.experimental.checkedint says "Normally this module should not be imported directly." -> this doesn't bode well. I suspect that's also the case for std.experimental.checkedint.flags. Modules should be import units.

* Naming: "Safe" has a different meaning throughout D and Phobos. Using it here is liable to confuse.

* Naming: The name of a type ("Int") in a name is often a red flag, and indeed here it's redundant. One should be able to say Smart!int, not stutter SmartInt!int (or Checked!int or whatever). Also, this suggests that other types should be considered, how about Smart!bool and Smart!double?

* One of the first things I looked for was establishing bounds for numbers, like Smart!(int, 0, 100) for percentage. For all its might, this package does not offer this basic facility, and from what I can tell does not allow users to enforce it via policies.

* Naming: I have no idea what "N bscal;" means.

* Documentation suddenly mentions isFixedPoint without having discussed it. Does this package do something about fixed-point numbers?

* Not sure about the value of wrapping bsf, bsr, ilogb and probably others. Unlike overflow etc,. calling them with the appropriate values is trivial on the user side. A completeness argument could be made but all of these wrappers are weight of little value.

* Typos: "equiavlents"

* Documentation uses "you" in several places, colloquialism that should be avoided.

* The idea of a loose federation of smart operators that sanitize the result of usual arithmetic operators is valuable. The idea of using different imports for the same name depending on design is clever, but can be simplified. Define functions such as enforcedOp, assertedOp (you don't need "unary" and "binary", they can be overloads), and then:

import std.checkedint : smartOp = enforcedOp;
...
auto x = smartOp!"+"(a, b);

* Not sure why divPow2 is needed, why not some enforcedOp!"<<" etc. Same about pow, why not enforcedOp!"^^"?

* Getting to the design: the root of the problem is a byzantine design that is closed to extension. The current definition goes:

struct SafeInt(N, IntFlagPolicy _policy, Flag!"bitOps" bitOps = Yes.bitOps);

Looking at the IntFlagPolicy, it offers three canned behavior: throws, asserts, and noex. Users cannot customize behavior and there is no information passed into the policy (e.g. the operands in case of overflow, or the numerator in case of division by zero). Passing the appropriate information would make it possible to implement various policies, e.g. the policy cannot say "I'm actually okay with this" or "here I'll implement a saturation policy".

A better design would use the policy as a last-resort hook, allowing it to substitute its own answer for the result if it can't be computed safely. For example, consider addition:

    bool overflow;
    auto result = myAdd(payload, rhs.payload, overflow);
    if (!overflow) return result;
    return onOverflow(payload, rhs.payload);

The onOverflow function is a policy, which allows user code to exactly define what should happen (including all current policies). The same goes about all other hooks - multiplication, division, power etc. Each should be configurable.

* I see little value in free functions such as e.g. abs() because they are trivial one-liners. I understand the need for completeness, but it seems a good aspiration for consistency is being marred by a bunch of code pulp that really does nothing interesting. Probably not worth it.

===

This suggests a much simpler design that uses DbI to choose behaviors in a flexible manner. The definition would go:

struct Checked(T, Hook, T min = T.min, T max = T.max) if (...)
{
    // state {
    static if (stateSize!Hook > 0) private Hook hook;
    else private alias hook = Hook;
    private T payload;
    // }
    ...
}

The struct Hook may have no member at all, in which case Checked!T will be as close to plain T as possible. Things become interesting when Hook defines optional methods that can be detected by introspection, such as onOverflow, onDiv0, onOutOfRangeCmp, etc. - essentially all hooks that may be ever needed. Whenever an operation cannot complete with the right behavior, the appropriate hook is statically introspected; if nonexistent, fine, return with the builtin behavior. If the hook does exist, pass responsibility to the hook along with enough information to actually let the hook supplant the computation.

The hook may have state (e.g. hysteresis, NaN flag, error state, etc) so that's why it may be embedded within Checked.

The only remaining matter is to implement a few preexisting policies (Hook implementations) to implement typical choices (such as the ones present today), and the core algorithms for doing bounded operations. The most interesting algorithms are for computing the bounds of operations such as |, &, and ^. The D compiler needs those as well, and currently implements them incorrectly. I have these in my mind and I can help with those.


Thanks,

Andrei

June 15, 2016
On 6/15/2016 9:40 AM, Andrei Alexandrescu wrote:
> Looking at the IntFlagPolicy, it offers three canned behavior: throws, asserts,
> and noex. Users cannot customize behavior and there is no information passed
> into the policy (e.g. the operands in case of overflow, or the numerator in case
> of division by zero). Passing the appropriate information would make it possible
> to implement various policies, e.g. the policy cannot say "I'm actually okay
> with this" or "here I'll implement a saturation policy".

Just a note to add to these excellent comments. One style of integer arithmetic is "saturation" arithmetic, whereby if it overflows, instead of an error it gets "stuck" at T.max. Same for underflow.

  https://en.wikipedia.org/wiki/Saturation_arithmetic

A good test for a policy API design is if saturation behavior could be done with a user-defined policy, instead of it being a pre-defined one.
June 15, 2016
On Wednesday, 15 June 2016 at 16:40:19 UTC, Andrei Alexandrescu wrote:
> Thanks for this work.
> [...]
> I think there are a few considerable issues with the proposal, but also that all are fixable.

Your message was very long, so for the moment I'm going to filter it down to just the high-level design criticism. (The rest is unimportant unless/until we reach consensus on the design, anyway.)

> * The opening documentation states this proposal is concerned with adding checking capabilities to integral types. It is a full-blown package with 6 modules totaling 4690 lines as wc counts. (For comparison: std.experimental.allocator, which offers many facilities and implements a number of difficult structures and algorithms, has 11831 lines.) That's a high budget and a high imposition on users for adding checks to integral types. Such extensive style of coding goes against the D style we want to promote.
> [...]
> The budget I'd  establish for this is one parameterized type in
> one module of manageable size. Several parameterized types would
> be okay if they characterize distinct abstractions and use the same backend. Anything else is an indication of a design run awry.

This can be summarized as, "It's too big and complicated."

`checkedint` as it stands is, I believe, fairly close to the smallest implementation possible in D today, within the constraints of the features demanded by the community in past discussions, coupled with my high-level design goals.

If you want something shorter or simpler, you will have to cut features or compromise the design in other ways. (Or improve the D language to facilitate a more elegant design.)

Some features and design goals that combine to motivate my design:

   1) Checked types must signal an error (somehow) whenever their behaviour
      deviates from that of an ideal mathematical integer.

   2) It should be possible to recover from errors - using `assert(false)` or
      a deliberate divide-by-zero to crash the program is bad design unless
      the condition that triggers it is never supposed to happen, ever.

   3) Performance (with respect to both speed and memory use) should be as
      close as possible to that of the built-in machine integer types.

   4) The API should minimize verbosity and ceremony, because otherwise hardly
      anyone will use it - people generally prefer convenience over safety.

   5) Writing generic code that works correctly with both checked and unchecked
      types must be easy.

   6) The API must make safe usage easy, and (accidental) unsafe usage hard,
      because people generally don't pay much attention to the docs (even if
      they're good). A false sense of security is worse than none at all.

   7) The API must be usable in `nothrow @nogc` code.

   8) The number of distinct template instantiations generated in natural use
      must be finite, to prevent excessive combinatorial explosion when
      checked types are used in public APIs. (Templates that are just aliases,
      and small functions that always inline don't count against this.)

> Also it is worrisome that one type wasn't enough (in spite of extensive
> parameterization with policies) and two are needed, with subtle differences
> between them that need to be summarized in a table.

The reason for the `SmartInt` versus `SafeInt` split is that with D's current semantics
(4) and (5) conflict.

`SmartInt` prioritizes (4); `SafeInt` and `DebugInt` prioritize (5).

> Getting to the design: the root of the problem is a byzantine design that is closed to extension.

The design was closed deliberately because of (8). Template bloat is a major concern, even with the current finite design.

I want `checkedint` to be usable in public APIs, and that requires some standardization of error handling and base types to be enforced upon the users. Otherwise, everyone will choose something different and all template instantiations involving integer types will become practically single-use.

> Looking at the IntFlagPolicy, it offers three canned behavior: throws, asserts, and noex.

The choice of policies is motivated by the natural incompatibility of (2), (4), (6), and (7). I built in enough variety to allow people to choose their own priorities among those goals, and no more because of (8).

> * One of the first things I looked for was establishing bounds for numbers, like Smart!(int, 0, 100) for percentage. For all its might, this package does not offer this basic facility, and from what I can tell does not allow users to enforce it via policies.

Here you are suggesting adding even more complexity to a design that you have already de-facto rejected as overly complex. As discussed earlier in this very thread, I studied adding support for arbitrary bounds and decided not to pursue that right now because implementing it in a performant way would greatly increase the complexity of `checkedint`, and make the template bloat problem much worse.

> Also, this suggests that other types should be considered, how about Smart!bool and Smart!double?

Neither `bool` nor `double` has the kind of severe-but-fixable safety and correctness issues that the machine integer types do, which motivates the `checkedint` design. No doubt someone can make up some sort of meaning to attach to those symbols, but it will most likely have nothing to do with `SmartInt`.

> * Not sure why divPow2 is needed, why not some enforcedOp!"<<"

Because (although similar) a bit shift is actually semantically different than dividing or multiplying by a power of two. The bit shift implies different rules for rounding and overflow.

I realized in testing that even for `SmartInt`, the bit shift semantics are still useful sometimes, and decided that it was better not to confuse the two. A `smartOp` version of the bit shifts is necessary because the built-in bit shifts have some undefined behaviour that needs to be fixed.

> etc. Same about pow, why not enforcedOp!"^^"?

`pow()` exists as a free function to satisfy (5), and because the `^^` and `^^=` operators both have language-level bugs that currently make their use incompatible with (6).

> I see little value in free functions such as e.g. abs() because they are trivial one-liners. I understand the need for completeness, but it seems a good aspiration for consistency is being marred by a bunch of code pulp that really does nothing interesting. Probably not worth it.

`checkedint`-aware versions of functions like `abs()` are necessary to satisfy (4) and (6) together.

> The hook may have state (e.g. hysteresis, NaN flag, error state, etc) so that's why it may be embedded within Checked.

The early iterations of `checkedint` worked this way (although I had no plans to support user-defined hooks). I implemented and debugged it, and thought it was about ready to submit many months ago.

Then I actually tried *using* it, and hated it. The problem with using a NaN state, is that in nothrow code you have to manually check it before *every* call to an external (non-`checkedint`-aware) function, or you may accidentally lose it.

As a result, safe NaN-based APIs violate (4), while concise APIs violate (6). The IEEE-inspired sticky flags feature is my solution to this problem, and it is far more pleasant to work with in practice - as well as faster and more memory efficient.

Meanwhile, in code that allows exceptions, there is no reason to pay the speed and memory penalty of carting around the NaN state to check later - just throw on the spot whenever something goes wrong.
June 15, 2016
On 06/15/2016 02:26 PM, Walter Bright wrote:
> On 6/15/2016 9:40 AM, Andrei Alexandrescu wrote:
>> Looking at the IntFlagPolicy, it offers three canned behavior: throws,
>> asserts,
>> and noex. Users cannot customize behavior and there is no information
>> passed
>> into the policy (e.g. the operands in case of overflow, or the
>> numerator in case
>> of division by zero). Passing the appropriate information would make
>> it possible
>> to implement various policies, e.g. the policy cannot say "I'm
>> actually okay
>> with this" or "here I'll implement a saturation policy".
>
> Just a note to add to these excellent comments. One style of integer
> arithmetic is "saturation" arithmetic, whereby if it overflows, instead
> of an error it gets "stuck" at T.max. Same for underflow.
>
>    https://en.wikipedia.org/wiki/Saturation_arithmetic
>
> A good test for a policy API design is if saturation behavior could be
> done with a user-defined policy, instead of it being a pre-defined one.

Also, the proposed design mentions "hysteresis" and allows Hook to introduce state. -- Andrei
June 15, 2016
On Wednesday, 15 June 2016 at 16:40:19 UTC, Andrei Alexandrescu wrote:
> I think there are a few considerable issues with the proposal, but also that all are fixable.

I already sent a much longer message detailing some of the reasons why I believe my design is sensible. But, before we continue this discussion much further, I need to stop and make one thing clear:

What you are proposing is *not* "fixing" my design - it is basically scrapping it and replacing it with a ground-up rewrite, with perhaps some bits and pieces and general inspiration taken from my work.

I'm OK with the community rejecting my design if that's what people really want to do, but I will not be implementing your proposal myself. I will simply leave `checkedint` on DUB, and move on with my life.

So, I'm putting out a general call now for the community to download the DUB package, try actually *using* it for something, and speak up as to whether the design seems good, or not.

If the decision is made to accept the high-level design, then we can go back to bikeshedding about names, fixing typos, tweaking/trimming the API, etc.

June 15, 2016
On Wednesday, 15 June 2016 at 18:50:32 UTC, tsbockman wrote:
> What you are proposing is *not* "fixing" my design - it is basically scrapping it and replacing it with a ground-up rewrite, with perhaps some bits and pieces and general inspiration taken from my work.
> [...]
> If the decision is made to accept the high-level design, then we can go back to bikeshedding about names, fixing typos, tweaking/trimming the API, etc.

I should mention that one possible "trimming" would be to remove either `SmartInt`/`smartOp` or `SafeInt`/`safeOp`. I don't *want* to do this, obviously, but I wouldn't categorize it as "scrapping the design", either.

June 15, 2016
On 15.06.2016 18:40, Andrei Alexandrescu wrote:
>
> The only remaining matter is to implement a few preexisting policies
> (Hook implementations) to implement typical choices (such as the ones
> present today), and the core algorithms for doing bounded operations.
> The most interesting algorithms are for computing the bounds of
> operations such as |, &, and ^. The D compiler needs those as well, and
> currently implements them incorrectly. I have these in my mind and I can
> help with those.

https://github.com/tgehr/d-compiler/blob/master/vrange.d#L338
June 15, 2016
On 06/15/2016 05:32 PM, Timon Gehr wrote:
> On 15.06.2016 18:40, Andrei Alexandrescu wrote:
>>
>> The only remaining matter is to implement a few preexisting policies
>> (Hook implementations) to implement typical choices (such as the ones
>> present today), and the core algorithms for doing bounded operations.
>> The most interesting algorithms are for computing the bounds of
>> operations such as |, &, and ^. The D compiler needs those as well, and
>> currently implements them incorrectly. I have these in my mind and I can
>> help with those.
>
> https://github.com/tgehr/d-compiler/blob/master/vrange.d#L338

Nice! On first glance (number of steps :o)) these looks like what I had in mind. Any chance you could make PRs for integrating these into the compiler? Thanks! -- Andrei