January 17, 2019
On Thursday, 17 January 2019 at 18:11:10 UTC, H. S. Teoh wrote:
> On Thu, Jan 17, 2019 at 02:39:18AM -0800, H. S. Teoh via Digitalmars-d wrote:
>> [...]
>
> Further thoughts on empty enums in D: according to the spec, an enum always has an underlying base type, the default of which is int.  So an empty enum as declared above, if we were to hypothetically allow it, would be a subtype of int (a bottom int subtype).  Which also means we can declare:
>
> 	enum BottomStr : string { }
>
> which, AFAIK, would be treated as something distinct from the bottom int type.  So it would represent a different bottom type.
>
> You'd need a "true" bottom type to get the direct equivalent of the Bottom of type theory.
>
> Of course, we could stipulate that all empty enums are interpreted as Tbottom, which would clear up this mess -- but this is yet another area where the DIP in question failed to address.
>
>
>> [...]
> [...]
>
> And furthermore, as somebody has already pointed out, Unit as declared above would be distinct from empty structs of any other name that you might declare.  So there'd be multiple incompatible unit types. Which again is a mess, since two functions that don't return values ("procedures") could ostensibly return distinct unit types (let's say one returns void and the other returns Unit, or one returns Unit and the other returns another empty struct of a different name), and you'd have a messy incompatibility situation.
>
>
> T

The compiler could "decay" these various "unit types" to the same type (with different names).

struct EmptyStruct { }
enum EmptyEnum { }

static assert( is (EmptyStruct == TUnit) );
static assert( is (EmptyEnum == TUnit) );

At that point they just become different ways of writing the same thing, but should keep the type theory sound.
January 17, 2019
On Thursday, 17 January 2019 at 17:53:10 UTC, H. S. Teoh wrote:
> For one thing, Top would be the supertype of every other type, so you could pass in values of any type to bar:
>
> 	bar(123);
> 	bar("abc");
> 	bar(null);
> 	bar(Unit.init);
>
> Which ultimately makes sense, because we could have gotten ptr from taking the address of a float, for example, and it would be UB to assign a string to *ptr.
>
> You could cast the Top* to something else, but that gets into the realm of implementation-defined behaviour (it could be just UB), and no longer something under the auspices of type theory.

All these sound like things that would make sense with void?

For instance, I think it would make sense to call a function `int foobar(void, void)` with as `foobar(1, "abd")`, which is basically the function saying it will ignore these parameters.
January 17, 2019
On Thursday, 17 January 2019 at 18:11:10 UTC, H. S. Teoh wrote:
> On Thu, Jan 17, 2019 at 02:39:18AM -0800, H. S. Teoh via Digitalmars-d wrote:
>> [...]
>
> Further thoughts on empty enums in D: according to the spec, an enum always has an underlying base type, the default of which is int.  So an empty enum as declared above, if we were to hypothetically allow it, would be a subtype of int (a bottom int subtype).  Which also means we can declare:
>
> 	enum BottomStr : string { }
>
> which, AFAIK, would be treated as something distinct from the bottom int type.  So it would represent a different bottom type.
>
> You'd need a "true" bottom type to get the direct equivalent of the Bottom of type theory.
>
> Of course, we could stipulate that all empty enums are interpreted as Tbottom, which would clear up this mess -- but this is yet another area where the DIP in question failed to address.
>
>
>> [...]
> [...]
>
> And furthermore, as somebody has already pointed out, Unit as declared above would be distinct from empty structs of any other name that you might declare.  So there'd be multiple incompatible unit types. Which again is a mess, since two functions that don't return values ("procedures") could ostensibly return distinct unit types (let's say one returns void and the other returns Unit, or one returns Unit and the other returns another empty struct of a different name), and you'd have a messy incompatibility situation.
>
>
> T

The compiler could "decay" these various "unit types" to the same type (with different names).

struct EmptyStruct { }
enum EmptyEnum { }

static assert( is (EmptyStruct == TUnit) );
static assert( is (EmptyEnum == TUnit) );

At that point they just become different ways of writing the same thing, but should keep the type theory sound.
January 17, 2019
On Thursday, 17 January 2019 at 18:38:54 UTC, Olivier FAURE wrote:
> For instance, I think it would make sense to call a function `int foobar(void, void)` with as `foobar(1, "abd")`, which is basically the function saying it will ignore these parameters.

To expand on this, I'm understanding supertyping (at least in D and similar languages) as the *discarding* of type information. Eg, A is the supertype of B if B has more specific information about what values it can hold than A, which means A can accept more different values.

In that mindset, void seems like the logical top type, because it describes no information at all about any value it could hold.
January 17, 2019
On Thursday, 17 January 2019 at 17:17:45 UTC, Tim wrote:
> On Tuesday, 15 January 2019 at 08:59:07 UTC, Mike Parker wrote:
>> DIP 1017, "Add Bottom Type", is now ready for Final Review. This is the last chance for community feedback before the DIP is handed off to Walter and Andrei for the Formal Assessment. Please read the procedures document for details on what is expected in this review stage:
>>
>> https://github.com/dlang/DIPs/blob/master/PROCEDURE.md#final-review
>>
>> The current revision of the DIP for this review is located here:
>>
>> https://github.com/dlang/DIPs/blob/4716033a8cbc2ee024bfad8942b2ff6b63521f63/DIPs/DIP1017.md
>>
>> In it you'll find a link to and summary of the previous review round. This round of review will continue until 11:59 pm ET on January 30 unless I call it off before then.
>>
>> Thanks in advance for your participation.
>
> An advantage of the bottom type over an attribute is, that it is part of the function type. A library allowing to set a custom error handler could specify that the error handler must not return:
> alias ErrorHandler = TBottom function(string msg);
>

Great example, let's include this in the DIP.

> Here is another example for TBottom* == typeof(null): If a library type allows custom user data with a template and you don't need, you could specify TBottom.
> struct LibraryType(UserData)
> {
>     // ...
>     UserData* userData;
> }
> Since the compiler knows, that userData is always null, it could eliminate dead code. Note that it would not be possible to specify typeof(null) as the template parameter of LibraryType, because then userData would be of type typeof(null)*.

This is another good example. I'd like to point out that the Unit type would allow you to do the same thing for a struct like:

struct AnotherLibraryType(UserData)
{
    UserData userData;
}

But in your example the template uses a pointer to UserData.  In this case the only way to express that you don't want any user data is to pass the Bottom type.

To summarize, the bottom type allows you to declare a special pointer with no storage!

    TBottom*.sizeof == 0

This is also something that should be in the DIP and Walter should chime in with whether or not he thinks these semantics can actually be implemented.

January 17, 2019
Am 17.01.19 um 18:17 schrieb Tim:
> An advantage of the bottom type over an attribute is, that it is part of
> the function type. A library allowing to set a custom error handler
> could specify that the error handler must not return:
> alias ErrorHandler = TBottom function(string msg);

You could also do this with an attribute. Consider the following:

```
void bar() @system
{

}

void main()
{
    alias SafeFun = void function() @safe;
    SafeFun safeFun = &bar; // Error: cannot implicitly convert
expression & bar of type void function() @system to void function() @safe
}
```

You could do the same thing for an `@noreturn` attribute:

```
alias ErrorHandler = void function(string msg) @noreturn;
```

What I am trying to say: `@noreturn` could also be part of the type of the function.


January 17, 2019
On Thu, Jan 17, 2019 at 06:47:33PM +0000, Olivier FAURE via Digitalmars-d wrote:
> On Thursday, 17 January 2019 at 18:38:54 UTC, Olivier FAURE wrote:
> > For instance, I think it would make sense to call a function `int
> > foobar(void, void)` with as `foobar(1, "abd")`, which is basically
> > the function saying it will ignore these parameters.
> 
> To expand on this, I'm understanding supertyping (at least in D and similar languages) as the *discarding* of type information. Eg, A is the supertype of B if B has more specific information about what values it can hold than A, which means A can accept more different values.

Think of Top as the analogue of D's Object class.  Every class inherits from Object.  Conversely, being handed an Object also gives you very little information about the underlying concrete class.  Top would just be Object as applied to *all* types in the language.


> In that mindset, void seems like the logical top type, because it describes no information at all about any value it could hold.

As I've said, `void` is an overloaded keyword that has different meanings in different contexts.  In the case of `void*`, it certainly behaves like a Top type -- void* can point to literally *anything*, but it also tells you basically nothing about the type it's pointing to.

However, `void` as used in function return types means something else altogether.  A void function is *not* the same as a function that returns Object, to use the OO analogy.  A function that returns Object (resp. returns Top) is returning an actual object with actual values -- it just so happens that the function doesn't specify anything more specific about said value.  A `void` function, in the sense that it's used in C/C++ and also D, is a function that does not return a meaningful value.  In that sense, it's closer to a Unit type: the Unit type can only ever hold a single value, and since this value cannot be distinguished from any other value of the same type (there is only one value of this type, so different values do not exist), it conveys zero information and can basically be elided. I.e., the function can be said to "return nothing" or "does not return a value (but it still does return)".  This is not the same thing as "function returns a value, but it doesn't tell you what you can do with this value".

The distinction is subtle, but important.  The problem is that we're so used to `void` as it is used in C/C++/D, that we're essentially working with a defective "arithmetic" where "1" and "infinity" are both regarded as "not a number", and so we have trouble telling them apart, even though their distinction is very important!

Furthermore, `void` as used in variable declarations behaves like a Bottom type -- it's illegal to declare a variable of type Bottom because Bottom by definition does not have *any* values -- yet the existence of a variable implies that it holds a value of that type, which is a contradiction.  Hence, a function declared to return Bottom is basically a function that never returns, because if it did, you'd have a value of type Bottom, which is impossible.  D does not allow you to declare a variable of type `void`, but if indeed `void` were Top, then variables of type `void` ought to be allowed, and would behave like std.variant.Variant!  The fact that `void` variables are not allowed implies that, in the context of variable declarations, `void` means Bottom.

So you see, `void` means different things depending on the context it's used in.  Sometimes it's Top, sometimes it's Unit, sometimes it's Bottom.  Again, this is like the deficient arithmetic of the ancients: "zero" was not a number, "infinity" was not a number, and the square root of a negative number was also not a number, so all three get lumped together as "not a number".  So you'd talk about X being "not a number" or NaN, and sometimes NaN means your calculation yielded a 0, sometimes NaN means your calculation yielded infinity, and sometimes NaN means your calculation tried to take the square root of a negative number. They are all lumped together under the label of NaN, but they are actually very different things.  Confusing the different things for one another would lead to contradictions and other (seemingly) unsolvable problems.

Similarly, confusing the various different meanings of `void` just because textually they are all written as v-o-i-d will eventually lead to trouble and inconsistencies.

If we want to "fix" `void` in D, the first step would be to write these different usages differently, so that they are clearly distinct from each other. Stop calling them all "void"; call them by three distinct names so that we don't confuse ourselves as to what we actually mean. Then we can talk about completing the type system with top/bottom/unit types.

//

And BTW, it makes me very scared that people seem to think I'm some kind of expert on type theory, which I most certainly am not.  I know what I know simply by googling on the subject and reading a few articles -- which anyone could do in a couple o' hours at the most. If *that* is enough to make me an "expert" relative to the majority of the forumites here, then I really have to wonder just how much hope we have of resolving this issue in any sane way that doesn't introduce even more problems.


T

-- 
Some days you win; most days you lose.
January 17, 2019
On Thu, Jan 17, 2019 at 06:51:13PM +0000, Jonathan Marler via Digitalmars-d wrote: [...]
> To summarize, the bottom type allows you to declare a special pointer with no storage!
> 
>     TBottom*.sizeof == 0
> 
> This is also something that should be in the DIP and Walter should chime in with whether or not he thinks these semantics can actually be implemented.

This would introduce an asymmetry with the rest of the pointer types in the language which all have equal sizes, and this asymmetry will percolate down to other language constructs, causing special cases and inconsistencies everywhere.  I don't recommend doing this.

A better approach might be to make TBottom* always equal to null -- i.e., it's always illegal to dereference it because no instances of TBottom can exist.

(Of course, TBottom.sizeof would have to be either 0 or some kind of non-existent value, because instances of TBottom cannot actually exist. Based on the principle of symmetry, I'd think it should be 0, because making it an error would imply specialcasing generic code that may unknowingly be handed TBottom as a template type argument, and making it return some other value like -1 or size_t.min may cause problems in other generic code that might, for example, wish to allocate storage of the given size.  OTOH, maybe a negative .sizeof might be just the ticket to prevent generic code from trying to instantiate TBottom.  Either way, it would have to be treated specially -- which also means there's a price to be paid for adding TBottom to the language: there is no free lunch.)


T

-- 
Those who've learned LaTeX swear by it. Those who are learning LaTeX swear at it. -- Pete Bleackley
January 17, 2019
Am 17.01.19 um 19:47 schrieb Olivier FAURE:
> On Thursday, 17 January 2019 at 18:38:54 UTC, Olivier FAURE wrote:
>> For instance, I think it would make sense to call a function `int
>> foobar(void, void)` with as `foobar(1, "abd")`, which is basically the
>> function saying it will ignore these parameters.
> 
> To expand on this, I'm understanding supertyping (at least in D and similar languages) as the *discarding* of type information. Eg, A is the supertype of B if B has more specific information about what values it can hold than A, which means A can accept more different values.
> 
> In that mindset, void seems like the logical top type, because it describes no information at all about any value it could hold.

I believe this view of things results directly from the fact that there are different meanings to `void`. In particular, `void*` is a pointer to _any_ type, so it is the "top type of all pointer types". If you think of `void` in the same way, then yes, `void` would be the top type.

However this is not the way `void` works. At the moment `void` signals "no information" which in type theory is the unit type's responsibility. Because think about it: Which values should a type that cannot convey any information hold? Exactly one, otherwise you can have at least two different values which you can tell apart by looking at them (printing them, or whatever else you want to do with them). It is true that a top type can hold values of any type, so naively you'd think that you loose the original type information and thus it is basically the same as "no information". But this is not true because the actual values can still be different at runtime. Also usually a major feature of a top type is that you can check at runtime what the original type of the variable it holds is. For an example, take a look std.Variant. It has a `type` property which returns the typeid of the type it currently holds. (Note that this is not the case for `void*`: Without remembering, you can't actually know what type of data a variable of type `void*` points to.)

If we were to fix `void`, we would need to let go of one of those views as they are inherently incompatible.

One option would indeed be to make `void` a top type. But I don't think that makes much sense because then we would have to change the meaning of returning `void` (it would then mean to return anything, not no information). In my opinion, the interpretation of `void` as "no information" is its _main_ interpretation and thus the correct way would be to preserve that meaning and get rid of the other incompatible meanings, which means that `void` would not a top type but a unit type.

We could even go a completely different route and get rid of the symbol `void` alltogether in order to not give people who are used to C, C++, Java, D (and 1000s of other languages which have copied C's _bad_ usage of void) false ideas.

The last idea is the route that Kotlin went: They don't have a `void` type. Their unit type is called `Unit`, their bottom type is called `Nothing` and their top type is called `Any`. The type `Any?` (optional any, a type that can hold a value of any type or `null`) is very similar to what our `void*` is at the moment. Similarly, `Nothing?` is basically the same as our `typeof(null)`.

This is actually quite nice, because it does not result in any wrong expectations–those types simply don't exist in C, C++, Java etc.

However, all of this does not consider at all how to get C-interoperability to work. Kotlin is able to work with Java, so maybe there is actually a chance of getting this to work (though they don't have to deal with `void*` which makes it a lot easier). Additionally it means massive breaking changes. Even if there is a way to actually make it work with C, I still fear that there is simply too much code breakage for all of this to ever become reality.
January 17, 2019
Am 17.01.19 um 21:04 schrieb H. S. Teoh:
> The distinction is subtle, but important.  The problem is that we're so used to `void` as it is used in C/C++/D, that we're essentially working with a defective "arithmetic" where "1" and "infinity" are both regarded as "not a number", and so we have trouble telling them apart, even though their distinction is very important!

This is a great analogy, I need to keep that in mind!

> Furthermore, `void` as used in variable declarations behaves like a Bottom type -- it's illegal to declare a variable of type Bottom because Bottom by definition does not have *any* values -- yet the existence of a variable implies that it holds a value of that type, which is a contradiction.  Hence, a function declared to return Bottom is basically a function that never returns, because if it did, you'd have a value of type Bottom, which is impossible.  D does not allow you to declare a variable of type `void`, but if indeed `void` were Top, then variables of type `void` ought to be allowed, and would behave like std.variant.Variant!  The fact that `void` variables are not allowed implies that, in the context of variable declarations, `void` means Bottom.

I don't actually think declaring a variable of the Bottom type is
required to be illegal. It just needs to be impossible for it to ever be
initialized. This can trivially be done by only allwoing it to be
initialized by copying because whenever you have an expression of type
Bottom to which you "initialize" a variable of type Bottom, that
expression never returns and thus the initialization never actually
takes place. This would still allow things like
```
Bottom var = assert(0);
```
and also
```
void fun(Bottom b)
{
    // whatever, this function can never actually be called
}

void main()
{
    fun(assert(0));
}
```
Why would anyone want this? Simple, you do not have to include a special
case for the Bottom type in generic code.

> And BTW, it makes me very scared that people seem to think I'm some kind of expert on type theory, which I most certainly am not.  I know what I know simply by googling on the subject and reading a few articles -- which anyone could do in a couple o' hours at the most. If *that* is enough to make me an "expert" relative to the majority of the forumites here, then I really have to wonder just how much hope we have of resolving this issue in any sane way that doesn't introduce even more problems.

I totally agree. It is basically the same for me. I do have a strong mathematical background which might help, but all my knowledge about type theory is basically from google, wikipedia and a few blog posts.