null and type safety (page 4)

November 06, 2008

Re: null and type safety

Posted by bearophile
in reply to Walter Bright

Permalink

bearophile

Posted in reply to Walter Bright

Permalink

Walter Bright:
> For example, you should also be able to create a ranged int that can
> only contain values from n to m:
> RangedInt!(N, M) i;

I presume you meant something like this:
Ranged!(int, N, M) i;

That has to work (assert and when possible statically assert) in situations like the following ones too:

i = long.max; // static err
i = ulong.max; // static err

Ranged!(ulong, 0, ulong.max) ul, r1, r2;
ul = ulong.max + 10; // static err
r1 = ulong.max / 2 + 100;
r2 = ulong.max / 2 + 100;
r1 + r2 // runtime err

As you have discussed about recently, for a compiler designer choosing what goes in the language and what to keep out of it and into libs is very important. Time ago I have read about a special Scheme *compiler* that is very small and very "pluggable", so it's very small, and a large part of the language is implemented by external libs, even a *static* type system, and several other things that usually are assumed part of the compiler itself.

It's not just a matter of creating a very flexible compiler core: even if you somehow are able to create it, then there are other problems to be solved: designing languages is quite hard, so you can't expect an average programmer to be a good designer like that (that's also why AST macros may lead to some troubles). So if you push things out of the language, you probably have to put them into standard libs, so normal programmers can use a standard and well designed version of them. Otherwise it leads to a lot of troubles that I don't list now.

How can we establish if ranged integral values have to be outside the compiler or inside? We can list requirements, etc. Generally the more things are pushed into the compiler, the more complex it becomes, slower to improve and mantain, and such features can also become more rigid (this can be seen very well with D unittests and ddoc. I think that eventually D unittests and ddoc may have to be removed to the language, and put into the standard library, and the language itself may have to grow some features (some more reflection? Maybe like Flectioned?) that allow the standard library code to define them with a handy & short syntax anyway. This to both reduce compiler complexity, allow more evolving capabilities to that functionalities, and allow the community of D programmers to improve them).

Some features of ranged integrals:
- Making integrals ranged has some different purposes, the main one is to avoid a class of runtime bugs, another purpose is to shorten some code a little. The final purpose is to have release code that has zero speed penalty compared to the D code of today. Some of those bugs are controlled by runtime code and other of them can probably be avoided at compile time, by the type system. The compiler can also avoid putting some runtime controls where it infers some values are into certain values. The code inside contracts (I mean of the contract programming) can be also used by the compiler to infer where it can remove more of those runtime controls.
- A short handy syntax is important enough, because for such ranges to become part of the D culture they may have to be handy, short, etc. If D has some features that most D programmers don't use, then they become less useful.
- Probably to avoid integral-related bugs the compiler and the runtime have to control all integral values used by the program, because letting the programmer use few of them in special points is probably a way to not see them used much. For the same purpose such controls probably need to be on (activated) by default, like array range controls.
- Recently I have shown a possible syntax to disable/enable some controls locally, with a syntax like:
safe(stack, ranges, arrays) {...}
unsafe(ranges) {...}
I think such syntax is better than the syntax used by ObjectPascal for such purposes.
- Once and where disabled such controls must have to cost zero at runtime, because D is designed to be quick. I presume the SafeD language has to keep them always activated (that's why having the compiler remove some of them automatically or using the contracts is useful).
- From the links I have shown here recently you can learn how much common is such class of integral-related bugs. And generally you can't talk about a "Safe-D" if you can't sum two integral values reliably :-)
- Range types of integers/chars/enums are useful, but also are useful subranges, that are essentially subtypes specialized for just this purpose. So if:
typedef int:1..6 TyDice;
typedef TyDice:1..3 TyHalfDice;
then a function that takes a TyDice automatically accepts a TyHalfDice too. Note that Haskell type system is so powerful that it allows the programmer to define such semantics, subtypes, etc. But D type system is quite more "primitive", so some of such things may be need to be cabled instead of being user-defined (by programmers that know a lot of type theory, of course).
- So I think that while unittests and ddoc may be better out of the compiler, range types may be better into it (note that I use unittests and ddoc _all the time_, all my programs use them heavily, I like them. I am not saying this because I don't like unittests and documentation strings).

Bye,
bearophile

Walter Bright wrote: > If that cannot be done in D, then D needs some design improvements. Essentially, any type should be "wrappable" in a struct which can alter the behavior of the wrapped type. > > For example, you should also be able to create a ranged int that can only contain values from n to m: > > RangedInt!(N, M) i; > > Preserving this property of structs has driven many design choices in D, particularly with regards to how const fits into the type system. Does that mean we're getting implicit cast overloads? Because without RangedInt!(N, M).opImplicitCastFrom(int i) you can't pass int values to functions accepting RangedInt instances. You can't pass a different RangedInt!(X, Y) either. It defeats the purpose of implicit range checking if you have to write litanies like foo(RangedInt!(1,10).check(i)) just to call a function. Sorry to jump the topic like that, but last time I asked my thread got hijacked. :P

On 2008-11-05 08:16:59 -0500, bearophile <bearophileHUGS@lycos.com> said: > The same is true making integral values become range values. If I want to write a function that takes an iterable of results of throwing a dice, I can use an enum, or control every item of the iterable for being in range 1 - 6. If range values are available I can just: > > StatsResults stats(Dice[] throwing_results) { ... > > Where Dice is: > typedef int:1..7 Dice; > > I then don't need to remember to control items for being in 1-6 inside stats(), and the control is pushed up, toward the place where that throwing_results was created (or where it comes from disk, user input, etc). This avoids some bugs and reduces some code. It's exactly the same thing, except that for numbers you may want much more than simple ranges. You could want non-zero numbers, odd or even numbers, square numbers, etc. I have the feeling that whatever the language try to restrict about numbers, it will never be enough. So my feeling is that this is better left to a template. -- Michel Fortin michel.fortin@michelf.com http://michelf.com/

Reply to Walter, > Steven Schveighoffer wrote: > >> Couldn't one design a struct wrapper that implements this behavior? >> > If that cannot be done in D, then D needs some design improvements. > Essentially, any type should be "wrappable" in a struct which can > alter the behavior of the wrapped type. > > For example, you should also be able to create a ranged int that can > only contain values from n to m: > > RangedInt!(N, M) i; > > Preserving this property of structs has driven many design choices in > D, particularly with regards to how const fits into the type system. > Why not explicitly support this with bodied typedefs? typedef int MyInt(int m, int M) { MyInt opAssign(int i) { assert(m <= i && i <= M); this = i; } MyInt opAssign(int m_, int M_)(MyInt!(m_,M_) i) // <- that doesn't work but... { static if(m > m_ && M_ > M) assert(m <= i && i <= M); // only assert as needed else static if(m > m_) assert(m <= i); else static if(M < M_) assert(i <= M); this = i; } // this is only to define the return type, the normal code for int is still generated MyInt!(m+m_, M+M_) opAdd(int m_, int M_)(MyInt!(m_,M_) i) = this; }

BCS wrote: > Reply to Walter, > >> Steven Schveighoffer wrote: >> >>> Couldn't one design a struct wrapper that implements this behavior? >>> >> If that cannot be done in D, then D needs some design improvements. >> Essentially, any type should be "wrappable" in a struct which can >> alter the behavior of the wrapped type. >> >> For example, you should also be able to create a ranged int that can >> only contain values from n to m: >> >> RangedInt!(N, M) i; >> >> Preserving this property of structs has driven many design choices in >> D, particularly with regards to how const fits into the type system. >> > > Why not explicitly support this with bodied typedefs? > > > typedef int MyInt(int m, int M) > { > MyInt opAssign(int i) { assert(m <= i && i <= M); this = i; } > > MyInt opAssign(int m_, int M_)(MyInt!(m_,M_) i) // <- that doesn't work but... > { > static if(m > m_ && M_ > M) assert(m <= i && i <= M); // only assert as needed > else static if(m > m_) assert(m <= i); > else static if(M < M_) assert(i <= M); > this = i; > } > > // this is only to define the return type, the normal code for int is still generated > MyInt!(m+m_, M+M_) opAdd(int m_, int M_)(MyInt!(m_,M_) i) = this; } > > Just create a struct with an int as its only member?

Reply to KennyTM~, > Just create a struct with an int as its only member? > But if you do that then you have to explicitly build all of the math overloads. Sometimes, they ALL end up as simple shells so why force the programer to build them all? Also it forces the compiler to use the int code generator rather than potentially not inlineing.

Forums