Jump to page: 1 2
Thread overview
Dependent typing and Refinement Typing for D
Dec 20, 2019
sighoya
Dec 21, 2019
sighoya
Dec 21, 2019
sighoya
Dec 21, 2019
sighoya
Dec 22, 2019
sighoya
Dec 21, 2019
sighoya
Dec 21, 2019
sighoya
Dec 22, 2019
bpr
Dec 22, 2019
sighoya
December 20, 2019
Reviving bearophile's thread https://forum.dlang.org/thread/gjkzgykpgwqryfxonnsy@forum.dlang.org

> https://github.com/tomprimozic/type-systems/tree/master/refined_types
>
>>The implementation of a refined type-checker is actually very straightforward and turned out to be much simpler than I expected. Essentially, program expressions and contracts on function parameters and return types are converted into a series of mathematical formulas and logical statements, the validity of which is then assessed using an external automated theorem prover Z3.<
>
> It seems possible to integrate Z3 with D contract programming.
>
> Bye,
> bearophile

I find it that easier if we do it likewise D does it at compile time. With gradual typing, the checks are done at runtime and you didn't need to up propagate the runtime restrictions in the if clause like non gradual dependent typing does it with compile time checking.

The idea is that you can apply any List with any Integer to "get(List list,int index) if index < list.length", the compiler inserts a runtime clause for asserting the condition in the if clause and subsequently throw an IndexOutOfBoundException in the false case.

If the compiler is clever enough, it can remove some runtime assertions if the enclosing function gives already guarantee for it, but that together with overload resolution might be a bit problematic and requires some sort of SAT solving, or we decide to do it only partial and give errors for negative cases instead.

The point is, even in non gradual dependent typing, you have to make some runtime assertion at some point, you never get runtime dependent typing for zero cost.
Further, gradual dependent typing can be easily fit to Dlang's existing ecosystem.
I would use it purely for argument constraints, not for result constraints as these incur unnecessary runtime checking except in unit tests where it makes sense.



December 20, 2019
On Friday, 20 December 2019 at 22:06:03 UTC, sighoya wrote:
> I find it that easier if we do it likewise D does it at compile time. With gradual typing, the checks are done at runtime and you didn't need to up propagate the runtime restrictions in the if clause like non gradual dependent typing does it with compile time checking.

I understand what you mean, but I don't really think you mean gradual typing. That would be more common in dynamic  languages where you may choose to gradually apply more and more constraints. So you can start with a runtime checked prototype and end up with a statically checked final program.

Which is kinda cool, but not possible for D.

> If the compiler is clever enough, it can remove some runtime assertions if the enclosing function gives already guarantee for it, but that together with overload resolution might be a bit problematic and requires some sort of SAT solving, or we decide to do it only partial and give errors for negative cases instead.

You need more machinery and annotations than just a SAT solver. I think it is better for D to instead explore what it already does, that is to do more thorough data flow analysis or abstract interpretation. Essentially add flow-typing and refinement typing (of some sort).

> I would use it purely for argument constraints, not for result constraints as these incur unnecessary runtime checking except in unit tests where it makes sense.

Wouldn't the problem be exactly the same? You need to deduce what the arguments are within the constraints, so you need to analyze all paths that led to the value you submit to the function to establish such constraint.

Generally speaking, you need heavy inter-procedural analysis?


December 21, 2019
> I understand what you mean, but I don't really think you mean gradual typing. That would be more common in dynamic  languages where you may choose to gradually apply more and more constraints.

I mentioned exactly gradual typing and D has already dynamic typing regarding generic constraints, though D's duck typing is at compile time, so gradual typing at runtime for runtime values would be the dual to compile duck typing for compile time values at compile time.

>So you can start with a runtime checked prototype
> and end up with a statically checked final program.

No, I would only use runtime checking. Even general dependent typing requires at some point runtime assertions aka Path Dependent typing ala Scala.

> Which is kinda cool, but not possible for D.

Why?

> You need more machinery and annotations than just a SAT solver.

Why?
You only need SAT solving in cases where you want to up propagate runtime assertion inserted by the compiler or for overload resolution. It don't have to be perfect though and we may can help with some kind of annotation, i.e. manually specifying the overload order.

> I think it is better for D to instead explore what it already does, that is to do more thorough data flow analysis or abstract interpretation. Essentially add flow-typing and refinement typing (of some sort).

This is non gradual dependent typing (general/classic dependent typing).
At first, it is very compile time consuming because of tracking, e.g. proofing a list is extended with m additional members requires to track the count of push functions called in scope.
At second, it is generally undecidable you need to insert manually proofs for it which you don't need for gradual dependent typing as all runtime variables become constants at runtime.

> Wouldn't the problem be exactly the same? You need to deduce what the arguments are within the constraints, so you need to analyze all paths that led to the value you submit to the function to establish such constraint.

No, return values depending on input values would be evaluated at runtime before the function is called.

The point is the following, it seems to add additional runtime checking, but you do it anyway in Lists and Vectors to throw IndexOutOfBoundException or NullPointerException, instead of doing this manually, the compiler can automate these exception handling by inserting runtime asserts at compile time before the function is called. Instead of an NPE or IndexOutOfBound you will get a ConstraintViolationException.


December 21, 2019
>Wouldn't the problem be exactly the same? You need to deduce what the arguments are within the constraints, so you need to analyze all paths that led to the value you submit to the function to establish such constraint.

If you suggest to take values from the function body into the return value expression, then no, we shouldn't allow this.

What happens if we have a constant value expression in the function signature return value, then we need to check indeed all possible paths leading to this value, therefore I would only use it for unit tests as unit tests could be executed at compile time then (because of constants inputs).

Another option would be to add a runtime assertion before returning a value inside the function block.


December 21, 2019
On Saturday, 21 December 2019 at 12:52:47 UTC, sighoya wrote:
>>So you can start with a runtime checked prototype
>> and end up with a statically checked final program.
>
> No, I would only use runtime checking. Even general dependent typing requires at some point runtime assertions aka Path Dependent typing ala Scala.

I don't know Scala, how does that work?


>> You need more machinery and annotations than just a SAT solver.
>
> Why?
> You only need SAT solving in cases where you want to up propagate runtime assertion inserted by the compiler or for overload resolution. It don't have to be perfect though and we may can help with some kind of annotation, i.e. manually specifying the overload order.

It will be very limited considering you can call functions in assertions. So you need to emit verification conditions to get anywhere, and that takes quite a bit of machinery.

> At first, it is very compile time consuming because of tracking, e.g. proofing a list is extended with m additional members requires to track the count of push functions called in scope.

Verification is time consuming, dataflow/abstract interpretion is more reasonable (but less powerful).

> At second, it is generally undecidable you need to insert manually proofs for it which you don't need for gradual dependent typing as all runtime variables become constants at runtime.

Yes, it will not always succeed. However,  you might be able to cache results from previous compiles or make it optional (i.e. just emit dynamic checks where applicable).

> No, return values depending on input values would be evaluated at runtime before the function is called.

After?

> The point is the following, it seems to add additional runtime checking, but you do it anyway in Lists and Vectors to throw IndexOutOfBoundException or NullPointerException, instead of doing this manually, the compiler can automate these exception handling by inserting runtime asserts at compile time before the function is called.

Contracts are supposed to have their asserts evaluated at the calling point, just not implemented in D IIRC.

December 21, 2019
On Saturday, 21 December 2019 at 13:13:26 UTC, sighoya wrote:
>>Wouldn't the problem be exactly the same? You need to deduce what the arguments are within the constraints, so you need to analyze all paths that led to the value you submit to the function to establish such constraint.
>
> If you suggest to take values from the function body into the return value expression, then no, we shouldn't allow this.

I think it would be easier to follow what the scope of your idea is if we looked at some pseduo code examples?


> What happens if we have a constant value expression in the function signature return value, then we need to check indeed all possible paths leading to this value, therefore I would only use it for unit tests as unit tests could be executed at compile time then (because of constants inputs).

Yes, that is a good point actually. Verification is kinda like unit-tests at compile time. :-)

December 21, 2019
> I think it would be easier to follow what the scope of your idea is if we looked at some pseduo code examples?

Good, point:


ElemType method(Array array)
{

uint index=readUInt(...);
return array(index);
}

with

ElemType array(uint index) if index<array.length

becomes:

ElemType method(Array array)
{

uint index=readUInt(...);
if(index<array.length)
{
return array(index);
}
else
{
throw new ContradictionError("Violation of Constraint index<array.length where index=$index and array.length=$array.length");
}


If we have
ElemType method(Array array, uint index) if index<array.length
return array(index);
}

instead, then the compiler could check if above constraint implies the constraint for array(index), but that may be complicated, maybe we add some hint to the compiler that is is trivial to check it:

@constraintsforMethod
ElemType method(Array array, uint index) if index<array.length
{
return array(index)@impliedBy(@constraintsForMethod);
}

such that no ContradictionException can be thrown in method but before each call of "method" at the call site except the method call is implied by the constraint of the enclosing function, then the runtime assertion further up propagates.


Overload resolution:

method1(uint32 i, uint32 j, uint32 k) if i^2 < j^2 < k^2
method2(int64 i, int64 j, int64 k) if i^2 < j^2 < k^2

May be hard to grok for the compiler and likewise for Z3, a better approach will be to reject it by the compiler and require the user to give an annotation:

@MoreSpecificThan(method2)
method(uint32 i, uint32 j, uint32 k) if i^2 < j^2 < k^2

@method2
method(int64 i, int64 j, int64 k) if i^2 < j^2 < k^2

Dispatching of method "method" is done at runtime like follows:

If i,j,k is applicable to only one of the methods, then this method is choosen,
If i,j,k is to non of these applicable, the throw an ContradictionException
If i,j,k is to more than one applicable, choose the most specific by follow the annotations given to them.

Unit Test:

void method():a if 2<a<10
{
   testObject=mock(...);
   return functionToTest(testObject) //will be run at compile time in order to check signature of method
}


>Verification is time consuming, dataflow/abstract interpretion is
more reasonable (but less powerful).

What's the difference? Can you give an example?
December 21, 2019
>Yes, it will not always succeed. However,  you might be able to
cache results from previous compiles or make it optional (i.e.
just emit dynamic checks where applicable).

Is an option, but release builds will be slow, runtime checking isn't good in this case, as you would proof argument to return value relationships each time a function is called.
I know that general dependent typing is from an idiomatic view point right, but it has the following drawbacks:

1.) Function signatures become much more complicated with respect to gradual dependent typed function signatures in that all return values are related to their input arguments

2.) You can't characterize all properties of a function/type in the signature, you don't know them immediately and the compiler can't infer them for you, adding some properties later might break your code.

3.) Signature become more sensible to changes (unstable), changing them causes a cascade of errors following the stack upwards.

4.) It doesn't fit to D, as D is already lazy (and unsafe) with compile time duck typing, e.g. you can create generic entities which can never be valid for any type parameter. Further type errors are thrown at specialization time deep in the call stack (unice), doing it differently for runtime values is an idomatic break to how it is handled by D.


Theoretically, general dependent typing could replace all kinds of unit testing, but to do so, you need to write the whole function block as the return value in the function signature which is unreadable and undecidable for the compiler to infer all kinds of properties. It would further break the idiom of encapsulation which is important for any kind of commercialism.
So practically, you still need to unit test.
December 21, 2019
Generally speaking, for enterprises, gradual dependent typing is fine, because if they don't want to get accustomed to it, they can simply ignore to up propagate additional constraints upwards to the stack whereas for general dependent typing, they are forced to do so, giving them a second Rust.
December 21, 2019
On Saturday, 21 December 2019 at 14:51:39 UTC, sighoya wrote:
> instead, then the compiler could check if above constraint implies the constraint for array(index), but that may be complicated, maybe we add some hint to the compiler that is is trivial to check it:

Ok, I believe I understand, but how is that dynamic check different from the in/out contracts that D has? Well, except that in-contracts should be inlined before the function is called.

> such that no ContradictionException can be thrown in method but before each call of "method" at the call site except the method call is implied by the constraint of the enclosing function, then the runtime assertion further up propagates.

Understood.

> If i,j,k is applicable to only one of the methods, then this method is choosen,
> If i,j,k is to non of these applicable, the throw an ContradictionException
> If i,j,k is to more than one applicable, choose the most specific by follow the annotations given to them.

Do you want to have overloading on dependent/refined types?

That would be interesting, but then it has to be resolved at compile time...


>>Verification is time consuming, dataflow/abstract interpretion is more reasonable (but less powerful).
>
> What's the difference? Can you give an example?

Verification tries to prove assertions by transforming the code into something that can be handled by a SMT/SAT solver. It is often done by overapproximation (gross oversimplificaton: putting a line between two points instead of calculating the exact position of points)

Flow analysis/abstract interpretation tries to collect information over the various paths in the code, usually using very specific measures that are dealt with in an approximate manner. Like keeping track of variables that are negative/positive/unknown or value ranges. Can move both backwards and forwards.


« First   ‹ Prev
1 2