June 29, 2013
On 6/29/13 7:30 PM, Walter Bright wrote:
> On 6/29/2013 2:53 PM, Ary Borenszweig wrote:
>> On 6/29/13 6:01 PM, Walter Bright wrote:
>>> On 6/29/2013 12:18 PM, Ary Borenszweig wrote:
>>>> What you are asking is essentially what Crystal does for all variables
>>>> (and types):
>>>>
>>>> https://github.com/manastech/crystal/wiki/Introduction#type-inference
>>>>
>>>> Your example would be written like this:
>>>>
>>>> x = 3
>>>> y = f()
>>>> x = 3.9
>>>>
>>>> But since Crystal transforms your code to SSA
>>>> (http://en.wikipedia.org/wiki/Static_single_assignment_form) you
>>>> actually have
>>>> *two* "x" variables in your code. The first one is of type Int32, the
>>>> second of
>>>> type Float64.
>>>
>>> Sorry, but that seems like a solution in search of a problem.
>>>
>>> And besides, yuk. Imagine the bugs caused by "hey, it doesn't implicitly
>>> convert, so instead of letting the user know he goofed, let's just
>>> silently create a new variable!"
>>
>> Sorry, but I can't imagine those bugs. Can you show me an example?
>
> Sure:
>
> x = 3
> px = &x
> y = f()
> x = 3.9
> // uh-oh, *px points to a different x, and wasn't updated!
> printf("%d\n", x);  // uh-oh, I thought x was an int!

If the last statements were:

x = 4
printf("%d\n", *px);

I can see where the problem is (you would expect that to print 4, right?). That can be easily fixed by not transforming the last x to SSA if its address is taken.

That's a really good example you gave :-)

June 30, 2013
On 6/29/2013 4:08 PM, Ary Borenszweig wrote:
> That's a really good example you gave :-)

Thanks. I remember seeing it somewhere before, but can't recall just where.

June 30, 2013
On Saturday, 29 June 2013 at 19:18:13 UTC, Ary Borenszweig wrote:
> On 6/27/13 9:34 PM, JS wrote:
>> Would it be possible for a language(specifically d) to have the ability
>> to automatically type a variable by looking at its use cases without
>> adding too much complexity? It seems to me that most compilers already
>> can infer type mismatchs which would allow them to handle stuff like:
>>
>> main()
>> {
>>    auto x;
>>    auto y;
>>    x = 3;   // x is an int, same as auto x = 3;
>>    y = f(); // y is the same type as what f() returns
>>    x = 3.9; // x is really a float, no mismatch with previous type(int)
>> }
>>
>> in this case x and y's type is inferred from future use. The compiler
>> essentially just lazily infers the variable type. Obviously ambiguity
>> will generate an error.
>
> What you are asking is essentially what Crystal does for all variables (and types):
>
> https://github.com/manastech/crystal/wiki/Introduction#type-inference
>
> Your example would be written like this:
>
> x = 3
> y = f()
> x = 3.9
>
> But since Crystal transforms your code to SSA (http://en.wikipedia.org/wiki/Static_single_assignment_form) you actually have *two* "x" variables in your code. The first one is of type Int32, the second of type Float64. The above solves the problem mentioned by Steven Schveighoffer, where you didn't know what overloaded version you was calling:
>
> x = 3
> f(x) # always calls f(Int32), because at run-time
>      # x will always be an Int32 at this point
> x = 3.9
>
> But to have this in a language you need some things:
>
> 1. Don't have a different syntax for declaring and updating variables
> 2. Transform your code to SSA
> (maybe more?)
>
> So this is not possible in D right now, and I don't think it will ever be because it requires a huge change to the whole language.

This is not what I am talking about and it seems quite dangerous to have one variable name masquerade as multiple variables.

I am simply talking about having the compiler enlarge the type if needed. (this is mainly for built in types since the type hierarchy is explicitly known)

e.g.,

auto x = 3;
x = 3.0; // invalid, but there is really no reason

It's obvious that we wanting x to be a floating point... why not expand it to one at compile time? Worse thing in general is a performance hit.

One can argue, and it has been already stated, that one doesn't know which overloaded function is called. This is true, but if one uses auto(or rather a more appropriate keyword), then the programmer knows that the largest type will be used. In general, it will not be a problem at all because the programmer will not intentionally treat a variable as a multi-type(which seems to be what crystal is doing).

What I am talking about allows us to do a few things easily:

auto x;
...
x = 3.0;  // x's type is set to a double if we do not assign x a larger x compatible with double.

auto x;
...
x = 3;   // x is set to an int type, we don't have to immediately assign to x. this is not very useful though.


more importantly, the we can have the compiler infer the type when we mix subtypes:

auto x;  // x is a string
x = 3;   // x is a string
x = 3.0; // x is a string
x = ""   // x is a string

but if we remove the last line we end up with

auto x;  // x is a double
x = 3;   // x is a double
x = 3.0; // x is a double

Which, the importance is that the compiler is choosing the most appropriate storage for us. x is not a multi variable like crystal nor a variant. It is simply an auto variable that looks at the entire scope rather than just its immediate assignment.

If one prefers,

{
   autoscope x;
   // x is defined as the largest type used
}

One problem is user defined types. Do we allow inheritance to be used:

{
   autoscope x;
   x = new A;
   x = new B;
} // x is of type B if B inherits A, else error

this would be the same as

auto x = (B)(new A);


July 01, 2013
2013/7/1 JS <js.mdnq@gmail.com>

> I am simply talking about having the compiler enlarge the type if needed. (this is mainly for built in types since the type hierarchy is explicitly known)
>

Just a simple matter, it would *drastically* increase compilation time.

void foo()
{
    auto elem;
    auto arr = [elem];

    elem = 1;
    ....
    elem = 2.0;
    // typeof(elem) change should modify the result of typeof(arr)
}

Such type dependencies between multiple variables are common in the realistic program.

When `elem = 2.0;` is found, compiler should run semantic analysis of the whole function body of foo _once again_, because the setting type of elem ignites the change of typeof(arr), and it would affect the code meaning.

If another variable type would be modified, it also ignites the whole function body semantic again.

After all, semantic analysis repetition would drastically increase.

I can easily imagine that the compilation cost would not be worth the small benefits.

Kenji Hara


July 01, 2013
On Monday, 1 July 2013 at 01:08:49 UTC, Kenji Hara wrote:
> 2013/7/1 JS <js.mdnq@gmail.com>
>
>> I am simply talking about having the compiler enlarge the type if needed.
>> (this is mainly for built in types since the type hierarchy is explicitly
>> known)
>>
>
> Just a simple matter, it would *drastically* increase compilation time.
>
> void foo()
> {
>     auto elem;
>     auto arr = [elem];
>
>     elem = 1;
>     ....
>     elem = 2.0;
>     // typeof(elem) change should modify the result of typeof(arr)
> }
>
> Such type dependencies between multiple variables are common in the
> realistic program.
>
> When `elem = 2.0;` is found, compiler should run semantic analysis of the
> whole function body of foo _once again_, because the setting type of elem
> ignites the change of typeof(arr), and it would affect the code meaning.
>
> If another variable type would be modified, it also ignites the whole
> function body semantic again.
>
> After all, semantic analysis repetition would drastically increase.
>
> I can easily imagine that the compilation cost would not be worth the small
> benefits.
>
> Kenji Hara

No, this would be a brute force approach. Only one "preprocessing pass" of (#lines)  would be required. Since parsing statement by statement already takes place, it should be an insignificant cost.

arr is of of type *typeof(elem), when elem is known arr is immediately known. One would have to create a dependency tree but this is relatively simple and in most cases the tree's would be very small.

The type of elem is known in one pass since we just have to scan statement by statement and update elem's type(using if (newtype > curtype) curtype = newtype). At the end of the scope elem's type is known and the dependency tree can be updated.

The complexity of the algorithm would be small since each additional *autoscope* variable would not add much additional computation(just updating the type... we have to scan the scope anyways).

July 01, 2013
On 07/01/2013 03:08 AM, Kenji Hara wrote:
> 2013/7/1 JS <js.mdnq@gmail.com <mailto:js.mdnq@gmail.com>>
>
>     I am simply talking about having the compiler enlarge the type if
>     needed. (this is mainly for built in types since the type hierarchy
>     is explicitly known)
>
>
> Just a simple matter, it would *drastically* increase compilation time.
>
> void foo()
> {
>      auto elem;
>      auto arr = [elem];
>
>      elem = 1;
>      ....
>      elem = 2.0;
>      // typeof(elem) change should modify the result of typeof(arr)
> }
>
> Such type dependencies between multiple variables are common in the
> realistic program.
>
> When `elem = 2.0;` is found, compiler should run semantic analysis of
> the whole function body of foo _once again_, because the setting type of
> elem ignites the change of typeof(arr), and it would affect the code
> meaning.
>
> If another variable type would be modified, it also ignites the whole
> function body semantic again.
>
> After all, semantic analysis repetition would drastically increase.
>
> I can easily imagine that the compilation cost would not be worth the
> small benefits.
>
> Kenji Hara

The described strategy can easily result in non-termination, and which template instantiations it performs can be non-obvious.

auto foo(T)(T arg){
    static if(is(T==int)) return 1.0;
    else return 1;
}

void main(){
    auto x;
    x = 1;
    x = foo(x);
}
July 01, 2013
On Sun, 30 Jun 2013 21:56:21 -0400, Timon Gehr <timon.gehr@gmx.ch> wrote:

> The described strategy can easily result in non-termination, and which template instantiations it performs can be non-obvious.
>
> auto foo(T)(T arg){
>      static if(is(T==int)) return 1.0;
>      else return 1;
> }
>
> void main(){
>      auto x;
>      x = 1;
>      x = foo(x);
> }

Ouch!  That is better than Walter's case :)

-Steve
July 01, 2013
On 6/30/2013 6:08 PM, Kenji Hara wrote:
> I can easily imagine that the compilation cost would not be worth the small
> benefits.

There are arguably not even small benefits.

July 01, 2013
On Monday, 1 July 2013 at 01:56:22 UTC, Timon Gehr wrote:
> On 07/01/2013 03:08 AM, Kenji Hara wrote:
>> 2013/7/1 JS <js.mdnq@gmail.com <mailto:js.mdnq@gmail.com>>
>>
>>    I am simply talking about having the compiler enlarge the type if
>>    needed. (this is mainly for built in types since the type hierarchy
>>    is explicitly known)
>>
>>
>> Just a simple matter, it would *drastically* increase compilation time.
>>
>> void foo()
>> {
>>     auto elem;
>>     auto arr = [elem];
>>
>>     elem = 1;
>>     ....
>>     elem = 2.0;
>>     // typeof(elem) change should modify the result of typeof(arr)
>> }
>>
>> Such type dependencies between multiple variables are common in the
>> realistic program.
>>
>> When `elem = 2.0;` is found, compiler should run semantic analysis of
>> the whole function body of foo _once again_, because the setting type of
>> elem ignites the change of typeof(arr), and it would affect the code
>> meaning.
>>
>> If another variable type would be modified, it also ignites the whole
>> function body semantic again.
>>
>> After all, semantic analysis repetition would drastically increase.
>>
>> I can easily imagine that the compilation cost would not be worth the
>> small benefits.
>>
>> Kenji Hara
>
> The described strategy can easily result in non-termination, and which template instantiations it performs can be non-obvious.
>
> auto foo(T)(T arg){
>     static if(is(T==int)) return 1.0;
>     else return 1;
> }
>
> void main(){
>     auto x;
>     x = 1;
>     x = foo(x);
> }

Sorry, it only results in non-termination if you don't check all return types out of a function. It is a rather easy case to handle by just following all the return types and choosing the largest one. No big deal...  any other tries?
July 01, 2013
On 07/01/2013 05:44 AM, JS wrote:
> On Monday, 1 July 2013 at 01:56:22 UTC, Timon Gehr wrote:
>> ...
>> The described strategy can easily result in non-termination, and which
>> template instantiations it performs can be non-obvious.
>>
>> auto foo(T)(T arg){
>>     static if(is(T==int)) return 1.0;
>>     else return 1;
>> }
>>
>> void main(){
>>     auto x;
>>     x = 1;
>>     x = foo(x);
>> }
>
> Sorry,

That's fine.

> it only results in non-termination if you don't check all return
> types out of a function.

Why is this relevant? I was specifically responding to the method lined out in the post I was answering. There have not been any other attempts to formalize the proposal so far.

> It is a rather easy case to handle by just
> following all the return types and choosing the largest one.

That neither handles the above case in a sensible way nor is it a solution for the general issue. (Hint: D's type system is Turing complete.)

> No big deal...  any other tries?

That's not how it goes. The proposed inference method has to be completely specified for all instances, not only for those instances that I can be bothered to provide to you as counterexamples.