Jump to page: 1 2 3
Thread overview
inferred types
Jan 15, 2003
Kimberley Burchett
Jan 15, 2003
Sean L. Palmer
Jan 15, 2003
Ilya Minkov
Jan 15, 2003
Kimberley Burchett
Jan 17, 2003
Ilya Minkov
Jan 16, 2003
Daniel Yokomiso
Jan 16, 2003
Kimberley Burchett
Jan 17, 2003
Norbert Nemec
Jan 18, 2003
Daniel Yokomiso
Jan 16, 2003
Russell Lewis
Jan 17, 2003
Sean L. Palmer
Jan 17, 2003
Norbert Nemec
Jan 17, 2003
Sean L. Palmer
Jan 17, 2003
Kimberley Burchett
Re: inferred types (cleaned up)
Jan 18, 2003
Kimberley Burchett
Feb 14, 2003
Walter
Feb 14, 2003
Walter
Jan 18, 2003
Norbert Nemec
Jan 18, 2003
Kimberley Burchett
Jan 18, 2003
Norbert Nemec
Jan 21, 2003
Russell Lewis
Jan 22, 2003
Norbert Nemec
Jan 22, 2003
Ilya Minkov
Feb 14, 2003
Walter
Jan 18, 2003
Daniel Yokomiso
January 15, 2003
A small, simple feature suggestion.  The "var" keyword could introduce a variable with an inferred type:

void foo(int x) {
var y = x + 1;
printf("y is %d\n", y);
}

This feature is especially useful in situations where you don't *know* the type of an expression.  For example, if the above example used a template so that x has some unspecified type T, and the + operator for T were overloaded, then there would be no way to know the type of T+int.  One could of course just assume that it would be the same type as x itself, but that doesn't scale to letting y store the result of arbitrary expressions.

Exploring the orthogonality of the feature... The "var" keyword could also be allowed for function arguments, where it could implicitly define a function template.  Also, for functions with consistent return types, you could allow it for the return value (again, this would be important for situations where you don't know the result type of an overloaded operator).  However, it would look a bit odd if used as a return type:

var foo(int x) {
return x+1;
}

It kind of looks like foo should be a variable, not a function.  So you might want to rename the keyword.  One choice would be to use "T" instead of "var", but then writing templates would be a bit more annoying because you'd have to come up with a name for your parameterized types.

Even if this feature is not adopted, D should probably still have *some* way of dealing with situations where you don't know the type of an expression.  I suppose you could allow use of a type() property wherever a type is expected:

(x+1).type() foo(SomeType x) {
return x+1;
}

But that strikes me as a lot uglier than using "var" or "T".

Kimberley Burchett
Endeca, Inc


January 15, 2003
"Kimberley Burchett" <kimbly at kimbly.comKimberley_member@pathlink.com> wrote in message news:b03d3k$2m8r$1@digitaldaemon.com...
> A small, simple feature suggestion.  The "var" keyword could introduce a variable with an inferred type:
>
> void foo(int x) {
> var y = x + 1;
> printf("y is %d\n", y);
> }

YES!!!

for (var iter = container.begin(); iter != container.end(); ++iter)
{
    *iter = 0;
}

> This feature is especially useful in situations where you don't *know* the
type
> of an expression.  For example, if the above example used a template so
that x
> has some unspecified type T, and the + operator for T were overloaded,
then
> there would be no way to know the type of T+int.  One could of course just assume that it would be the same type as x itself, but that doesn't scale
to
> letting y store the result of arbitrary expressions.

Or if you don't want to type out a really lengthy type, or make a typedef for it.  See C++ iterators.

for (std::map<std::basic_string<char>, std::less<std::basic_string<char> >
>::const_iterator iter = container.begin(); iter != container.end(); ++iter)
{
    foo(*iter);
}

> Exploring the orthogonality of the feature... The "var" keyword could also
be
> allowed for function arguments, where it could implicitly define a
function
> template.

Interesting... you mean:

x.type() add(var x, var y)
{
    return x + y;
}

int u = add(2, 4);
float n = add(0.5f, 1.0f);
double j = add(2.5, 3.0);

Sounds nice;  it'd be even better if you can constrain the types to be derived from some interface

x.type() add(var x : addable, var y : addable)
{
    return x + y;
}
> Also, for functions with consistent return types, you could allow it
> for the return value (again, this would be important for situations where
you
> don't know the result type of an overloaded operator).  However, it would
look a
> bit odd if used as a return type:
>
> var foo(int x) {
> return x+1;
> }
>
> It kind of looks like foo should be a variable, not a function.  So you
might
> want to rename the keyword.  One choice would be to use "T" instead of
"var",
> but then writing templates would be a bit more annoying because you'd have
to
> come up with a name for your parameterized types.
>
> Even if this feature is not adopted, D should probably still have *some*
way of
> dealing with situations where you don't know the type of an expression.  I suppose you could allow use of a type() property wherever a type is
expected:
>
> (x+1).type() foo(SomeType x) {
> return x+1;
> }
>
> But that strikes me as a lot uglier than using "var" or "T".

Yes.  For one, you have to build the expression twice, once to get its type so you can declare the variable, and again for the actual value.

for (container.begin().type() iter = container.begin(); iter !=
container.end(); ++iter)
{
    *iter = 0;
}

> Kimberley Burchett
> Endeca, Inc


January 15, 2003
Sean L. Palmer wrote:
> "Kimberley Burchett" <kimbly at kimbly.comKimberley_member@pathlink.com>
> wrote in message news:b03d3k$2m8r$1@digitaldaemon.com...
> 
>>A small, simple feature suggestion.  The "var" keyword could introduce a

NOT SIMPLE i guess!

>>variable with an inferred type:
>>
>>void foo(int x) {
>>var y = x + 1;
>>printf("y is %d\n", y);
>>}
> 
> 
> YES!!!

Damn. It doesn't seem reliable to me. In Caml types are only inferred, but it has to do with a number of design limitations:
 - there are different sets of operators for all types of data. EG:
int + int, float +. float, bign +/ bign!!! If there's no such thing, you get error-prone situations. Well, that'd be no problem in Delphi, but in C derivates it is, because integer division and FP devision are the same operator here, but have a different meaning!
 - there are only two types of data structures: containing pointers and integers only, and containing no pointers.  Each float is a memory block/sctucture itself, except for designated array types, which can only be processed with special (C-external or VM-internal) functions. Both floats and arrays belong to no-pointer-block types.
 - all integers are the size of pointers -1 bit (eg 31 bit), because they are mangled so that the GC can differ pointer from integer.
 - either a function handles a pointer-sized type and would work with any of them, or it works with numeric types and can be identified by the operators.
 - "generic" functions working with pointers and integers never clean up and never change them, they can only copy them, leaving a ton of work to the mega-efficient GC.


>>This feature is especially useful in situations where you don't *know* the type
>>of an expression.  For example, if the above example used a template so that x
>>has some unspecified type T, and the + operator for T were overloaded, then
>>there would be no way to know the type of T+int.  One could of course just
>>assume that it would be the same type as x itself, but that doesn't scale to
>>letting y store the result of arbitrary expressions.

OUCH! safer:
operator+ (T, int).type
if one really wants to know. I agree there should be some mechanism, but it must work statically, and thus be semantically identical to this one.

And what happenes if you take unspecified type of unspecified type of still not specified type?

A function can not be compiled without a type. Thus it would turn into something like a macro or a template. Or into C++ overloaded functions. Either you bloat everything, or you have to assign all that funcs names. And here we go again.

> Or if you don't want to type out a really lengthy type, or make a typedef
> for it.  See C++ iterators.

:/
why not create an iterator object and then:

Iterator.Load("blah");
while (Iterator.While) {
  // action
}

January 15, 2003
In article <b04bav$6fd$1@digitaldaemon.com>, Ilya Minkov says...
>Damn. It doesn't seem reliable to me. In Caml types are only inferred, but it has to do with a number of design limitations:
>  - there are different sets of operators for all types of data. EG:
>int + int, float +. float, bign +/ bign!!! If there's no such thing, you get error-prone situations. Well, that'd be no problem in Delphi, but in C derivates it is, because integer division and FP devision are the same operator here, but have a different meaning!

I should clarify.  I was assuming that the right-hand-side was fully typed.  I'm not suggesting full Hindley-Milner-style type inference here.  No unification is necessary.  I probably shouldn't have used the word "infer" -- I meant something much simpler.

The D compiler already has to compute the type of all expressions anyway, in order to know whether they're a compile error, or whether they require an implicit cast, etc.  So this is just a shorthand to say that that type of the rhs should be used for the new variable.

It would be perfectly reasonable to, say, prevent the use of a var return type for recursive (or mutually recursive) functions. Otherwise you'd have to solve for the type, instead of simply knowing it.

However the original motivation for the feature didn't even involve return types -- it was just for local variables, which are syntactically prevented from having recursive definitions anyway.

>  - all integers are the size of pointers -1 bit (eg 31 bit), because
>they are mangled so that the GC can differ pointer from integer.

My proposal has nothing to do with tagged values.

>  - either a function handles a pointer-sized type and would work with
>any of them, or it works with numeric types and can be identified by the operators.

You're thinking I suggested type inference for function arguments.
But I didn't -- I suggested that if used as a parameter, then the "var"
keyword could actually introduce a *template*.  And templates in D
apparently have to be explicitly instantiated.  So this objection is
not relevant.  I personally don't even know if I like the implicit
template idea -- I was just exploring it for orthogonality reasons.

>  - "generic" functions working with pointers and integers never clean
>up and never change them, they can only copy them, leaving a ton of work to the mega-efficient GC.

You've jumped from the static type system into the runtime memory model, and implied that the first complicates the second.  But all that I'm proposing is a syntactic shorthand for something that can be statically computed.

If the compiler decides to replace a particular use of the "var" keyword with the type "float", then I could just as easily have typed "float" in the first place.  Therefore there are no runtime implications beyond what the language already allows.

>OUCH! safer:
>operator+ (T, int).type
>if one really wants to know. I agree there should be some mechanism, but
>it must work statically, and thus be semantically identical to this one.

This is the same as what I suggested with the .type() property.  But
I think it's ugly.  Perhaps in my original post I should have emphasized
the word "*consistent* return types".  What I meant to imply by that
was that return values could only be left omitted if the function
returned a value whose type was completely determinable from the type
of the arguments, and which if it has multiple return points, all
return *exactly* the same static type.

>And what happenes if you take unspecified type of unspecified type of still not specified type?

The compiler builds a DAG, and if it discovers a cycle, then you get a compile error.

>A function can not be compiled without a type. Thus it would turn into something like a macro or a template.

Exactly! :)  I was proposing that if the type of an argument was omitted, then you get a template.  And if the type of the return value is omitted, then it must be deducible from the types of the arguments.

>:/
>why not create an iterator object and then:
>
>Iterator.Load("blah");
>while (Iterator.While) {
>   // action
>}

Because then you have to have a single iterator class that's capable of iterating over everything.  Otherwise you'd have to use a more specific type than just "Iterator".


January 16, 2003
In article <b03d3k$2m8r$1@digitaldaemon.com>, Kimberley Burchett <kimbly at kimbly.com> says...
>
>A small, simple feature suggestion.  The "var" keyword could introduce a variable with an inferred type:
>
>void foo(int x) {
>var y = x + 1;
>printf("y is %d\n", y);
>}
>
>This feature is especially useful in situations where you don't *know* the type of an expression.  For example, if the above example used a template so that x has some unspecified type T, and the + operator for T were overloaded, then there would be no way to know the type of T+int.  One could of course just assume that it would be the same type as x itself, but that doesn't scale to letting y store the result of arbitrary expressions.
>
>Exploring the orthogonality of the feature... The "var" keyword could also be allowed for function arguments, where it could implicitly define a function template.  Also, for functions with consistent return types, you could allow it for the return value (again, this would be important for situations where you don't know the result type of an overloaded operator).  However, it would look a bit odd if used as a return type:
>
>var foo(int x) {
>return x+1;
>}
>
>It kind of looks like foo should be a variable, not a function.  So you might want to rename the keyword.  One choice would be to use "T" instead of "var", but then writing templates would be a bit more annoying because you'd have to come up with a name for your parameterized types.
>
>Even if this feature is not adopted, D should probably still have *some* way of dealing with situations where you don't know the type of an expression.  I suppose you could allow use of a type() property wherever a type is expected:
>
>(x+1).type() foo(SomeType x) {
>return x+1;
>}
>
>But that strikes me as a lot uglier than using "var" or "T".
>
>Kimberley Burchett
>Endeca, Inc
>
>

Hi,

The first kind of var usage seems nice, but in practice I think it can lead to some obscure pieces of code. In the middle of a large method how are you going to interpret this:


var foo = bar.testIt(baz, fred, 1, 1.9, "\");


With method overloading in D we need to verify the types of baz and fred so we
can discover which version is used. If baz and fred are also var, it can be
somewhat hard to understand what's going on.
Some weeks ago a programming pissing contest was going on comp.arch and someone
said var is no longer than int. I agree that it's true that var is smaller than
almost anything else, but I don't think a new keyword is needed just for this
half-baked type-inference system. Sather 1.2 provides a ::= assignment that
declare lvalue with type and value of rvalue. In Sather 1.3 they are warning
against this usage, because it can lead to maintanance problems. But Sather has
saner type names than C++, and builtin iterators, so they don't have the
std::ListIterator<std::Set<String>> problem ;-)
The second usage is similar to current template mechanism, only terser. But this
is a problem of template verbosity. IMO it's better to make a simpler template
syntax than add another concept to to the same job.
Just for the records, I like type system where inference is possible (e.g.: ML
or Haskell), but without unification we must always restrict inference to some
parts of the problem.

Best regards,
Daniel Yokomiso.


January 16, 2003
>The first kind of var usage seems nice, but in practice I think it can lead to some obscure pieces of code. In the middle of a large method how are you going to interpret this:
>
>var foo = bar.testIt(baz, fred, 1, 1.9, "\");
>
>With method overloading in D we need to verify the types of baz and fred so we can discover which version is used. If baz and fred are also var, it can be somewhat hard to understand what's going on.

I agree that the feature can be used in ways that make code more opaque. But then, so can pointers, templates, operator overloading, unions, exceptions, and many more! :)  However I still think something like it is necessary, otherwise templates will be hampered by the type system.

I'd like to point out that this:

var foo = bar.testIt(baz, fred, 1, 1.9, "\");
print(foo);

is completely equivalent to this:

print(bar.testIt(baz, fred, 1, 1.9, "\"));

However, by using the variable declaration you can avoid order of execution problems (e.g. if there were code between the assignment of foo and the call to print() that would change the return value of bar.testIt).  Also, the variable declaration lets you reuse the value more than once without having to call bar.testIt() again.

I've never heard anyone claim that you shouldn't be able to compose expressions because it's not clear what the type is of the intermediate values.  So why is it a problem here?

>Sather 1.2 provides a ::= assignment that declare lvalue with type and value of rvalue.  In Sather 1.3 they are warning against this usage, because it can lead to maintanance problems.

That's interesting.  I didn't know about that feature in Sather.  It's also interesting that they're now deprecating it.  However I somehow doubt that they're deprecating it just because somebody managed to use it to botch up their code.

>IMO it's better to make a simpler template
>syntax than add another concept to to the same job.

I would tend to agree.  I don't think I like it for function args.


January 16, 2003
Daniel Yokomiso wrote:
> The first kind of var usage seems nice, but in practice I think it can lead to
> some obscure pieces of code. In the middle of a large method how are you going
> to interpret this:
> 
> 
> var foo = bar.testIt(baz, fred, 1, 1.9, "\");

Such a thing is where an IDE comes in to help.  Put your mouse over 'var', and the IDE will tell you the type.  Or, if you prefer, just have the IDE display the underlying type even though you just typed 'var'.

> With method overloading in D we need to verify the types of baz and fred so we
> can discover which version is used. If baz and fred are also var, it can be
> somewhat hard to understand what's going on.

January 17, 2003
"Daniel Yokomiso" <Daniel_member@pathlink.com> wrote in message news:b06pjb$1kcf$1@digitaldaemon.com...
> Hi,
>
> The first kind of var usage seems nice, but in practice I think it can
lead to
> some obscure pieces of code. In the middle of a large method how are you
going
> to interpret this:

It's always possible to write obscure code.  Doesn't mean you have to.

> var foo = bar.testIt(baz, fred, 1, 1.9, "\");

You have the same problem anyway;  lack of named parameters is confusing enough by itself.

> With method overloading in D we need to verify the types of baz and fred
so we
> can discover which version is used. If baz and fred are also var, it can
be
> somewhat hard to understand what's going on.

You may not need to understand what's going on.  IDE code browsing features are improving all the time anyway.  Maybe your IDE would have an option to "bake in" all implicit variables, injecting the actual type name back into the source code.

And not all uses are arcane.  It'd probably find a lot of use when doing simple stuff.

> Some weeks ago a programming pissing contest was going on comp.arch and
someone
> said var is no longer than int. I agree that it's true that var is smaller
than
> almost anything else, but I don't think a new keyword is needed just for
this
> half-baked type-inference system.

:= would work fine.

> Sather 1.2 provides a ::= assignment that
> declare lvalue with type and value of rvalue. In Sather 1.3 they are
warning
> against this usage, because it can lead to maintanance problems. But
Sather has
> saner type names than C++, and builtin iterators, so they don't have the std::ListIterator<std::Set<String>> problem ;-)

It may even *increase* maintainability of code.  If you change the parameters, all the implicit temp variables automatically change too (and if they change in a way that doesn't work, the compiler will tell you).  It'd be great for use in templates when you don't know the actual type name anyway.  Most of that could be taken care of by a typeof(x) or x.type() feature though.

The only difference between    x.type() y = x   and   y ::= x   is one of syntactical sugar.    y ::= x   looks alot sweeter.   Especially if the actual x is a rather complicated expression.   The programmer's intent is clear;  they want a named temporary variable.  The first form clutters up this intent with syntactical baggage.

Sather looks interesting semantically but I find the flavor of the syntax disturbing.  I'll have to look into it more.

> The second usage is similar to current template mechanism, only terser.
But this
> is a problem of template verbosity. IMO it's better to make a simpler
template
> syntax than add another concept to to the same job.
> Just for the records, I like type system where inference is possible
(e.g.: ML
> or Haskell), but without unification we must always restrict inference to
some
> parts of the problem.

You could do some powerful stuff with type inference.  So long as there are no declarative cycles.

As with any feature that allows one to shoot themselves in the foot, care must be used to aim properly.

Exercise:  name one feature of C++ that can't possibly be used to shoot yourself in the foot, one way or another.  ;)

Sean



January 17, 2003
Kimberley Burchett  wrote:
> In article <b04bav$6fd$1@digitaldaemon.com>, Ilya Minkov says...
> 
>>Damn. It doesn't seem reliable to me. In Caml types are only inferred, but it has to do with a number of design limitations:
>> - there are different sets of operators for all types of data. EG:
>>int + int, float +. float, bign +/ bign!!! If there's no such thing, you get error-prone situations. Well, that'd be no problem in Delphi, but in C derivates it is, because integer division and FP devision are the same operator here, but have a different meaning!
> 
> 
> I should clarify.  I was assuming that the right-hand-side was fully
> typed.  I'm not suggesting full Hindley-Milner-style type inference
> here.  No unification is necessary.  I probably shouldn't have used
> the word "infer" -- I meant something much simpler.
> 
> The D compiler already has to compute the type of all expressions
> anyway, in order to know whether they're a compile error, or whether
> they require an implicit cast, etc.  So this is just a shorthand to
> say that that type of the rhs should be used for the new variable.
True. OK. That settles most of it. It is a minor change then.


> However the original motivation for the feature didn't even involve
> return types -- it was just for local variables, which are syntactically
> prevented from having recursive definitions anyway.
> 
> 
>> - all integers are the size of pointers -1 bit (eg 31 bit), because 
>>they are mangled so that the GC can differ pointer from integer.
> 
> 
> My proposal has nothing to do with tagged values.
:/
It just explained how come that "generic" type is allowed in Caml. This "generic" is simply either an integer, or a pointer to any kind of object, so that so far no operations except for equality comparison and copying are done on them, it can be done. But i don't think there should be such a thing in D.

> 
> If the compiler decides to replace a particular use of the "var"
> keyword with the type "float", then I could just as easily have
> typed "float" in the first place.  Therefore there are no runtime
> implications beyond what the language already allows.
> 
OK. Set.

> 
>>OUCH! safer:
>>operator+ (T, int).type
>>if one really wants to know. I agree there should be some mechanism, but it must work statically, and thus be semantically identical to this one.
> 
> 
> This is the same as what I suggested with the .type() property.  But
> I think it's ugly.  Perhaps in my original post I should have emphasized
> the word "*consistent* return types".  What I meant to imply by that was that return values could only be left omitted if the function
> returned a value whose type was completely determinable from the type
> of the arguments, and which if it has multiple return points, all
> return *exactly* the same static type.



>>A function can not be compiled without a type. Thus it would turn into something like a macro or a template.
> 
> 
> Exactly! :)  I was proposing that if the type of an argument was
> omitted, then you get a template.  And if the type of the return
> value is omitted, then it must be deducible from the types of the
> arguments.
> 

OK. A template then. I'm sorry, I guess a lot of misunderstanding comes from my mistake, by which i was answering a reply to your message, not your original one. In the satement:

var name expression;

expression is evaluated and a variable 'name' is created with the type of expression, if it has a type. As if there stood

typeof(expression) name;

and typeof() would exist. :) It would be semantically the same as in my earlier post, but much prettier. But then again: this may only be allowed within explicit template instantion.


The problem of '/' is rather of general nature. It is to my opinion always undesired if someone writes '7/4' and gets '1', these may be some variables, the type of which may be hidden behind something else... To my opinion '/' has to *always* return an "extended" type. There has to be another operator for integer devision.

It's just like it was done with concatenation. Integer division is of a different nature than FP-division, while other operations are of the same and give the same results. Maybe take '\' for integer division, since it's not used for line splicing like in C?

I'd say it would be generally good if it was possible to use any function, which takes two operands, with infix notation.

Eg:
first $chewTwo second
instead of:
chewTwo(first, second)

Well, it has some problems. First, with infix notation some measure needs to be taken to identify the function both visually and maybe also for the compiler. It may be done by some prefix(#,ยง,$,\), or maybe enclosing between two symbols ( {chewTwo}, /chewTwo/, \chewTwo\ ), or maybe by a postfix or even parenthesis:
first chewTwo() second
But if prefix or enclosing are chosen, it could also clarify the syntax in some situations when used with one-parameter function, like:
$chewOne input
instead of:
chewOne (input)
which is though silly when standalone, may be of great use in expressions already choking of parenthesis. Like in the old C "(((((lisp)))))" joke.
Another problem with infix notation is that the order of precedence is not specified. Either it has to be explicitly stated in function declarations, or all the functions would get the highest possible precedence, and then... it would possibly save only one level of parenthesis or not much more than that, which is though a lot in some cases.

> 
>>:/
>>why not create an iterator object and then:
>>
>>Iterator.Load("blah");
>>while (Iterator.While) {
>>  // action
>>}
> 
> 
> Because then you have to have a single iterator class that's capable
> of iterating over everything.  Otherwise you'd have to use a more
> specific type than just "Iterator".

Well, i have used a similar thing once. I just put such an iterator functionality into a class, to iterate over data in that particular class. Probably because of having no better solution. Though it was OK for the situation, it is not generally usable, like over any kind of everything.


> 
> 

January 17, 2003
Daniel Yokomiso wrote:
> Sather 1.2 provides
> a ::= assignment that declare lvalue with type and value of rvalue. In
> Sather 1.3 they are warning against this usage, because it can lead to
> maintanance problems.

Careful: I was the maintainer of Sather during that time. I've done lots of code in Sather (a complete compiler written from scratch, at one time) and I can only speak in favor of that "::=" construct. I really urge everyone to consider the "var" suggestion. It definitely *improves* maintainability greatly. Of course, one can use that feature to produce unreadable code, but then, an expression where you can't infer the type by a quick glance is a bad idea anyway. If you've come to need the type written in variable declarations as documentation, you really have a much deeper problem.

As an example for where I really hate the lack of such a possibility in C++ are iterators:

        for(Somecontainer<Sometype>::Reverse_iterator i = C.end();i;i--) {
                ...
        };

How about this:

        for(var i = C.end();i;i--) {
                ...
        };

Just consider you want to try out another container type and have to go through all your code to adjust the iterators. Of course, there are ways to avoid that, but they all need some overhead at some other place.

Of course, in D, type inference is not always possible from an expression alone, but why not simply make the use of "var" an error in those cases?

Ciao,
Nobbi
« First   ‹ Prev
1 2 3