why are types all keywords? - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » why are types all keywords?

Thread overview

why are types all keywords?
Jul 08, 2005 Greg Smith
Jul 09, 2005 Anders F Björklund
Jul 10, 2005 AJG
Jul 11, 2005 Hasan Aljudy
Jul 11, 2005 Anders F Björklund
Jul 18, 2005 Greg Smith
Jul 18, 2005 Hasan Aljudy
Jul 19, 2005 Greg Smith
Jul 19, 2005 Derek Parnell
Jul 19, 2005 Hasan Aljudy
Jul 19, 2005 Unknown W. Brackets
Jul 19, 2005 Greg Smith
Jul 20, 2005 Charles Hixson
Jul 20, 2005 Greg Smith
Jul 20, 2005 Ben Hinkle
Jul 20, 2005 Charles Hixson
Jul 21, 2005 Greg Smith

July 08, 2005

why are types all keywords?

Posted by Greg Smith

Greg Smith

One of the problems with C/C++ is that you can't parse it unless
you know what words are  type names, or typedef/classes; and for
this reason C/C++ type names need to be keywords. D has modified
the syntax so that a parser does not need to know in advance that
certain identifiers represent user-defined types. I think this
is a great step forward.

The question is: why are all the built-in type names (and there
are a lot of them) still keywords? They don't need to be, and I don't
see how it can do any good to make them keywords. I count about
24 keywords which are types. These could all be predefined identifiers.

So why is this bad? Part of it is just a personal bias - if I plot
a chart of all the languages I've used, with 'niceness of language'
vs. 'number of keywords', there is a strong inverse correlation -
python is in one corner, and Dec Compiled BASIC (yes, I'm that old)
is far into the other corner.

But there are some good reasons to avoid superfluous
keywords. Keywords by  definition have the enforced meaning
everywhere - if you add new
keywords, you will break code which has any local, global, struct
member, or anything with the same name. I remember a long time
ago, a buddy was baffled that his C code wouldn't compile in
C++, it turned out he had a struct member called 'this' or 'catch'
or something (this was before the days of syntax coloring).
In D, if new types are ever added - or new predefined values such
as 'true' and 'false' - they can be added as predefined identifiers
without breaking anything. So, why not do it that way from the beginning?

Languages which implement built-in types (and constants) as predefined identifiers include Pascal and VHDL, and python (to the extent that it has type names, they are __builtin__ type objects and not keywords).

D does not define property names as keywords, why are 'true'
and 'false' and all those type names keywords?
It could be argued that 'this' doesn't need to be a keyword. I might want to have a struct member called 'this'; syntactically, it could be a predefined local variable.

In D as currently implemented,

	i = int + 2;

.. is a syntax error, whereas

        alias int myint;
         i = myint + 2;

 ... is syntactically legal, but disallowed at the semantic level.
Is this difference important or desirable?
Making 'int' a predefined identifier would cause these two to be
treated the same way in terms of compiler diagnostics.

It might be argued that it would be very dangerous to allow
functions to define a local variable called 'float'. In C, this
could break code which is secretly inserted by macros or #include.
But (a) D doesn't have these (b) *anything* can be broken in C by these
things. In any case, you can always make it illegal to redefine float as
a variable, while still allowing it in, say, struct namespaces. With
a keyword, no such distinction is possible.

- greg

July 09, 2005

Re: why are types all keywords?

Posted by Anders F Björklund
in reply to Greg Smith

Anders F Björklund

Posted in reply to Greg Smith

Greg Smith wrote:

> D does not define property names as keywords, why are 'true'
> and 'false' and all those type names keywords?

Good question, and I was asking the same thing actually...
(but didn't get any answers so I eventually gave up on it)

Another strange thing is that "bool" is *not* a keyword,
as you would have expected the type of those two to be ?

I wouldn't mind if bool, true and false were all moved to
e.g. std.stdbool (which could still be included by default ?)

Just like in C99:
http://www.opengroup.org/onlinepubs/009695399/basedefs/stdbool.h.html

Then again, I'm secretly plotting the swiftly demise of the
"bit" type so you probably shouldn't pay any attention to me. ;-)

That is: it would be even better if bool, true and false were
a new type of their own - but that isn't ever going to happen.

--anders

July 10, 2005

Re: why are types all keywords?

Posted by AJG
in reply to Greg Smith

AJG

Posted in reply to Greg Smith

Hi,

# class Int { public int max = 1337; }
#
# int someFunc() {
#     Int int = new Int();
#     return (int.max); // Is this 1337 or 2147483647?
# }

I think making intrinsic types keywords is a Good Thing™, but perhaps I'm not getting your proposal correctly. Would you want something like the above to be legal?

--AJG.



In article <damsjd$uga$1@digitaldaemon.com>, Greg Smith says...
>
>One of the problems with C/C++ is that you can't parse it unless you know what words are  type names, or typedef/classes; and for this reason C/C++ type names need to be keywords. D has modified the syntax so that a parser does not need to know in advance that certain identifiers represent user-defined types. I think this is a great step forward.
>
>The question is: why are all the built-in type names (and there
>are a lot of them) still keywords? They don't need to be, and I don't
>see how it can do any good to make them keywords. I count about
>24 keywords which are types. These could all be predefined identifiers.
>
>So why is this bad? Part of it is just a personal bias - if I plot a chart of all the languages I've used, with 'niceness of language' vs. 'number of keywords', there is a strong inverse correlation - python is in one corner, and Dec Compiled BASIC (yes, I'm that old) is far into the other corner.
>
>But there are some good reasons to avoid superfluous
>keywords. Keywords by  definition have the enforced meaning
>everywhere - if you add new
>keywords, you will break code which has any local, global, struct
>member, or anything with the same name. I remember a long time
>ago, a buddy was baffled that his C code wouldn't compile in
>C++, it turned out he had a struct member called 'this' or 'catch'
>or something (this was before the days of syntax coloring).
>In D, if new types are ever added - or new predefined values such
>as 'true' and 'false' - they can be added as predefined identifiers
>without breaking anything. So, why not do it that way from the beginning?
>
>Languages which implement built-in types (and constants) as predefined identifiers include Pascal and VHDL, and python (to the extent that it has type names, they are __builtin__ type objects and not keywords).
>
>D does not define property names as keywords, why are 'true'
>and 'false' and all those type names keywords?
>It could be argued that 'this' doesn't need to be a keyword. I might
>want to have a struct member called 'this'; syntactically, it could be a
>predefined local variable.
>
>In D as currently implemented,
>
>	i = int + 2;
>
>.. is a syntax error, whereas
>
>         alias int myint;
>          i = myint + 2;
>
>  ... is syntactically legal, but disallowed at the semantic level.
>Is this difference important or desirable?
>Making 'int' a predefined identifier would cause these two to be
>treated the same way in terms of compiler diagnostics.
>
>It might be argued that it would be very dangerous to allow
>functions to define a local variable called 'float'. In C, this
>could break code which is secretly inserted by macros or #include.
>But (a) D doesn't have these (b) *anything* can be broken in C by these
>things. In any case, you can always make it illegal to redefine float as
>a variable, while still allowing it in, say, struct namespaces. With
>a keyword, no such distinction is possible.
>
>- greg
>
>
>
>
>
>
>
>

July 11, 2005

Re: why are types all keywords?

Posted by Hasan Aljudy
in reply to Greg Smith

Hasan Aljudy

Posted in reply to Greg Smith

Greg Smith wrote:
> One of the problems with C/C++ is that you can't parse it unless
> you know what words are  type names, or typedef/classes; and for
> this reason C/C++ type names need to be keywords. D has modified
> the syntax so that a parser does not need to know in advance that
> certain identifiers represent user-defined types. I think this
> is a great step forward.
> 
> The question is: why are all the built-in type names (and there
> are a lot of them) still keywords? They don't need to be, and I don't
> see how it can do any good to make them keywords. I count about
> 24 keywords which are types. These could all be predefined identifiers.

I just don't get it ...

What's the point of making something like "int" not a keyword?

#int int; //wth?
#class int
#{
# static int max = 1337; //wtf is int here? variable? type? class?
#}
#float double = int.max; //go figure
#double bit = cast(typeof( double )) int;


> 
> So why is this bad? Part of it is just a personal bias - if I plot
> a chart of all the languages I've used, with 'niceness of language'
> vs. 'number of keywords', there is a strong inverse correlation -
> python is in one corner, and Dec Compiled BASIC (yes, I'm that old)
> is far into the other corner.

I hope the method in which you measure "niceness of language" doesn't include "number of keywords" ...

> 
> But there are some good reasons to avoid superfluous
> keywords. Keywords by  definition have the enforced meaning
> everywhere - if you add new
> keywords, you will break code which has any local, global, struct
> member, or anything with the same name. 

A good compiler will quickly point out to you the error and hopefully it can be easily fixed, find and replace in files :)

> I remember a long time
> ago, a buddy was baffled that his C code wouldn't compile in
> C++, it turned out he had a struct member called 'this' or 'catch'
> or something (this was before the days of syntax coloring).

Why didn't his compiler tell him that "this" is a keyword?

> In D, if new types are ever added - or new predefined values such
> as 'true' and 'false' - they can be added as predefined identifiers
> without breaking anything. So, why not do it that way from the beginning?
> 
> Languages which implement built-in types (and constants) as predefined identifiers include Pascal and VHDL, and python (to the extent that it has type names, they are __builtin__ type objects and not keywords).
> 


> D does not define property names as keywords, why are 'true'
> and 'false' and all those type names keywords?
I think true and false are not just aliases for 0 and 1 (or atleast, I hope so)

> It could be argued that 'this' doesn't need to be a keyword. I might want to have a struct member called 'this'; syntactically, it could be a predefined local variable.

yeah, that's a bad argument.

> 
> In D as currently implemented,
> 
>     i = int + 2;
> 
> .. is a syntax error, whereas
> 
>         alias int myint;
>          i = myint + 2;
> 
>  ... is syntactically legal, but disallowed at the semantic level.
> Is this difference important or desirable?

I don't see your point .. both are errors.

> Making 'int' a predefined identifier would cause these two to be
> treated the same way in terms of compiler diagnostics.
> 
> It might be argued that it would be very dangerous to allow
> functions to define a local variable called 'float'. In C, this
> could break code which is secretly inserted by macros or #include.
> But (a) D doesn't have these (b) *anything* can be broken in C by these
> things. In any case, you can always make it illegal to redefine float as
> a variable, while still allowing it in, say, struct namespaces. With
> a keyword, no such distinction is possible.
> 
> - greg
> 
> 

I don't see one single real problem with the issue.

If it ain't broken, don't fix it.

July 11, 2005

Re: why are types all keywords?

Posted by Anders F Björklund
in reply to Hasan Aljudy

Anders F Björklund

Posted in reply to Hasan Aljudy

Hasan Aljudy wrote:

>> D does not define property names as keywords, why are 'true'
>> and 'false' and all those type names keywords?
> 
> I think true and false are not just aliases for 0 and 1 (or atleast, I hope so)

Sorry, but in D:
"true" is a constant bit of 1, and "false" is a constant bit of 0.

const bit true = 1;
const bit false = 0;

They just happen to be implemented inside the D compiler itself...

	case TOKtrue:
	    e = new IntegerExp(loc, 1, Type::tbit);
	    nextToken();
	    break;

	case TOKfalse:
	    e = new IntegerExp(loc, 0, Type::tbit);
	    nextToken();
	    break;

See http://www.prowiki.org/wiki4d/wiki.cgi?BitsAndBools

--anders

July 18, 2005

Re: why are types all keywords?

Posted by Greg Smith
in reply to Hasan Aljudy

Greg Smith

Posted in reply to Hasan Aljudy

Hasan Aljudy wrote:
> Greg Smith wrote:
> 
>>
>> The question is: why are all the built-in type names (and there
>> are a lot of them) still keywords? They don't need to be, and I don't
>> see how it can do any good to make them keywords. I count about
>> 24 keywords which are types. These could all be predefined identifiers.
> 
> 
> I just don't get it ...
> 
> What's the point of making something like "int" not a keyword?
> 
> #int int; //wth?
> #class int
> #{
> # static int max = 1337; //wtf is int here? variable? type? class?
> #}
> #float double = int.max; //go figure
> #double bit = cast(typeof( double )) int;
> 
What's the point of *making* it a keyword???

Yes, this change would allow you to redefine int. it's possible in
other languages, and they haven't self-destructed as a result.
 If this is a problem, you could make it illegal to redefine built-in names in certain scopes. If they are keywords, then
this level of control is not possible.

My point is, there's no reason to make it a keyword, unless you want
it to always be (effectively) a special punctuation mark, in *all*
possible contexts, and you want to extend that to *all* the built-in
types, despite the fact that user-defined types don't have or need this
special treatment, and you don't mind putting in extra grammar rules to
deal with the fact that type names could be these keywords *or* identifiers.
> 
>>
>> So why is this bad? Part of it is just a personal bias - if I plot
>> a chart of all the languages I've used, with 'niceness of language'
>> vs. 'number of keywords', there is a strong inverse correlation -
>> python is in one corner, and Dec Compiled BASIC (yes, I'm that old)
>> is far into the other corner.
> I hope the method in which you measure "niceness of language" doesn't include "number of keywords" ...

Actually, that was a contributing factor for the Compiled Basic.
There were several pages of keywords, and every word that had
anything to do with computing was in there somewhere. So you had
to make variable names with spelling errors in them, 'rekord'.
But, no, in general languages with a large number of keywords
seem to be designed along the principle that you should cram
as much as possible into the core
language, and that shows in other ways as well.

Also, languages with a lot of keywords often have big, clumsy, bloated
grammars, and need to have a lot of keywords to direct the parser. Keywords were invented for the purpose of adding extra punctuation to
the token set, to help the grammar. I don't see any point in making more keywords than are needed for this purpose. D has >20 keywords which are not needed for the grammar, and therefore the grammar is more complicated than it needs to be, and error messages are generally less informative as a side-effect.
> 
>>
>> But there are some good reasons to avoid superfluous
>> keywords. Keywords by  definition have the enforced meaning
>> everywhere - if you add new
>> keywords, you will break code which has any local, global, struct
>> member, or anything with the same name. 
> 
> 
> A good compiler will quickly point out to you the error and hopefully it can be easily fixed, find and replace in files :)
>
In practice, you get baffling error messages. I've been through that,
afer they added 'xor' and 'and', etc, to the C++ keyword list, without
checking first if it was OK with me :-).

> 
>> I remember a long time
>> ago, a buddy was baffled that his C code wouldn't compile in
>> C++, it turned out he had a struct member called 'this' or 'catch'
>> or something (this was before the days of syntax coloring).
> 
> 
> Why didn't his compiler tell him that "this" is a keyword?
>
Why on earth would it do that? it reported a syntax error,
since a keyword appeared in a position where it was not
allowed by the grammar. A lot of tokens other than 'identifier'
are allowed there - so you wouldn't even get something as helpful as
"error at 'try' - expected 'identifier'"
 Try it with your favourite C++ compiler.
> 
>> In D, if new types are ever added - or new predefined values such
>> as 'true' and 'false' - they can be added as predefined identifiers
>> without breaking anything. So, why not do it that way from the beginning?
>>
...
> 
>> D does not define property names as keywords, why are 'true'
>> and 'false' and all those type names keywords?
> 
> I think true and false are not just aliases for 0 and 1 (or atleast, I hope so)
>
The point is, they are keywords, therefore these words are not available in contexts where they could otherwise be locally redefined.
> 
> 
>>
>> In D as currently implemented,
>>
>>     i = int + 2;
>>
>> .. is a syntax error, whereas
>>
>>         alias int myint;
>>          i = myint + 2;
>>
>>  ... is syntactically legal, but disallowed at the semantic level.
>> Is this difference important or desirable?
> 
> 
> I don't see your point .. both are errors.
>
Here's the point:
 (1) they are both, essentially, the same error, why should they
     produce completely different error messages?
 (2) the error message you get for the second one,
   "can't do that to a type", is much more useful than the one you get
   for the first, "syntax error".
 (3) The compiler writer's job is more complex for no benefit:
   you have a grammar with two different ways of matching types, which
   may be certain keywords or any identifier; and that grammar rejects
   certain erroneous constructs, such as adding things to built-in
   types, but you *still* need to check that  expressions are
   well formed since any identifier could in fact be a type name.

It should be quite clear that the D grammar[*] would be simpler if type
names were not keywords; and that the semantic checking required to
compensate is already in place to deal with user-defined types.

[*] by this I mean the one in the compiler, not the one in the documentation; the latter tends to assume you know a priori what names
are types, whereas you don't in practice.

> I don't see one single real problem with the issue.
> 
> If it ain't broken, don't fix it.

Hey, then why bother with D? Use C/C++. It seems to work OK, people
use it for a lot of stuff.

How about this: the language is in its early development. It is still
possible to make changes like this. It will be much, much harder in the
future. I still don't see one reason why there *should* be so many keywords (other than the fact that's already done that way)  and I've pointed out a few reasons why IMHO it's better, and cleaner, not to.

The fact that C does the same thing does not qualify as a reason, since
it's a stated goal of D to eliminate the very reason C needs to do that.
So, having gone to the trouble to eliminate the need for keywords... why
are they still there???

July 18, 2005

Re: why are types all keywords?

Posted by Hasan Aljudy
in reply to Greg Smith

Hasan Aljudy

Posted in reply to Greg Smith

Greg Smith wrote:
> Hasan Aljudy wrote:
[snip]
>>
>> I just don't get it ...
>>
>> What's the point of making something like "int" not a keyword?
>>
>> #int int; //wth?
>> #class int
>> #{
>> # static int max = 1337; //wtf is int here? variable? type? class?
>> #}
>> #float double = int.max; //go figure
>> #double bit = cast(typeof( double )) int;
>>
> What's the point of *making* it a keyword???
> 
> Yes, this change would allow you to redefine int. it's possible in
> other languages, and they haven't self-destructed as a result.

Sorry, all the languages I'v worked with are from the C family (C, C++, Java, D) with the exception of Pascal.

How do other languages implement that?

>  If this is a problem, you could make it illegal to redefine built-in names in certain scopes. If they are keywords, then
> this level of control is not possible.

The ability to use "int" or "float" or "this" for one's own purposes is not really an advantage.

> 
> My point is, there's no reason to make it a keyword, unless you want
> it to always be (effectively) a special punctuation mark, in *all*
> possible contexts, and you want to extend that to *all* the built-in
> types, despite the fact that user-defined types don't have or need this
> special treatment, and you don't mind putting in extra grammar rules to
> deal with the fact that type names could be these keywords *or* identifiers.

I still don't get your point ....
It's a keywrod because, well, how do you define a variable to be of a certain type? well, you use a "type name" to spcify the type of a variable.

type_name variable_name;

You can define your own types, but your own types will always be defined in terms of other types.

typedef newtype oldtype;

struct new_type
{
    some_known_type field1;
    some_other_known_type field2;
    //.. etc
}

every new type is defined in terms of other type(s), there must be in the end a type which isn't defined in terms of anything.

int is such a type.

if it's not a keyword, then it can be turned on and off.
well, how do you turn it "on"? and what would be the point of having turned off?

[snip]
>>> I remember a long time
>>> ago, a buddy was baffled that his C code wouldn't compile in
>>> C++, it turned out he had a struct member called 'this' or 'catch'
>>> or something (this was before the days of syntax coloring).
>>
>>
>>
>> Why didn't his compiler tell him that "this" is a keyword?
> 
>  >
> Why on earth would it do that? it reported a syntax error,
> since a keyword appeared in a position where it was not
> allowed by the grammar. A lot of tokens other than 'identifier'
> are allowed there - so you wouldn't even get something as helpful as
> "error at 'try' - expected 'identifier'"
>  Try it with your favourite C++ compiler.

I'm just saying the problem here is the error messege, not the keyword.

>>> In D as currently implemented,
>>>
>>>     i = int + 2;
>>>
>>> .. is a syntax error, whereas
>>>
>>>         alias int myint;
>>>          i = myint + 2;
>>>
>>>  ... is syntactically legal, but disallowed at the semantic level.
>>> Is this difference important or desirable?
>>
>>
>>
>> I don't see your point .. both are errors.
> 
>
> Here's the point:
>  (1) they are both, essentially, the same error, why should they
>      produce completely different error messages?

because .. they can be treated differently.
for
#int + 2
there is no way around using something other than int.
but for
#myint + 2
you can redefine myint to be a variable, or you can use something other than myint.

>  (2) the error message you get for the second one,
>    "can't do that to a type", is much more useful than the one you get
>    for the first, "syntax error".

so? ask the compiler writer to produce a more informative error messege!

>  (3) The compiler writer's job is more complex for no benefit:
>    you have a grammar with two different ways of matching types, which
>    may be certain keywords or any identifier; and that grammar rejects
>    certain erroneous constructs, such as adding things to built-in
>    types, but you *still* need to check that  expressions are
>    well formed since any identifier could in fact be a type name.
> 
> It should be quite clear that the D grammar[*] would be simpler if type
> names were not keywords; and that the semantic checking required to
> compensate is already in place to deal with user-defined types.
> 
> [*] by this I mean the one in the compiler, not the one in the documentation; the latter tends to assume you know a priori what names
> are types, whereas you don't in practice.

Ok, how would that help the language user?

I never wrote a compiler, and I have no bit of clue about what you are talking about.

But, assuming that you are corrent, and that it does indeed make writing the compielr easier .. your point still doesn't stand.

The compiler has already been written!

I think it would be much easier for the compiler aithur to use what he had already written than to rewrite the compiler to compensate for your suggestion.

> 
>> I don't see one single real problem with the issue.
>>
>> If it ain't broken, don't fix it.
> 
> 
> Hey, then why bother with D? Use C/C++. It seems to work OK, people
> use it for a lot of stuff.

Because C++ is broken.
And Java is broken too.

> 
> How about this: the language is in its early development. It is still
> possible to make changes like this. It will be much, much harder in the
> future. I still don't see one reason why there *should* be so many keywords (other than the fact that's already done that way)  and I've pointed out a few reasons why IMHO it's better, and cleaner, not to.

Where are those reasons? I didn't see them.
The only reasons were:
1- so you can use "int" as a variable name or something else.
2- easier to implement in a compiler.

but #1 is not really a practical reason. and I already answered #2

> The fact that C does the same thing does not qualify as a reason, since
> it's a stated goal of D to eliminate the very reason C needs to do that.
> So, having gone to the trouble to eliminate the need for keywords... why
> are they still there???

Where does the documentation state that D's goal is to eliminate C's need for keywords?

July 19, 2005

Re: why are types all keywords?

Posted by Unknown W. Brackets
in reply to Greg Smith

Unknown W. Brackets

Posted in reply to Greg Smith

I don't agree; the reason I don't is completely for the thing you first speak of:

Parsing it.

Wait, you say: but, we already made it clear that you don't need to know type names to parse it (in D.)  Quite astute, yes... but, consider the following code:

int dumb()
{
   static int i = 0;

   return 42 + i++;
}

Now, let's say that int isn't a keyword.  Let's even say, for the sake of argument, I can name a function int as well:

int int()
{
   return 42;
}

Now, again, this should be parseable because we know some things here... but, most basic highlighters will not do this correctly.

In fact, I would argue that the fact we can do this:

int bool()
{
   return 1;
}

(which we can...) is the problem here.  Not only is that confusing, but the only editor I can think of offhand which has highlighting powerful enough to handle that is Microsoft Visual Studio - and even then, it would be *fun* to make it tell the difference, based on the way it works.

I will agree that if D is going to make every text editor/IDE/etc. developer's heads ache, it should do it from the start... otherwise, not at all (not for types!)

As it is, in languages like PHP... int can't really be highlighted by most highlighters - because you can use it as a type *AND* as a function name.  This is horrible, in my opinion, and detrimental - although it's negative effects are limited, in this case, because variables have a $ prefix.

-[Unknown]


> One of the problems with C/C++ is that you can't parse it unless
> you know what words are  type names, or typedef/classes; and for
> this reason C/C++ type names need to be keywords. D has modified
> the syntax so that a parser does not need to know in advance that
> certain identifiers represent user-defined types. I think this
> is a great step forward.
> 
> The question is: why are all the built-in type names (and there
> are a lot of them) still keywords? They don't need to be, and I don't
> see how it can do any good to make them keywords. I count about
> 24 keywords which are types. These could all be predefined identifiers.
> 
> So why is this bad? Part of it is just a personal bias - if I plot
> a chart of all the languages I've used, with 'niceness of language'
> vs. 'number of keywords', there is a strong inverse correlation -
> python is in one corner, and Dec Compiled BASIC (yes, I'm that old)
> is far into the other corner.
> 
> But there are some good reasons to avoid superfluous
> keywords. Keywords by  definition have the enforced meaning
> everywhere - if you add new
> keywords, you will break code which has any local, global, struct
> member, or anything with the same name. I remember a long time
> ago, a buddy was baffled that his C code wouldn't compile in
> C++, it turned out he had a struct member called 'this' or 'catch'
> or something (this was before the days of syntax coloring).
> In D, if new types are ever added - or new predefined values such
> as 'true' and 'false' - they can be added as predefined identifiers
> without breaking anything. So, why not do it that way from the beginning?
> 
> Languages which implement built-in types (and constants) as predefined identifiers include Pascal and VHDL, and python (to the extent that it has type names, they are __builtin__ type objects and not keywords).
> 
> D does not define property names as keywords, why are 'true'
> and 'false' and all those type names keywords?
> It could be argued that 'this' doesn't need to be a keyword. I might want to have a struct member called 'this'; syntactically, it could be a predefined local variable.
> 
> In D as currently implemented,
> 
>     i = int + 2;
> 
> .. is a syntax error, whereas
> 
>         alias int myint;
>          i = myint + 2;
> 
>  ... is syntactically legal, but disallowed at the semantic level.
> Is this difference important or desirable?
> Making 'int' a predefined identifier would cause these two to be
> treated the same way in terms of compiler diagnostics.
> 
> It might be argued that it would be very dangerous to allow
> functions to define a local variable called 'float'. In C, this
> could break code which is secretly inserted by macros or #include.
> But (a) D doesn't have these (b) *anything* can be broken in C by these
> things. In any case, you can always make it illegal to redefine float as
> a variable, while still allowing it in, say, struct namespaces. With
> a keyword, no such distinction is possible.
> 
> - greg

July 19, 2005

Re: why are types all keywords?

Posted by Greg Smith
in reply to Hasan Aljudy

Greg Smith

Posted in reply to Hasan Aljudy

Hasan Aljudy wrote:
> 
> Greg Smith wrote:
> 
>> Hasan Aljudy wrote:
> [snip]
>>>
>>> I just don't get it ...
>>>
>>> What's the point of making something like "int" not a keyword?
>>>
>>> #int int; //wth?
>>> #class int
>>> #{
>>> # static int max = 1337; //wtf is int here? variable? type? class?
>>> #}
>>> #float double = int.max; //go figure
>>> #double bit = cast(typeof( double )) int;
>>>
>> What's the point of *making* it a keyword???
>>
>> Yes, this change would allow you to redefine int. it's possible in
>> other languages, and they haven't self-destructed as a result.
> 
> Sorry, all the languages I'v worked with are from the C family (C, C++, Java, D) with the exception of Pascal.
> 
> How do other languages implement that?

Very simple. You go into the symbol table at startup -- the same one
into which the user names go - and you predefine the names there as types. Pascal does this, and you've probably never noticed. See?
it doesn't hurt at all.

> 
>>  If this is a problem, you could make it illegal to redefine built-in names in certain scopes. If they are keywords, then
>> this level of control is not possible.
> 
> 
> The ability to use "int" or "float" or "this" for one's own purposes is not really an advantage.
> 
No, that's not the point. You can still make it illegal to redefine these. What's the difference between making it illegal to redefine them and making them keywords?
  (1) by making them keywords, you complicate the grammar and gain no advantage by doing so; the grammar must still support type names which are identifiers.
  (2) by making them keywords, you cause them to be treated differently, in the parser and semantic passes, from user-defined types. Functionality needs to be replicated in the compiler, since 'int' is discovered to be a type in the parser, while 'myint' is seen as an identifier in the parser, and is discovered to be a type in the semantic
processing. This means more complexity than needed, and leads to inconsistent, and less useful, diagnostics.
  (3) New built-in types can be added in future to the language as predefined identifiers, with much less likelihood of breaking old code
than if they are added as new keywords.
  (4) if they are defined as identifiers, you can make it illegal to
redefine them in specific contexts. With keywords there is no such control.

To appeal to the KISS principle:
  - If the built-in types can be implemented in the same way as the user-defined types, why not do so ?? If you want to make it illegal
to redefine these, fine - but why chisel them into stone in the parser
when the grammar doesn't need this, and would be simpler without it?

>>
>> My point is, there's no reason to make it a keyword, unless you want
>> it to always be (effectively) a special punctuation mark, in *all*
>> possible contexts, and you want to extend that to *all* the built-in
>> types, despite the fact that user-defined types don't have or need this
>> special treatment, and you don't mind putting in extra grammar rules to
>> deal with the fact that type names could be these keywords *or* identifiers.
> 
> 
> I still don't get your point ....
> It's a keywrod because, well, how do you define a variable to be of a certain type? well, you use a "type name" to spcify the type of a variable.
> 
> type_name variable_name;
> 
> You can define your own types, but your own types will always be defined in terms of other types.
> 
> typedef newtype oldtype;
> 
> struct new_type
> {
>     some_known_type field1;
>     some_other_known_type field2;
>     //.. etc
> }
> 
> every new type is defined in terms of other type(s), there must be in the end a type which isn't defined in terms of anything.
> 
> int is such a type.
> 
> if it's not a keyword, then it can be turned on and off.
> well, how do you turn it "on"? and what would be the point of having turned off?
> 
Clearly, all types have to start from built-in types. This is immaterial
to whether the built-in types are defined in the grammar as keywords, or
in the symbol table as predefined names, as in pascal.

You are saying this: because there is no point in redefining them, they
should be cast in stone in the parser. I mildly disagree with the premise, and I utterly disagree with the conclusion.

Regarding the premise, as I have pointed out, what if you want to add a new built-in type -- if you define it as a new keyword, it might conflict with a local variable name in some existing code.

If you want it to be illegal to redefine certain names, this is fine, but this does not by any means mean they need to be keywords!!
IMHO this should be done in the symbol table, not by making keywords that are not required by the grammar. This is much simpler in the long run; it leads to better error messages, e.g. "can't redefine 'int' in this name space " vs. "Syntax error"; and allows control by scope, e.g. you might want to allow some names to be used in struct members.


> 
> [snip]
> 
>>>> I remember a long time
>>>> ago, a buddy was baffled that his C code wouldn't compile in
>>>> C++, it turned out he had a struct member called 'this' or 'catch'
>>>> or something (this was before the days of syntax coloring).
..
>>> Why didn't his compiler tell him that "this" is a keyword?
 ..
>> Why on earth would it do that? it reported a syntax error,
>> since a keyword appeared in a position where it was not
>> allowed by the grammar. A lot of tokens other than 'identifier'
>> are allowed there - so you wouldn't even get something as helpful as
>> "error at 'try' - expected 'identifier'"
>>  Try it with your favourite C++ compiler.
> 
> I'm just saying the problem here is the error messege, not the keyword.

I fully agree. And the best way to get better error messages is to allow the semantic pass to see these errors, rather than making them syntax errors, which is what happens when keywords are defined.
> 
>>>> In D as currently implemented,
>>>>
>>>>     i = int + 2;
>>>>
>>>> .. is a syntax error, whereas
>>>>
>>>>         alias int myint;
>>>>          i = myint + 2;
>>>>
>>>>  ... is syntactically legal, but disallowed at the semantic level.
>>>> Is this difference important or desirable?
>>>
>>>
>>>
>>>
>>> I don't see your point .. both are errors.
>>
>>
>>
>> Here's the point:
>>  (1) they are both, essentially, the same error, why should they
>>      produce completely different error messages?
> 
> 
> because .. they can be treated differently.
> for
> #int + 2
> there is no way around using something other than int.
> but for
> #myint + 2
> you can redefine myint to be a variable, or you can use something other than myint.
> 
Please step back and think about what I am trying to say with this example. Of course they can be, and are, treated differently; I know
why the behavior occurs.
I'm saying there's no advantage to this and there are disadvantages;
which are eliminated by eliminating the keywords.
They are treated differently because the *parser* knows 'int' is a type name, and has no rule allowing it to add a type to something; but the parser has a rule saying an identifier can be added to something. What I'm saying is: if int
was *not* a keyword, we could eliminate the first rule, simplify the
grammar, get better diagnostics, shorten the keyword table (and thus
speed up the lexer) ... the compiler code which rejects 'int + 2' would
then be the same code which rejects "myint+2".
Is there any advantage to treating them differently?

>>  (2) the error message you get for the second one,
>>    "can't do that to a type", is much more useful than the one you get
>>    for the first, "syntax error".
> 
> 
> so? ask the compiler writer to produce a more informative error messege!

You say later that you aren't familiar with compilers, and no offence,
but that's showing here.
By far he easiest way to improve the error message is to do away with the unnecessary keywords. It's very hard to produce helpful messages for errors which arise because no grammar rule is applicable. A syntax error is basically the parser saying "huh?". At best, it can tell you where it became irrevocably confused, and tell what kinds of tokens are legal at that point. It is possible to add additional grammar rules, solely for the purpose of matching specific illegal constructs, so that they can be given more meaningful error messages. This gets rather messy. And in this case, the desirable grammar rules already exist -- with 'identifier' in them, so that they don't apply when types happen to
be built-in types.

It is far easier in the semantic phase to provide a guess at what you think the programmer was trying to do, and produce a useful error message. Imagine a language which allows array declarations sized by integer constants,or expressions formed of integer constants. It would be possible to make 'int a[-3]' a syntax error in such a language, by contriving the grammar so that no rule matched it. Far better to make it syntactically  legal, so the message is "error: negative array dimension for 'a'", rather than "syntax error". The test would be needed anyhow, since the grammar can't make "int a[7-10]" illegal.

Actually, we could get this improvement in D by modifying the grammar as such:
   identifier_or_type::
             IDENTIFIER  { $$ = lookup_ident($1); }
          |  INT     { $$ = /* .. type obj for 'int' */ }
          |  BYTE    { $$ = /* .. type obj for 'byte' */ }
        ...

... and eliminating all other rules referencing the type keywords, which, by D charter, are actually redundant. And, using 'identifier_or_type' in place of most IDENTIFER references (not the ones where IDENTIFER is assigned a meaning).

Thus, 'int + 2' would be caught by the same code as 'myint + 2'.

This change obtains most of the improvement I'm looking
for while still preventing the names from being redefined. It's then a relatively small step to eliminate this one weird bit of grammar and provide predefined symbols.

> 
> 
> Ok, how would that help the language user?
> 
> I never wrote a compiler, and I have no bit of clue about what you are talking about.
> 


> But, assuming that you are corrent, and that it does indeed make writing the compielr easier .. your point still doesn't stand.
> 
> The compiler has already been written!
> 
> I think it would be much easier for the compiler aithur to use what he had already written than to rewrite the compiler to compensate for your suggestion.
>
This is a valid point in general, but there are times, and precious few of them, when there is an opportunity to get things right even it means changing something which already works as it is. D is, by charter, in such a situation. All such opportunities should be considered in the long-term view, since there will *never* be an easier time to make such
a change. The cost of the change will be short-lived, the benefit will
stay on.

> 
>>
>> How about this: the language is in its early development. It is still
>> possible to make changes like this. It will be much, much harder in the
>> future. I still don't see one reason why there *should* be so many keywords (other than the fact that's already done that way)  and I've pointed out a few reasons why IMHO it's better, and cleaner, not to.
> 
> 
> Where are those reasons? I didn't see them.
> The only reasons were:
> 1- so you can use "int" as a variable name or something else.
> 2- easier to implement in a compiler.

> 
> but #1 is not really a practical reason. and I already answered #2
Regarding 2, the only reason you gave is the pre-existing code. Look
at the trouble Bill Gates got us all into with that thinking in the
early 80's. Do you really think the current D compiler will be the only one ever written?

 Also, you keep missing, or dismissing,

  3 - more consistent, useful error checking/error messages, by eliminating replication of semantic checking in the parser.

> 
>> The fact that C does the same thing does not qualify as a reason, since
>> it's a stated goal of D to eliminate the very reason C needs to do that.
>> So, having gone to the trouble to eliminate the need for keywords... why
>> are they still there???
> 
> 
> Where does the documentation state that D's goal is to eliminate C's need for keywords?
> 
Not quite that. The stated goal is to eliminate the need, which exists
in C, for the parser to know which identifiers are previously defined
as typedefs (or classes in C++), since C cannot be parsed otherwise.


This makes D a 'context-free grammar', you don't need to feed
information back to the parser from the symbol table.
C defines 'int' etc as keywords for the same purpose, they must
be distinguished (to the parser) from regular identifiers.
(also, because C has idioms like 'unsigned char' which do not apply to typedefs, and have likewise been eliminated in D). So, making D's grammar context-free has, as a direct result, eliminated the need
for type names to be keywords.

-----------------

http://www.digitalmars.com/d/index.html

Major Goals of D
 ...
    * Make D substantially easier to implement a compiler for than C++.
 ...
    * Have a context-free grammar.
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-----------------

When I first encountered D, after reading the article on printf in Jan/05 Dr. Dobbs, I read the 'context-free grammar' part in the goals, and my first thought was "Great!" and my second was "... and so built-in types aren't keywords any more..." but that turned out be untrue, for reasons no-one has been able to supply.

I don't think I've encountered any other well-thought-out language (and D clearly is one) which defines a whole bunch of keywords which are not actually necessary to the parsing process.


Thank you for helping me clarify my argument.


BTW, I feel like I'm telling someone, "It's summer, you don't need to wear a snowsuit any more , you'll be more comfortable without it", and I keep getting back "you haven't really given me a strong enough reason to not wear it; the grocery store is a bit chilly, for instance; and I'm already wearing it and I know it fits..."

- greg

July 19, 2005

Re: why are types all keywords?

Posted by Greg Smith
in reply to Unknown W. Brackets

Greg Smith

Posted in reply to Unknown W. Brackets

Unknown W. Brackets wrote:

> 
> Now, let's say that int isn't a keyword.  Let's even say, for the sake of argument, I can name a function int as well:
> 
> int int()
> {
>    return 42;
> }
> 
> Now, again, this should be parseable because we know some things here...
> but, most basic highlighters will not do this correctly.

Firstly, 'int' etc can be protected from redefinition (on a scope-selective basis, if needed) without being a keyword, see my temporally preceding post for some reasons why this is an advantage over just using keywords.

Secondly, even if 'int' can be redefined at file scope,
I don't think it's a problem that 'int' would always appear colored as
a builtin-type in your editor. This would probably be preferred, since it would let you know you are doing something dubious.

If you wanted to go further, and make an editor which parses the whole file so that local variables, e.g. are displayed in a different color than globals, and references to undefined variables are displayed in red, and you can hover over any variable and see its type...  that's much, much easier for D  than for C. The same process would allow 'int' to be highlighted properly (and/or let you know specifically that you were redefining a built-in type).

I have experience in the user side of this kind of issue, since I do a lot of Python coding. In python you can write

def add_dot(str):
	return str + '.'

... and it works, but it's poor practice, since 'str' is a predefined
(__builtin__) name which corresponds to the string type. you only get
in trouble when you modify it and fail to notice the conflict:

def add_dot_num(str,num):
	return str + '.' + str(num)   # oops!

The second 'str' refers to the local parameter rather than the builtin
'str' which converts 2 to '2'. However, this doesn't cause anywhere near
as much trouble as you might think:

   - many editors color 'str' differently, since it's in __builtin__; this makes it harder to redefine it by mistake.
   - automatic code checkers can easily determine that this code is redefining 'str' and warn you;
   - new builtins added to the language (and they are often added) do not break any code that happens to already use the same name as a variable. The resulting 'dubious usage' can be 'fixed' at your leisure.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation