Thread overview | |||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
January 22, 2014 Re: So, You Want To Write Your Own Programming Language? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On 2014-01-22 05:29, Walter Bright wrote: > http://www.reddit.com/r/programming/comments/1vtm2l/so_you_want_to_write_your_own_language_dr_dobbs/ From the article: "Regex is just the wrong tool for lexing and parsing." I'm wonder why is there so many books about implementing compilers that spends, usually, quite a large chapter about regular expressions? -- /Jacob Carlborg |
January 22, 2014 Re: So, You Want To Write Your Own Programming Language? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright: > http://www.reddit.com/r/programming/comments/1vtm2l/so_you_want_to_write_your_own_language_dr_dobbs/ Thank you for the simple nice article. >The poisoning approach. [...] This is the approach we've been using in the D compiler, and are very pleased with the results.< Yet, even in D most of the error messages after the first few ones are often not so useful to me. So perhaps I'd like a compiler switch to show only the first few error messages and then stop the compiler. >Automated documentation generator. [...] Before Ddoc, the documentation had only a random correlation with the code, and too often, they had nothing to do with each other. After Ddoc, the two were brought in sync.< And now the situation is even better, we have documentation unittests and the function arguments are verified to be in sync with their ddoc comment. Probably there's some space for further improvements. >One semantic technique that is obvious in hindsight (but took Andrei Alexandrescu to point out to me) is called "lowering."< In Haskell the GHC compiler goes one step further, it translates all the Haskell code into an intermediate code named Core, that is not the language of a virtual machine, it's still a functional language, but it's simpler, lot of the syntax differences between language constructs is reduced to a much reduced number of mostly functional stuff. >My general rule is if the explanation for what the function does is more lines than the implementation code, then the function is likely trivia and should be booted out.< In Haskell there's a standard module named Prelude, it's imported on default and defined lot of functions, etc of general use. Most functions in it are only few lines long (often 2-3 lines long, with some functions up to 10-13 lines long). Bonus: the cute idea of a language for students: http://www.iro.umontreal.ca/~felipe/IFT2030-Automne2002/Complements/tinyc.c (On Reddit I seem to see some comments, like structs not allowing constructors?) Bye, bearophile |
January 22, 2014 Re: So, You Want To Write Your Own Programming Language? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jacob Carlborg | > On Wednesday, 22 January 2014 at 10:36:31 UTC, Jacob Carlborg wrote: > I'm wonder why is there so many books about implementing compilers that spends, usually, quite a large chapter about regular expressions? I wonder about that too. For anything halfway useful regex has too much limitations. Wich you only find out in later chapter or pretty soon in your parser :D |
January 22, 2014 Re: So, You Want To Write Your Own Programming Language? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Wednesday, 22 January 2014 at 04:29:05 UTC, Walter Bright wrote:
> http://www.reddit.com/r/programming/comments/1vtm2l/so_you_want_to_write_your_own_language_dr_dobbs/
"A good syntax needs redundancy in order to diagnose errors and give good error messages."
This is also true of natural languages. The higher the redundancy, the easier it is to guess or reconstruct what a person tried to say (in a noisy environment) or write (if the message gets messed up somehow). Texts in highly inflectional languages (like German) can be "recovered" with higher accuracy than texts in English.
If grammatical relations are no longer expressed by inflectional endings (as is often the case in English), the word order is crucial.
"The dog bit the man."
In Latin and German you can turn the statement around and still know who bit who(m).
Over the centuries, natural languages have reduced redundancy, but there are still loads of redundancies e.g. "two cats" (it would be enough to say "two cat", which some languages actually do, see also "a 15 _year_ old girl).
Syntax is getting simplified due to the fact that the listener "knows what we mean", e.g. "buy one get one free". I wonder to what extent languages will be simplified one day. But this is a topic for a whole book ...
|
January 22, 2014 Re: So, You Want To Write Your Own Programming Language? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Wednesday, 22 January 2014 at 04:29:05 UTC, Walter Bright wrote:
> http://www.reddit.com/r/programming/comments/1vtm2l/so_you_want_to_write_your_own_language_dr_dobbs/
Great article!
|
January 22, 2014 Re: So, You Want To Write Your Own Programming Language? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Chris | Chris:
> "A good syntax needs redundancy in order to diagnose errors and give good error messages."
I'd like to measure this statement experimentally: are error messages in Go and Scala any worse because of the optional use of semicolons? My initial supposition is that the answer is negative.
Bye,
bearophile
|
January 22, 2014 Re: So, You Want To Write Your Own Programming Language? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Wednesday, 22 January 2014 at 04:29:05 UTC, Walter Bright wrote:
> http://www.reddit.com/r/programming/comments/1vtm2l/so_you_want_to_write_your_own_language_dr_dobbs/
Great article. I was surprised that you mentioned lowering positively, though.
I think from DMD we have enough experience to say that although lowering sounds good, it's generally a bad idea. It gives you a mostly-working prototype very quickly, but you pay a heavy price for it. It destroys valuable semantic information. You end up with poor quality error messages, and counter-intuitively, you can end up with _more_ special cases (eg, lowering ref-foreach in DMD means ref local variables can spread everywhere). And it reduces possibilities for the optimizer.
In DMD, lowering has caused *major* problems with AAs, foreach. and builtin-functions, and some of the transformations that the inliner makes. It's also caused problems with postincrement and exponentation. Probably there are other examples.
It seems to me that what does make sense is to perform lowering as the final step before passing the code to the backend. If you do it too early, you're shooting yourself in the foot.
|
January 22, 2014 Re: So, You Want To Write Your Own Programming Language? | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On Wednesday, 22 January 2014 at 11:59:30 UTC, bearophile wrote:
> I'd like to measure this statement experimentally: are error messages in Go and Scala any worse because of the optional use of semicolons? My initial supposition is that the answer is negative.
Error messages in SML are either really neat or catastrophic.
|
January 22, 2014 Re: So, You Want To Write Your Own Programming Language? | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On Wednesday, 22 January 2014 at 10:38:40 UTC, bearophile wrote:
>
> In Haskell the GHC compiler goes one step further, it translates all the Haskell code into an intermediate code named Core, that is not the language of a virtual machine, it's still a functional language, but it's simpler, lot of the syntax differences between language constructs is reduced to a much reduced number of mostly functional stuff.
>
Same story is with Erlang as far as I know.
|
Copyright © 1999-2021 by the D Language Foundation