Thread overview
[SAOC] "D backend for Bison" thread
September 09
Hello all!

My name is Adela Vais, and I am a 4th-year student at the "Politehnica" University of Bucharest, where I study Computer Engineering.
I love learning new programming languages and during the Formal Languages and Automata course I took at university I became interested in LR theory.
Eduard Stăniloiu and Răzvan Nițu made my introduction to D, at a workshop that took place during the “Ideas and Projects Workshop” Summer School 2019, and I found the language interesting because of its high expressiveness and its fast memory safety feature.
This is why I decided to do this summer an internship at my university, learning and making contributions to Dlang and GNU Bison. I consider SAoC a great way to get involved in the D community.

The project I will pursue during SAoC aims to complete the D backend for Bison, by creating the GLR Parser for the D language.
Currently, Bison supports only the LALR1 Parser for D. While simple and fast, this parser has its limitations: it cannot handle non-deterministic or ambiguous grammars. A GLR Parser would be able to handle such a grammar, without the constraint of only one token lookahead.
Combining my love for D, for learning programming languages, and for LR theory, this project is a tremendous learning opportunity for me.

What I intend to do during SAoC, for each milestone, is:

Milestone 1 - Understand the already existing code

- Analyze the LALR1 D language parser and the C and C++ GLR parsers.
- Understand M4 (the language I will partly write the parser in) by working in the GNU Bison repository, doing smaller tasks (adding functionalities like lookahead correction, and starting working at a push parser, which will further my understanding of the LALR1 D parser).
- Creating (at least) 4 small programs that will make me understand the differences between the C and C++’s LALR1 and GLR parsers, using ambiguous and non-ambiguous grammars.

Milestone 2 - Write the GLR support for the D language

- Create the glr.d file. Similar to the C++ code, this parser will be a class that wraps around the C GLR parser (for easier maintenance).
- Provide the same interface to that of the LALR1 parser. The user should not be able to feel any difference on their end. Add support so that the user is able to provide input from both stdin and files, and create the Lexer interface that allows the user to create a class implementing a lexer method, an error reporting method, and location tracking.

Milestone 3 - Write the GLR support for the D language

- Continue working on the interface. Add the declarations currently supported by the LALR1 parser: %error-verbose, %parse-param, %union, %code, %locations, %initial-action.
- Merge the glr.d and lalr1.d files. The two files will likely end up duplicating a lot of code, so a merge will be needed.

Milestone 4 – Test and write documentation

- Pass all unit tests, fix any bug that appears.
- If the time allows, I plan to integrate this parser in a project made by a third party, to further test the correctness.
- Write the documentation.

I will post weekly (or biweekly) updates with my progress.

I already made some changes to the existing D language support and I will continue to do so beyond SAoC, too. I intend to pursue this project for my undergraduate thesis topic and continue as an ambassador on behalf of the D community after that.


Thanks!
Adela
September 10
On 2020-09-09 21:59, Adela Vais wrote:

> The project I will pursue during SAoC aims to complete the D backend for Bison, by creating the GLR Parser for the D language.

I don't know much about Bison. But isn't Bison a parser generator that takes some language grammar as input and outputs a parser implemented in some language. Bison seem to currently support C, C++ and Java for the parser implementation. Is the goal to add D to that list?

> - Pass all unit tests, fix any bug that appears.

Why is this a separate task? In my opinion all unit tests should be written at the same time as the implementation, or before, if you're doing test driven development. And the implementation is not done until the tests are done an all pass. The tests are part of the implementation.

-- 
/Jacob Carlborg
September 10
On Thu, Sep 10, 2020 at 05:04:42PM +0200, Jacob Carlborg via Digitalmars-d wrote: [...]
> Bison seem to currently support C, C++ and Java for the parser implementation. Is the goal to add D to that list?

Yes.


T

-- 
The trouble with TCP jokes is that it's like hearing the same joke over and over.
September 11
On Thursday, 10 September 2020 at 15:04:42 UTC, Jacob Carlborg wrote:
>> - Pass all unit tests, fix any bug that appears.
>
> Why is this a separate task? In my opinion all unit tests should be written at the same time as the implementation, or before, if you're doing test driven development. And the implementation is not done until the tests are done an all pass. The tests are part of the implementation.

I had in mind extensive, combine-all-the-features-you-possibly-can type of tests when I wrote that, of course I will also test along the way.