Want to help DMD bugfixing? Write a simple utility. (page 3)

> On 3/23/11, Jonathan M Davis <jmdavisProg@gmx.com> wrote: > > That would require a full-blown D lexer and parser. > > > > - Jonathan M Davis > > Isn't DDMD written in D? I'm not sure about how finished it is though. Yes, but the lexer and parser in ddmd are not only GPL (which would be a problem for some stuff but not others - for something like Don's utility, it wouldn't be a problem), and more importantly, it is tied to the compiler code. It's not designed to be used by an arbitrary program. For that, you would need a lexer and parser which were designed with an API such that an arbitrary D program could use them. For instance, the lexer could produce a range of tokens to be processed, and a program which wants to use the lexer can then process that range. - Jonathan M Davis

On 3/23/11, Jonathan M Davis <jmdavisProg@gmx.com> wrote: >> On 3/23/11, Jonathan M Davis <jmdavisProg@gmx.com> wrote: >> > That would require a full-blown D lexer and parser. >> > >> > - Jonathan M Davis >> >> Isn't DDMD written in D? I'm not sure about how finished it is though. > > Yes, but the lexer and parser in ddmd are not only GPL (which would be a > problem for some stuff but not others - for something like Don's utility, it > wouldn't be a problem), and more importantly, it is tied to the compiler > code. > It's not designed to be used by an arbitrary program. For that, you would > need > a lexer and parser which were designed with an API such that an arbitrary D > program could use them. For instance, the lexer could produce a range of > tokens to be processed, and a program which wants to use the lexer can then > process that range. > > - Jonathan M Davis > I didn't even know it was GPL. It doesn't come with a license file.

> What about the artistic license, the front-end can be used with that license. Is that less restrictive than GPL? I don't know what the exact licensing situation is. However, as I understand it, the C++ front-end is under the GPL, and therefore because ddmd is based on the C++ front-end, it is also under the GPL. If that's not the case, I don't know what the licensing situation really is. And I don't know what the artistic license says exactly, so I don't know what its restrictions are. - Jonathan M Davis

> Currently, as far as I know, there are only two lexers and two parsers for D: the C++ front end which dmd, gdc, and ldc use and the D front end which ddmd uses and which is based on the C++ front end. Both of those are under the GPL (which makes them useless for a lot of stuff) and both of them are tied to compilers. Being able to lex D code and get the list of tokens in a D program and being able to parse D code and get the resultant abstract syntax tree would be very useful for a number of programs. There is a third one: http://code.google.com/p/dil/. The main page says that the lexer and the parser are fully implemented for both D1 and D2. But the license is also the GPL.

On 03/24/2011 08:53 AM, Alexey Prokhin wrote: > Currently, as far as I know, there are only two lexers and two parsers for > D: the C++ front end which dmd, gdc, and ldc use and the D front end which > ddmd uses and which is based on the C++ front end. Both of those are under > the GPL (which makes them useless for a lot of stuff) and both of them are > tied to compilers. Being able to lex D code and get the list of tokens in > a D program and being able to parse D code and get the resultant abstract > syntax tree would be very useful for a number of programs. I fully support this. We desperately need it, I guess, working and maintained along language evolution. This is the whole purpose of the GSOC proposal "D tools in D": http://prowiki.org/wiki4d/wiki.cgi?GSOC_2011_Ideas#DtoolsinD Semantic analysis, introduced step by step, would be a huge plus. Denis -- _________________ vita es estrany spir.wikidot.com

> Is there a copy of the official D grammar somewhere online? I wrote a lexer for my Compiler class and would love to try and apply it to another grammar. The official D grammar is spread among the specification. But I recall that someone compiled a complete grammar for D1 some time ago.

On Wed, 23 Mar 2011 21:16:02 -0000, Jonathan M Davis <jmdavisProg@gmx.com> wrote: > There are tasks for which you need to be able to lex and parse D code. To 100% correctly remove unit tests would be one such task. Is that last bit true? You definitely need to be able to lex it, but instead of actually parsing it you just count { and } and remove 'unittest' plus { plus } plus everything in between right? -- Using Opera's revolutionary email client: http://www.opera.com/mail/

On 03/25/2011 12:08 PM, Regan Heath wrote: > On Wed, 23 Mar 2011 21:16:02 -0000, Jonathan M Davis <jmdavisProg@gmx.com> wrote: >> There are tasks for which you need to be able to lex and parse D code. To >> 100% correctly remove unit tests would be one such task. > > Is that last bit true? You definitely need to be able to lex it, but instead of > actually parsing it you just count { and } and remove 'unittest' plus { plus } > plus everything in between right? At first sight, you're both wrong: you'd need to count { } levels. Also, I think true lexing is not really needed: you'd only need to put apart strings and comments that could hold non-code { & }. (But these are only very superficial notes.) Denis -- _________________ vita es estrany spir.wikidot.com

March 25, 2011

Re: Want to help DMD bugfixing? Write a simple utility.

Posted by Don
in reply to spir

Permalink

Don

Posted in reply to spir

Permalink

spir wrote:
> On 03/25/2011 12:08 PM, Regan Heath wrote:
>> On Wed, 23 Mar 2011 21:16:02 -0000, Jonathan M Davis <jmdavisProg@gmx.com> wrote:
>>> There are tasks for which you need to be able to lex and parse D code. To
>>> 100% correctly remove unit tests would be one such task.
>>
>> Is that last bit true? You definitely need to be able to lex it, but instead of
>> actually parsing it you just count { and } and remove 'unittest' plus { plus }
>> plus everything in between right?
> 
> At first sight, you're both wrong: you'd need to count { } levels. Also, I think true lexing is not really needed: you'd only need to put apart strings and comments that could hold non-code { & }.
> (But these are only very superficial notes.)
> 
> Denis

Yes, exactly: you just need to lex strings (including q{}), comments (which you remove),
unittest, and count levels of {.
You need to worry about backslashes in comments, but that's about it.

I even did this in a CTFE function once, I know it isn't complicated.
Should be possible in < 50 lines of code.
I just didn't want to have to do it myself.

In fact, it would be adequate to replace:
unittest
{
   blah...
}
with:
unittest{}

Then you don't need to worry about special cases like:

version(XXX)
unittest
{
...
}

Forums