May 11, 2012
Le 11/05/2012 10:01, Roman D. Boiko a écrit :
> There were several discussions about the need for a D compiler library.
>
> I propose my draft implementation of lexer for community review:
> https://github.com/roman-d-boiko/dct
>
> Lexer is based on Brian Schott's project
> https://github.com/Hackerpilot/Dscanner, but it has been refactored and
> extended (and more changes are on the way).
>
> The goal is to have source code loading, lexer, parser and semantic
> analysis available as parts of Phobos. These libraries should be
> designed to be usable in multiple scenarios (e.g., refactoring, code
> analysis, etc.).
>
> My commitment is to have at least front end built this year (and
> conforming to the D2 specification unless explicitly stated otherwise
> for some particular aspect).
>
> Please post any feed here. A dedicated project web-site will be created
> later.
>

I have started a similar stuff, and it is currently more advanced than DCT is. I kind of decapitated sdc to do it.

Maybe we should join effort instead of doing 2 separate projects ?
May 11, 2012
Le 11/05/2012 11:02, Jacob Carlborg a écrit :
> If think that the end goal of a project like this, putting a D frontend
> in Phobos, should be that the compiler should be built using this
> library. This would result in the compiler and library always being in
> sync and having the same behavior. Otherwise it's easy this would be
> just another tool that tries to lex and parse D code, always being out
> of sync with the compiler and not having the same behavior.
>
> For this to happen, for Walter to start using this, I think there would
> be a greater change if the frontend was a port of the DMD frontend and
> not changed too much.
>

From the beginning, I'm think AST macro using CTFE.
May 11, 2012
Le 11/05/2012 12:01, dennis luehring a écrit :
> Am 11.05.2012 11:33, schrieb Roman D. Boiko:
>>> -very fast in parsing/lexing - there need to be a benchmark
>>> enviroment from the very start
>> Will add that to May roadmap.
>
> are using slices for prevent coping everything around?
>
> the parser/lexer need to be as fast as the original one - maybe even
> faster - else it won't replace walters at any time - because speed does
> matter here very much

The best optimization is the one that bring your code from non working state to working state.
May 11, 2012
On 2012-05-11 12:35, Roman D. Boiko wrote:
> On Friday, 11 May 2012 at 09:36:28 UTC, Jacob Carlborg wrote:

>> Aha, clever. As long as I can get out the information I'm happy :) How
>> about adding properties for this in the token struct?
> There is a method for that in Lexer interface, for me it looks like it
> belongth there and not to token. Version accepting token and producing a
> pair of start/end Locations will be added.

Found it now, "calculateFor". It not sure if it's the most intuitive name though. I get the feeling: "calculate what?".

>> That might be the case. But I don't think it belongs in the parser.
> I will provide example code and a dedicated post later to illustrate my
> point.

I guess I'll have to wait for that then :)

>> Ok, fair enough. Perhaps this could be a property in the Token struct
>> as well. In that case I would suggest renaming "value" to
>> lexeme/spelling/representation, or something like that, and then name
>> the new property "value".
> I was going to rename value, but couldn't find a nice term. Thanks for
> your suggestions!
> As for the property with strongly typed literal value, currently I plan
> to put it into AST.

I stole "spelling" from Clang :)

-- 
/Jacob Carlborg
May 11, 2012
Le 11/05/2012 11:31, Roman D. Boiko a écrit :
> On Friday, 11 May 2012 at 09:02:12 UTC, Jacob Carlborg wrote:
>> If think that the end goal of a project like this, putting a D
>> frontend in Phobos, should be that the compiler should be built using
>> this library. This would result in the compiler and library always
>> being in sync and having the same behavior. Otherwise it's easy this
>> would be just another tool that tries to lex and parse D code, always
>> being out of sync with the compiler and not having the same behavior.
>>
>> For this to happen, for Walter to start using this, I think there
>> would be a greater change if the frontend was a port of the DMD
>> frontend and not changed too much.
>
> My plan is to create frontend that would be much better than existing,
> both in design and implementation. I decided to work on this full time
> for several months.
>

I this is your plan, I'm very happy. We really should discuss to avoid duplicating effort as we are doing right now.
May 11, 2012
On 5/11/12 4:22 PM, Roman D. Boiko wrote:
>> What about line and column information?
> Indices of the first code unit of each line are stored inside lexer and
> a function will compute Location (line number, column number, file
> specification) for any index. This way size of Token instance is reduced
> to the minimum. It is assumed that Location can be computed on demand,
> and is not needed frequently. So column is calculated by reverse walk
> till previous end of line, etc. Locations will possible to calculate
> both taking into account special token sequences (e.g., #line 3
> "ab/c.d"), or discarding them.

But then how do you do to efficiently (if reverse walk is any efficient) compute line numbers?

Usually tokens are used and discarded. I mean, somebody that uses the lexer asks tokens, process them (for example to highlight code or to build an AST) and then discards them. So you can reuse the same Token instance. If you want to peek the next token, or have a buffer of token, you can use a freelist ( http://dlang.org/memory.html#freelists , one of the many nice things I learned by looking at DMD's source code ).

So adding line and column information is not like wasting a lot of memory: just 8 bytes more for each token in the freelist.
May 11, 2012
On Friday, 11 May 2012 at 11:41:34 UTC, alex wrote:
> Ever thought of asking the VisualD developer to integrate your library into his IDE extension? Might be cool to do so because of extended completion abilities etc. (lol I'm the Mono-D dev -- but why not? ;D)
Didn't think about that yet, because I don't use VisualD.
I actually planned to analyse whether DCT could be integrated into Mono-D, so your feedback is welcome :)
May 11, 2012
On Friday, 11 May 2012 at 11:50:14 UTC, deadalnix wrote:
> Le 11/05/2012 11:31, Roman D. Boiko a écrit :
>> My plan is to create frontend that would be much better than existing,
>> both in design and implementation. I decided to work on this full time
>> for several months.
>>
>
> I this is your plan, I'm very happy. We really should discuss to avoid duplicating effort as we are doing right now.

> I have started a similar stuff, and it is currently more advanced than DCT is. I kind of decapitated sdc to do it.
>
> Maybe we should join effort instead of doing 2 separate projects ?

That makes sense. Is it possible to switch SDC to the Boost license? I'm trying to keep it for all DCT code.

May 11, 2012
On Friday, 11 May 2012 at 11:49:23 UTC, Jacob Carlborg wrote:
> On 2012-05-11 12:35, Roman D. Boiko wrote:
>> On Friday, 11 May 2012 at 09:36:28 UTC, Jacob Carlborg wrote:
>
>>> Aha, clever. As long as I can get out the information I'm happy :) How
>>> about adding properties for this in the token struct?
>> There is a method for that in Lexer interface, for me it looks like it
>> belongth there and not to token. Version accepting token and producing a
>> pair of start/end Locations will be added.
>
> Found it now, "calculateFor". It not sure if it's the most intuitive name though. I get the feeling: "calculate what?".
calculateLocation was original name, but I don't like repeating return type in method names, I decided to change it so that it is clear that another renaming is needed ;) Any suggestions?

>>> That might be the case. But I don't think it belongs in the parser.
>> I will provide example code and a dedicated post later to illustrate my
>> point.
>
> I guess I'll have to wait for that then :)
I'll try to do that ahead of roadmap, it is important.
May 11, 2012
Le 11/05/2012 14:04, Roman D. Boiko a écrit :
> That makes sense. Is it possible to switch SDC to the Boost license? I'm
> trying to keep it for all DCT code.
>

Let me do a clean package of my code this week end. For now it is mixed with SDC source code, which was enough as I was working alone, but isn't quite right for people to join the effort.