February 05, 2013
On 2/4/13 11:05 PM, Andrej Mitrovic wrote:
> On 2/5/13, Andrei Alexandrescu<SeeWebsiteForEmail@erdani.org>  wrote:
>> Suggestion: take lexer.c and convert it to D. Should take one day, and
>> you'll have performance on par.
>
> This was already done for DDMD, and the more recent minimal version of it:
>
> https://github.com/zachthemystic/ddmd-clean/blob/master/dmd/lexer.d

Awesome! Has anyone measured the speed?

Andrei
February 05, 2013
On Tuesday, 5 February 2013 at 04:19:54 UTC, Andrei Alexandrescu wrote:
> Awesome! Has anyone measured the speed?
>
> Andrei

I gave up on getting it to compile.

ddmd (the project this one is based on) was last updated to 2.040 according to its page on dsource, and I also haven't gotten it to compile.
February 05, 2013
On Tuesday, 5 February 2013 at 03:22:52 UTC, Andrei Alexandrescu wrote:
> On 2/4/13 10:19 PM, Brian Schott wrote:
>> More optimizing:
>> http://hackerpilot.github.com/experimental/std_lexer/images/times2.png
>>
>> Still only half speed. I'm becoming more and more convinced that Walter
>> is actually a wizard.
>
> Suggestion: take lexer.c and convert it to D. Should take one day, and you'll have performance on par.
>

DMD's lexer is not suitable for phobos IMO. It doesn't take a range as input and don't produce a range. It also lack the features you may want from a multi usage D lexer.
February 05, 2013
On 2013-02-05 04:22, Andrei Alexandrescu wrote:

> Suggestion: take lexer.c and convert it to D. Should take one day, and
> you'll have performance on par.

There's reason for why nobody has just extract the lexer from DMD. It will probably take more than a day just to extract the lexer to be able to use it without the rest of DMD.

It's probably easier to do the actual porting than extract the lexer.

Actually, when I think about it, Johnathan is working on porting the DMD lexer to D.

-- 
/Jacob Carlborg
February 05, 2013
On 02/05/2013 07:19 AM, Brian Schott wrote:
> More optimizing:
> http://hackerpilot.github.com/experimental/std_lexer/images/times2.png
>
> Still only half speed. I'm becoming more and more convinced that Walter
> is actually a wizard.

Time to do some hacking on your lexer I guess. I'll try add a couple of tricks and see if it helps.

What command do you use for benchmarking?
February 05, 2013
On Tuesday, February 05, 2013 09:14:35 Jacob Carlborg wrote:
> Actually, when I think about it, Johnathan is working on porting the DMD lexer to D.

Not exactly. I decided that it would be of greater benefit to just write the thing according to the grammar and make sure that the compiler matched the spec. I've already found and fixed several bugs in the spec because of that. Also, given how I'm writing it, I expect that its speed will be similar to that of dmd given the fact that I'm writing it such that it will generally do the minimum number of operations required. But it wouldn't surprise me at all if no lexer can possibly match dmd at the moment as long as it's compiled with dmd (at least on Linux), because gcc's optimizer is so much better than dmd's, and dmd is getting compiled with gcc, whereas the competing lexer in D is being compiled with dmd.

I don't have a lot of time to work on my lexer at the moment, but I'd really like to get it done soon, and I have most of the features in place. Unfortunately, when I went to try and work on it again the other day, the code wasn't compiling anymore, and I need to figure out why. I suspect that it's a regression related to string mixins, but I have to investigate further to sort it out.

- Jonathan M Davis
February 05, 2013
On 2013-02-05 10:07, Jonathan M Davis wrote:

> Not exactly. I decided that it would be of greater benefit to just write the
> thing according to the grammar and make sure that the compiler matched the
> spec.

Aha, I see.

-- 
/Jacob Carlborg
February 05, 2013
On Tuesday, February 05, 2013 09:14:35 Jacob Carlborg wrote:
> On 2013-02-05 04:22, Andrei Alexandrescu wrote:
> > Suggestion: take lexer.c and convert it to D. Should take one day, and you'll have performance on par.
> 
> There's reason for why nobody has just extract the lexer from DMD. It will probably take more than a day just to extract the lexer to be able to use it without the rest of DMD.

There are basic ideas about how it works which are obviously good and should be in the finished product in D, but it's not range-based, which forces you to do things differently. It's also not configurable, which forces you to do things differently.

If it could be ported as-is and then compared for speed, then that would be a great test, since it would be able to show how much of the speed problem is purely a compiler issue as opposed to a design issue, but you wouldn't be able to actually use it for anything more than what Brian is doing with his performance testing, because as you point out, it's too integrated into dmd. It _would_ be valuable though as a performance test of the compiler.

- Jonathan M Davis
February 05, 2013
On Tuesday, February 05, 2013 01:07:48 Jonathan M Davis wrote:
> I don't have a lot of time to work on my lexer at the moment, but I'd really like to get it done soon, and I have most of the features in place. Unfortunately, when I went to try and work on it again the other day, the code wasn't compiling anymore, and I need to figure out why. I suspect that it's a regression related to string mixins, but I have to investigate further to sort it out.

It turns out that it has nothing to do with string mixins (though it does have to do with CTFE):

http://d.puremagic.com/issues/show_bug.cgi?id=9452

Fortunately, there's a simple workaround that'll let me continue until the bug is fixed.

- Jonathan M Davis
February 05, 2013
On 2013-02-05 11:44, Jonathan M Davis wrote:

> If it could be ported as-is and then compared for speed, then that would be a
> great test, since it would be able to show how much of the speed problem is
> purely a compiler issue as opposed to a design issue, but you wouldn't be able
> to actually use it for anything more than what Brian is doing with his
> performance testing, because as you point out, it's too integrated into dmd.
> It _would_ be valuable though as a performance test of the compiler.

Yeah, that could be useful.

-- 
/Jacob Carlborg