Lexer in D (page 3)

I noted it to late: decode comes from readText: readText validates your text. I will now use std.file.read. The new trace output is here and it seems you're right, isKeyword and isType (yes I'd decided to separate it again and keep both in the lexer) take a lot of time: http://dpaste.1azy.net/0d6aff6b 154874 106186 95403 0 pure nothrow bool Puzzle.Lexer.isKeyword(immutable(char)[]) 159688 78020 69289 0 pure nothrow bool Puzzle.Lexer.isType(immutable(char)[]) I never thought that. :)

I think that thanks to your suggestions isKeyword and isType are entirely inlined, because they aren't listet in the trace.log anymore (http://dpaste.1azy.net/b94b19ff). The only thing which makes trouble is the isNext function. Maybe I should also use a StringStream (implemented as Range). What do you think?

03-Mar-2013 03:52, Namespace пишет: > I think that thanks to your suggestions isKeyword and isType are > entirely inlined, because they aren't listet in the trace.log anymore > (http://dpaste.1azy.net/b94b19ff). The only thing which makes trouble is > the isNext function. > Maybe I should also use a StringStream (implemented as Range). What do > you think? I'd repeat that I think it makes no sense to separately treat isType. In any case there is way more to types then built-in ones and is the job of parser (to assume types) and semantic step (to type-check and infer). Another thing is to run -profile without inline to understand the structure of time spent per each subroutine better. -- Dmitry Olshansky

03-Mar-2013 03:12, Namespace пишет: > I noted it to late: decode comes from readText: readText validates your > text. I will now use std.file.read. > That's why Walter promotes profilers. They help verify hypothesis on performance and constantly surprise people in what they actually find. > The new trace output is here and it seems you're right, isKeyword and > isType (yes I'd decided to separate it again and keep both in the lexer) > take a lot of time: > http://dpaste.1azy.net/0d6aff6b > 154874 106186 95403 0 pure nothrow bool > Puzzle.Lexer.isKeyword(immutable(char)[]) > 159688 78020 69289 0 pure nothrow bool > Puzzle.Lexer.isType(immutable(char)[]) > > I never thought that. :) -- Dmitry Olshansky

> I'd repeat that I think it makes no sense to separately treat isType. In any case there is way more to types then built-in ones and is the job of parser (to assume types) and semantic step (to type-check and infer). Yes I've understood, but currently I want it so. > Another thing is to run -profile without inline to understand the structure of time spent per each subroutine better. I did this and I refactored a lot of my code. Now I get 170 - 180 msecs and this is my current trace.log: http://dpaste.1azy.net/b94b19ff But currently I have no more ideas how I could gain more performance. Maybe I should disable the compiler while looping through the text? Or maybe I should allocate more space for the Token array (e.g. toks.length = 100;)? I hope that you or anyone other have further ideas. But anyway, thanks for the help! :)

March 03, 2013

Re: Lexer in D

Posted by Dmitry Olshansky
in reply to Namespace

Permalink

Dmitry Olshansky

Posted in reply to Namespace

Permalink

03-Mar-2013 18:28, Namespace пишет:
>> I'd repeat that I think it makes no sense to separately treat isType.
>> In any case there is way more to types then built-in ones and is the
>> job of parser (to assume types) and semantic step (to type-check and
>> infer).
> Yes I've understood, but currently I want it so.
>
>> Another thing is to run -profile without inline to understand the
>> structure of time spent per each subroutine better.
> I did this and I refactored a lot of my code.
> Now I get 170 - 180 msecs and this is my current trace.log:
> http://dpaste.1azy.net/b94b19ff
>
> But currently I have no more ideas how I could gain more performance.
> Maybe I should disable the compiler while looping through the text?
> Or maybe I should allocate more space for the Token array (e.g.
> toks.length = 100;)?
> I hope that you or anyone other have further ideas.
> But anyway, thanks for the help! :)

Simple - don't use array append and don't produce and array.
Just produce a lazy forward range that is iterated.

At the very least use Appender!(Token[]) it ought to be much faster.

-- 
Dmitry Olshansky

> Simple - don't use array append and don't produce and array. > Just produce a lazy forward range that is iterated. > > At the very least use Appender!(Token[]) it ought to be much faster. But hasn't each range an array internally? Or how does that work? I try to use Appender, but this lazy forward range sounds interesting.

On Sunday, 3 March 2013 at 16:19:27 UTC, Namespace wrote: >> Simple - don't use array append and don't produce and array. >> Just produce a lazy forward range that is iterated. >> >> At the very least use Appender!(Token[]) it ought to be much faster. > > But hasn't each range an array internally? > Or how does that work? > > I try to use Appender, but this lazy forward range sounds interesting. Appender does not work for my Token struct because I have const/immutable members. So I cannot add something. Example: http://dpaste.1azy.net/21f100d3 Without 'const' it works fine.

03-Mar-2013 20:19, Namespace пишет: >> Simple - don't use array append and don't produce and array. >> Just produce a lazy forward range that is iterated. >> >> At the very least use Appender!(Token[]) it ought to be much faster. > > But hasn't each range an array internally? Of course, not or the whole range concept is meaningless. > Or how does that work? By having a struct or generally any object that in fact works as generator of tokens. popFront lexes new token, front gives the current token and empty indicate if there are no more tokens. See std.range for some simple ranges. If you throw in a .save method you can replay tokenization from some point in the past (= soem older state, think lookahead when parsing). > I try to use Appender, but this lazy forward range sounds interesting. -- Dmitry Olshansky

03-Mar-2013 20:36, Namespace пишет: > On Sunday, 3 March 2013 at 16:19:27 UTC, Namespace wrote: >>> Simple - don't use array append and don't produce and array. >>> Just produce a lazy forward range that is iterated. >>> >>> At the very least use Appender!(Token[]) it ought to be much faster. >> >> But hasn't each range an array internally? >> Or how does that work? >> >> I try to use Appender, but this lazy forward range sounds interesting. > > Appender does not work for my Token struct because I have > const/immutable members. > So I cannot add something. > > Example: http://dpaste.1azy.net/21f100d3 > Without 'const' it works fine. Just rip off the const why would you need it anyway? -- Dmitry Olshansky

Forums