January 27, 2013 Re: Request for comments: std.d.lexer | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dmitry Olshansky | > The hint is that your question is a bit faulty: by calling it "the D grammar" do you mean the exact one listed on the website or any equivalent that parses the same language (including the ones obtained by simple transformations)?
The latter. The one I use for Pegged to generate (what is hopefully) a D parser is already modified, discards constructs like NameList := Name NameList in favor of Name+
Anyway, let's stop here. Back to lexing proper :)
|
January 27, 2013 Re: Request for comments: std.d.lexer | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Sunday, 27 January 2013 at 19:46:12 UTC, Walter Bright wrote:
> On 1/27/2013 1:51 AM, Brian Schott wrote:
>> I'm interested in ideas on the API design and other high-level issues at the
>> moment. I don't consider this ready for inclusion. (The current module being
>> reviewed for inclusion in Phobos is the new std.uni.)
>
> Just a quick comment: byToken() should not accept a filename. It's input should be via an InputRange, not a file.
The file name is accepted for eventual error reporting purposes. The actual input for the lexer is the parameter called "range".
Regarding the times that I posted, my point was that it's not slower than "dmd -c", nothing more.
|
January 27, 2013 Re: Request for comments: std.d.lexer | ||||
---|---|---|---|---|
| ||||
Posted in reply to Brian Schott | On 01/27/2013 10:39 PM, Brian Schott wrote:
> ...
>
> Regarding the times that I posted, my point was that it's not slower
> than "dmd -c", nothing more.
Sure. The point you brought across, however, was that it is not significantly faster yet. :o)
|
January 27, 2013 Re: Request for comments: std.d.lexer | ||||
---|---|---|---|---|
| ||||
Posted in reply to Brian Schott | On 1/27/2013 1:39 PM, Brian Schott wrote:
> The file name is accepted for eventual error reporting purposes.
Use an OutputRange for that.
|
January 28, 2013 Re: Request for comments: std.d.lexer | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Sunday, 27 January 2013 at 23:49:11 UTC, Walter Bright wrote:
> On 1/27/2013 1:39 PM, Brian Schott wrote:
>> The file name is accepted for eventual error reporting purposes.
>
> Use an OutputRange for that.
What about that delegate-based design? I thought everyone agreed that it was nice?
David
|
January 28, 2013 Re: Request for comments: std.d.lexer | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Nadlinger | On 1/27/2013 4:48 PM, David Nadlinger wrote:
> On Sunday, 27 January 2013 at 23:49:11 UTC, Walter Bright wrote:
>> On 1/27/2013 1:39 PM, Brian Schott wrote:
>>> The file name is accepted for eventual error reporting purposes.
>>
>> Use an OutputRange for that.
>
> What about that delegate-based design? I thought everyone agreed that it was nice?
An OutputRange is a way of doing that. The advantage of OutputRange's is that is TheWayToDoThings in Phobos so that components can all interoperate and plug into each other.
|
January 28, 2013 Re: Request for comments: std.d.lexer | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Sunday, 27 January 2013 at 23:49:11 UTC, Walter Bright wrote: > On 1/27/2013 1:39 PM, Brian Schott wrote: >> The file name is accepted for eventual error reporting purposes. > > Use an OutputRange for that. I think you misunderstand. The file name is so that if you pass in "foo.d" the lexer can say "Error: unterminated string literal beginning on line 123 of foo.d". It's not so that error messagaes will be written to a file of that name. On the topic of performance, I realized that the numbers posted previously were actually for a debug build. Fail. For whatever reason, the current version of the lexer code isn't triggering my heisenbug[1] and I was able to build with -release -inline -O. Here's what avgtime has to say: $ avgtime -q -h -r 200 dscanner --tokenCount ../phobos/std/datetime.d ------------------------ Total time (ms): 51409.8 Repetitions : 200 Sample mode : 250 (169 ocurrences) Median time : 255.57 Avg time : 257.049 Std dev. : 4.39338 Minimum : 252.931 Maximum : 278.658 95% conf.int. : [248.438, 265.66] e = 8.61087 99% conf.int. : [245.733, 268.366] e = 11.3166 EstimatedAvg95%: [256.44, 257.658] e = 0.608881 EstimatedAvg99%: [256.249, 257.849] e = 0.800205 Histogram : msecs: count normalized bar 250: 169 ######################################## 260: 22 ##### 270: 9 ## Which works out to 1,327,784 tokens per second on my Ivy Bridge i7. I created a small program that demangles the output of valgrind so that tools like KCachegrind can display profiling information more clearly. It's now on the wiki[2] The bottleneck in std.d.lexer as it stands is the appender instances that assemble Token.value during iteration and front() on the array of char[]. (As I'm sure everyone expected) [1] http://forum.dlang.org/thread/bug-9353-3@http.d.puremagic.com%2Fissues%2F [2] http://wiki.dlang.org/Other_Dev_Tools |
January 28, 2013 Re: Request for comments: std.d.lexer | ||||
---|---|---|---|---|
| ||||
Posted in reply to Brian Schott | On 1/27/2013 4:53 PM, Brian Schott wrote:
> On Sunday, 27 January 2013 at 23:49:11 UTC, Walter Bright wrote:
>> On 1/27/2013 1:39 PM, Brian Schott wrote:
>>> The file name is accepted for eventual error reporting purposes.
>>
>> Use an OutputRange for that.
>
> I think you misunderstand. The file name is so that if you pass in "foo.d" the
> lexer can say "Error: unterminated string literal beginning on line 123 of
> foo.d". It's not so that error messagaes will be written to a file of that name.
Yes, I did misunderstand. I suggest updating the documentation to clear up the misunderstanding.
|
January 28, 2013 Re: Request for comments: std.d.lexer | ||||
---|---|---|---|---|
| ||||
Posted in reply to Brian Schott | On Monday, 28 January 2013 at 00:53:03 UTC, Brian Schott wrote:
> On Sunday, 27 January 2013 at 23:49:11 UTC, Walter Bright wrote:
>> On 1/27/2013 1:39 PM, Brian Schott wrote:
>>> The file name is accepted for eventual error reporting purposes.
>>
>> Use an OutputRange for that.
>
> I think you misunderstand. The file name is so that if you pass in "foo.d" the lexer can say "Error: unterminated string literal beginning on line 123 of foo.d". It's not so that error messagaes will be written to a file of that name.
>
I don't think that is a good idea. For instance mixin need to be lexed but don't come from a file.
The lexer should report the error, what is done on error is up to the user of the lexer.
|
January 28, 2013 Re: Request for comments: std.d.lexer | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Monday, 28 January 2013 at 00:51:28 UTC, Walter Bright wrote: > On 1/27/2013 4:48 PM, David Nadlinger wrote: >> On Sunday, 27 January 2013 at 23:49:11 UTC, Walter Bright wrote: >>> On 1/27/2013 1:39 PM, Brian Schott wrote: >>>> The file name is accepted for eventual error reporting purposes. >>> >>> Use an OutputRange for that. >> >> What about that delegate-based design? I thought everyone agreed that it was nice? > > An OutputRange is a way of doing that. The advantage of OutputRange's is that is TheWayToDoThings in Phobos so that components can all interoperate and plug into each other. I was talking about the design you proposed yourself here: http://forum.dlang.org/post/jvp9ke$2m45$1@digitalmars.com Oh, and you really don't need to give me the basic Phobos/ranges sales pitch, I think I'm quite aware of their advantages. I'm just not sure that e.g. having an "exception thrower" output range would be a wise design decision. David |
Copyright © 1999-2021 by the D Language Foundation