std.d.lexer: pre-voting review / discussion - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » std.d.lexer: pre-voting review / discussion

Thread overview

std.d.lexer: pre-voting review / discussion
Sep 11, 2013 Dicebot
Sep 11, 2013 Tove
Sep 11, 2013 Johannes Pfau
Sep 11, 2013 Walter Bright
Sep 11, 2013 qznc
Sep 11, 2013 Walter Bright
Sep 11, 2013 Walter Bright
Sep 11, 2013 Brian Schott
Sep 11, 2013 Walter Bright
Sep 11, 2013 H. S. Teoh
Sep 11, 2013 Dicebot
Sep 11, 2013 H. S. Teoh
Sep 11, 2013 Dicebot
Sep 11, 2013 H. S. Teoh
Sep 12, 2013 deadalnix
Sep 12, 2013 Jonathan M Davis
Sep 12, 2013 H. S. Teoh
Sep 12, 2013 sclytrack
Sep 11, 2013 Manfred Nowak
Sep 11, 2013 Walter Bright
Sep 11, 2013 Manfred Nowak
Sep 12, 2013 Robert Schadek
Sep 12, 2013 Manfred Nowak
Sep 11, 2013 Jonathan M Davis
Sep 11, 2013 Brian Schott
Sep 12, 2013 deadalnix
Sep 12, 2013 Manfred Nowak
Sep 12, 2013 Manfred Nowak
Sep 11, 2013 Jonathan M Davis
Sep 11, 2013 Jonathan M Davis
Sep 11, 2013 H. S. Teoh
Sep 11, 2013 Walter Bright
Sep 11, 2013 H. S. Teoh
Sep 11, 2013 Walter Bright
Sep 12, 2013 H. S. Teoh
Sep 12, 2013 Brian Schott
Sep 12, 2013 H. S. Teoh
Sep 12, 2013 Jacob Carlborg
Sep 11, 2013 Brian Schott
Sep 11, 2013 H. S. Teoh
Sep 12, 2013 Jacob Carlborg
Sep 11, 2013 Piotr Szturmaj
Sep 11, 2013 Kapps
Sep 11, 2013 Piotr Szturmaj
Sep 12, 2013 Jacob Carlborg
Sep 11, 2013 Michel Fortin
Sep 11, 2013 Walter Bright
Sep 12, 2013 deadalnix
Sep 12, 2013 Walter Bright
Sep 12, 2013 deadalnix
Sep 12, 2013 Walter Bright
Sep 12, 2013 deadalnix
Sep 12, 2013 Walter Bright
Sep 12, 2013 deadalnix
Sep 12, 2013 Martin Nowak
Sep 12, 2013 Martin Nowak
Sep 12, 2013 H. S. Teoh
Sep 12, 2013 deadalnix
Sep 12, 2013 H. S. Teoh
Sep 12, 2013 deadalnix
Sep 12, 2013 H. S. Teoh
Sep 13, 2013 deadalnix
Sep 12, 2013 Manfred Nowak
Sep 28, 2013 Mehrdad
Sep 28, 2013 Mehrdad
Sep 12, 2013 Walter Bright
Sep 12, 2013 Robert Schadek
Sep 12, 2013 Dmitry Olshansky
Sep 12, 2013 Robert Schadek
Sep 12, 2013 Jonathan M Davis
Sep 12, 2013 qznc
Sep 12, 2013 Martin Nowak
Sep 12, 2013 Timon Gehr
Sep 12, 2013 Dmitry Olshansky
Sep 12, 2013 H. S. Teoh
Sep 12, 2013 Timon Gehr
Sep 12, 2013 Jacob Carlborg
Sep 12, 2013 Martin Nowak
Sep 12, 2013 Timon Gehr
Sep 11, 2013 Walter Bright
Sep 11, 2013 Brian Schott
Sep 11, 2013 Walter Bright
Sep 11, 2013 Martin Nowak
Sep 12, 2013 Jacob Carlborg
Sep 12, 2013 dennis luehring
Sep 12, 2013 Jacob Carlborg
Sep 12, 2013 Dmitry Olshansky
Sep 12, 2013 Jacob Carlborg
Sep 12, 2013 Brian Schott
Sep 12, 2013 dennis luehring
Sep 12, 2013 dennis luehring
Sep 12, 2013 Walter Bright
Sep 12, 2013 Brian Schott
Sep 13, 2013 Walter Bright
Sep 12, 2013 deadalnix
Sep 12, 2013 Timon Gehr
Sep 17, 2013 Dicebot
Sep 17, 2013 deadalnix
Sep 17, 2013 Dicebot
Sep 17, 2013 Brian Schott
Sep 17, 2013 Dicebot
Sep 25, 2013 Brian Schott
Sep 25, 2013 Jacob Carlborg
Sep 25, 2013 Brian Schott
Sep 25, 2013 Jacob Carlborg
Sep 26, 2013 Dominikus Dittes Scherkl
Sep 26, 2013 Jos van Uden
Sep 27, 2013 Dominikus Dittes Scherkl
Sep 25, 2013 Brian Schott
Sep 25, 2013 deadalnix
Sep 17, 2013 ilya-stromberg

September 11, 2013

std.d.lexer: pre-voting review / discussion

Posted by Dicebot

Dicebot

std.d.lexer is standard module for lexing D code, written by Brian Schott

---- Input ----

Code: https://github.com/Hackerpilot/phobos/tree/master/std/d

Documentation:
http://hackerpilot.github.io/experimental/std_lexer/phobos/lexer.html

Initial discussion:
http://forum.dlang.org/thread/dpdgcycrgfspcxenzrjf@forum.dlang.org

Usage example in real project:
https://github.com/Hackerpilot/Dscanner
(as stdx.d.lexer, Brian, please correct me if versions do not match)

---- Information for reviewers ----

(yes, I am mostly copy-pasting this :P)

Goal of this thread is to detect if there are any outstanding
issues that need to fixed before formal "yes"/"no" voting
happens. If no critical objections will arise, voting will begin
starting with a next week. Otherwise it depends on time module author needs to implement suggestions.

Please take this part seriously: "If you identify problems along the
way, please note if they are minor, serious, or showstoppers."
(http://wiki.dlang.org/Review/Process). This information later
will be used to determine if library is ready for voting.

If there are any frequent Phobos contributors / core developers
please pay extra attention to submission code style and fitting
into overall Phobos guidelines and structure.

Most important goal of this review is to determine any API / design problems. Any internal implementation tweaks may happen after inclusion to Phobos but it is important to assure that no breaking changes will be required any time soon after module will get wider usage.

---- Information request from module author ----

Performance was a major discussed topic in previous thread. Could you please provide benchmarking data for version currently ongoing the review?

September 11, 2013

Re: std.d.lexer: pre-voting review / discussion

Posted by Tove
in reply to Dicebot

Tove

Posted in reply to Dicebot

On Wednesday, 11 September 2013 at 15:02:00 UTC, Dicebot wrote:
> std.d.lexer is standard module for lexing D code, written by Brian Schott

I remember reading there were some interesting hash-advances in dmd recently.

http://forum.dlang.org/thread/kq7ov0$2o8n$1@digitalmars.com?page=1

maybe it's worth benchmarking those hashes for std.d.lexer as well.

September 11, 2013

Re: std.d.lexer: pre-voting review / discussion

Posted by Johannes Pfau
in reply to Dicebot

Johannes Pfau

Posted in reply to Dicebot

Am Wed, 11 Sep 2013 17:01:58 +0200
schrieb "Dicebot" <public@dicebot.lv>:

> std.d.lexer is standard module for lexing D code, written by Brian Schott
> 

Question / Minor issue:

As we already have a range based interface I'd love to have partial lexing / parsing, especially for IDEs.

Say I have this source code:
--------------------------------------------
1: module a;
2:
3: void test(int a)
4: {
5:     [...]
6: }
7:
8: void test2()
9: [...]
--------------------------------------------

Then I first do a full parse pass over the source. Now line 5 is being edited. I know from the full parse that line 5 is part of a FunctionDeclaration which starts at line 3 and ends at line 6. Now I'd like to re-parse only that part:

--------------------------------------------
FunctionDeclaration decl = document.getDeclByLine(5);
decl.reparse(/*start_line=*/ 3, docBuffer);
--------------------------------------------

I think these are the two critical points related to this for the proposed std.lexer:

* How can I tell the lexer to start lexing at line/character n? Of
  course the input could be sliced, but then line number and position
  information in the Token struct is wrong.
* I guess std.lexer slices the original input? This could make things
  difficult if the file buffer is edited in place. But this can
  probably be dealt with outside of std.lexer. (By reallocating all
  .value members)


(And once this is working, an example in the docs would be great)

But to be honest I'm not sure how important this really is. I think it should help for more responsive IDEs but maybe parsing is not a bottleneck at all?

September 11, 2013

Re: std.d.lexer: pre-voting review / discussion

Posted by qznc
in reply to Dicebot

qznc

Posted in reply to Dicebot

On Wednesday, 11 September 2013 at 15:02:00 UTC, Dicebot wrote:
> std.d.lexer is standard module for lexing D code, written by Brian Schott
>
> Documentation:
> http://hackerpilot.github.io/experimental/std_lexer/phobos/lexer.html

The documentation for Token twice says "measured in ASCII characters or UTF-8 code units", which sounds confusing to me.

Is it UTF-8, which includes ASCII? Then it should not be "or".

This is nitpicking. Overall, I like the proposal. Great work!

September 11, 2013

Re: std.d.lexer: pre-voting review / discussion

Posted by Walter Bright
in reply to Dicebot

Walter Bright

Posted in reply to Dicebot

On 9/11/2013 8:01 AM, Dicebot wrote:
> std.d.lexer is standard module for lexing D code, written by Brian Schott

Thank you, Brian! This is important work.

Not a thorough review, just some notes from reading the doc file:

1. I don't like the _ suffix for keywords. Just call it kwimport or something like that.

2. The example uses an if-then sequence of isBuiltType, isKeyword, etc. Should be an enum so a switch can be done for speed.

3. I assumed TokenType is a type. But it's not, it's an enum. Even the document says it's a 'type', but it's not a type.

4. When naming tokens like .. 'slice', it is giving it a syntactic/semantic name rather than a token name. This would be awkward if .. took on new meanings in D. Calling it 'dotdot' would be clearer. Ditto for the rest. For example that is done better, '*' is called 'star', rather than 'dereference'.

5. The LexerConfig initialization should be a constructor rather than a sequence of assignments. LexerConfig documentation is awfully thin. For example, 'tokenStyle' is explained as being 'Token style', whatever that is.

6. No clue how lookahead works with this. Parsing D requires arbitrary lookahead.

7. uint line; Should indicate that lines start with '1', not '0'. Ditto for columns.

8. 'default_' Again with the awful practice of appending _.

9. Need to insert intra-page navigation links, such as when 'byToken()' appears in the text, it should be link to where byToken is described.

> Goal of this thread is to detect if there are any outstanding
> issues that need to fixed before formal "yes"/"no" voting
> happens. If no critical objections will arise, voting will begin
> starting with a next week. Otherwise it depends on time module author needs to implement suggestions.

I believe the state of the documentation is a showstopper, and needs to be extensively fleshed out before it can be considered ready for voting.

September 11, 2013

Re: std.d.lexer: pre-voting review / discussion

Posted by Brian Schott
in reply to Walter Bright

Brian Schott

Posted in reply to Walter Bright

The choice of ending token names with underscores was made according to the Phobos style guide.

http://dlang.org/dstyle.html

September 11, 2013

Re: std.d.lexer: pre-voting review / discussion

Posted by Walter Bright
in reply to Brian Schott

Walter Bright

Posted in reply to Brian Schott

On 9/11/2013 12:10 PM, Brian Schott wrote:
> The choice of ending token names with underscores was made according to the
> Phobos style guide.
>
> http://dlang.org/dstyle.html

I didn't realize that was in the style guide. I guess I can't complain about it, then :-)

September 11, 2013

Re: std.d.lexer: pre-voting review / discussion

Posted by Walter Bright
in reply to qznc

Walter Bright

Posted in reply to qznc

On 9/11/2013 11:45 AM, qznc wrote:
> On Wednesday, 11 September 2013 at 15:02:00 UTC, Dicebot wrote:
>> std.d.lexer is standard module for lexing D code, written by Brian Schott
>>
>> Documentation:
>> http://hackerpilot.github.io/experimental/std_lexer/phobos/lexer.html
>
> The documentation for Token twice says "measured in ASCII characters or UTF-8
> code units", which sounds confusing to me.
>
> Is it UTF-8, which includes ASCII? Then it should not be "or".

Pedantically, it is just UTF-8 code units, which are a superset of ASCII.

September 11, 2013

Re: std.d.lexer: pre-voting review / discussion

Posted by Walter Bright
in reply to Johannes Pfau

Walter Bright

Posted in reply to Johannes Pfau

On 9/11/2013 11:43 AM, Johannes Pfau wrote:
> But to be honest I'm not sure how important this really is. I think it
> should help for more responsive IDEs but maybe parsing is not a
> bottleneck at all?

It is important, and I'm glad you brought it up. The LexerConfig can provide a spot to put a starting line/column value.

September 11, 2013

Re: std.d.lexer: pre-voting review / discussion

Posted by Manfred Nowak
in reply to Walter Bright

Manfred Nowak

Posted in reply to Walter Bright

Walter Bright wrote:
> Parsing D requires arbitrary lookahead.

Why---and since which version?

-manfred

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation