View mode: basic / threaded / horizontal-split · Log in · Help
November 24, 2010
Re: Looking for champion - std.lang.d.lex
On 19/11/2010 23:39, Andrei Alexandrescu wrote:
> On 11/19/10 1:03 PM, Bruno Medeiros wrote:
>> On 22/10/2010 20:48, Andrei Alexandrescu wrote:
>>> On 10/22/10 14:02 CDT, Tomek Sowiński wrote:
>>>> Dnia 22-10-2010 o 00:01:21 Walter Bright <newshound2@digitalmars.com>
>>>> napisał(a):
>>>>
>>>>> As we all know, tool support is important for D's success. Making
>>>>> tools easier to build will help with that.
>>>>>
>>>>> To that end, I think we need a lexer for the standard library -
>>>>> std.lang.d.lex. It would be helpful in writing color syntax
>>>>> highlighting filters, pretty printers, repl, doc generators, static
>>>>> analyzers, and even D compilers.
>>>>>
>>>>> It should:
>>>>>
>>>>> 1. support a range interface for its input, and a range interface for
>>>>> its output
>>>>> 2. optionally not generate lexical errors, but just try to recover and
>>>>> continue
>>>>> 3. optionally return comments and ddoc comments as tokens
>>>>> 4. the tokens should be a value type, not a reference type
>>>>> 5. generally follow along with the C++ one so that they can be
>>>>> maintained in tandem
>>>>>
>>>>> It can also serve as the basis for creating a javascript
>>>>> implementation that can be embedded into web pages for syntax
>>>>> highlighting, and eventually an std.lang.d.parse.
>>>>>
>>>>> Anyone want to own this?
>>>>
>>>> Interesting idea. Here's another: D will soon need bindings for CORBA,
>>>> Thrift, etc, so lexers will have to be written all over to grok
>>>> interface files. Perhaps a generic tokenizer which can be parametrized
>>>> with a lexical grammar would bring more ROI, I got a hunch D's
>>>> templates
>>>> are strong enough to pull this off without any source code generation
>>>> ala JavaCC. The books I read on compilers say tokenization is a solved
>>>> problem, so the theory part on what a good abstraction should be is
>>>> done. What you think?
>>>
>>> Yes. IMHO writing a D tokenizer is a wasted effort. We need a tokenizer
>>> generator.
>>>
>>
>> Agreed, of all the things desired for D, a D tokenizer would rank pretty
>> low I think.
>>
>> Another thing, even though a tokenizer generator would be much more
>> desirable, I wonder if it is wise to have that in the standard library?
>> It does not seem to be of wide enough interest to be in a standard
>> library. (Out of curiosity, how many languages have such a thing in
>> their standard library?)
>
> Even C has strtok.
>
> Andrei

That's just a fancy splitter, I wouldn't call that a proper tokenizer. I 
meant something that, at the very least, would tokenize based on regular 
expressions (and have heterogenous tokens).

-- 
Bruno Medeiros - Software Engineer
November 24, 2010
Re: Looking for champion - std.lang.d.lex
On 19/11/2010 23:56, Michael Stover wrote:
> so that was 4 months ago - how do things currently stand on that initiative?
>
> -Mike
>
> On Fri, Nov 19, 2010 at 6:37 PM, Bruno Medeiros
> <brunodomedeiros+spam@com.gmail> wrote:
>
>     On 19/11/2010 22:25, Michael Stover wrote:
>
>         As for D lexers and tokenizers, what would be nice is to
>         A) build an antlr grammar for D
>         B) build D targets for antlr so that antlr can generate lexers and
>         parsers in the D language.
>
>         For B) I found http://www.mbutscher.de/antlrd/index.html
>
>         For A) A good list of antlr grammars is at
>         http://www.antlr.org/grammar/list, but there isn't a D grammar.
>
>         These things wouldn't be an enormous amount of work to create and
>         maintain, and, if done, anyone could parse D code in many languages,
>         including Java and C which would make providing IDE features for D
>         development easier in those languages (eclipse for instance),
>         and you
>         could build lexers and parsers in D using antlr grammars.
>
>         -Mike
>
>
>     Yes, that would be much better. It would be directly and immediately
>     useful for the DDT project:
>
>     "But better yet would be to start coding our own custom parser
>     (using a parser generator like ANTLR for example), that could really
>     be tailored for IDE needs. In the medium/long term, that's probably
>     what needs to be done. "
>     in
>     http://www.digitalmars.com/d/archives/digitalmars/D/ide/Future_of_Descent_and_D_Eclipse_IDE_635.html
>
>     --
>     Bruno Medeiros - Software Engineer
>
>

I don't know about Ellery, as you can see in that thread he/she(?) 
mentioned interest in working on that, but I don't know anything more.

As for me, I didn't work on that, nor did I plan to.
Nor am I planning to anytime soon, DDT can handle things with the 
current parser for now (bugs can be fixed on the current code, perhaps 
some limitations can be resolved by merging some more code from DMD), so 
I'll likely work on other more important features before I go there. For 
example, I'll likely work on debugger integration, and code completion 
improvements before I would go on writing a new parser from scratch. 
Plus, it gives more time to hopefully someone else work on it. :P

Unlike Walter, I can't write a D parser in a weekend... :) Not even on a 
week, especially since I never done anything of this kind before.


-- 
Bruno Medeiros - Software Engineer
November 24, 2010
Re: Looking for champion - std.lang.d.lex
On 11/24/2010 09:13 AM, Bruno Medeiros wrote:
>
> I don't know about Ellery, as you can see in that thread he/she(?)
> mentioned interest in working on that, but I don't know anything more.
>

Normally I go by 'it'.

Been pretty busy this semester, so I haven't been doing much.

But the bottom line is, yes I have working antlr grammars for D1 and D2 
if you don't mind
1) they're slow
2) they're tied to a hacked-out version of the netbeans fork of ANTLR2
3) they're tied to some custom java code
4) I haven't been keeping the tree grammars so up to date

I've not released them for those reasons. Semester will be over in about 
3 weeks, though, and I'll have time then.

> As for me, I didn't work on that, nor did I plan to.
> Nor am I planning to anytime soon, DDT can handle things with the
> current parser for now (bugs can be fixed on the current code, perhaps
> some limitations can be resolved by merging some more code from DMD), so
> I'll likely work on other more important features before I go there. For
> example, I'll likely work on debugger integration, and code completion
> improvements before I would go on writing a new parser from scratch.
> Plus, it gives more time to hopefully someone else work on it. :P
>
> Unlike Walter, I can't write a D parser in a weekend... :) Not even on a
> week, especially since I never done anything of this kind before.
>
>

It took me like 3 months to read his parser to figure out what was going on.
November 24, 2010
Re: Looking for champion - std.lang.d.lex
On 24/11/2010 13:30, Bruno Medeiros wrote:
> On 19/11/2010 23:39, Andrei Alexandrescu wrote:
>> On 11/19/10 1:03 PM, Bruno Medeiros wrote:
>>> On 22/10/2010 20:48, Andrei Alexandrescu wrote:
>>>> On 10/22/10 14:02 CDT, Tomek Sowiński wrote:
>>>>> Dnia 22-10-2010 o 00:01:21 Walter Bright <newshound2@digitalmars.com>
>>>>> napisał(a):
>>>>>
>>>>>> As we all know, tool support is important for D's success. Making
>>>>>> tools easier to build will help with that.
>>>>>>
>>>>>> To that end, I think we need a lexer for the standard library -
>>>>>> std.lang.d.lex. It would be helpful in writing color syntax
>>>>>> highlighting filters, pretty printers, repl, doc generators, static
>>>>>> analyzers, and even D compilers.
>>>>>>
>>>>>> It should:
>>>>>>
>>>>>> 1. support a range interface for its input, and a range interface for
>>>>>> its output
>>>>>> 2. optionally not generate lexical errors, but just try to recover
>>>>>> and
>>>>>> continue
>>>>>> 3. optionally return comments and ddoc comments as tokens
>>>>>> 4. the tokens should be a value type, not a reference type
>>>>>> 5. generally follow along with the C++ one so that they can be
>>>>>> maintained in tandem
>>>>>>
>>>>>> It can also serve as the basis for creating a javascript
>>>>>> implementation that can be embedded into web pages for syntax
>>>>>> highlighting, and eventually an std.lang.d.parse.
>>>>>>
>>>>>> Anyone want to own this?
>>>>>
>>>>> Interesting idea. Here's another: D will soon need bindings for CORBA,
>>>>> Thrift, etc, so lexers will have to be written all over to grok
>>>>> interface files. Perhaps a generic tokenizer which can be parametrized
>>>>> with a lexical grammar would bring more ROI, I got a hunch D's
>>>>> templates
>>>>> are strong enough to pull this off without any source code generation
>>>>> ala JavaCC. The books I read on compilers say tokenization is a solved
>>>>> problem, so the theory part on what a good abstraction should be is
>>>>> done. What you think?
>>>>
>>>> Yes. IMHO writing a D tokenizer is a wasted effort. We need a tokenizer
>>>> generator.
>>>>
>>>
>>> Agreed, of all the things desired for D, a D tokenizer would rank pretty
>>> low I think.
>>>
>>> Another thing, even though a tokenizer generator would be much more
>>> desirable, I wonder if it is wise to have that in the standard library?
>>> It does not seem to be of wide enough interest to be in a standard
>>> library. (Out of curiosity, how many languages have such a thing in
>>> their standard library?)
>>
>> Even C has strtok.
>>
>> Andrei
>
> That's just a fancy splitter, I wouldn't call that a proper tokenizer. I
> meant something that, at the very least, would tokenize based on regular
> expressions (and have heterogenous tokens).
>

In other words, a lexer, that might be a better term in this context.

-- 
Bruno Medeiros - Software Engineer
November 24, 2010
Re: Looking for champion - std.lang.d.lex
On 20/11/2010 01:29, Jonathan M Davis wrote:
> On Friday, November 19, 2010 15:17:35 Bruno Medeiros wrote:
>> On 19/11/2010 22:02, Jonathan M Davis wrote:
>>> On Friday, November 19, 2010 13:53:12 Bruno Medeiros wrote:
>>>> On 19/11/2010 21:27, Jonathan M Davis wrote:
>>>>
>>>> And by providing a lexer and a parser outside the standard library,
>>>> wouldn't it make it just as easy for those tools to be written? What's
>>>> the advantage of being in the standard library? I see only
>>>> disadvantages: to begin with it potentially increases the time that
>>>> Walter or other Phobos contributors may have to spend on it, even if
>>>> it's just reviewing patches or making sure the code works.
>>>
>>> If nothing, else, it makes it easier to keep in line with dmd itself.
>>> Since the dmd front end is LGPL, it's not possible to have a Boost port
>>> of it (like the Phobos version will be) without Walter's consent. And
>>> I'd be surprised if he did that for a third party library (though he
>>> seems to be pretty open on a lot of that kind of stuff). Not to mention,
>>> Walter and the core developers are _exactly_ the kind of people that you
>>> want working on a lexer or parser of the language itself, because
>>> they're the ones who work on it.
>>>
>>> - Jonathan M Davis
>>
>> Eh? That license argument doesn't make sense: if the lexer and parser
>> were to be based on DMD itself, then putting it in the standard library
>> is equivalent (in licensing terms) to licensing the lexer and parser
>> parts of DMD in Boost. More correctly, what I mean by equivalent, is
>> that there no reason why Walter would allow one thing and not the
>> other... (because on both cases he would have to issue that license)
>
> It's very different to have D implementation of something - which is based on a
> C++ version but definitely different in some respects - be under Boost and
> generally available, and having the C++ implementation be under Boost -
> particularly when the C++ version covers far more than just a lexer and parser.
> Someone _could_ port the D code back to C++ and have that portion useable under
> Boost, but that's a lot more work than just taking the C++ code and using it,
> and it's only the portions of the compiler which were ported to D to which could
> be re-used that way. And since the Boost code could be used in a commercial
> product while the LGPL is more restricted, it could make a definite difference.
>
> I'm not a licensing expert, and I'm not an expert on what Walter does and
> doesn't want done with his code, but he put the compiler front end under the
> LGPL, not Boost, and he's given his permission to have the lexer alone ported to
> D and put under the Boost license in the standard library, which is very
> different from putting the entire front end under Boost. I expect that the parser
> will follow eventually, but even if it does, that's still not the entire front
> end. So, there is a difference in licenses does have a real impact. And no one
> can take the LGPL C++ code and port it to D - for the standard library or
> otherwise - without Walter's permission, because its his copyright on the code.
>

There are some misunderstandings here. First, the DMD front-end is 
licenced under the GPL, not LGPL.
Second, more importantly, it is actually also licensed under the 
Artistic license, a very permissible license. This is the basis for me 
stating that almost certainly Walter would not mind licensing the DMD 
parser and lexer under Boost, as it's actually not that different from 
the Artistic license.


>
> Ideally, Phobos would be huge in a manner similar to how C# or Java's libraries
> are huge. It will take time to get there, and we'll need more developers, but I

That point actually works in my favor. C# and Java's libraries are much 
bigger than Phobos, and yet they have no functionality for 
lexing/parsing their own languages (or any other for that matter)!


-- 
Bruno Medeiros - Software Engineer
November 24, 2010
Re: Looking for champion - std.lang.d.lex
On 24/11/2010 16:19, Ellery Newcomer wrote:
> On 11/24/2010 09:13 AM, Bruno Medeiros wrote:
>>
>> I don't know about Ellery, as you can see in that thread he/she(?)
>> mentioned interest in working on that, but I don't know anything more.
>>
>
> Normally I go by 'it'.
>

I didn't meant to offend or anything, I was just unsure of that. To me 
Ellery seems like a female name (but that can be a bias due to English 
not being my first language, or some other cultural thing). On the other 
hand, I would be surprised if a person of the female variety would be 
that interested in D, to the point of contributing in such way.

> Been pretty busy this semester, so I haven't been doing much.
>
> But the bottom line is, yes I have working antlr grammars for D1 and D2
> if you don't mind
> 1) they're slow
> 2) they're tied to a hacked-out version of the netbeans fork of ANTLR2
> 3) they're tied to some custom java code
> 4) I haven't been keeping the tree grammars so up to date
>
> I've not released them for those reasons. Semester will be over in about
> 3 weeks, though, and I'll have time then.
>

Hum, doesn't sound like it might be suitable for DDT, but I wasn't 
counting on that either.

>> As for me, I didn't work on that, nor did I plan to.
>> Nor am I planning to anytime soon, DDT can handle things with the
>> current parser for now (bugs can be fixed on the current code, perhaps
>> some limitations can be resolved by merging some more code from DMD), so
>> I'll likely work on other more important features before I go there. For
>> example, I'll likely work on debugger integration, and code completion
>> improvements before I would go on writing a new parser from scratch.
>> Plus, it gives more time to hopefully someone else work on it. :P
>>
>> Unlike Walter, I can't write a D parser in a weekend... :) Not even on a
>> week, especially since I never done anything of this kind before.
>>
>>
>
> It took me like 3 months to read his parser to figure out what was going
> on.

Not 3 man-months for sure!, right? (Man-month in the sense of someone 
working 40 hours per week during a month.)


-- 
Bruno Medeiros - Software Engineer
November 24, 2010
Re: Looking for champion - std.lang.d.lex
On 11/24/2010 02:09 PM, Bruno Medeiros wrote:
>
> I didn't meant to offend or anything, I was just unsure of that.

None taken; I'm just laughing at you. As I understand it, though, 
'Ellery' is a unisex name, so it is entirely ambiguous.

>> It took me like 3 months to read his parser to figure out what was going
>> on.
>
> Not 3 man-months for sure!, right? (Man-month in the sense of someone
> working 40 hours per week during a month.)
>
>

Probably not
November 24, 2010
Re: Looking for champion - std.lang.d.lex
Bruno Medeiros:

> On the other hand, I would be surprised if a person of the female variety
> would be that interested in D, to the point of contributing in such way.

In Python newsgroups I have seen few women, now and then, but in the D newsgroup so far... not many. So far D seems a male thing. I don't know why. At the university at the Computer Science course there are a good enough number of female students (and few female teachers too).

Bye,
bearophile
November 24, 2010
Re: Looking for champion - std.lang.d.lex
bearophile schrieb:
> Bruno Medeiros:
> 
>> On the other hand, I would be surprised if a person of the female variety
>> would be that interested in D, to the point of contributing in such way.
> 
> In Python newsgroups I have seen few women, now and then, but in the D newsgroup so far... not many. So far D seems a male thing. I don't know why. At the university at the Computer Science course there are a good enough number of female students (and few female teachers too).
> 
> Bye,
> bearophile

At my university there are *very* few woman studying computer science.
Most women sitting in CS lectures here are studying maths and have to do some 
basic CS lectures (I don't think they're the kind that would try D voluntarily).
We have two female professors though.
November 25, 2010
Re: Looking for champion - std.lang.d.lex
"Daniel Gibson" <metalcaedes@gmail.com> wrote in message 
news:icjv6l$p1r$2@digitalmars.com...
> bearophile schrieb:
>> Bruno Medeiros:
>>
>>> On the other hand, I would be surprised if a person of the female 
>>> variety
>>> would be that interested in D, to the point of contributing in such way.
>>
>> In Python newsgroups I have seen few women, now and then, but in the D 
>> newsgroup so far... not many. So far D seems a male thing. I don't know 
>> why. At the university at the Computer Science course there are a good 
>> enough number of female students (and few female teachers too).
>>
>> Bye,
>> bearophile
>
> At my university there are *very* few woman studying computer science.
> Most women sitting in CS lectures here are studying maths and have to do 
> some basic CS lectures (I don't think they're the kind that would try D 
> voluntarily).
> We have two female professors though.

See, that's the #1 worst thing about the field of programming: Total sausage 
fest.
9 10 11 12 13 14
Top | Discussion index | About this forum | D home