January 16, 2015
On 1/11/15 3:48 PM, Walter Bright wrote:
> On 1/11/2015 9:45 AM, Stefan Koch wrote:
>> I'm  powerful writing a parser-generator, that will be able to
>> transform the
>> generated parse-tree back into source automatically.
>> writing a rule-based formatter should be pretty doable.
>
> Formatting the AST into text is straightforward, dmd already does that
> for .di file generation.
>
> The main problem is what to do about comments, which don't fit into the
> grammar.
>
> A secondary problem is what to do when the line length limit is
> exceeded, such as for long expressions.

The way I did it in Descent (I copied the logic from JDT) is to parse the code into an AST, and then walk the AST in sync with a lexer. So if you have this:

void /* comment /* foo() {}

the AST would be a FunctionDecl (whatever the name is) so you'd expect a type (consume that AST node, in sync with the lexer), then check for comments/newlines/etc., skip/print them, then consume the name, check for comments/newlines/etc.

That way the AST doesn't have to know anything about comments, but comments need to be known by the lexer (via a flag, probably).

Considering how flexible is JDT's formatter, I think this solution is pretty good.
January 16, 2015
On Friday, 16 January 2015 at 15:06:42 UTC, Ary Borenszweig wrote:
> The way I did it in Descent (I copied the logic from JDT) is to parse the code into an AST, and then walk the AST in sync with a lexer.

My dfmt tool does something similar. The parser runs over the code first and makes notes on things like the location of unary operators and which braces end function/aggregate declarations. Then the formatter iterates over the tokens (including comments) and is able to correctly print "a* b;" instead of "a * b;".
January 16, 2015
Hi there, I'm a C++/Python refugee, new to D.

> clang-format seems to do a pretty good job with both of these. Comments seem to be intact unless they're too long, then they're wrapped. It seems to wrap at a space or other non-identifier character. Same thing with expressions that are too long.

I would love such a tool for D, especially based on the ideas of clang-format. I first heard about clang-format from a talk by Google's Chandler Carruth: the way he said it, the LLVM guys looked at what their C++ programmers wasted the most time on, and it turned out to be whitespace, surprisingly enough. So they implemented was is essentially LaTeX for source code (optimal placement of spaces and line breaks using Djikstra's algorithm, the works). And he said it changed the way he codes, and at that point I had to go get it, and I 100% agree with him: it's awesome to have.

It works with Vim, Emacs, Visual Studio, Sublime, etc., because it provides a simple Python wrapper to the executable that anything can call any time to format any code. It works with C/C++, Objective-C, and Javascript; when I discovered it could do JavaScript, my use of that language rose substantially (could be coincidence :).

Reading some of the discussion here about whether to integrate it into the compiler, etc., makes me realize that one of the nice things about clang-format is real-time interactivity, i.e., I can type-type-type code quickly and without bothering with whitespace, just getting the idea out of my head and into the editor, then when I reach a breathing space I can hit a keycombo, and my editor reformats the block I just typed. I find that the hits to my flow are much smaller than having to do this manually, and I think that's what Carruth meant when he said it changed his coding.

Another nice thing about it is that it's fully parameterized (how many spaces, whether to indent this, what penalty to assign that, etc., example at [1]). It can search the current and parent directories for a .clang-format file with those parameters. It can also generate stock .clang-format files conforming to various coding styles, viz., LLVM, Google, Mozilla, WebKit, & Chromium. I throw this into my Git repos; happiness ensues.

Hope this helps outline the paths others have taken!

[1] Example Javascript .clang-format file: https://github.com/fasiha/kanjiwild/blob/gh-pages/.clang-format
January 16, 2015
On Friday, 16 January 2015 at 15:55:53 UTC, Brian Schott wrote:
> On Friday, 16 January 2015 at 15:06:42 UTC, Ary Borenszweig wrote:
>> The way I did it in Descent (I copied the logic from JDT) is to parse the code into an AST, and then walk the AST in sync with a lexer.
>
> My dfmt tool does something similar. The parser runs over the code first and makes notes on things like the location of unary operators and which braces end function/aggregate declarations. Then the formatter iterates over the tokens (including comments) and is able to correctly print "a* b;" instead of "a * b;".

This is a short talk about clang-format's design and implementation:
https://www.youtube.com/watch?v=s7JmdCfI__c

They have this concept of "unwrapped lines" into which the source code is split by a "structural parser". I am not sure, if dfmt really needs it, because we don't have cpp macros littered around in D code.

While dfmt already works well for some code (e.g. the snippet on dlang.org frontpage), the biggest hurdle (imho) is an expression formatter, which for example understands operator precedences. Currently, dfmt uses a greedy line fill algorithm.
1 2 3 4 5 6 7
Next ›   Last »