June 15, 2010
Bruno Medeiros wrote:

> On 05/06/2010 09:17, Lutger wrote:
>> Bruno Medeiros wrote:
>>
>>> On 25/05/2010 14:06, Jacob Carlborg wrote:
>>>> On 2010-05-25 04.15, Ary Borenszweig wrote:
>>>>> On 05/24/2010 08:38 AM, Bruno Medeiros wrote:
>>>>>> Hi.
>>>>>>
>>>>>> I'm now in a position where I can dedicate a lot of free time for an open source project, and I'm seriously considering going back working on the D Eclipse IDE project. I worked on Mmrnmhrm a couple of years ago, as part of my thesis, which led to some restrictions on the kinds of tasks I should work on. Now I don't have that issue, I have (almost) complete freedom on what I can work on, and in particular I would like to unify the two current efforts for a D Eclipse IDE: Descent and Mmrnmhrm, as there is a lot of work being put into both (especially Descent ^^ )
>>>>>>
>>>>>> Now, Ary has been inactive for quite a while, and he said he wasn't interested in working in Descent any more :(
>>>>>
>>>>> I know I told you that, but now that I think of it, it's not that I'm not interested. The project has grown too big and in the last releases I added nice features without thinking much about the design and the flexibility of growth... so now I feel the project is kind of a mess and it's very hard to continue it. The problems I see are:
>>>>>
>>>>> * Porting DMD source to Java was done manually and it's a very boring and long task, and we need to find a way to automate it if we'd like to support really good integration with the language (I mean, real semantic value, and because D is not dynamic I think this is worth it).
>>>>
>>>> I have two suggestions for this problem:
>>>>
>>>> 1. Could DMD be compiled to a dynamic library and then be used like a plugin, using JNI to interact between the compiler and the plugin.
>>>>
>>>
>>> Nope, the compiler generates a big structure, an AST, which is composed of nodes from a complex hierarchy of classes. Transferring such sctructure across JNI would be incredibly hard do implement, not to mention probably inefficient. JNI (as with most any C interfacing) is good mostly just for calling C methods with simple parameters.
>>>
>>> The other option would be for the compiler to expose just a thin API (without big data structures), and have the IDE query the semantic functionality directly to the compiler. But then the frontend itself would have to be extended a lot, which would me a lot of complicated coding in C... argh, no way...
>>>
>>
>> Is working with ddmd an option?
> 
> Nah. Since DDMD is a direct port of DMD, the situation would not be much
> different: you would have a lot of C-like code in D, as well as a
> Walter-like design :P, and you wouldn't be able to change either of
> those things, because then you would make it significantly hard (if not
> impractical) to update the DDMD port with new DMD changes.
> That's exactly the same problem there is with the Java port of DMD.
> 
> I'm now pretty sure the only good long-term strategy is to have a
> compiler/engine designed from scratch to work with IDE functionality.
> Although I didnt realize until now, thanks to your comment, that it
> might be feasible for this engine to be separate, (either as a process,
> or as library), from the Java IDE, and thus the engine could be in a
> different language than Java.
> Hum, this is quite interesting, one very big reason being that then it
> could be reused by different D IDEs, or other sorts of projects...
> 

That, and being written in D, would make it more attractive. There is another option with a headstart: dil, if gpl is not a problem.
June 15, 2010
On 2010-06-15 21:37, Lutger wrote:
> Bruno Medeiros wrote:
>
>> On 05/06/2010 09:17, Lutger wrote:
>>> Bruno Medeiros wrote:
>>>
>>>> On 25/05/2010 14:06, Jacob Carlborg wrote:
>>>>> On 2010-05-25 04.15, Ary Borenszweig wrote:
>>>>>> On 05/24/2010 08:38 AM, Bruno Medeiros wrote:
>>>>>>> Hi.
>>>>>>>
>>>>>>> I'm now in a position where I can dedicate a lot of free time for an
>>>>>>> open source project, and I'm seriously considering going back working
>>>>>>> on the D Eclipse IDE project. I worked on Mmrnmhrm a couple of years
>>>>>>> ago, as part of my thesis, which led to some restrictions on the
>>>>>>> kinds of tasks I should work on. Now I don't have that issue, I have
>>>>>>> (almost) complete freedom on what I can work on, and in particular I
>>>>>>> would like to unify the two current efforts for a D Eclipse IDE:
>>>>>>> Descent and Mmrnmhrm, as there is a lot of work being put into both
>>>>>>> (especially Descent ^^ )
>>>>>>>
>>>>>>> Now, Ary has been inactive for quite a while, and he said he wasn't
>>>>>>> interested in working in Descent any more :(
>>>>>>
>>>>>> I know I told you that, but now that I think of it, it's not that I'm
>>>>>> not interested. The project has grown too big and in the last releases
>>>>>> I added nice features without thinking much about the design and the
>>>>>> flexibility of growth... so now I feel the project is kind of a mess
>>>>>> and it's very hard to continue it. The problems I see are:
>>>>>>
>>>>>> * Porting DMD source to Java was done manually and it's a very boring
>>>>>> and long task, and we need to find a way to automate it if we'd like
>>>>>> to support really good integration with the language (I mean, real
>>>>>> semantic value, and because D is not dynamic I think this is worth
>>>>>> it).
>>>>>
>>>>> I have two suggestions for this problem:
>>>>>
>>>>> 1. Could DMD be compiled to a dynamic library and then be used like a
>>>>> plugin, using JNI to interact between the compiler and the plugin.
>>>>>
>>>>
>>>> Nope, the compiler generates a big structure, an AST, which is composed
>>>> of nodes from a complex hierarchy of classes. Transferring such
>>>> sctructure across JNI would be incredibly hard do implement, not to
>>>> mention probably inefficient. JNI (as with most any C interfacing) is
>>>> good mostly just for calling C methods with simple parameters.
>>>>
>>>> The other option would be for the compiler to expose just a thin API
>>>> (without big data structures), and have the IDE query the semantic
>>>> functionality directly to the compiler. But then the frontend itself
>>>> would have to be extended a lot, which would me a lot of complicated
>>>> coding in C... argh, no way...
>>>>
>>>
>>> Is working with ddmd an option?
>>
>> Nah. Since DDMD is a direct port of DMD, the situation would not be much
>> different: you would have a lot of C-like code in D, as well as a
>> Walter-like design :P, and you wouldn't be able to change either of
>> those things, because then you would make it significantly hard (if not
>> impractical) to update the DDMD port with new DMD changes.
>> That's exactly the same problem there is with the Java port of DMD.
>>
>> I'm now pretty sure the only good long-term strategy is to have a
>> compiler/engine designed from scratch to work with IDE functionality.
>> Although I didnt realize until now, thanks to your comment, that it
>> might be feasible for this engine to be separate, (either as a process,
>> or as library), from the Java IDE, and thus the engine could be in a
>> different language than Java.
>> Hum, this is quite interesting, one very big reason being that then it
>> could be reused by different D IDEs, or other sorts of projects...
>>
>
> That, and being written in D, would make it more attractive. There is another option with a headstart: dil, if gpl is not a problem.

GPL is generally a problem with Eclipse since the Eclipse license (EPL) is incompatible with GPL. The nature of GPL is that you cannot even dynamically link to a GPL library without effecting the license of your own code and third party code you use. Basically the only way around that is inter-process communication. The CDT pluing for Eclipse uses this method to communicate with GCC.

-- 
/Jacob Carlborg
July 12, 2010
On 25/05/2010 21:31, Bruno Medeiros wrote:
> On 25/05/2010 03:15, Ary Borenszweig wrote:
>> On 05/24/2010 08:38 AM, Bruno Medeiros wrote:
>>
>> * Porting DMD source to Java was done manually and it's a very boring
>> and long task, and we need to find a way to automate it if we'd like to
>> support really good integration with the language (I mean, real semantic
>> value, and because D is not dynamic I think this is worth it).
>> * The DMD source was modified a little bit for performance reasons and
>> for integration with Descent so now it contains bugs that are very hard
>> to fix.
>
> That's actually exactly the concerns I had from the start with doing
> that approach. Even if we had an easy way to automate the port, I think
> we would still have a code that has issues of performance and
> scalability. And yet if we would try to modify it to address those
> issues, we would most likely introduce hard to correct bugs, and/or make
> the automated porting process (if we had one) much more complicated, if
> not impossible (aka, reverting to doing a lot of manual work).
>
> That's why I think the approach of porting the full DMD frontend is not
> good, even it was more automated. So for me its either:
>
> * Port and use the DMD parser only, and implement semantic analysys
> features by scratch. This is Mmrnmhrm's current approach, and would mean
> the semantic capabilities would be much less than Descent: likely just
> resolving references (for code completion), but no semantic errors or
> CTFE debug.
>
> * Try and build the parser and semantic engine with the help of some
> other tool, like IMP for example.
>
> I'll have to take a look at IMP again to think about this some more.
>
>> * The code is just too big because it has a lot of code from JDT,
>> modified a little bit, and that makes it hard for other developers to
>> join.
>>
>> .. This means I'm currently
>>> the only Eclipse developer, so I could just do things in the way I'm
>>> thinking of, but Ary, I would still like to get your input, because I
>>> hope one day you might be interested in coming back to Descent
>>> development. :)
>>
>> I can program now and then if I have time, but what I'd really like is
>> to plan how to start things almost from scratch and think of a plan of
>> doing it well. Maybe Descent could be done with IMP, I don't think using
>> JDT's source code will be good for the project. The only problem I see
>> with IMP (or DLTK) is that they don't support some of the features
>> Descent already provides... but as IMP gets better Descent will
>> automatically get those updates.
>>
>
> Yes, this is the other issue is how to do all the IDE infrastructure
> stuff. I also think that porting JDT is less than ideal. Porting JDT has
> similar problems to the DMD approach (although to a much lesser degree).
> The code is not easy to understand, and we risk introducing bugs. And it
> is also hard to maintain and keep updated with new JDT versions.
> If there was no project such as IMP or DLTK, I think it would still be
> worth to do the JDT porting. But since those projects exist and are
> currently actively developed, I think they are a much better choice.
> In the cases where they don't provide the features that Descent already
> supports, I think that we might be able to do a little bit of
> mix-and-match, and use a some of JDT's ported code. But the feasibility
> of that will depend a lot on the particular feature, and whether IMP or
> DLTK is used.
>
> As for which to use, I'm still inclined torwards DLTK. DLTK has more IDE
> infrastructure support and functionality than IMP, and might make it
> easier to mix-and-match with JDT porting, since a lot of DLTK itself is
> also based on JDT ported code. We might even submit contributions to
> DLTK itself. But it has no support for parsing or language semantic
> analysis, which IMP has. So that's an advantage for IMP.
> But I'll have to take another look at IMP (an also the new DLTK version,
> since I'm not yet familiar with all the changes that have occurred in
> the last year or so).
>
>
>>
>>>
>>> Ary, I would like to know if by any chance you are interested in doing
>>> that separation in the near future. If not, I can do it myself, but it
>>> might take a bit longer.
>>
>> Yes, I would like to do it, but first I think we need to think if we
>> want to a) keep using the Java port and little by little upgrade it to
>> the latest dmd versions, b) start another port but build a tool to more
>> or less automate that process (Frank Benoit started something in that
>> regard but I don't know what happened then) or c) make a less
>> semantic-aware IDE like Mmrnmhrm (the good thing is that it might be
>> simpler and faster, but the bad thing is that we might not get all the
>> features we'd want).
>>
>> Thanks for bringing this topic! A lot of effort has been put in Descent
>> and it would maybe be sad if the project is abandoned... (but I really
>> feel I'm stuck now and I don't know how to advance... or at least I know
>> I would have to keep porting C++ code to Java, but...
>>
>> http://www.explosm.net/comics/1804/
>>
>> )
>>
>> Cheers!
>
> Cool to see you a bit more re-motivated. :)
>
>
> So let me say what I want to do next: Mmrnmrhm at the moment has no
> support for D2 whatsoever (well its not completely unusable, one can
> open D2 files but it will treat them as D1 files). I want to take the
> current Descent parser, and update Mmrnmrhm's semantic engine to this
> new parser version, such that Mmrnmrhm's current semantic functionality
> (go to references, find references) works in D2 as well. While doing
> this, it should give a good idea of how complex it is to keep up the
> basic IDE functionality on par with language changes, with the approach
> of implementing the code by scratch. I'm confident this can be done
> without much effort, and thus it would validate itself as the best
> approach.
>
>

Ok, status update. I've done the above, updated Mmrnmrhm using Descent's parser. It now works ok (same functionality as Mmrnmrhm was with D1.0) for D2, well, at least up to D #2.030. (you get syntax errors with annotations, for example)
It was fairly quick task, it only took 2 man-days.

I stick by my conclusion that it is fairly easy to keep a parser up-to-date with language changes, together with a basic custom IDE semantic engine to provide reference resolution and search (which is the core of the features of proper code completion, find reference, hover text, etc.). That's not surprising since these aspects of a language rarely change. Other aspects of the language not directly related to reference resolution, but that affect it, change more often, but on the other hand are easier to adapt. (and also, if they are not adapted, the IDE is not too crippled)
This is in contrast to, for example, maintaining a custom semantic engine for semantic error reporting. This would make it much harder to keep up to date with language changes, and much more crippling to the IDE if not kept up to date (if outdated, one would likely get a lot of false errors in the editor).

Anyways, I again took a better look at IMP, and also XText (a project very similar in purpose to IMP), but I'm not convinced they offer a significant advantage over DLTK. Yes, DLTK is only about IDE
"infrastructure", its not a complete solution and in particular has almost nothing to support language semantic engine functionality. IMP and XText on the other hand do aim to support this kind of functionality, but in practice I think that this support is somewhat superficial, and that for non-simple languages (aka, not DSLs), you still have to write the bulk of the code for the semantic functionality.

In more detail, looking at this quote from IMP intro:
""
The goal of the project is to ease the development of commercial-quality IDE support for new programming languages, including the following features:
[...]
 * refactoring support (not only "Move" and "Rename", but type- and code-related refactorings requiring non-trivial analysis, e.g. "Extract Method" and "Infer Type Arguments")
""
However this particular support has not yet been implement in IMP, nor has it even show how it could be implemented, and its usage would be. So this particular promise is vaporware at the moment.
The only main semantic functionality that IMP provides is reference resolving. But even for that, check this quote from the IMP User Guide, section "Creating a Reference-Resolution Service":
""
The customization of this service implementation for a particular language may be simple or complex, depending on language semantics and on the availability of information that can facilitate the resolution process.
""

So basically this is exactly the same stuff I did for Mmrnmhrm already. At the moment, not much is gained from using IMP it seems, versus DLTK.

I also looked at XText, a more recent project, but which has gained a lot of traction in Eclipse recently. It has even matured as an Eclipse project (like DLTK, but unlike IMP), and has even been featured in the recent Helios release.

I've tried out XText, and actually tried coding a very simple language on it. It seems XText actually delivers something on that promise of providing support for core semantic functionality: they have their own grammar language which allows you to annotate and specify certain kinds of semantic information, namely regarding error validation/reporting and reference resolving. With XText it is actually possible to build an IDE/editor with reference resolving (that means Code Completion, Find Reference, etc.) without writing any code for something like a ReferenceResolver. Just annotate the language grammar to specify what particular nodes/productions are references (and to what kind of entity it references). I find that XText is quite a step ahead of IMP, semantics-wise.
However, I still don't think it would be worthwhile to adopt XText. It works great for simple and moderately complex DSLs, but for a full blown general purpose language like D (which is quite complex, even amongst GPLs), I think that support would erode, and you would have to write the bulk of the semantic code anyways, and have little to gain.
One thing of interest though, is their use of EMF: the ASTs that XText generates are based on EMF, and they claim that this provides them with a lot of tools and functionality for manipulating the AST using generic EMF tooling. I haven't looked into this with depth, but it seems potentially substantial and worthwhile to keep in mind, should we ever get to point of doing AST manipulations (for refactoring, etc.).


So, yeah, as things are, I'm maintaining my approach with Mmrnmrhm: writing and improving the (very basic) semantic engine from scratch, and using DLTK.
As for the parser, it will continue to use Descent's parser, although the parser is already a bit outdated for D2. :/ It would be great if someone were to volunteer to update it... *wink* *wink* :p
But better yet would be to start coding our own custom parser (using a parser generator like ANTLR for example), that could really be tailored for IDE needs. In the medium/long term, that's probably what needs to be done.

-- 
Bruno Medeiros - Software Engineer
July 12, 2010
On 07/12/2010 02:45 PM, Bruno Medeiros wrote:
> So, yeah, as things are, I'm maintaining my approach with Mmrnmrhm:
> writing and improving the (very basic) semantic engine from scratch, and
> using DLTK.
> As for the parser, it will continue to use Descent's parser, although
> the parser is already a bit outdated for D2. :/ It would be great if
> someone were to volunteer to update it... *wink* *wink* :p

sigh, point me at the source

> But better yet would be to start coding our own custom parser (using a
> parser generator like ANTLR for example), that could really be tailored
> for IDE needs. In the medium/long term, that's probably what needs to be
> done.
>

What in particular does an IDE need tailored to it?
July 14, 2010
On 12/07/2010 21:31, Ellery Newcomer wrote:
> On 07/12/2010 02:45 PM, Bruno Medeiros wrote:
>> So, yeah, as things are, I'm maintaining my approach with Mmrnmrhm:
>> writing and improving the (very basic) semantic engine from scratch, and
>> using DLTK.
>> As for the parser, it will continue to use Descent's parser, although
>> the parser is already a bit outdated for D2. :/ It would be great if
>> someone were to volunteer to update it... *wink* *wink* :p
>
> sigh, point me at the source
>

Great, but are you aware what this task entails? In other words, are you familiar with descent.compiler? Because this is a Java port of the DMD frontend, and I'm not sure how easy it is for someone else other than Ary to learn how to do this, there might be quite a bit of private knowledge. (we can ask him about this, of course)

>> But better yet would be to start coding our own custom parser (using a
>> parser generator like ANTLR for example), that could really be tailored
>> for IDE needs. In the medium/long term, that's probably what needs to be
>> done.
>>
>
> What in particular does an IDE need tailored to it?

Well, in terms of *pure functionality*, there is not much else an IDE needs, other than making sure it can get all source information (like node source text ranges, comments, etc.)
The big thing that can be tailored is performance: incremental parsing, being able to do different kinds of parsing with different levels of detail ("sub-parsing"?). Some of these performance features become very important for IDE scalability.
Good error-recovery is also important, although that is also usually useful for a compiler (even if not so much as an IDE).

-- 
Bruno Medeiros - Software Engineer
July 14, 2010
On 14/07/2010 11:46, Bruno Medeiros wrote:
> On 12/07/2010 21:31, Ellery Newcomer wrote:
>> On 07/12/2010 02:45 PM, Bruno Medeiros wrote:
>>> So, yeah, as things are, I'm maintaining my approach with Mmrnmrhm:
>>> writing and improving the (very basic) semantic engine from scratch, and
>>> using DLTK.
>>> As for the parser, it will continue to use Descent's parser, although
>>> the parser is already a bit outdated for D2. :/ It would be great if
>>> someone were to volunteer to update it... *wink* *wink* :p
>>
>> sigh, point me at the source
>>
>
> Great, but are you aware what this task entails? In other words, are you
> familiar with descent.compiler? Because this is a Java port of the DMD
> frontend, and I'm not sure how easy it is for someone else other than
> Ary to learn how to do this, there might be quite a bit of private
> knowledge. (we can ask him about this, of course)
>
>>> But better yet would be to start coding our own custom parser (using a
>>> parser generator like ANTLR for example), that could really be tailored
>>> for IDE needs. In the medium/long term, that's probably what needs to be
>>> done.
>>>

Ah, forgot to say, source is:
http://www.dsource.org/projects/descent/browser/trunk/descent.compiler


-- 
Bruno Medeiros - Software Engineer
July 14, 2010
On 07/14/2010 05:46 AM, Bruno Medeiros wrote:
> On 12/07/2010 21:31, Ellery Newcomer wrote:
>> On 07/12/2010 02:45 PM, Bruno Medeiros wrote:
>>> So, yeah, as things are, I'm maintaining my approach with Mmrnmrhm:
>>> writing and improving the (very basic) semantic engine from scratch, and
>>> using DLTK.
>>> As for the parser, it will continue to use Descent's parser, although
>>> the parser is already a bit outdated for D2. :/ It would be great if
>>> someone were to volunteer to update it... *wink* *wink* :p
>>
>> sigh, point me at the source
>>
>
> Great, but are you aware what this task entails? In other words, are you
> familiar with descent.compiler? Because this is a Java port of the DMD
> frontend, and I'm not sure how easy it is for someone else other than
> Ary to learn how to do this, there might be quite a bit of private
> knowledge. (we can ask him about this, of course)

Nope, but it doesn't look too much different from dmd's source.

I am, however, intimidated by large codebases wrt build (particularly java based ones).

>
>>> But better yet would be to start coding our own custom parser (using a
>>> parser generator like ANTLR for example), that could really be tailored
>>> for IDE needs. In the medium/long term, that's probably what needs to be
>>> done.
>>>
>>
>> What in particular does an IDE need tailored to it?
>
> Well, in terms of *pure functionality*, there is not much else an IDE
> needs, other than making sure it can get all source information (like
> node source text ranges, comments, etc.)
> The big thing that can be tailored is performance: incremental parsing,
> being able to do different kinds of parsing with different levels of
> detail ("sub-parsing"?). Some of these performance features become very
> important for IDE scalability.
> Good error-recovery is also important, although that is also usually
> useful for a compiler (even if not so much as an IDE).
>

hmm, for incremental parsing, ANLTR isn't going to be your friend
July 19, 2010
On 14/07/2010 14:50, Ellery Newcomer wrote:
>
> I am, however, intimidated by large codebases wrt build (particularly
> java based ones).
>

What do you mean with regards to build?

-- 
Bruno Medeiros - Software Engineer
July 19, 2010
On 07/19/2010 11:13 AM, Bruno Medeiros wrote:
> On 14/07/2010 14:50, Ellery Newcomer wrote:
>>
>> I am, however, intimidated by large codebases wrt build (particularly
>> java based ones).
>>
>
> What do you mean with regards to build?
>

I mean I might need some help setting things up.

I haven't tried yet, and I actually have time today, so I'll see if I can't figure things out.
July 21, 2010
On 19/07/2010 21:42, Ellery Newcomer wrote:
> On 07/19/2010 11:13 AM, Bruno Medeiros wrote:
>> On 14/07/2010 14:50, Ellery Newcomer wrote:
>>>
>>> I am, however, intimidated by large codebases wrt build (particularly
>>> java based ones).
>>>
>>
>> What do you mean with regards to build?
>>
>
> I mean I might need some help setting things up.
>
> I haven't tried yet, and I actually have time today, so I'll see if I
> can't figure things out.

Shouldn't be too complicated. You only need to checkout the descent.compiler project, but you will need to run it with Eclipse+PDE, because descent.compiler does have some dependencies on Eclipse code (I want to remove that eventually, shouldn't be too hard).

The dtool project might be of interested as well, I am moving there some tests there that also exercise the parser in descent.compiler.


-- 
Bruno Medeiros - Software Engineer