May 11, 2012
On 2012-05-11 14:14, Roman D. Boiko wrote:
> On Friday, 11 May 2012 at 11:47:18 UTC, deadalnix wrote:
>> From the beginning, I'm think AST macro using CTFE.
> Could you please elaborate?
>
> I plan to strictly follow published D specification.

That won't be easy, nobody know what the specification is . TDPL, DMD or dlang.org?

> Exceptions from this rule are possible provided either of the following
> is true:
> * new functionality has been implemented in DMD but is not included into
> specification yet
> * specification is incorrect (has a bug) or incomplete, especially if
> DMD behavior differs from specification
> * change is compatible with specification and brings some significant
> improvement (e.g., this seems to be the case for my decision to
> introduce post-processor after lexer)
>
> Some more exceptions might be added later, but the goal is to minimize
> differences.
>
>


-- 
/Jacob Carlborg
May 11, 2012
On 2012-05-11 15:01, Roman D. Boiko wrote:

> What about the following signature: Location locate(size_t index)?
> Or even better:
> alias size_t CodeUnitIndex;
> Location locateFor(CodeUnitIndex position);

That is better although I would prefer to pass in a token (assuming that is where index is declared). Then it would be an implementation detail that "index" is used to get the location.

Another option would be to turn it around:

sturct/class Location
{
    Location find/locate (Token token);
}

> The problem with placing it in Token is that Token should not know
> anything about source as a whole.

I see. For convenience there could be properties defined that just forwards the call to some other function.

-- 
/Jacob Carlborg
May 11, 2012
Le 11/05/2012 15:01, Roman D. Boiko a écrit :
> On Friday, 11 May 2012 at 12:55:58 UTC, Jacob Carlborg wrote:
>> On 2012-05-11 14:07, Roman D. Boiko wrote:
>>> On Friday, 11 May 2012 at 11:49:23 UTC, Jacob Carlborg wrote:
>>
>>>> Found it now, "calculateFor". It not sure if it's the most intuitive
>>>> name though. I get the feeling: "calculate what?".
>>
>>> calculateLocation was original name, but I don't like repeating return
>>> type in method names, I decided to change it so that it is clear that
>>> another renaming is needed ;) Any suggestions?
>>>
>>
>> My original suggestion was to have the functionality in Token, which
>> would have made for intuitive names: line, column and file. But since
>> you didn't like that I have to give it some thought.
>
> What about the following signature: Location locate(size_t index)?
> Or even better:
> alias size_t CodeUnitIndex;
> Location locateFor(CodeUnitIndex position);
>
> The problem with placing it in Token is that Token should not know
> anything about source as a whole.

I don't really see the benefit of this. You are trading a O(1) operation to an O(log(n)) . It can only be faster in specific cases, which should be measured.

It is likely to be slower on compiling erroneous code or used for something else than compiling, and likely to be insignificant to compile correct code (I suspect the performance win is negligible compared to the time required to build a fully operational executable).

You are overcomplicating things for no to little benefice.
May 11, 2012
? my guess is that the spec is TDPL + TDPL errata. dlang.org should be
updated as people notice inaccuracies.
This project would be an ideal time to update dlang.org as people notice
its not in sync with TDPL.
If TDPL doesn't cover it then the community should review it.


On Fri, May 11, 2012 at 3:17 PM, Jacob Carlborg <doob@me.com> wrote:

> On 2012-05-11 14:14, Roman D. Boiko wrote:
>
>> On Friday, 11 May 2012 at 11:47:18 UTC, deadalnix wrote:
>>
>>> From the beginning, I'm think AST macro using CTFE.
>>>
>> Could you please elaborate?
>>
>> I plan to strictly follow published D specification.
>>
>
> That won't be easy, nobody know what the specification is . TDPL, DMD or dlang.org?
>
>
>  Exceptions from this rule are possible provided either of the following
>> is true:
>> * new functionality has been implemented in DMD but is not included into
>> specification yet
>> * specification is incorrect (has a bug) or incomplete, especially if
>> DMD behavior differs from specification
>> * change is compatible with specification and brings some significant
>> improvement (e.g., this seems to be the case for my decision to
>> introduce post-processor after lexer)
>>
>> Some more exceptions might be added later, but the goal is to minimize differences.
>>
>>
>>
>
> --
> /Jacob Carlborg
>


May 11, 2012
On Friday, 11 May 2012 at 13:28:21 UTC, deadalnix wrote:
> Le 11/05/2012 15:01, Roman D. Boiko a écrit :
>> The problem with placing it in Token is that Token should not know
>> anything about source as a whole.
>
> I don't really see the benefit of this. You are trading a O(1) operation to an O(log(n)) . It can only be faster in specific cases, which should be measured.
It would be interesting to see benchmarks. But anyway log(1M) is just 20, and I didn't see any source code with 1M lines :). The main point is design, not performance. I'm trying to build minimal design that handles common use cases very well, and others well enough. I don't see why Location would be needed for every Token.

> It is likely to be slower on compiling erroneous code or used for something else than compiling, and likely to be insignificant to compile correct code (I suspect the performance win is negligible compared to the time required to build a fully operational executable).
IMO, it is unrelated.
> You are overcomplicating things for no to little benefice.
Well, it depends... However, I will provide design rationale document this month.
May 11, 2012
On Friday, 11 May 2012 at 13:25:53 UTC, Jacob Carlborg wrote:
> On 2012-05-11 15:01, Roman D. Boiko wrote:
>
>> What about the following signature: Location locate(size_t index)?
>> Or even better:
>> alias size_t CodeUnitIndex;
>> Location locateFor(CodeUnitIndex position);
>
> That is better although I would prefer to pass in a token (assuming that is where index is declared). Then it would be an implementation detail that "index" is used to get the location.
It is also what I was planning to do after I posted my last suggestion. Unless I will discover some conceptual problem with that, which is unlikely.

> Another option would be to turn it around:
>
> sturct/class Location
> {
>     Location find/locate (Token token);
> }
>
>> The problem with placing it in Token is that Token should not know
>> anything about source as a whole.
>
> I see. For convenience there could be properties defined that just forwards the call to some other function.
Both these cases would introduce circular dependency between Lexer and its output data (Location or Token). It is possible to break such dependency via complicating things even more, or live with it. But I don't like circular dependencies when there is no compelling reason to introduce them.

May 11, 2012
On Friday, 11 May 2012 at 13:30:49 UTC, Rory McGuire wrote:
> ? my guess is that the spec is TDPL + TDPL errata. dlang.org should be
> updated as people notice inaccuracies.
> This project would be an ideal time to update dlang.org as people notice
> its not in sync with TDPL.
> If TDPL doesn't cover it then the community should review it.
Whenever I'm in doubt, I try to analyze both TDPL and dlang.org, and also consult DMD sources. But in general when I write specification I mean dlang.org.

I agree that it should be maintained, but I understand that it is difficult to do. I'll try to summarize problems which I discover (I already found many), but it is quite time-consuming activity.
May 11, 2012
On Friday, 11 May 2012 at 13:28:21 UTC, deadalnix wrote:
> Le 11/05/2012 15:01, Roman D. Boiko a écrit :
>> The problem with placing it in Token is that Token should not know
>> anything about source as a whole.
>
> I don't really see the benefit of this. You are trading a O(1) operation to an O(log(n)) . It can only be faster in specific cases, which should be measured.
Technically, I'm trading N*0(1) operations needed to track line and column while consuming each character to M*0(log(n)) operations when calculating them on demand. N = number of characters, n is number of lines and M is number of actual usages of Location. My assumption is that M << N (M is much smaller than N).
May 11, 2012
On Friday, 11 May 2012 at 12:20:27 UTC, Roman D. Boiko wrote:
> On Friday, 11 May 2012 at 12:13:53 UTC, alex wrote:
>> Mono-D is written in C#, VisualD uses D -- so it actually should be easier to integrate into the second one :)
> Sorry, I meant D-IDE. But there might exist the reason to consume D implementation from C# also. I would happily collaborate to make it usable for that.
Oops D-IDE is also C#...
May 11, 2012
On 2012-05-11 15:30, Rory McGuire wrote:
> ? my guess is that the spec is TDPL + TDPL errata. dlang.org
> <http://dlang.org> should be updated as people notice inaccuracies.
> This project would be an ideal time to update dlang.org
> <http://dlang.org> as people notice its not in sync with TDPL.
> If TDPL doesn't cover it then the community should review it.
>

None of these are in sync.

-- 
/Jacob Carlborg