September 23, 2013
On Sep 23, 2013 6:30 PM, "Sean Kelly" <sean@invisibleduck.org> wrote:
>
> On Sep 21, 2013, at 10:22 PM, Walter Bright <newshound2@digitalmars.com>
wrote:
>
> > On 9/21/2013 8:54 PM, Michel Fortin wrote:
> >> I don't think it should be a priority, but rejecting the idea outright
is
> >> shortsighted in my opinion.
> >
> > I'm not rejecting the idea outright. I've actually implemented this in
the dmc compiler. It's just not terribly useful, and it has costs.
>
> I'd consider it in a similar class as the dictionary lookup that occurs
when an unknown symbol is encountered.  Totally unnecessary, but it's a nice time-saver.  Is it clang that displays the line in error with a carat underneath the error?  Though if there really isn't an efficient way to do it in DMD then I don't think it's worthwhile.  I was only thinking of the parser when I mentioned the beginning-of-line pointer.  I hadn't considered the AST.

GCC has a carat too now.

Regards
-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';


September 23, 2013
Regarding the improving the error messages of D compilers, I think having the column number is nice, but there are two improvements that I think are more important:

1) One of them is the "aka", that is showing both the name of aliases and the aliased types/values:
http://d.puremagic.com/issues/show_bug.cgi?id=5004

2) The other improvement I'd like for D error messages and warnings is to give a standard error number. This is a simple improvement, but it makes simpler to write explanation pages for the errors. The C# compiler and other compilers have them.

Bye,
bearophile
September 23, 2013
On 9/23/13, Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote:
> I have a partial implementation of this in one of my branches

Also, I will not continue work on this unless Walter greenlights it. I'm not going to put any work that's inevitably going to be thrown away.
September 23, 2013
On 9/23/13, bearophile <bearophileHUGS@lycos.com> wrote:
> 1) One of them is the "aka", that is showing both the name of aliases and the aliased types/values: http://d.puremagic.com/issues/show_bug.cgi?id=5004

I have a partial implementation of this in one of my branches, but IIRC it was difficult do cover all cases since the compiler internally inserts a bunch of aliases as well. Those could be marked that they're internal, so that's fixable. Another issue I ran into is that diagnostics are called after the retrieval of the aliased-to type, which basically means by the time the compiler issues errors the alias declaration is gone, the compiler only works on the target symbol. I had a workaround for this, but it's going to take more work to get done.
September 23, 2013
On 2013-09-21 23:12, Andrei Alexandrescu wrote:

> I'm ambivalent because the matter is fuzzy. It is factually true that
> new releases will break code. On the other hand, that is the case with
> most compiler releases even for mature languages (at Facebook upgrading
> across minor gcc releases _always_ entails significant disruption). On
> the third hand (sic), there are companies and projects using D in the
> real world so stating that is unstable would do little else than either
> shoo people away for no good reason.

Some of these companies still use D1.

About the code breakage. I think it's still an issue that bugfixes and language changes occur in the same release. No notation of major, minor and patch releases.

-- 
/Jacob Carlborg
September 23, 2013
On 09/22/2013 09:27 PM, Walter Bright wrote:
> On 9/22/2013 11:43 AM, Timon Gehr wrote:
>> Tracking line numbers is likely worth it. I don't believe that
>> providing column
>> numbers in error messages necessitates a slowdown though.
>
> Please consider that:
>
>       IT ISN'T JUST FOR ERROR MESSAGES
>
> It would go in the symbolic debug info, too, where it will be required
> everywhere and will be right there on the fast path through the
> lexer/compiler.
> ...

There is no such thing as a law that obliges compiler writers to add column numbers in debug info when such information is available in frontend error messages. The trade-offs involved in both cases may be different and deserve separate consideration.

> Now consider the lexer doing a fast skip over comment text (this ranks
> fairly high in the profile). This operation gets a lot slower if you're
> also keeping track of column number.

I am not keeping track of column number.

> Please note that:
>
>       COLUMN NUMBER ISN'T THE OFFSET FROM THE START OF THE LINE
> ...

Obviously. I compute the correct column number exactly in the case when an error message should actually be printed. It is not necessary to do any of this on the fast path. The additional memory word per location that I waste in comparison to DMD could be shaved off by using more computation in the error case (or by giving up support for exact underlining), but the project has not yet reached a stage where this is worth considering/measuring.

Excerpts from actual code I wrote roughly two years ago:

class Source{
    // computes a slice of the entire first line
    // where some given slice occurs in the source buffer.
    // this allows to recover column information on the fly, and we
    // will also be able to print the line where an error occurred
    // without storing it explicitly.
    // running time is linear in output length
    string getLineOf(string rep)in{/*...*/}out{/*...*/}body{
        string before=code[0..rep.ptr-code.ptr];
        string after=code[rep.ptr-code.ptr..$];
        immutable(char)* start=code.ptr, end=code.ptr+code.length;

        // It is fine to skip decoding here, because we are just
        // searching for ASCII characters.
        // TODO: support unicode line breaks?
        foreach_reverse(ref c; before)
            if(c=='\n'||c=='\r'){start = &c+1; break;}
        foreach(ref c; after)
            if(c=='\n'||c=='\r'){end = &c; break;}
        return start[0..end-start];
    }
    // ...
}

struct Location{
    string rep;    // slice of the code representing the Location
    int line;      // line number at start of location

    @property Source source()const{
        auto src = Source.get(rep); // (currently just a linear search)
        assert(src, "source for '"~rep~"' not found!");
        return src;
    }
    // ...
}

int getColumn(Location loc, int tabsize){
    int res=1;
    auto l=loc.source.getLineOf(loc.rep);
    for(;!l.empty&&l[0]&&l.ptr<loc.rep.ptr; l.popFront()){
        if(l.front=='\t') res=res-res%tabsize+tabsize;
        else res++;
    }
    return res;
}



September 24, 2013
On 9/23/2013 10:29 AM, Sean Kelly wrote:
> On Sep 21, 2013, at 10:22 PM, Walter Bright <newshound2@digitalmars.com>
> wrote:
>> I'm not rejecting the idea outright. I've actually implemented this in the
>> dmc compiler. It's just not terribly useful, and it has costs.
>
> I'd consider it in a similar class as the dictionary lookup that occurs when
> an unknown symbol is encountered.  Totally unnecessary, but it's a nice
> time-saver.

It's not in the same category, because that feature has zero cost.

Again, it's about the cost of it.
September 24, 2013
On 9/23/2013 10:33 AM, Iain Buclaw wrote:
> GCC has a carat too now.

DMC has had a carat for 30 years now.

  int x x;
        ^
  test2.c(2) : Error: missing ',' between declaration of 'x' and 'x'

Nobody ever gave a damn about that feature, i.e. not one single person commented on it, including not a single D user.
September 24, 2013
On 9/23/2013 11:38 AM, bearophile wrote:
> 2) The other improvement I'd like for D error messages and warnings is to give a
> standard error number. This is a simple improvement, but it makes simpler to
> write explanation pages for the errors. The C# compiler and other compilers have
> them.

I used to do that, but again, it was a completely unwanted feature, and I abandoned it.

It's simple enough to grep for the error message text, and I myself prefer to do the grep method.

What makes me grumpy is people only want these things when some other compiler does it, sort of a bandwagon thing.

September 24, 2013
On 9/23/2013 12:07 PM, Andrej Mitrovic wrote:
> On 9/23/13, bearophile <bearophileHUGS@lycos.com> wrote:
>> 1) One of them is the "aka", that is showing both the name of
>> aliases and the aliased types/values:
>> http://d.puremagic.com/issues/show_bug.cgi?id=5004
>
> I have a partial implementation of this in one of my branches, but
> IIRC it was difficult do cover all cases since the compiler internally
> inserts a bunch of aliases as well. Those could be marked that they're
> internal, so that's fixable. Another issue I ran into is that
> diagnostics are called after the retrieval of the aliased-to type,
> which basically means by the time the compiler issues errors the alias
> declaration is gone, the compiler only works on the target symbol. I
> had a workaround for this, but it's going to take more work to get
> done.
>

I worry that this is too complicated to be worthwhile.