No bounds checking for dynamic arrays at compile time? (page 3)

On Thursday, 13 December 2012 at 09:38:18 UTC, bearophile wrote: > That program seems to have a bug, unless the signature of foo becomes (ref int[]). Indeed. I'm learning how to type on a new keyboard, so most of my brain power is being spent on figuring out where the keys are. It took me over 40 minutes to type that whole message. :( I said it right in my head, fwiw. > Right, but there are several cases where a little smarter compiler is able to see at compile-time that something bad is present in the code. The problem is that it can't solve anything but the most basic of errors in code. This type of analysis will simultaneously be computationally expensive and useless in _real_ code. My point is that, ultimately, run time problems have to be dealt with at run time. I wouldn't mind this suggestion except for the fact that it will take away human resources from much more useful endeavors (not to mention increasing compilation time). Many suggestions I see the value in... but I see almost none here (as I explained, it won't do anything in non-trivial code that you might actually see using dynamic arrays).

On Thu, Dec 13, 2012 at 05:31:05PM +0100, Chris Cain wrote: > On Thursday, 13 December 2012 at 09:38:18 UTC, bearophile wrote: [...] > >Right, but there are several cases where a little smarter compiler is able to see at compile-time that something bad is present in the code. > > The problem is that it can't solve anything but the most basic of errors in code. This type of analysis will simultaneously be computationally expensive and useless in _real_ code. My point is that, ultimately, run time problems have to be dealt with at run time. > > I wouldn't mind this suggestion except for the fact that it will take away human resources from much more useful endeavors (not to mention increasing compilation time). Many suggestions I see the value in... but I see almost none here (as I explained, it won't do anything in non-trivial code that you might actually see using dynamic arrays). Yeah, I think this is a case of premature optimization, a very tempting trap to fall into for efficiency-minded people (like myself). Many of these "optimizations" actually hurt, rather than help. Like you said, making the compiler detect dynamic array length at compile-time is non-trivial -- in fact, it amounts to solving the Halting Problem, which is unsolvable -- so the only practical cases that can be done are trivial and most likely nowhere near the performance bottleneck that makes optimization necessary in the first place. Definitely not worth the significant increase in compilation times. I used to be a big fan of optimizing everywhere, until one day I actually took the time to profile my code, and discovered to my great surprise that my hotspots were NOT where I thought they were. They were all the places where I didn't even think to look because they were so trivial and so obviously already "optimal". Which meant that 90% of my "optimization" efforts had been wasted on code that wasn't even important to optimize, while neglecting the real hotspots. (In fact, some of my "optimizations" turned out to actually degrade performance, because I eschewed a certain implementation technique as "slow" where it would have helped with the actual hotspots in the code. Not to mention wasting much of my time in thinking up clever ways of "optimizing", where a straightforward implementation would have sufficed.) I wouldn't be surprised if many (most?) programmers will be shocked to learn where the real hotspots in their code are, contrary to whatever preconceived notions they may have had. T -- A computer doesn't mind if its programs are put to purposes that don't match their names. -- D. Knuth

On 12/13/2012 3:07 AM, bearophile wrote: > I agree that putting lot of similar special cased tests in the compiler is a bad > idea (also because code like $+$-$+1 is very uncommon). > But can't the already present expression range analysis be used to cover some > simple but common enough bugs? I've seen no evidence that these are "common enough". >> Since the bug is caught anyway, such is an extremely low priority because it's >> got such a low payoff. > > If that's true then why aren't you programming in Python? :-) Spotting mistakes > at compile time is usually better because the sooner you find bugs, the less > problems they cause. Although that is an advantage to static typing, it is hardly the only advantage. Static typing has a tremendous advantage in generating high performance code.

On 12/13/2012 4:05 AM, Jacob Carlborg wrote: > On 2012-12-13 10:54, Walter Bright wrote: > >> I just don't see the point in adding flow analysis for that, and it'll >> ding you at runtime anyway > > What about code that is only executed at compile time? > CTFE would catch it.

On 12/13/2012 9:40 AM, H. S. Teoh wrote: > I wouldn't > be surprised if many (most?) programmers will be shocked to learn where > the real hotspots in their code are, contrary to whatever preconceived > notions they may have had. I can vouch for this. I've been programming for 35 years, and I still get gobsmacked by where the real bottleneck is. For errors, what I try to do is look at the kinds of patterns of error that are commonplace, and try to devise ways to head them off. Expending effort on better detection of errors that people don't make is a waste of time. (Note I said "better" detection. Errors still need to be detected.)

December 13, 2012

Re: No bounds checking for dynamic arrays at compile time?

Posted by bearophile
in reply to Walter Bright

Permalink

bearophile

Posted in reply to Walter Bright

Permalink

Walter Bright:

> For errors, what I try to do is look at the kinds of patterns of error that are commonplace, and try to devise ways to head them off.

This was a bug commonly found, I think you accepted it, but it's not fixed yet. I hope it's not forgotten, it's a little breaking change:

http://d.puremagic.com/issues/show_bug.cgi?id=5409

- - - - - - - - - - - - - - - -

Some other common bug patterns:

Issue http://d.puremagic.com/issues/show_bug.cgi?id=4407

class Foo {
    int x, y;
    this(int x_, int y_) {
        this.x = x;
        y = y;

    }
}
void main() {}

- - - - - - - - - - - - - - - -

Issue http://d.puremagic.com/issues/show_bug.cgi?id=3878

Arguments and members with the same name:

class Foo {
    int x;
    this(int x) { x = x; }
    void inc(int x) { this.x += x; }
}
class Bar {
    int x;
    this() { x = 5; }
}
struct Spam {
    static int x;
    void inc(int x) { Spam.x += x; }
}
void main() {}

- - - - - - - - - - - - - - - -

Issue http://d.puremagic.com/issues/show_bug.cgi?id=5187

C# refuses code similar to this:


public class Foo {
    public int x = 10;
}
public class Test : Foo {
    public int x = 20;
}
void main() {}

- - - - - - - - - - - - - - - -

Issue http://d.puremagic.com/issues/show_bug.cgi?id=5212


class Foo {
    int[] args;
    this(int[] args_...) {
        args = args_;
    }
}
Foo foo() {
    return new Foo(1, 2, 3); // passes stack data to Foo
}
void main() {
    assert(foo().args == [1, 2, 3]);
}

- - - - - - - - - - - - - - - -

Issue http://d.puremagic.com/issues/show_bug.cgi?id=8757


auto x1 = y1 ? z1 : w1; // OK
auto x2 = x0 + (y1 ? z1 : w1); // OK
auto x3 = (x0 + y1) ? z1 : w1; // OK
auto x4 = x0 + y1 ? z1 : w1; // Not good
auto x5 = y1 ? z1 : (y2 ? z2 : w2); // OK
auto x6 = y1 ? z1 : y2 ? z2 : w2; // Not good

- - - - - - - - - - - - - - - -

> Expending effort on better detection of errors that people don't make is a waste of time.

I agree. Bugs 5409 and 8757 are demonstrably common in already debugged C/C++ code. Bug 5212 is a trap.


Now this issue is fixed:
http://d.puremagic.com/issues/show_bug.cgi?id=6883

So this code:

// program#1
void main() {
    int[5] x;
    x[x.length] = 1;
    x[$] = 1;
    enum size_t n = 2;
    x[x.length + n] = 2;
    x[$ + n] = 2;
}


Generates the errors:
test.d(3): Error: array index 5 is out of bounds x[0 .. 5]
test.d(4): Error: array index 5 is out of bounds x[0 .. 5]
test.d(6): Error: array index 7 is out of bounds x[0 .. 5]
test.d(7): Error: array index 7 is out of bounds x[0 .. 5]


If I keep the same code but I replace x with a dynamic array no compile-time errors are generated:

// program#2
void main() {
    auto x = new int[5];
    x[x.length] = 1;
    x[$] = 1;
    enum size_t n = 2;
    x[x.length + n] = 2;
    x[$ + n] = 2;
}


program#1 code that uses fixed-sized arrays is flagged as wrong at compile time. program#2 is equally wrong, why isn't it good to give the same compilation errors for all or part of those four cases in program#2? Do they need lot of special casing?

Bye,
bearophile

On Friday, December 14, 2012 08:34:44 Jacob Carlborg wrote: > On 2012-12-13 22:26, Walter Bright wrote: > > CTFE would catch it. > > Didn't you just say that flow analysis is needed for that? No. You'd get a RangeError in CTFE just like you'd get at runtime. It's just that the function is being run at compile time instead of runtime. What would require flow analysis would be to statically determine that there was an indexing problem. It's the difference between examining the code to determine whether it has a problem and running it to see if it has a problem. - Jonathan M Davis

On 12/13/2012 11:34 PM, Jacob Carlborg wrote: > On 2012-12-13 22:26, Walter Bright wrote: > >> CTFE would catch it. > > Didn't you just say that flow analysis is needed for that? > CTFE executes at compile time, no flow analysis is needed for that. DFA is something very different - it "executes" all paths simultaneously.

Forums