Implicit Asserts as Optimization/Compile Warnings/Debugging

Nov 30, 2003

Russ Lewis

Nov 30, 2003

Walter

Dec 01, 2003

Dec 01, 2003

Dec 10, 2003

Dec 10, 2003

November 30, 2003

Implicit Asserts as Optimization/Compile Warnings/Debugging

Posted by Russ Lewis

Permalink

Russ Lewis

Permalink

What if an assert in one place in the program was translated to many different places in the program, reflecting the explicit and implicit assumptions of the programmer?  What if this system was used to allow for better optimization and compile-time error detection?

First, as I enter this, let me say that I am aware of the Halting Problem.  Most likely, any such optimization system would have to have a cap...it would stop after searching x deep (and, probably, issue a warning saying that it had to stop).  But let's explore the potential it has...

Second, I am aware that the mechanism I describe here only works if you compile all of the .d files together at once...if any function implementations are hidden, it becomes much more difficult (or impossible) to perform these optimizations.

My theory is that at any given line of code there are any number of assert()s which are true.  Some are explicit (stated in an earlier line, where variable values have not been altered) and some are implicit (stated later in the program, stated in calling functions, in called functions, etc.).  I'm thinking that the compiler should propogate assert()s as much as possible at compile time in an attempt to find dead code, bugs, and optimization opportunities.

The interesting thing about this is that it allows, in a sense, to have the program to describe its own compile-time error.  Basically, you make it a compile-time error (or warning?) if the compiler can detect any path where an assert() is hit.  First, the trivial case.  This function generates a compile-time error:

void foo()
{
   int a = 1;
   assert(a == 0);
}

However, this section of code also causes an error:

void foo(int arg)
body { assert(arg > 0); ... }

void bar()
{
   foo(0);
}

This is pretty trivial.  But things get more interesting if the compiler communicates implicit asserts around the program.  Take this example:

void foo(int arg)
body { assert(arg > 0); ... }

void bar(int arg)
{
   foo(arg);
   if(arg == 0)
       {    ...    }
}

Now, the compiler knows from the assert() in foo() that arg must be >0.  So it adds an implicit assert at compile time to bar(), which calls foo():

void bar(int arg)
{
   /* implicit */ assert(arg > 0);
   foo(arg)
   if(arg == 0)
       {    /* dead code...can't be reached */    }
}

(Of course, this implicit assert, once added to bar(), gets propogated up to all functions which call bar(), and so on.)

So far, this isn't very advanced.  But let's push things a little further, using the example where the implicit default case (of a switch) throws an exception (something people are debating a lot in a previous thread). First, let's make this assumption: any exception that gets all the way out past main() to the loader function is a program bug.  It means we didn't handle something properly.  So, we place an assertion in the loader function

int loader_function_foo(...)
{
   try {
       ...call main() here, with the right args...
   }
   catch    /* catch anything & everything */
   {
       assert(0);
   }
}

So, this means that if the compiler can detect any code path which results in an exception that is not caught, it will treat it as a compile-time error.  So, look at how the compiler can transform a switch statement that lacks a default case:

switch(var)
{
case 1:    ...    break;
case 2:    ...    break;
case 3:    ...    break;
/* this line implicitly added by the compiler */ default: throw NoMatchingCaseException;
}

Now, if the compiler looks up the call stack (as much as it is able) and finds no catch() for that exception, it further transforms the switch:

switch(var)
{
case 1:    ...    break;
case 2:    ...    break;
case 3:    ...    break;
/* this line implicitly added by the compiler */ default: assert(0);
}

Finally, this assertion is moved up in the function:

/* this line implicitly added by the compiler */ assert(var == 1 || var == 2 || var == 3);
switch(var)
{
case 1:    ...    break;
case 2:    ...    break;
case 3:    ...    break;
}

This assertion can continue to be carried up to calling functions.  The compiler will (in most cases, giving our respects to the Halting Problem) be able to determine that either the default case will never be hit, or that the default case can be hit.  In the latter situation, it fires off a compile-time error.  In the former case, the code is more optimized because the function does not have to have an added default: statement.

Thoughts?

"Russ Lewis" <spamhole-2001-07-16@deming-os.org> wrote in message news:bqd07o$che$1@digitaldaemon.com... > Thoughts? I think you're right. There is a lot of potential for an advanced compiler to make use of asserts. I also agree that if the compiler can prove that an assert will be tripped, that should be a compile time error.

Walter wrote: > "Russ Lewis" <spamhole-2001-07-16@deming-os.org> wrote in message > news:bqd07o$che$1@digitaldaemon.com... > >>Thoughts? > > > I think you're right. There is a lot of potential for an advanced compiler > to make use of asserts. I also agree that if the compiler can prove that an > assert will be tripped, that should be a compile time error. We could include this into the language standard now, in a way that allows space both for compilers that support the feature and those that do not. I would suggest a syntax where we overload assert() (at least) 2 ways: A: assert(expr); B: assert(expr,uint); A: Existing assert(). Non-searching compilers (like current dmd) just compile in a test, with a throw if it fails, for debug builds. Searching compilers can copy this assert as many times as they deem appropriate. B: Programmer requests that the compiler search this assert() at least this many plys deep. (Compiler may search even further if so desired.) Non-searching compilers should issue a WARNING stating that they are unable to search. Searching compilers should offer compile-time WARNING if the search cannot conclusively determine that the assert() is safe before running out of time. In all cases, searching compilers would issue a compile time ERROR if they detected any path where the assert might fail. Thoughts? Is this a good syntax? If so, can we add it to the official syntax now, even if dmd doesn't support searching yet?

"Russ Lewis" <spamhole-2001-07-16@deming-os.org> wrote in message news:bqemvh$2ohu$1@digitaldaemon.com... > Walter wrote: > > "Russ Lewis" <spamhole-2001-07-16@deming-os.org> wrote in message news:bqd07o$che$1@digitaldaemon.com... > > > >>Thoughts? > > > > > > I think you're right. There is a lot of potential for an advanced compiler > > to make use of asserts. I also agree that if the compiler can prove that an > > assert will be tripped, that should be a compile time error. > > We could include this into the language standard now, in a way that allows space both for compilers that support the feature and those that do not. I would suggest a syntax where we overload assert() (at least) 2 ways: > > A: assert(expr); > B: assert(expr,uint); > > A: > Existing assert(). Non-searching compilers (like current dmd) just > compile in a test, with a throw if it fails, for debug builds. > Searching compilers can copy this assert as many times as they deem > appropriate. > > B: > Programmer requests that the compiler search this assert() at least this > many plys deep. (Compiler may search even further if so desired.) > Non-searching compilers should issue a WARNING stating that they are > unable to search. Searching compilers should offer compile-time WARNING > if the search cannot conclusively determine that the assert() is safe > before running out of time. > > In all cases, searching compilers would issue a compile time ERROR if they detected any path where the assert might fail. > > Thoughts? Is this a good syntax? If so, can we add it to the official syntax now, even if dmd doesn't support searching yet? I think the compiler can potentially handle this automatically without any additional syntax.

Walter wrote: > I think the compiler can potentially handle this automatically without any > additional syntax. Sure, it can, but I think that that has problems. The programmer who uses a searching compiler will assume, when he writes an assert, that one of two things are going to happen. Either a) The compiler searches the program and determines that my assert is for-sure safe; it thus removes it from the program altogether, or b) The compiler searches the program and finds and error; it tells me so with a compile-time error. Thus, asserts become not just a sanity checking tool, but a very powerful unittesting tool. However, what happens when there is a third case: c) The compiler tries to search the assert() but gives up after a little while. It doesn't tell me, though, so I have potential of runtime errors at any arbitrary time in the future. If (c) is possible, then I think that programmers won't use assert()s to their full potential, nor will they trust them. So, I would argue, if the programmer is expecting the compiler to search a given assert(), but the compiler gives up because of Halting Problem troubles, then it really ought to report this to the user with a warning. However, you have repeatedly stated (and I agree) that a multiplicity of warnings just causes troubles. Too many warnings is just as bad as none at all. So, I proposed two syntaxes. One syntax (without the uint arg) simply makes an assert as we currently have them. Searching compilers may use them, but need not report warnings if the search runs out of time. The other syntax (with the uint arg) is an explicit statement from the programmer that he wants this assert() to be searched. Thus, compilers are required to do their best to search, and if they cannot acheive it, then they are required to post a warning. IMHO, this gives the best of both worlds - it gives a "clean" output for the simply written program, and it gives a "detailed" output for programs which require it. Moreover, it gives a clear, language-level division between which assert()s MUST be searched, and which can be searched at the convenience of the compiler. Russ

Russ Lewis wrote: > > If (c) is possible, then I think that programmers won't use assert()s to their full potential, nor will they trust them. So, I would argue, if the programmer is expecting the compiler to search a given assert(), but the compiler gives up because of Halting Problem troubles, then it really ought to report this to the user with a warning. > I dunno. Java has runtime assertion checking, and the thought process there is pretty simple: Assertions are debug tools. When assertions are used, they should never have side effects (i.e. we are only checking state, not making changes to state). Therefore, I can use assertions to do complex validations that are expensive to do. At runtime if the assertions are enabled (usually during development) then we perform all those checks. However, assertions can also be disabled which means those expensive checks are no longer performed. The thought here is that any reasons for those assertions have already been shaken out. I like using assertions as debug tools. If assertions can never be disabled at runtime, then there are certain types of checks that I won't do because it would be too computationally expensive. If the language performs runtime checking that is fine--it is a different tool than an assertion. Bounds checking is critical with arrays--I would never want that turned off.

Forums