August 04, 2014
On Monday, 4 August 2014 at 01:19:28 UTC, Andrei Alexandrescu wrote:
> On 8/3/14, 6:17 PM, John Carter wrote:
>> Well, I'm the dogsbody who has the job of upgrading the toolchain and
>> handling the fallout of doing so.
>>
>> So I have been walking multimegaline code bases through every gcc
>> version in the last 15 years.
>
> Truth. This man speaks it.
>
> Great post, thanks!
>
>
> Andrei

His post basically says that his real life experience leads him to believe that a static analyzer based on using information from asserts will very likely generate a ton of warnings/errors, because real life code is imperfect.

In other words, if you use that information to optimize instead, you are going to get a ton of bugs, because the asserts are inconsistent with the code.

So his post completely supports the conclusion that you've disagreed with, unless this has convinced you and you're switching sides now (could it be?) :)
August 04, 2014
On Monday, 4 August 2014 at 01:26:10 UTC, Daniel Gibson wrote:
> Am 04.08.2014 03:17, schrieb John Carter:
>> But that's OK.
>>
>> Because I bet 99.999% of those warnings will be pointing straight at
>> bone fide defects.
>>
>
> Well, that would make the problem more acceptable..
> However, it has been argued that it's very hard to warn about code that will be eliminated, because that code often only become dead or redundant due to inlining, template instantiation, mixin, ... and you can't warn in those cases.
> So I doubt that the compiler will warn every time it removes checks that are considered superfluous because of a preceding assert().
>
> Cheers,
> Daniel

It is possible, just not as a default enabled warning.

Some compilers offers optimization diagnostics which can be enabled by a switch, I'm quite fond of those as it's a much faster way to go through a list of compiler highlighted failed/successful optimizations rather than being forced to check the asm output after every new compiler version or minor code refactoring.

In my experience, it actually works fine in huge projects, even if there are false positives you can analyse what changes from the previous version as well as ignoring modules which you know is not performance critical.
August 04, 2014
On Monday, 4 August 2014 at 02:18:12 UTC, David Bregman wrote:
>
> His post basically says that his real life experience leads him to believe that a static analyzer based on using information from asserts will very likely generate a ton of warnings/errors, because real life code is imperfect.
>
> In other words, if you use that information to optimize instead, you are going to get a ton of bugs, because the asserts are inconsistent with the code.

No.

My experience says deeper optimization comes from deeper understanding of the dataflow, with deeper understanding of the dataflow comes stricter warnings about defective usage.

ie. A Good compiler writer, as Walter and the gcc guys clearly are, don't just slap in an optimization pass out of nowhere.

They are all too painfully aware that if their optimization pass breaks anything, they will be fending off thousands of complaints that "Optimization X broke....".

Compiler users always blame the optimizer long before they blame their crappy code.

Watching the gcc mailing list over the years, those guys bend over backwards to prevent that happening.

But since an optimization has to be based on additional hard information, they have, with every new version of gcc, used that information both for warnings and optimization.

August 04, 2014
On Sunday, 3 August 2014 at 21:57:08 UTC, John Carter wrote:
> On Sunday, 3 August 2014 at 19:47:27 UTC, David Bregman wrote:
>
>> Walter has proposed a change to D's assert function as follows [1]:
>> "The compiler can make use of assert expressions to improve optimization, even in -release mode."
>
> Hmm. I really really do like that idea.
>
> I suspect it is one of those ideas of Walter's that has consequences that reach further than anyone foresees..... but that's OK, because it is fundamentally the correct course of action, it's implications foreseen and unforeseen will be correct.
>
> One "near term" implication is to permit deeper static checking of the code.
>

I allow myself to chime in. I don't have much time to follow the whole thing, but I have this in my mind for quite a while.

First thing first, the proposed behavior is what I had in mind for SDC since pretty much day 1. It already uses hint for the optimizer to tell it the branch won't be taken, but I definitively want to go further.

By definition, when an assert has been removed in release that would have failed in debug, you are in undefined behavior land already. So there is no reason not to optimize.
August 04, 2014
On Monday, 4 August 2014 at 02:31:36 UTC, John Carter wrote:

> But since an optimization has to be based on additional hard information, they have, with every new version of gcc, used that information both for warnings and optimization.

Hmm. Not sure I made that clear.

ie. Yes, it is possible that a defect may be injected by an optimization that assumes an assert is true when it isn't.

However, experience suggests that many (maybe two full orders of magnitude) more defects will be flagged.

ie. In terms of defect reduction it's a big win rather than a loss.

The tragedy of C optimization and static analysis is that the language is so loosely defined in terms of how it is used, the compiler has very little to go on.

This proposal looks to me to be a Big Win, because it gifts the compiler (and any analysis tools) with a huge amount of eminently usable information.
August 04, 2014
On Monday, 4 August 2014 at 02:31:36 UTC, John Carter wrote:
> On Monday, 4 August 2014 at 02:18:12 UTC, David Bregman wrote:
>>
>> His post basically says that his real life experience leads him to believe that a static analyzer based on using information from asserts will very likely generate a ton of warnings/errors, because real life code is imperfect.
>>
>> In other words, if you use that information to optimize instead, you are going to get a ton of bugs, because the asserts are inconsistent with the code.
>
> No.
>
> My experience says deeper optimization comes from deeper understanding of the dataflow, with deeper understanding of the dataflow comes stricter warnings about defective usage.

Yes, that isn't what is being proposed though. This is about optimization, not warnings or errors.

> ie. A Good compiler writer, as Walter and the gcc guys clearly are, don't just slap in an optimization pass out of nowhere.
>
> They are all too painfully aware that if their optimization pass breaks anything, they will be fending off thousands of complaints that "Optimization X broke....".

If you read the earlier threads, you will see Walter freely admits this will break code. Actually he says that such code is already broken. This doesn't involve new warnings, it will just break silently. It would be very difficult to do otherwise (see Daniel Gibson's reply to your post).
August 04, 2014
On Monday, 4 August 2014 at 00:34:30 UTC, Andrei Alexandrescu wrote:
> On 8/3/14, 4:51 PM, Mike Farnsworth wrote:
>> This all seems to have a very simple solution, to use something like:
>> expect()
>>
>> GCC for example has an intrinsic, __builtin_expect() that is used to
>> notify the compiler of a data constraint it can use in optimization for
>> branches.  Why not make something like this a first-class citizen in D
>> (and even expand the concept to more than just branch prediction)?
>
> __builtin_expect is actually not that. It still generates code when the expression is false. It simply uses the static assumption to minimize jumps and maximize straight execution for the true case. -- Andrei

Yes, that's  why I pointed out expanding it to actually throw an exception when the expectation isn't meant.  I guess that's really more like assume() that has been mentioned?

At EA we used two versions of an assertion: assert() which compiled out in non-debug builds, etc; and verify() that was kept in non-debug builds but just boiled back down to the condition.  The latter was when we relied on the side-effects of the logic (used the condition in a real runtime branch), but really we wanted to know if it ever took the else so to speak as that was an error we never wanted to ship with.

FFIW, at my current job, in C++ we use assert() that compiles out for final builds (very performance-driven code, so even the conditions tested have to scat), and we also have likely() and unlikely() macros that take advantage of __builtin_expect().  There are only a few places where we do both, where the assertion may be violated and we still want to recover nicely from it, but still don't want the performance suck of the else case code polluting the instruction cache.
August 04, 2014
On Monday, 4 August 2014 at 02:40:49 UTC, deadalnix wrote:
> I allow myself to chime in. I don't have much time to follow the whole thing, but I have this in my mind for quite a while.
>
> First thing first, the proposed behavior is what I had in mind for SDC since pretty much day 1. It already uses hint for the optimizer to tell it the branch won't be taken, but I definitively want to go further.

Not everyone had that definition in mind when writing their asserts.

> By definition, when an assert has been removed in release that would have failed in debug, you are in undefined behavior land already. So there is no reason not to optimize.

By the new definition, yes. But is it reasonable to change the definition, and then retroactively declare previous code broken? Maybe the ends justify the means in this case but it certainly isn't obvious that they do. I don't understand why breaking code is sacrilege one time, and the next time can be done without any justifications.
August 04, 2014
On 8/3/14, 6:59 PM, David Bregman wrote:
> w.r.t the one question about performance justification: I'm not
> necessarily asking for research papers and measurements, but based on
> these threads I'm not aware that there is any justification at all. For
> all I know this is all based on a wild guess that it will help
> performance "a lot", like someone who optimizes without profiling first.
> That certainly isn't enough to justify code breakage and massive UB
> injection, is it? I hope we can agree on that much at least!

I think at this point (without more data) a bit of trust in one's experience would be needed. I've worked on performance on and off for years, and so has Walter. We have plenty of war stories that inform our expertise in the matter, including weird stuff like "swap these two enum values and you'll get a massive performance regressions although code is correct either way".

I draw from numerous concrete cases that the right/wrong optimization at the right/wrong place may as well be the difference between winning and losing. Consider the recent php engine that gets within 20% of hhvm; heck, I know where to go to make hhvm 20% slower with 50 lines of code (compare at 2M+). Conversely, gaining those 20% were months multiplied by Facebook's best engineers.

Efficiency is hard to come by and easy to waste. I consider Walter's take on "assert" a modern, refreshing take on an old pattern that nicely preserves its spirit, and a good opportunity and differential advantage for D. If anything, these long threads have strengthened that belief. It has also clarified to me that:

(a) We must make sure we don't transform @safe code into unsafe code; in the first approximation that may simply mean assert() has no special meaning in release mode. Also bounds checking would need to probably be not elided by assert. I consider these challenging but in good, gainful ways.

(b) Deployment of optimizations must be carefully staggered and documented.


Andrei

August 04, 2014
On 8/3/14, 8:22 PM, Andrei Alexandrescu wrote:
> (a) We must make sure we don't transform @safe code into unsafe code; in
> the first approximation that may simply mean assert() has no special
> meaning in release mode.

... in @safe code! -- Andrei