checkedint call removal (page 41)

August 02, 2014

Re: checkedint call removal

Posted by Artur Skawina
in reply to Andrew Godfrey

Permalink

Artur Skawina

Posted in reply to Andrew Godfrey

Permalink

On 08/02/14 05:34, Andrew Godfrey via Digitalmars-d wrote:
> Suppose I call some logging function which has a faulty assertion in it. What about Walter's position prevents that assertion's effects from escaping the logging function and infecting my code?

Nothing. Walter _wants_ the assumptions to not be contained inside the assert expression.

> I know cross-module optimization is hard hence this may be unlikely, but still it shows something missing.

Like I said in my very first post about this proposal -- this is *not* a theoretical issue. It's not some abstract problem that might surface once a sufficiently smart compiler arrives. It affects code *right now*. Fortunately, compilers are not using Walter's assert definition. But that's not because doing it would be 'hard', the support for these kind of optimization is there. Look:

ass1.d:
---------------------------------------------------------------------
   static import gcc.attribute;
   enum inline = gcc.attribute.attribute("forceinline");

   @inline void assert_()(bool c) {
      import gcc.builtins;
      assert(c);
      if (!c)                         // *1
         __builtin_unreachable();     // *1
   }

   extern (C) void alog(int level, char* msg) {
      assert_(level<10);
      //...
   }

   extern (C) int cf();
   void main() { if (!cf()) assert(0); }
---------------------------------------------------------------------

ass2.c:
---------------------------------------------------------------------
   void alog(int level, char* msg);

   volatile int g = 10;

   int cf() {
      int a = /* an expression that might return 10 at RT */g;

      alog(a, "blah");

      return a==10;
   }
---------------------------------------------------------------------

Compile this with "gdc -O3 ass1.d ass2.c -o ass -flto -frelease" and
this program will fail because the assert triggers. Now, either remove
the lines in the `assert_` function marked with `*1` (it's gcc-speak
for 'assume'), or fix the assert condition -- now the program will
succeed.
The point here is that a simple typo, an off-by-one error, an "<" instead
of "<=", etc, in the assert condition gives you undefined behavior. The
program can then do absolutely everything, return a wrong result, crash
or execute `rm -rv /`.
In this case gdc compiles it to just:

0000000000402970 <_Dmain>:
  402970:       48 83 ec 08             sub    $0x8,%rsp
  402974:       8b 05 26 ad 25 00       mov    0x25ad26(%rip),%eax        # 65d6a0 <g>
  40297a:       e8 91 f1 ff ff          callq  401b10 <abort@plt>

_`assume` is extremely dangerous_. Redefining `assert` to include `assume` would result in D's `assert` being banned from the whole code base, if 'D' even would be consider a 'sane' enough language to use...

artur

On Saturday, 2 August 2014 at 11:12:42 UTC, Artur Skawina via Digitalmars-d wrote: > > _`assume` is extremely dangerous_. You sure can come up with an example where -release (and only with release the problem exists) results in equally dangerous behaviour by overwriting memory due to disabled bound checks. > Redefining `assert` to include `assume` > would result in D's `assert` being banned from the whole code base, if 'D' > even would be consider a 'sane' enough language to use... Or you ban -release from the project.

On 08/02/14 14:12, Tobias Pankrath via Digitalmars-d wrote: > On Saturday, 2 August 2014 at 11:12:42 UTC, Artur Skawina via Digitalmars-d wrote: >> >> _`assume` is extremely dangerous_. > > You sure can come up with an example where -release (and only with release the problem exists) results in equally dangerous behaviour by overwriting memory due to disabled bound checks. `assume` (ie Walter's version of assert) is much worse because even if there are uncoditionally-enabled open-coded bounds checks, the compiler will silently skip them. This: ------------------------------------------------------------------ auto fx(ubyte* p, size_t len) @safe { assert_(len>0); if (len>=1) return p[0]; return -1; } ------------------------------------------------------------------ turns into: ------------------------------------------------------------------ 00000000004029a0 <@safe int fx(ubyte*, ulong)>: 4029a0: 0f b6 07 movzbl (%rdi),%eax 4029a3: c3 retq ------------------------------------------------------------------ Keep in mind that the `assert` can be elsewhere, in a different function and/or module, and can even be written in a different language. The D-asserts will propagate into C code, just like in my previous example. artur

On Saturday, 2 August 2014 at 12:44:26 UTC, Artur Skawina via Digitalmars-d wrote: > ------------------------------------------------------------------ > > Keep in mind that the `assert` can be elsewhere, in a different > function and/or module, and can even be written in a different > language. The D-asserts will propagate into C code, just like in > my previous example. > > artur I agree that this might hide bugs, but I don't agree that the additional trouble is bigger than the additional payoffs. If I would use -release now, I would use it after the change and vice-versa.

On 08/02/14 14:54, Tobias Pankrath via Digitalmars-d wrote: > I agree that this might hide bugs, but I don't agree that the additional trouble is bigger than the additional payoffs. The bug was _introduced_ by the assert, the code was 100% correct. Imagine working on a project with dozen+ developers that use asserts extensively ("it never hurts to have more assertions"). If one of them makes a simple mistake or forgets to update an assert expression somewhere, your own perfectly fine and safe code becomes buggy and exploitable. If you're lucky the problem will be found in testing, but that's far from certain. We use high level languages to (aot) protect ourselves from our own mistakes. Just because I can write: S* p = 0x12345678; does not mean that the compiler has to accept it. artur

On Saturday, 2 August 2014 at 13:21:07 UTC, Artur Skawina via Digitalmars-d wrote: > On 08/02/14 14:54, Tobias Pankrath via Digitalmars-d wrote: >> I agree that this might hide bugs, but I don't agree that the additional trouble is bigger than the additional payoffs. > > The bug was _introduced_ by the assert, the code was 100% correct. > If an assert fails, it's a bug in my book. > Imagine working on a project with dozen+ developers that use asserts > extensively ("it never hurts to have more assertions"). If one of them > makes a simple mistake or forgets to update an assert expression > somewhere, your own perfectly fine and safe code becomes buggy and > exploitable. If there is a wrong assert in the code, it's not perfectly fine. To fail to update some if condition somewhere and to corrupt memory or to forget to fix an assert somewhere and to corrupt memory, are both bugs that will happend with the same likeihood. The first will get you with disabled bound checks, the latter might get you with this optimization. I just don't see how I would take a stand, where I care about one but not about the other.

On 08/02/14 15:32, Tobias Pankrath via Digitalmars-d wrote: > On Saturday, 2 August 2014 at 13:21:07 UTC, Artur Skawina via Digitalmars-d wrote: >> On 08/02/14 14:54, Tobias Pankrath via Digitalmars-d wrote: > If there is a wrong assert in the code, it's not perfectly fine. The code is perfectly fine in isolation. The bug have leaked from some other subsystem or library. When you look at or audit this code, everything seems fine and there appears to be no problem. Of course such a program is buggy. This is about a) how easy it is to get to the buggy state; b) how hard it is to identify and find the bug; c) the impact of such a bug. `assume` introduces _user-defined_ conditions that trigger UB. > To fail to update some if condition somewhere and to corrupt memory or to forget to fix an assert somewhere and to corrupt memory, are both bugs that will happend with the same likeihood. > > The first will get you with disabled bound checks, the latter might get you with this optimization. @safe was supposed to protect from that. artur

On Saturday, 2 August 2014 at 03:07:25 UTC, Tofu Ninja wrote: > > D3 anyone? :) If "assert()" is turned into "assume()" then D will go down as the buggiest compiler in history. No point in continuing with D* then. Then again, there are plenty of other letters. E, F, G, H…

On 8/2/14, 5:44 AM, Artur Skawina via Digitalmars-d wrote: > auto fx(ubyte* p, size_t len) @safe { > assert_(len>0); > if (len>=1) > return p[0]; > return -1; > } As an aside I think it's a bug that this function passes @safe. It should not be able to safely dereference the pointer because it may be e.g. just past the end of the array. Has this been submitted as a bug? -- Andrei

On 08/02/2014 05:08 PM, Andrei Alexandrescu wrote: > On 8/2/14, 5:44 AM, Artur Skawina via Digitalmars-d wrote: >> auto fx(ubyte* p, size_t len) @safe { >> assert_(len>0); >> if (len>=1) >> return p[0]; >> return -1; >> } > > As an aside I think it's a bug that this function passes @safe. It > should not be able to safely dereference the pointer because it may be > e.g. just past the end of the array. Has this been submitted as a bug? > -- Andrei > So far I have been under the impression that dereferencing pointers in @safe is intended to be ok, but creating pointers to inexistent data is intended to be un-@safe.

Forums