Jump to page: 1 2
Thread overview
dip1000 + pure is a DEADLY COMBO
May 12, 2021
Dennis
May 12, 2021
12345swordy
May 12, 2021
Dennis
May 12, 2021
12345swordy
May 12, 2021
Paul Backus
May 12, 2021
MoonlightSentinel
May 12, 2021
Paul Backus
May 14, 2021
Mathias LANG
May 13, 2021
Dukc
May 14, 2021
Per Nordlöw
May 14, 2021
Per Nordlöw
May 14, 2021
Dennis
May 14, 2021
Per Nordlöw
May 14, 2021
Dennis
May 14, 2021
Per Nordlöw
May 14, 2021
Per Nordlöw
May 14, 2021
Per Nordlöw
May 12, 2021

Sorry for the attention-grabbing title, but I think it's warranted, because the gist of it is this:

With -preview=dip1000 enabled, the compiler will happily compile valid, @safe D code into memory corrupting machine code.

The root cause is:
Issue 20150 - -dip1000 defeated by pure

The compiler ignores "reference to local variable x assigned to non-scope parameter y" errors when the function is annotated or inferred pure. The idea is, presumably, that pure functions can't escape references because they have no interaction with global variables. This is false of course, since they can still return them or assign them to other parameters.

The deadly part it that using this flawed logic, the compiler sometimes turns GC allocations into stack allocations too eagerly. Here I got memory corruption because the compiler allocated an array literal on the stack instead of the heap:

Issue 21291 - Array literal that escapes scope is allocated on stack

Later I encountered another instance of it where a closure was not heap-allocated, which looked something like this:

import core.thread;
@safe:
void main() {
    S s;
    s.memberFunc();
}

struct S {
    int a;
    auto memberFunc() {
        auto t = new Thread({
            auto pa = &a; // pointer to stack frame of main!
        });
    }
}

I'm not the only one who encountered memory corruption bugs this way, user Dechcaudron commented on my issue: "This has also happened to me, no idea it could be due to -dip1000".

And most recently:
Issue 21912 - Invalid stack closure when calling delegate inside lambda

Why is this not fixed?

Walter made a PR for fixing the behavior in dmd: (March 2020)
https://github.com/dlang/dmd/pull/10924

Later, aG0aep6G made a better fix: (November 2020)
https://github.com/dlang/dmd/pull/12010

But they're both blocked by the fact that Phobos relies on the bug to compile with -dip1000. This makes sense, because the conversion process was mostly "add scope and return annotations until the compile errors go away". pure functions did not give error messages, so they did not get those annotations.

Regarding this extra work, aG0aep6G commented: (January 2021)

>

I had started on it, but it's tedious work tracking down the errors through templates and overloads. If I remember correctly, dup gave me some trouble, too.

So I've put it on ice for the time being. If someone else wants to give it a shot, that would be great.

And that's where we are now.

Future of dip1000

Matthias asked "Is there a plan to enable DIP1000 by default?" during DConf Online 2020 Day One Q & A Livestream, at 4:50:11.
Walter mentioned "we can do it now" and Atila mentioned how the first step would be to change -dip1000 errors into equivalent deprecation warnings.

Clearly, issue 20150 is a blocker for dip1000 by default.

In the meantime, since I absolutely don't want another unfortunate soul debugging memory corruption bugs that dip1000 introduces, this post is meant to raise awareness, and discuss intermediate solutions.

Maybe the compiler can defensively heap-allocate for now, though that would break @nogc code. Or maybe we can add another switch, -preview=dip1000proper, since the fix is a breaking change. What do you think?

May 12, 2021

On Wednesday, 12 May 2021 at 13:14:30 UTC, Dennis wrote:

>

Sorry for the attention-grabbing title, but I think it's warranted, because the gist of it is this:

With -preview=dip1000 enabled, the compiler will happily compile valid, @safe D code into memory corrupting machine code.

The root cause is:
Issue 20150 - -dip1000 defeated by pure

The compiler ignores "reference to local variable x assigned to non-scope parameter y" errors when the function is annotated or inferred pure. The idea is, presumably, that pure functions can't escape references because they have no interaction with global variables. This is false of course, since they can still return them or assign them to other parameters.

The deadly part it that using this flawed logic, the compiler sometimes turns GC allocations into stack allocations too eagerly. Here I got memory corruption because the compiler allocated an array literal on the stack instead of the heap:

Issue 21291 - Array literal that escapes scope is allocated on stack

Later I encountered another instance of it where a closure was not heap-allocated, which looked something like this:

import core.thread;
@safe:
void main() {
    S s;
    s.memberFunc();
}

struct S {
    int a;
    auto memberFunc() {
        auto t = new Thread({
            auto pa = &a; // pointer to stack frame of main!
        });
    }
}

I'm not the only one who encountered memory corruption bugs this way, user Dechcaudron commented on my issue: "This has also happened to me, no idea it could be due to -dip1000".

And most recently:
Issue 21912 - Invalid stack closure when calling delegate inside lambda

Why is this not fixed?

Walter made a PR for fixing the behavior in dmd: (March 2020)
https://github.com/dlang/dmd/pull/10924

Later, aG0aep6G made a better fix: (November 2020)
https://github.com/dlang/dmd/pull/12010

But they're both blocked by the fact that Phobos relies on the bug to compile with -dip1000. This makes sense, because the conversion process was mostly "add scope and return annotations until the compile errors go away". pure functions did not give error messages, so they did not get those annotations.

Regarding this extra work, aG0aep6G commented: (January 2021)

>

I had started on it, but it's tedious work tracking down the errors through templates and overloads. If I remember correctly, dup gave me some trouble, too.

So I've put it on ice for the time being. If someone else wants to give it a shot, that would be great.

And that's where we are now.

Future of dip1000

Matthias asked "Is there a plan to enable DIP1000 by default?" during DConf Online 2020 Day One Q & A Livestream, at 4:50:11.
Walter mentioned "we can do it now" and Atila mentioned how the first step would be to change -dip1000 errors into equivalent deprecation warnings.

Clearly, issue 20150 is a blocker for dip1000 by default.

In the meantime, since I absolutely don't want another unfortunate soul debugging memory corruption bugs that dip1000 introduces, this post is meant to raise awareness, and discuss intermediate solutions.

Maybe the compiler can defensively heap-allocate for now, though that would break @nogc code. Or maybe we can add another switch, -preview=dip1000proper, since the fix is a breaking change. What do you think?

Should it be a bug with Pure rather than dip1000?

-Alex

May 12, 2021

On Wednesday, 12 May 2021 at 14:58:23 UTC, 12345swordy wrote:

>

Should it be a bug with Pure rather than dip1000?

Purity is inferred correctly. The problem is that function parameters get the scope storage class for free when the function is strongly pure. In any case, I don't think it makes a difference whether you call it a "bug with pure" or a "bug with dip1000".

May 12, 2021

On Wednesday, 12 May 2021 at 15:47:24 UTC, Dennis wrote:

>

On Wednesday, 12 May 2021 at 14:58:23 UTC, 12345swordy wrote:

>

Should it be a bug with Pure rather than dip1000?

Purity is inferred correctly. The problem is that function parameters get the scope storage class for free when the function is strongly pure. In any case, I don't think it makes a difference whether you call it a "bug with pure" or a "bug with dip1000".

Can phobos be rewritten, such that it doesn't depend on the bug?

-Alex

May 12, 2021

On Wednesday, 12 May 2021 at 21:20:03 UTC, 12345swordy wrote:

>

On Wednesday, 12 May 2021 at 15:47:24 UTC, Dennis wrote:

>

On Wednesday, 12 May 2021 at 14:58:23 UTC, 12345swordy wrote:

>

Should it be a bug with Pure rather than dip1000?

Purity is inferred correctly. The problem is that function parameters get the scope storage class for free when the function is strongly pure. In any case, I don't think it makes a difference whether you call it a "bug with pure" or a "bug with dip1000".

Can phobos be rewritten, such that it doesn't depend on the bug?

-Alex

This was discussed here:

https://github.com/dlang/dmd/pull/12010

Short answer: yes, but it requires a lot of difficult and tedious debugging work, and nobody has volunteered yet.

May 12, 2021

On Wednesday, 12 May 2021 at 22:11:36 UTC, Paul Backus wrote:

>

Short answer: yes, but it requires a lot of difficult and tedious debugging work, and nobody has volunteered yet.

... and will probably break code in several other libraries.

May 12, 2021

On Wednesday, 12 May 2021 at 22:51:24 UTC, MoonlightSentinel wrote:

>

On Wednesday, 12 May 2021 at 22:11:36 UTC, Paul Backus wrote:

>

Short answer: yes, but it requires a lot of difficult and tedious debugging work, and nobody has volunteered yet.

... and will probably break code in several other libraries.

If they're using -preview=dip1000, yes. Do preview flags come with a stability guarantee?

May 13, 2021

On Wednesday, 12 May 2021 at 13:14:30 UTC, Dennis wrote:

>

With -preview=dip1000 enabled, the compiler will happily compile valid, @safe D code into memory corrupting machine code.

This is indeed horrible. Thanks for bringing it up.

>

Maybe the compiler can defensively heap-allocate for now, though that would break @nogc code. Or maybe we can add another switch, -preview=dip1000proper, since the fix is a breaking change. What do you think?

The compiler switch, but with some changes:

  • It would be inverse. Correct behaviour by default.
  • You have to list modules/packages where the buggy behaviour would apply.
  • One can set that flag to core,std to continue using Phobos with -dip1000, while still fixing the bug regarding user's own code.
  • The error messages caused by the bug fix should clearly redirect to instructions to the above.
May 14, 2021

On Wednesday, 12 May 2021 at 22:53:20 UTC, Paul Backus wrote:

>

On Wednesday, 12 May 2021 at 22:51:24 UTC, MoonlightSentinel wrote:

>

On Wednesday, 12 May 2021 at 22:11:36 UTC, Paul Backus wrote:

>

Short answer: yes, but it requires a lot of difficult and tedious debugging work, and nobody has volunteered yet.

... and will probably break code in several other libraries.

If they're using -preview=dip1000, yes. Do preview flags come with a stability guarantee?

Nope, that's the point of a preview flag, it can break from release to release.

Our company uses -preview=in and -checkaction=context (the later isn't a preview, but it's still very experimental) and we know that the tradeoff is that you will most likely need to stick to one version of the compiler if you want to keep your sanity.

May 14, 2021

On Wednesday, 12 May 2021 at 13:14:30 UTC, Dennis wrote:

>

The idea is, presumably, that pure functions can't escape references because they have no interaction with global variables. This is false of course, since they can still return them or assign them to other parameters.

No, pure function can neither access module global nor process global (__gshared) variables whatsoever regardless of scope nor return qualifier on parameters and return type.

Please show an example that contradicts this statement.

I fail to see the general issue here.

« First   ‹ Prev
1 2