Thread overview
(DMD) problem with closures/delegate literals and structs?
Dec 07, 2013
Johannes Pfau
Dec 07, 2013
Iain Buclaw
Dec 07, 2013
Iain Buclaw
Dec 07, 2013
H. S. Teoh
Dec 07, 2013
Johannes Pfau
Dec 07, 2013
H. S. Teoh
Dec 08, 2013
Iain Buclaw
Dec 08, 2013
H. S. Teoh
December 07, 2013
I found that ARM bug I said I found when testing dub on ARM.

Turns out I was pretty stupid not testing this on X86 first, cause it's not ARM specific.... Debugging on a fast x86 machine would have been much more fun. Anyway, GDC right now can't compile dub:

gdc -o bin/dub -fversion=DubUseCurl  -I source
-lcurl @build-files.txt
bin/dub fetch vibe-d
Fetching vibe-d 0.7.18...
[1]    4139 segmentation fault (core dumped)  bin/dub fetch vibe-d

So I debugged this stuff and it's a stack corruption. Have a look at
this example:
http://dpaste.dzfl.pl/433c0a3d

Please note that the this reference in the delegate points to the stack. Of course copying the struct doesn't magically change the address, so it still refers the old data.

It looks like we actually generate a closure here which contains the this pointer instead of directly using the struct as a context pointer. That is probably an optimization bug in dmd, but it doesn't matter in this case as the problem would exist for closures and normal delegates.

I'm wondering if this is according to the spec, but I guess it is. If we have a struct and take the address of a member funtion, the context pointer is a pointer to the struct (probably on stack) as well. So if we're in a member function and we have a delegate literal which accesses this it seems correct that we have a pointer to the struct (probably on stack). It can however be confusing:

struct S
{
    int a;
    void delegate() getDelegate(int b)
    {
         //b is in a closure and safe to access
         //but a is still accessed via this->a where
         //this may point to the stack.
         return () { return a + b; };
    }
}

So I guess I'm asking: Is this correct D behavior?


BTW: The reason why we see this bug in gdc and not in dmd is simple: Because of GDC bug 52 NRVO doesn't work in gdc as in dmd. In dmd the address of the returned variable doesn't change. Dub returns HTTP structs by value. As long as the address doesn't change there's no issue, so DMDs NRVO hides this bug.

However, as I now know how to break it it's easy to also break it on
DMD ;-)
http://dpaste.dzfl.pl/c109c2fd

If the currect compiler behavior is correct - and I think it is - then std.net.curl is a ticking time bomb and needs to be fixed ASAP.
December 07, 2013
On Dec 7, 2013 9:10 AM, "Johannes Pfau" <nospam@example.com> wrote:
>
> I found that ARM bug I said I found when testing dub on ARM.
>
> Turns out I was pretty stupid not testing this on X86 first, cause it's not ARM specific.... Debugging on a fast x86 machine would have been much more fun. Anyway, GDC right now can't compile dub:
>
> gdc -o bin/dub -fversion=DubUseCurl  -I source
> -lcurl @build-files.txt
> bin/dub fetch vibe-d
> Fetching vibe-d 0.7.18...
> [1]    4139 segmentation fault (core dumped)  bin/dub fetch vibe-d
>
> So I debugged this stuff and it's a stack corruption. Have a look at
> this example:
> http://dpaste.dzfl.pl/433c0a3d
>
> Please note that the this reference in the delegate points to the stack. Of course copying the struct doesn't magically change the address, so it still refers the old data.
>
> It looks like we actually generate a closure here which contains the this pointer instead of directly using the struct as a context pointer. That is probably an optimization bug in dmd, but it doesn't matter in this case as the problem would exist for closures and normal delegates.
>
> I'm wondering if this is according to the spec, but I guess it is. If we have a struct and take the address of a member funtion, the context pointer is a pointer to the struct (probably on stack) as well. So if we're in a member function and we have a delegate literal which accesses this it seems correct that we have a pointer to the struct (probably on stack). It can however be confusing:
>
> struct S
> {
>     int a;
>     void delegate() getDelegate(int b)
>     {
>          //b is in a closure and safe to access
>          //but a is still accessed via this->a where
>          //this may point to the stack.
>          return () { return a + b; };
>     }
> }
>
> So I guess I'm asking: Is this correct D behavior?
>
>
> BTW: The reason why we see this bug in gdc and not in dmd is simple: Because of GDC bug 52 NRVO doesn't work in gdc as in dmd. In dmd the address of the returned variable doesn't change. Dub returns HTTP structs by value. As long as the address doesn't change there's no issue, so DMDs NRVO hides this bug.
>
> However, as I now know how to break it it's easy to also break it on
> DMD ;-)
> http://dpaste.dzfl.pl/c109c2fd
>
> If the currect compiler behavior is correct - and I think it is - then std.net.curl is a ticking time bomb and needs to be fixed ASAP.

Ouch. I remember a similar conversation with David when he was writing std.parallelism (the bug he ran into was code that assumed left to right evaluation).  I should hope that the author of std.net.curl didn't write the module relying on this behavior.

Regards
-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';


December 07, 2013
On Dec 7, 2013 2:45 PM, "Iain Buclaw" <ibuclaw@gdcproject.org> wrote:
>
> Ouch. I remember a similar conversation with David when he was writing
std.parallelism (the bug he ran into was code that assumed left to right evaluation).  I should hope that the author of std.net.curl didn't write the module relying on this behavior.
>

Didn't intentionally write. :)

Regards
-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';


December 07, 2013
On Sat, Dec 07, 2013 at 10:08:25AM +0100, Johannes Pfau wrote: [...]
> gdc -o bin/dub -fversion=DubUseCurl  -I source
> -lcurl @build-files.txt
> bin/dub fetch vibe-d
> Fetching vibe-d 0.7.18...
> [1]    4139 segmentation fault (core dumped)  bin/dub fetch vibe-d
> 
> So I debugged this stuff and it's a stack corruption. Have a look at
> this example:
> http://dpaste.dzfl.pl/433c0a3d
> 
> Please note that the this reference in the delegate points to the stack. Of course copying the struct doesn't magically change the address, so it still refers the old data.

Yeah, this is a pretty nasty gotcha in the way D implements structs. Adam Ruppe & I ran into this problem in ConsoleD.terminal: his Terminal struct registers a bunch of dtors that are invoked when the struct goes out of scope, and initially, they were delegates that close over struct member variables. Bad idea. When the struct was returned to the caller, it was copied to a new location (perfectly valid -- structs are value types). This left the old copy of the struct on the stack, so the delegates still worked, but as soon as you reuse that portion of the stack for anything else, the dtor would segfault because the variables it closed over are now garbage.

Moral of the story: don't close over struct members in delegates if you're returning the struct to the caller (or expect the delegate to get invoke while passing the struct to a callee -- since the callee gets a *copy* of the struct, not the original). Here be dragons. :P


> It looks like we actually generate a closure here which contains the this pointer instead of directly using the struct as a context pointer.  That is probably an optimization bug in dmd, but it doesn't matter in this case as the problem would exist for closures and normal delegates.

The problem is that delegates are never passed the this pointer from the caller; it is assumed that it's part of their context. You never write `myStruct.dg(args)`, but simply `dg(args)`. But by the time the delegate is invoked, you can't guarantee that `this` is still valid. Only the caller knows what copy of the struct it still holds, but this isn't communicated to the delegate. (On second thought, even if it *did* pass the updated `this` to the delegate, it would still be wrong, because there could be multiple copies of the struct by then, and who knows which copy the delegate was supposed to be operating on?)


> I'm wondering if this is according to the spec, but I guess it is. If we have a struct and take the address of a member funtion, the context pointer is a pointer to the struct (probably on stack) as well. So if we're in a member function and we have a delegate literal which accesses this it seems correct that we have a pointer to the struct (probably on stack). It can however be confusing:
> 
> struct S
> {
>     int a;
>     void delegate() getDelegate(int b)
>     {
>          //b is in a closure and safe to access
>          //but a is still accessed via this->a where
>          //this may point to the stack.
>          return () { return a + b; };
>     }
> }
> 
> So I guess I'm asking: Is this correct D behavior?
[...]

I don't know if it's *correct*, but that's how it works currently. It's certainly a very nasty pitfall for those not in the know, though.

The major issue here is that normally, if your delegate closes over a local variable, the compiler will automatically allocate the local variable on the heap rather than the stack. This prevents crashes caused by referencing out-of-scope local variables. But this is not done for struct members -- it can't be, since the only way is to make struct members pointers to the heap instead of embedded in the struct. Since structs are freely copied around, you can't guarantee that the closed-over member variable is the correct copy -- or even still exists -- by the time the delegate is called.

I'm almost tempted to say that it should be illegal for a delegate to close over struct members, but that seems unnecessarily restrictive (there are cases where you might want to do this, and can do it in a safe way). I almost feel like the compiler should complain about escaping references to local variables in this case, though.


T

-- 
In a world without fences, who needs Windows and Gates? -- Christian Surchi
December 07, 2013
Am Sat, 7 Dec 2013 08:43:53 -0800
schrieb "H. S. Teoh" <hsteoh@quickfur.ath.cx>:

> On Sat, Dec 07, 2013 at 10:08:25AM +0100, Johannes Pfau wrote: [...]
> > It looks like we actually generate a closure here which contains the this pointer instead of directly using the struct as a context pointer.  That is probably an optimization bug in dmd, but it doesn't matter in this case as the problem would exist for closures and normal delegates.
> 
> The problem is that delegates are never passed the this pointer from the caller; it is assumed that it's part of their context. You never write `myStruct.dg(args)`, but simply `dg(args)`. But by the time the delegate is invoked, you can't guarantee that `this` is still valid. Only the caller knows what copy of the struct it still holds, but this isn't communicated to the delegate. (On second thought, even if it *did* pass the updated `this` to the delegate, it would still be wrong, because there could be multiple copies of the struct by then, and who knows which copy the delegate was supposed to be operating on?)

True, but that's not what I meant ;-)
I always get the terminology related to closures wrong so sorry if that
didn't make sense.

What I meant is this: In the first example I posted,
http://dpaste.dzfl.pl/433c0a3d
the delegate does not access _function variables_. It only accesses the
this pointer. So there's no need for it to be a real closure and
allocate memory, it could instead be a normal delegate with the context
pointer simply set to the this pointer at the moment the delegate is
created.
December 07, 2013
On Sat, Dec 07, 2013 at 07:32:46PM +0100, Johannes Pfau wrote:
> Am Sat, 7 Dec 2013 08:43:53 -0800
> schrieb "H. S. Teoh" <hsteoh@quickfur.ath.cx>:
> 
> > On Sat, Dec 07, 2013 at 10:08:25AM +0100, Johannes Pfau wrote: [...]
> > > It looks like we actually generate a closure here which contains the this pointer instead of directly using the struct as a context pointer.  That is probably an optimization bug in dmd, but it doesn't matter in this case as the problem would exist for closures and normal delegates.
> > 
> > The problem is that delegates are never passed the this pointer from the caller; it is assumed that it's part of their context. You never write `myStruct.dg(args)`, but simply `dg(args)`. But by the time the delegate is invoked, you can't guarantee that `this` is still valid.  Only the caller knows what copy of the struct it still holds, but this isn't communicated to the delegate. (On second thought, even if it *did* pass the updated `this` to the delegate, it would still be wrong, because there could be multiple copies of the struct by then, and who knows which copy the delegate was supposed to be operating on?)
> 
> True, but that's not what I meant ;-)
> I always get the terminology related to closures wrong so sorry if
> that didn't make sense.
> 
> What I meant is this: In the first example I posted,
> http://dpaste.dzfl.pl/433c0a3d
> the delegate does not access _function variables_. It only accesses
> the this pointer. So there's no need for it to be a real closure and
> allocate memory, it could instead be a normal delegate with the
> context pointer simply set to the this pointer at the moment the
> delegate is created.

The problem is that when `this` is closed over, it's one thing, but when the delegate is invoked later, `this` may no longer be valid. For example:

	struct S {
		int x = 123; // canary
		void delegate() getDelegate() {
			// Closes over `this`
			return () => x;
		}
	}

	void delegate() dg;

	S makeStruct() {
		S s;
		dg = s.getDelegate();
		return s;
	}

	void main() {
		auto s = makeStruct();
		dg(); // <-- NG
	}


T

-- 
Indifference will certainly be the downfall of mankind, but who cares? -- Miquel van Smoorenburg
December 08, 2013
On 7 December 2013 20:30, H. S. Teoh <hsteoh@quickfur.ath.cx> wrote:
> On Sat, Dec 07, 2013 at 07:32:46PM +0100, Johannes Pfau wrote:
>> Am Sat, 7 Dec 2013 08:43:53 -0800
>> schrieb "H. S. Teoh" <hsteoh@quickfur.ath.cx>:
>>
>> > On Sat, Dec 07, 2013 at 10:08:25AM +0100, Johannes Pfau wrote: [...]
>> > > It looks like we actually generate a closure here which contains the this pointer instead of directly using the struct as a context pointer.  That is probably an optimization bug in dmd, but it doesn't matter in this case as the problem would exist for closures and normal delegates.
>> >
>> > The problem is that delegates are never passed the this pointer from the caller; it is assumed that it's part of their context. You never write `myStruct.dg(args)`, but simply `dg(args)`. But by the time the delegate is invoked, you can't guarantee that `this` is still valid.  Only the caller knows what copy of the struct it still holds, but this isn't communicated to the delegate. (On second thought, even if it *did* pass the updated `this` to the delegate, it would still be wrong, because there could be multiple copies of the struct by then, and who knows which copy the delegate was supposed to be operating on?)
>>
>> True, but that's not what I meant ;-)
>> I always get the terminology related to closures wrong so sorry if
>> that didn't make sense.
>>
>> What I meant is this: In the first example I posted,
>> http://dpaste.dzfl.pl/433c0a3d
>> the delegate does not access _function variables_. It only accesses
>> the this pointer. So there's no need for it to be a real closure and
>> allocate memory, it could instead be a normal delegate with the
>> context pointer simply set to the this pointer at the moment the
>> delegate is created.
>
> The problem is that when `this` is closed over, it's one thing, but when the delegate is invoked later, `this` may no longer be valid. For example:
>
>         struct S {
>                 int x = 123; // canary
>                 void delegate() getDelegate() {
>                         // Closes over `this`
>                         return () => x;
>                 }
>         }
>
>         void delegate() dg;
>
>         S makeStruct() {
>                 S s;
>                 dg = s.getDelegate();
>                 return s;
>         }
>
>         void main() {
>                 auto s = makeStruct();
>                 dg(); // <-- NG
>         }
>

I feel inclined to make that always make that sort of usage an error. :)
December 08, 2013
On Sun, Dec 08, 2013 at 12:17:02AM +0000, Iain Buclaw wrote:
> On 7 December 2013 20:30, H. S. Teoh <hsteoh@quickfur.ath.cx> wrote:
> > On Sat, Dec 07, 2013 at 07:32:46PM +0100, Johannes Pfau wrote:
[...]
> >> What I meant is this: In the first example I posted,
> >> http://dpaste.dzfl.pl/433c0a3d
> >> the delegate does not access _function variables_. It only accesses
> >> the this pointer. So there's no need for it to be a real closure
> >> and allocate memory, it could instead be a normal delegate with the
> >> context pointer simply set to the this pointer at the moment the
> >> delegate is created.
> >
> > The problem is that when `this` is closed over, it's one thing, but when the delegate is invoked later, `this` may no longer be valid. For example:
> >
> >         struct S {
> >                 int x = 123; // canary
> >                 void delegate() getDelegate() {
> >                         // Closes over `this`
> >                         return () => x;
> >                 }
> >         }
> >
> >         void delegate() dg;
> >
> >         S makeStruct() {
> >                 S s;
> >                 dg = s.getDelegate();
> >                 return s;
> >         }
> >
> >         void main() {
> >                 auto s = makeStruct();
> >                 dg(); // <-- NG
> >         }
> >
> 
> I feel inclined to make that always make that sort of usage an error. :)

I agree. But the problem is that currently the compiler will happily accept this and emit code for it, which will screw up at runtime. So the compiler should be fixed to either reject this outright, or somehow generate code that does the Right Thing.


T

-- 
First Rule of History: History doesn't repeat itself -- historians merely repeat each other.