Hi Luis,
Thanks for spending some time on it.
Here some inline remarks from my side.
On Friday, 28 March 2025 at 00:35:50 UTC, Luís Marques wrote:
>
- The main issue: The distributed
libdruntime-ldc-shared.so
appears to be compiled without version=SupportSanitizers
, so the fake stack scanning isn't enabled.
This is intentional, because there is some overhead associated with it. If this is the main issue then that's "OK" and has always been the case. Perhaps FakeStack is now enabled by default. We should either disable that default, or start shipping including the small overhead in druntime.
>
- Fake stack support should work when properly enabled: If you build LDC with
-DRT_SUPPORT_SANITIZERS=True
, the fake stack is scanned as expected.
Great.
> From what I can see (by checking for the proper GC scan calls and addresses), everything is being done correctly. The test tests/sanitizers/asan_fakestack_GC.d
still fails, but not because of broken fake stack scanning—instead, the test itself is broken. An assertion fails in test_non_null_does_not_trigger_collection
because memory from a previous iteration is being freed, not because it's freeing memory incorrectly for the current iteration.
The idea of the test is to test that enabling fakestack does not change collection behavior. The test is clunky, I'll admit. If it is indeed failing already on the fakestack-disabled case, then it needs to be adjusted for sure.
>
- Redundant function calls: The runtime function
scanAllTypeImpl
calls scanStackForASanFakeStack
multiple times for the same fake stack frame, using identical memory ranges each time. While technically correct, this is unnecessary and inefficient. Based on the comment "This is the pointer we are catching with"
, it seems there's a misunderstanding regarding how the fake stack should be located.
I'm not sure how this can be related to specifically fakestack implementation issue. If you mean that scanAllTypeImpl
calls scanStackForASanFakeStack
multiple times, then that is not a problem of our fakestack support code, but of scanAllTypeImpl
itself because it would also call the normal scan
method multiple times with same memory range. That would point towards an issue with duplicates in the ThreadBase.sm_cbeg
list.
Is that what you meant? (it is literally what you typed, but the sentences following it do not make sense with that interpretation...)
LLVM does not give us many options to find the fake stack. At the time I think I used the only public API method available, which indeed results in multiple times getting the same fakestack. We can poke into ASan internals but I am afraid it will quickly bitrot/break. Perhaps by now ASan's public interface has been extended, otherwise we should ask for it (or submit PRs to LLVM)...
I'm quite sure we are doing the right thing of locating the fakestack, it's done similarly here: https://www.aovivo.ilutas.com.br/node-v17.9.1/deps/v8/src/heap/base/stack.cc
>
- Misleading comment in the test: The comment
// Large enough so it will be on fakestack heap (not inlined in local stack frame)
in asan_fakestack_GC.d
appears to be incorrect. The size of the allocation shouldn't determine this—if its address escapes, it must go into the heap-allocated fake stack regardless.
I vaguely remember there being some optimization with small memory sizes, but I cannot find it and cannot trigger it anymore. I think it may be related to this comment of mine: https://johanengelen.github.io/posts/2017-12-25-ldc-and-addresssanitizer/#fn:4
Hmm... I no longer remember. Regardless, you are right that to detect any stackmemory use after return, that memory cannot sit on the regular processor stack.
cheers,
Johan