Thread overview | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
July 21, 2017 AddressSanitizer and the GC | ||||
---|---|---|---|---|
| ||||
Hi all, I've added AddressSanitizer [1] support to LDC, and it is able to nicely catch bugs like this one: ``` void foo(int* arr) { arr[10] = 1; } void main() { int[10] a; foo(&a[0]); } ``` It works for malloc'ed memory, but not yet for GC'd memory. Our GC pool memory is malloc'ed, so it is "safe" to access from ASan's current point of view. To make it work with our GC, we have to mark all GC pool memory (malloc'ed) as poisoned, such that access to that memory will trigger an ASan error. Then when a user asks memory from the pool (e.g. `new int`), we should allocate a little more GC'd memory than what the user requests (red zone), and then mark only the user size as unpoisoned. By allocating a little more, we make sure that user allocated memory blocks are separated by at least an amount of red zone that helps detect read/write memory overflows. Pieces of the solution: a. Call __asan_poison_memory_region on all memory internally allocated by the GC b. ptr = GC.alloc(user_request_size+redzone) c. Call __asan_unpoison_memory_region on ptr[0..user_request_size] I've already been toying with several options but would like your opinion: what is the best way for me to add this to the GC? 1. What are the best places to modify the conservative GC with the above pieces? 2. Should I create a new GC type "asan"? We now have "conservative" (default) and "manual". Adding an option to the default GC is made hard because of linking problems (need to link with asan library, but I can define @weak function stubs), but I am worried about slow down in non-asan mode (which is the 99.9999999% use case). Thanks, Johan [1] https://github.com/google/sanitizers/wiki/AddressSanitizer |
July 21, 2017 Re: AddressSanitizer and the GC | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johan Engelen | On 7/21/2017 12:06 PM, Johan Engelen wrote: < [...] Thanks for your work on this. It is important to support such valuable tools. One way making such things pretty much cost-free in the olden days was to trick the linker into writing NOPs over the function calls by using specially crafted fixup records. This worked with dumb linkers, but today's linkers do much more semantic inference into what's going on, I'm afraid this may be too brittle. Unless there is a linker feature to support it? Are some linkers able to NOP out a function call if it sees that the function body consists solely of a RET? Another option is to use a pointer to the asan functions, like: if (asan_fp) (*asan_fp)(args...) The asan_fp can then be set to NULL and there's only the overhead of a compare/branch rather than a function call. Well, also the load of a global variable. |
July 21, 2017 Re: AddressSanitizer and the GC | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johan Engelen | On 7/21/2017 12:06 PM, Johan Engelen wrote:
> I've added AddressSanitizer [1] support to LDC, and it is able to nicely catch bugs like this one:
If anyone is game, it would be cool to do an LLVM compile of DMD so AddressSanitizer can look for bugs in DMD itself.
|
July 21, 2017 Re: AddressSanitizer and the GC | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Friday, 21 July 2017 at 20:56:02 UTC, Walter Bright wrote:
> On 7/21/2017 12:06 PM, Johan Engelen wrote:
>> I've added AddressSanitizer [1] support to LDC, and it is able to nicely catch bugs like this one:
>
> If anyone is game, it would be cool to do an LLVM compile of DMD so AddressSanitizer can look for bugs in DMD itself.
Yep, I've got fuzzing to work too, so... expect nice blogs "soon". ;-)
(btw, the work is because of wanting fuzz testing at Weka, so that's my stress test for this work)
|
July 21, 2017 Re: AddressSanitizer and the GC | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Friday, 21 July 2017 at 20:53:18 UTC, Walter Bright wrote: > On 7/21/2017 12:06 PM, Johan Engelen wrote: > < [...] > > Thanks for your work on this. It is important to support such valuable tools. > > One way making such things pretty much cost-free in the olden days was to trick the linker into writing NOPs over the function calls by using specially crafted fixup records. This worked with dumb linkers, but today's linkers do much more semantic inference into what's going on, I'm afraid this may be too brittle. > > Unless there is a linker feature to support it? Are some linkers able to NOP out a function call if it sees that the function body consists solely of a RET? ARM's toolchain has something that does this: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka14171.html Can't find something like it for other toolchains. > Another option is to use a pointer to the asan functions, like: > > if (asan_fp) > (*asan_fp)(args...) > > The asan_fp can then be set to NULL and there's only the overhead of a compare/branch rather than a function call. Well, also the load of a global variable. I can mark the if-clause as unlikely, so the performance difference would be small (allocation is slow already, of course). Let me try that for now. We can decide later, let's first find an implementation that works and test that the whole idea is valid and I didn't miss anything important. Can you advise me on where best to add the asan calls? Thanks, Johan PS. The interface is easy, I don't need much info at the call sites: // Poisons memory region [addr, addr+size) void __asan_poison_memory_region(const(void*) addr, size_t size); // Unpoisons memory region [addr, addr+size) void __asan_unpoison_memory_region(const(void*) addr, size_t size); |
July 21, 2017 Re: AddressSanitizer and the GC | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johan Engelen | On Friday, 21 July 2017 at 21:37:39 UTC, Johan Engelen wrote:
> On Friday, 21 July 2017 at 20:53:18 UTC, Walter Bright wrote:
>> On 7/21/2017 12:06 PM, Johan Engelen wrote:
>> < [...]
>>
>> Another option is to use a pointer to the asan functions, like:
>>
>> if (asan_fp)
>> (*asan_fp)(args...)
>>
>> The asan_fp can then be set to NULL and there's only the overhead of a compare/branch rather than a function call. Well, also the load of a global variable.
>
> I can mark the if-clause as unlikely, so the performance difference would be small (allocation is slow already, of course).
Quick extra note: the function pointer solution is not friendly to LTO, whereas a weak-linking solution would easily be optimized-out fully with LTO. So that's something to consider too.
|
July 21, 2017 Re: AddressSanitizer and the GC | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johan Engelen | On 7/21/2017 2:14 PM, Johan Engelen wrote:
> On Friday, 21 July 2017 at 20:56:02 UTC, Walter Bright wrote:
>> If anyone is game, it would be cool to do an LLVM compile of DMD so AddressSanitizer can look for bugs in DMD itself.
>
> Yep, I've got fuzzing to work too, so... expect nice blogs "soon". ;-)
>
> (btw, the work is because of wanting fuzz testing at Weka, so that's my stress test for this work)
Pretty dazz! Please continue!
|
July 21, 2017 Re: AddressSanitizer and the GC | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johan Engelen | On 7/21/2017 2:39 PM, Johan Engelen wrote:
> Quick extra note: the function pointer solution is not friendly to LTO, whereas a weak-linking solution would easily be optimized-out fully with LTO. So that's something to consider too.
This code should be version'd with:
version (AddressSanitizer)
so that:
1. it will be easy to try different methods
2. all the different places it is used will be easy to find
3. if it does make things too slow, it can be easily 'removed' at the option of whoever built druntime
|
July 21, 2017 Re: AddressSanitizer and the GC | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johan Engelen | On 7/21/2017 2:37 PM, Johan Engelen wrote:
> Can you advise me on where best to add the asan calls?
Offhand, I don't know. It's been maybe 15 years since I worked on it, and a lot of people have moved things about since then.
Dmitry Olshansky has worked on it very recently, in fact he is working on it at the moment. He might be a good person to ask and can likely give you a better answer than I can.
|
Copyright © 1999-2021 by the D Language Foundation