Thread overview
Mesh: a new approach to memory compaction for C/C++
Sep 17, 2019
Per Nordlöw
Sep 17, 2019
Stefan Koch
Sep 22, 2019
Emery Berger
Sep 22, 2019
Stefan Koch
Sep 18, 2019
Gregor Mückl
Sep 18, 2019
Stefan Koch
Sep 22, 2019
Emery Berger
Sep 24, 2019
Gregor Mückl
Nov 11, 2019
zoujiaqing
September 17, 2019
strangeloop talk here:

https://www.youtube.com/watch?v=c1UBJbfR-H0

and github repo here:

https://github.com/plasma-umass/Mesh

Sounds to me like making D's GC precise by default to enable memory compaction is worth pursuing.
September 17, 2019
On Tuesday, 17 September 2019 at 11:42:51 UTC, Per Nordlöw wrote:
> strangeloop talk here:
>
> https://www.youtube.com/watch?v=c1UBJbfR-H0
>
> and github repo here:
>
> https://github.com/plasma-umass/Mesh
>
> Sounds to me like making D's GC precise by default to enable memory compaction is worth pursuing.

I've just skimmed the paper.
It seems that all this does is to defragment the memory at runtime.
Which it does at regular intervals unless a steady-state heuristic stops mesh from doing that.
It also seems this allocation scheme only works for small to medium size allocations done in large quantities.
And therefore would not work if the application does allocation in large chunks itself (as is the case for dmd by default).

I am going to look into this a bit further.
September 18, 2019
On Tuesday, 17 September 2019 at 11:42:51 UTC, Per Nordlöw wrote:
> strangeloop talk here:
>
> https://www.youtube.com/watch?v=c1UBJbfR-H0
>
> and github repo here:
>
> https://github.com/plasma-umass/Mesh
>
> Sounds to me like making D's GC precise by default to enable memory compaction is worth pursuing.

I'm am thinking about the implications for debugging. Mapping the same page to two virtual memory addresses on the heap creates weird situations where writing over the bounds of one object doesn't trash allocations with nearby memory addresses, but can affect seemingly far away allocations that ended up randomly aliased to the same page. Even worse, Mesh randomizes allocations within pages to get more opportunities to alias pages. This means that some heap-smashing bug will likely present itself differently on each run. Have fun finding the root cause of that :/.
September 18, 2019
On Wednesday, 18 September 2019 at 09:31:07 UTC, Gregor Mückl wrote:
> On Tuesday, 17 September 2019 at 11:42:51 UTC, Per Nordlöw wrote:
>> strangeloop talk here:
>>
>> https://www.youtube.com/watch?v=c1UBJbfR-H0
>>
>> and github repo here:
>>
>> https://github.com/plasma-umass/Mesh
>>
>> Sounds to me like making D's GC precise by default to enable memory compaction is worth pursuing.
>
> I'm am thinking about the implications for debugging. Mapping the same page to two virtual memory addresses on the heap creates weird situations where writing over the bounds of one object doesn't trash allocations with nearby memory addresses, but can affect seemingly far away allocations that ended up randomly aliased to the same page. Even worse, Mesh randomizes allocations within pages to get more opportunities to alias pages. This means that some heap-smashing bug will likely present itself differently on each run. Have fun finding the root cause of that :/.

Good point. I hadn't thought about that.
All the additional OS calls to copy stuff into different physical
pages will also most likely also not be great for perf.
If the allocator increases the runtime of perlbench by 4%
that's acutally quite significant as I can't imagine perlbench runs fast.
September 22, 2019
On Tuesday, 17 September 2019 at 14:06:06 UTC, Stefan Koch wrote:
> On Tuesday, 17 September 2019 at 11:42:51 UTC, Per Nordlöw wrote:
>> strangeloop talk here:
>>
>> https://www.youtube.com/watch?v=c1UBJbfR-H0
>>
>> and github repo here:
>>
>> https://github.com/plasma-umass/Mesh
>>
>> Sounds to me like making D's GC precise by default to enable memory compaction is worth pursuing.
>

(Mesh co-author here)

FWIW, Mesh eliminates the need for precise garbage collection.

> I've just skimmed the paper.
> It seems that all this does is to defragment the memory at runtime.

"all this does"?? That's a lot! :)

Mesh is the first system to automatically perform defragmentation for C-like languages.

-- emery

PS We'd love to see it adopted by D, and we welcome comments and pull requests!





September 22, 2019
On Wednesday, 18 September 2019 at 09:31:07 UTC, Gregor Mückl wrote:
> On Tuesday, 17 September 2019 at 11:42:51 UTC, Per Nordlöw wrote:
>> strangeloop talk here:
>>
>> https://www.youtube.com/watch?v=c1UBJbfR-H0
>>
>> and github repo here:
>>
>> https://github.com/plasma-umass/Mesh
>>
>> Sounds to me like making D's GC precise by default to enable memory compaction is worth pursuing.
>
> I'm am thinking about the implications for debugging. Mapping the same page to two virtual memory addresses on the heap creates weird situations where writing over the bounds of one object doesn't trash allocations with nearby memory addresses, but can affect seemingly far away allocations that ended up randomly aliased to the same page. Even worse, Mesh randomizes allocations within pages to get more opportunities to alias pages. This means that some heap-smashing bug will likely present itself differently on each run. Have fun finding the root cause of that :/.

You can always seed the random number generator with a fixed value to get reproducible runs.

More interestingly, it's also possible to exploit an Exterminator-like approach  (see https://people.cs.umass.edu/~emery/pubs/p87-novark.pdf) to automatically find the source of such bugs! This is specifically enabled by randomization. We implemented Exterminator for a previous randomizing allocator (DieHard: code here, including Exterminator -- https://github.com/plasma-umass/DieHard), and there's no technical barrier to doing the same for Mesh.

-- emery (Mesh co-author)


September 22, 2019
On Sunday, 22 September 2019 at 16:12:13 UTC, Emery Berger wrote:
> On Tuesday, 17 September 2019 at 14:06:06 UTC, Stefan Koch wrote:
>> On Tuesday, 17 September 2019 at 11:42:51 UTC, Per Nordlöw wrote:
>>> strangeloop talk here:
>>>
>>> https://www.youtube.com/watch?v=c1UBJbfR-H0
>>>
>>> and github repo here:
>>>
>>> https://github.com/plasma-umass/Mesh
>>>
>>> Sounds to me like making D's GC precise by default to enable memory compaction is worth pursuing.
>>
>
> (Mesh co-author here)
>
> FWIW, Mesh eliminates the need for precise garbage collection.
>
>> I've just skimmed the paper.
>> It seems that all this does is to defragment the memory at runtime.
>
> "all this does"?? That's a lot! :)
>
> Mesh is the first system to automatically perform defragmentation for C-like languages.
>
> -- emery
>
> PS We'd love to see it adopted by D, and we welcome comments and pull requests!

Hi Emery,

I didn't mean to sound patronizing. :-/

The Paper is well written and easy to understand.
I wish more papers were like yours.

Cheers
September 23, 2019
On 9/22/19 12:12 PM, Emery Berger wrote:
> On Tuesday, 17 September 2019 at 14:06:06 UTC, Stefan Koch wrote:
>> On Tuesday, 17 September 2019 at 11:42:51 UTC, Per Nordlöw wrote:
>>> strangeloop talk here:
>>>
>>> https://www.youtube.com/watch?v=c1UBJbfR-H0
>>>
>>> and github repo here:
>>>
>>> https://github.com/plasma-umass/Mesh
>>>
>>> Sounds to me like making D's GC precise by default to enable memory compaction is worth pursuing.
>>
> 
> (Mesh co-author here)
> 
> FWIW, Mesh eliminates the need for precise garbage collection.
> 
>> I've just skimmed the paper.
>> It seems that all this does is to defragment the memory at runtime.
> 
> "all this does"?? That's a lot! :)
> 
> Mesh is the first system to automatically perform defragmentation for C-like languages.
> 
> -- emery
> 
> PS We'd love to see it adopted by D, and we welcome comments and pull requests!

Thanks, Emery. I'm cc'ing my students with this and will make the point to consider this a possible project for our upcoming students. Also, we should put this on the ideas page for GSoC and SAoC.
September 24, 2019
On Sunday, 22 September 2019 at 16:15:08 UTC, Emery Berger wrote:
> On Wednesday, 18 September 2019 at 09:31:07 UTC, Gregor Mückl wrote:
>> On Tuesday, 17 September 2019 at 11:42:51 UTC, Per Nordlöw wrote:
>>> strangeloop talk here:
>>>
>>> https://www.youtube.com/watch?v=c1UBJbfR-H0
>>>
>>> and github repo here:
>>>
>>> https://github.com/plasma-umass/Mesh
>>>
>>> Sounds to me like making D's GC precise by default to enable memory compaction is worth pursuing.
>>
>> I'm am thinking about the implications for debugging. Mapping the same page to two virtual memory addresses on the heap creates weird situations where writing over the bounds of one object doesn't trash allocations with nearby memory addresses, but can affect seemingly far away allocations that ended up randomly aliased to the same page. Even worse, Mesh randomizes allocations within pages to get more opportunities to alias pages. This means that some heap-smashing bug will likely present itself differently on each run. Have fun finding the root cause of that :/.
>
> You can always seed the random number generator with a fixed value to get reproducible runs.
>
> More interestingly, it's also possible to exploit an Exterminator-like approach  (see https://people.cs.umass.edu/~emery/pubs/p87-novark.pdf) to automatically find the source of such bugs! This is specifically enabled by randomization. We implemented Exterminator for a previous randomizing allocator (DieHard: code here, including Exterminator -- https://github.com/plasma-umass/DieHard), and there's no technical barrier to doing the same for Mesh.
>
> -- emery (Mesh co-author)

I've recently had a cursory look into some of the work of your group. You've got some wonderfully crazy ideas and it's nice to see the proofs of concept.

I'm not sure I see the practical value of Exterminator for application developers. As I understand it, MS included a derivative of DieHard on Windows to be able to cheat misbehaving applications into crashing less often. As I see it, this is a pretty desperate workaround for when you can't fix the actual bug directly.

I am really curious about one thing: what can Exterminator do that valgrind cannot when I'm debugging a program locally? For the sake of discussion, let's assume that valgrind gets full heap allocation information from the GC (which currently isn't the case for the D garbage collector as I understand it; other GCs like moarvm's have that integration).

Also, to put my foot squarely into my own mouth: I have a fair bit of reservation regarding probabilistic algorithms on a system level. I generally prefer the underlying systems I work with to be as deterministic as possible. I also develop Monte Carlo simulations for a living... ;)
November 11, 2019
On Tuesday, 17 September 2019 at 11:42:51 UTC, Per Nordlöw wrote:
> strangeloop talk here:
>
> https://www.youtube.com/watch?v=c1UBJbfR-H0
>
> and github repo here:
>
> https://github.com/plasma-umass/Mesh
>
> Sounds to me like making D's GC precise by default to enable memory compaction is worth pursuing.

I think, performance is often more important than momory compaction :)