View mode: basic / threaded / horizontal-split · Log in · Help
June 30, 2006
StackThreads and Phobos
Hello fellow D users!  I've been working diligently on improving StackThreads
(http://assertfalse.com), and so far D has worked wonderfully.  The same can not
be said for Phobos...  (This may get a bit long winded.)

---

1.  removeRange is really slow.

In order to prevent the garbage collector from accidentally deleting valid
objects, we need to mark the stack of each user-context.

One naive solution is to simply mark the entire stack region, however this isn't
a very good idea.

Say I create a user context which calls a function and then yields.  If that
function performed any allocations, then they would remain on the stack, and get
scanned by the GC.  Until I overwrite those values, the GC will always see them,
and therefore never collect their memory. 

This results in memory leaks, and a dramatic loss of performance.  The solution
is to dynamically resize the range used by the stack.  When we switch into a
context, we remove the range, and before we leave, we add it back.  This nicely
solves the problem, except for one thing - removeRange requires O(n) time.

This performance penalty applies to each Context, increasing the cost of
switching dramatically.

A simple solution is to use a hash set or another efficient data structure.
This would also be able to detect range address conflicts, such as adding a
range twice.

As a short term fix, I propose changing line 987 of internal/gcx.d from:

memmove(ranges + i, ranges + i + 1, (nranges - i) * ranges[0].sizeof);

to:

memmove(ranges + i, ranges + nranges, ranges[0].sizeof);

While this does not remove the sluggish linear search, it does eliminate a very
large memory copy.  The cost of this approach is that the ordering of ranges is
not preserved, which may increase look up time in some circumstances.

Another possibility is to allow for 'dynamic' ranges, which the user would be
able to resize.  In this situation, we could still use a linear search for
removal, however changing the range's dimensions is a constant time operation.
Thread safety could be insured via an atomic operation, ie CMPXCHG.


2.  It is impossible to safely handle a page fault.

Another problem with StackThreads right now is stack growth.  In a regular
context, the operating system automatically grows the stack once the program
hits the bottom.  It keeps on doing this until there is no more address
space/memory left, and then it throws an overflow.

In StackThreads, I'd like to do the same thing.  However, DMD transforms any
page faults into exceptions, which it then unwinds up the call stack.  This
makes it impossible to trap, since the exception handler will unwind the stack
until it hits a try block with a matching catch - and only then will it get
handled.  From this state, there is no way to recover the program, since we
cannot undo the unwind.  The only tenable option is a global fault and
termination of the context.

In windows, you could try to install a custom exception handler, and trap any
stack overflows yourself - except DMD automatically installs an SEH for each
try..catch block.  It is pretty much inevitable that a hapless user will eat
your page fault when he is trying to catch some other sort of exception.

Call backs are one solution, user programs can carefully handle page faults and
stack overflows without disturbing the state of the program.


---

There are a few more issues, but these are the only two I have encountered that
I couldn't hack around.

My personal view is that integrated user-context switching would fix most of
those issues, and provide a great deal of flexibility.  Once the GC issues are
resolved, it can be made very fast.  In fact, the only overhead (outside GC
management) is equivalent to a single function call.

Standard user level contexts make all sorts of things possible, like coroutines
or micro threads.  They can be used to iterate over complex data structures
trivially.  They are much simpler and faster than threads for GUIs, and
eliminate the need for many state machine objects.

In conjunction with standard preemptive threading, user-contexts provide an
elegant solution to many practical programming problems.  Standard library
integration will result in a much faster context switching and better future
support.


-Mikola Lysenko
July 02, 2006
Re: StackThreads and Phobos
mclysenk@mtu.edu wrote:
> Hello fellow D users!  I've been working diligently on improving StackThreads
> (http://assertfalse.com), and so far D has worked wonderfully.  The same can not
> be said for Phobos...  (This may get a bit long winded.)
> 
> ---
> 
> 1.  removeRange is really slow.
> 
> In order to prevent the garbage collector from accidentally deleting valid
> objects, we need to mark the stack of each user-context.
> 
> One naive solution is to simply mark the entire stack region, however this isn't
> a very good idea.
> 
> Say I create a user context which calls a function and then yields.  If that
> function performed any allocations, then they would remain on the stack, and get
> scanned by the GC.  Until I overwrite those values, the GC will always see them,
> and therefore never collect their memory. 
> 
> This results in memory leaks, and a dramatic loss of performance.  The solution
> is to dynamically resize the range used by the stack.  When we switch into a
> context, we remove the range, and before we leave, we add it back.  This nicely
> solves the problem, except for one thing - removeRange requires O(n) time.
> 

But isn't that the correct behavior? If one can return to that function 
instance(context), shouldn't all allocations made previously by that 
function remain valid?
(note: I haven't used your library, I just read your post and this 
question came up)
Also: Where do you keep the data from each context? In the heap?


-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
July 02, 2006
Re: StackThreads and Phobos
In article <e88ltq$kvc$3@digitaldaemon.com>, Bruno Medeiros says...
>
>But isn't that the correct behavior? If one can return to that function 
>instance(context), shouldn't all allocations made previously by that 
>function remain valid?
>(note: I haven't used your library, I just read your post and this 
>question came up)
>

It is not correct behavior.  Here's an example which may illustrate this better:

#void func1()
#{
#     func2();
#     yield();
#}
#
#void func2()
#{
#     int[] arr = new int[20];
#}

Here is a trace of the execution of func1 alongside it's stack

# Instruction              |      Stack
#--------------------------+--------------
#  func2()                 | SP> &func1 + offset
#                          |
#  int[] arr = new int[20] |     &func1 + offset
#                          | SP> &arr
#                          |
#  return from func2()     | SP> &func1 + offset
#                          |     &arr             <-- Old unused reference
#                          |
#  yield()                 |     &func1 + offset
#                          |     &arr             <-- Will NEVER get collected
#                          |

In this situation, it is doubtful that arr will ever get collected.  In order
for the GC to remove it, it must not find ANY references to arr in memory.
Period.  If the GC scans the entire stack, it will see a reference to arr, and
therefore it will never collect arr.  This is a trivial example, but it is not
difficult to see that these sorts of memory leaks can become very substantial
and very difficult to track down.


The problem is how to tell the garbage collector not to scan past the stack
pointer.  The only way at the moment is to use addRange and removeRange, or zero
out all of the unused stack memory every time you yield.  Either option has a
substantial overhead.

>
>Also: Where do you keep the data from each context? In the heap?
>
>

Heap.  The next version of stackthreads will replace malloc/free with system
specific allocators, such as mmap and VirtualAlloc.

-Mikola Lysenko
July 03, 2006
Re: StackThreads and Phobos
mclysenk@mtu.edu wrote:
> In article <e88ltq$kvc$3@digitaldaemon.com>, Bruno Medeiros says...
>> But isn't that the correct behavior? If one can return to that function 
>> instance(context), shouldn't all allocations made previously by that 
>> function remain valid?
>> (note: I haven't used your library, I just read your post and this 
>> question came up)
>>
> 
> It is not correct behavior.  Here's an example which may illustrate this better:
> 
> #void func1()
> #{
> #     func2();
> #     yield();
> #}
> #
> #void func2()
> #{
> #     int[] arr = new int[20];
> #}
> 
> Here is a trace of the execution of func1 alongside it's stack
> 
> # Instruction              |      Stack
> #--------------------------+--------------
> #  func2()                 | SP> &func1 + offset
> #                          |
> #  int[] arr = new int[20] |     &func1 + offset
> #                          | SP> &arr
> #                          |
> #  return from func2()     | SP> &func1 + offset
> #                          |     &arr             <-- Old unused reference
> #                          |
> #  yield()                 |     &func1 + offset
> #                          |     &arr             <-- Will NEVER get collected
> #                          |
> 
> In this situation, it is doubtful that arr will ever get collected.  In order
> for the GC to remove it, it must not find ANY references to arr in memory.
> Period.  If the GC scans the entire stack, it will see a reference to arr, and
> therefore it will never collect arr.  This is a trivial example, but it is not
> difficult to see that these sorts of memory leaks can become very substantial
> and very difficult to track down.
> 
> 
> The problem is how to tell the garbage collector not to scan past the stack
> pointer.  The only way at the moment is to use addRange and removeRange, or zero
> out all of the unused stack memory every time you yield.  Either option has a
> substantial overhead.
> 
>> Also: Where do you keep the data from each context? In the heap?
>>
>>
> 
> Heap.  The next version of stackthreads will replace malloc/free with system
> specific allocators, such as mmap and VirtualAlloc.
> 
> -Mikola Lysenko
> 
> 

Unless my understanding of this is wrong, the use of the word "never" 
(gets collected) is too strong. If during normal program flow the stack 
is overwritten again, the reference becomes free. Or am I missing something?
Also, doesn't this happen with any regular function?


-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
July 03, 2006
Re: StackThreads and Phobos
In article <e8artd$16e0$1@digitaldaemon.com>, Bruno Medeiros says...
>
>Unless my understanding of this is wrong, the use of the word "never" 
>(gets collected) is too strong. If during normal program flow the stack 
>is overwritten again, the reference becomes free. Or am I missing something?
>Also, doesn't this happen with any regular function?
>

Well, you are correct, that it may still get collected in the future -
regardless, its prospects are fairly dim.  You are also right in your statement
that this is not abnormal behavior for functions, and the GC knows this.  It has
a special case which handles the stack of a regular function.

When the GC does a full collect, it pauses all threads and marks the range from
the stack pointer to the bottom of the stack.  This prevents it from scanning
stuff which is no longer valid, since that range never gets marked.  This works
very well.

The bad part happens when we create multiple stacks for one thread.  In order
for the GC to properly scan them, we need to mark their stack when it is not in
use.  If the GC didn't scan their stack, then we'd accidentally over-collect
stuff they were using.  If we mark the entire stack range, then we'd
under-collect and get a bunch of memory leaks.  To fix this, we use addRange to
mark the active part of their stack before we switch out their context, and
removeRange to delete their old - and potentially invalid - stack range when we
switch into their context.

This part is tricky, and it burned me many times while I was working on stack
threads before I finally caught on.  It is not a big deal to fix once you know
what's happening.  The main disadvantage at the moment is removeRange - it's
just too slow!  It can be fixed, but it requires changing phobos.  

-Mikola Lysenko
July 03, 2006
Re: StackThreads and Phobos
mclysenk@mtu.edu wrote:
> In article <e8artd$16e0$1@digitaldaemon.com>, Bruno Medeiros says...
>> Unless my understanding of this is wrong, the use of the word "never" 
>> (gets collected) is too strong. If during normal program flow the stack 
>> is overwritten again, the reference becomes free. Or am I missing something?
>> Also, doesn't this happen with any regular function?
>>
> 
> Well, you are correct, that it may still get collected in the future -
> regardless, its prospects are fairly dim.  You are also right in your statement
> that this is not abnormal behavior for functions, and the GC knows this.  It has
> a special case which handles the stack of a regular function.
> 
> When the GC does a full collect, it pauses all threads and marks the range from
> the stack pointer to the bottom of the stack.  This prevents it from scanning
> stuff which is no longer valid, since that range never gets marked.  This works
> very well.
> 
> The bad part happens when we create multiple stacks for one thread.  In order
> for the GC to properly scan them, we need to mark their stack when it is not in
> use.  If the GC didn't scan their stack, then we'd accidentally over-collect
> stuff they were using.  If we mark the entire stack range, then we'd
> under-collect and get a bunch of memory leaks.  To fix this, we use addRange to
> mark the active part of their stack before we switch out their context, and
> removeRange to delete their old - and potentially invalid - stack range when we
> switch into their context.
> 
> This part is tricky, and it burned me many times while I was working on stack
> threads before I finally caught on.  It is not a big deal to fix once you know
> what's happening.  The main disadvantage at the moment is removeRange - it's
> just too slow!  It can be fixed, but it requires changing phobos.  
> 
> -Mikola Lysenko
> 
> 

That was clear now, but for what I understand, then your problem is 
simply that you must use removeRange(and it is slow).
That is, it doesn't really have to do with function contexts and stacks. 
Anyone who allocates manually managed memory for any purpose, and then 
uses addRange on it, will have the same problem, right? (because they 
too will have to use removeRange)

-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
July 03, 2006
Re: StackThreads and Phobos
In article <e8baer$1ond$1@digitaldaemon.com>, Bruno Medeiros says...
>
>
>That was clear now, but for what I understand, then your problem is 
>simply that you must use removeRange(and it is slow).
>That is, it doesn't really have to do with function contexts and stacks. 
>Anyone who allocates manually managed memory for any purpose, and then 
>uses addRange on it, will have the same problem, right? (because they 
>too will have to use removeRange)
>


Correct.  This problem will affect everyone who uses addRange/removeRange.

-Mikola Lysenko
Top | Discussion index | About this forum | D home