Thread overview
Functional Muti-threading
Jan 21, 2007
janderson
Jan 21, 2007
Saaa
Jan 22, 2007
janderson
Jan 22, 2007
janderson
Jan 22, 2007
janderson
Jan 22, 2007
Daniel Keep
Jan 22, 2007
janderson
Jan 22, 2007
janderson
Jan 25, 2007
Kevin Bealer
January 21, 2007
Here's one idea about how D could help support multi-threading.  What if you could specify which functions where thread-able.  ie:

threadable int foo(int X)
{
...
}

void bar()
{
   int a = 0;
   int array[3];
   x[0] = foo(a);  //Separate worker thread
   x[1] = foo(a);  //Separate worker thread (or could be main thread)
   x[2] = foo(a);  //Main thread
   //Would sleep here until things are done
}

Say foo was a relativity small function.  The compiler may decide to pack 2 thread operations together for greater efficiency.  The compiler would have the ultimate decision about when something gets placed on another thread.  Whether an application is compiled threaded or not would be a compiler flag (-threaded).

The compiler should also be able to determine functions that can be threaded saftly and add these to the list (kinda like how inline works in C++ -> you can mark something as inline to let the compiler know but it also makes its own decisions).

Consider the for loop:

int array = new int[];
array.length = GetSize();
for (int n=0; n<array.length; ++n)
{
   array[0] = foo(4);
}
//Sleep until finished

In this particular case the compiler knows at runtime what the size is just before the loop starts (and size does not change).  It could divide the work into the number of hardware threads.

Consider the for loop:

int array[] = new int[];
for (int n=0; n<GetSize(); ++n)
{
   array ~= foo(4);
}
//Sleep until finished

In this case the compiler has no-idea about how many times the loop will run.  It would have to hand out the tasks for foo one at a time to whatever worker thread were least busy.

Another thought for the for loop was to do something like:

int array[] = new int[];
Threadable(4) for (int n=0; n<GetSize(); ++n)
{
   array ~= foo(4);
}

Where threadable(4) tells the compiler that the work should be served out in lots of 4.  It could compile 4 versions of of the function (ie one with 4 foos, one with 3, one with 2 and one with 1).

Because these optimisation could get really complex, I think you'd start by just allowing the threadable option and then slowly add optimizations to support it.

The downside to this approach is that you would probably not be able to get the performance a of hand tuned threading application.  But then again, that would take a lot longer to write.

Anyways the point is, the compiler would decide when to do multithreading and what threadable functions can be compounded together at compile time to make for an optimal program.

Just a ruff idea.  I think compiler support for multi-threading in some form would be helpful in D's success.

What do you think?
January 21, 2007
I'd recommend looking at the futurism lib in D.announce and see how much of this is already there ; )


January 22, 2007
Saaa wrote:
> I'd recommend looking at the futurism lib in D.announce and see how much of this is already there ; ) 
> 

The futurism library was one of the things that triggered my memory about this idea.  Well done on a great library!

However I think it could be done better by the compiler in many regards.  Things like futurism would still have their place.

The difference is that in futurism library threading based on the call, not the function itself.  It doesn't determine if something should be threaded or not, users have to decide on a case by case basis.  It doesn't compound threaded functions together. Its more work for programmer to maintain.  Its harder to read.  Its harder to turn on and off.  Placing sleep-until-done-messages would also need to be done by hand.

One thought though, you may be able to get pretty close to that for-loop thing with delegates and templates (for compounding).  Although if you had 2 loops in a row (or a loop and threaded function) it wouldn't be able to figure out that these both can be running at once.

-Joel
January 22, 2007
janderson wrote:
> Saaa wrote:
>> I'd recommend looking at the futurism lib in D.announce and see how much of this is already there ; )
> 
> The futurism library was one of the things that triggered my memory about this idea.  Well done on a great library!
> 
> However I think it could be done better by the compiler in many regards.  Things like futurism would still have their place.
> 
> The difference is that in futurism library threading based on the call, not the function itself.  It doesn't determine if something should be threaded or not, users have to decide on a case by case basis.  It doesn't compound threaded functions together. Its more work for programmer to maintain.  Its harder to read.  Its harder to turn on and off.  Placing sleep-until-done-messages would also need to be done by hand.
> 
> One thought though, you may be able to get pretty close to that for-loop thing with delegates and templates (for compounding).  Although if you had 2 loops in a row (or a loop and threaded function) it wouldn't be able to figure out that these both can be running at once.
> 
> -Joel

Also how would futurism figure out if a function (that isn't labeled) is  candidate for multi-threading?

-Joel
January 22, 2007
Case in point:

To take an example from Kevin Bealer,

int sum(int[] x)
{
    int rv = 0;
    foreach(int y; x) {
        rv += y;
    }
    return rv;
}

int sum_4(int[] x)
{
    alias Future!(int) FInt;
    int part = x.length/4;

    // Each of these queues a work item to be done either on some
    // other thread, or by the main thread here.

    FInt x1 = new FInt({ return sum(x[0..part]); });
    FInt x2 = new FInt({ return sum[part..part*2]); });
    FInt x3 = new FInt({ return sum[part*2..part*3]); });
    FInt x4 = new FInt({ return sum[part*3..$]; });

    // The addition below waits for each results to complete
    // before adding it to the total.

    return x1.value + x2.value + x3.value + x4.value;
}


Could be written like:
int sum(int[] x)
{
    int rv = 0;
    threadable(4) foreach(int y; x) {
        rv += y;
    }
    return rv;
}

Or even better
threadable int sum(int[] x)
{
    int rv = 0;
    threadable(4) foreach(int y; x) {
        rv += y;
    }
    return rv;
}

Now sum can be spliced in with other CPU independent operations. ir

int foo(int[] a, int b[])
{
  return sum(a) + sum(b); //Both run in parallel
}

Better still, we can turn it off with a simple compiler flag to get single threaded performance.

Obviously we have Futurism library now however I make this suggestion for D2.0.  I'm sure there are a few problems in the syntax that i presented that will need to be worked though.  Or maybe the edge cases are just too difficult to make this feasible?

Oh, and for anyone who still hasn't see Herb Sutter's video:

http://video.google.com/videoplay?docid=7625918717318948700&q=sutter

It explains why it is so important to figure out how multi-threading is going to be handled in D.
January 22, 2007
janderson wrote:
> [snip]
> 
> What do you think?

"I think all the right thinking people in this country are sick and tired of being sick and tired of being sick and tired.  I know I'm not, and I'm sick and tired of being told that I am!"

Or something to that effect...

*ahem*

This is one of the things I've thought about quite a bit over the years.  What you call "threadable" I call "pure", as in a pure function. Basically, a pure function is one which has *no* side effects.  If a function has no side effects, then it doesn't matter when it gets executed.  The nice thing is that "pure function" is more well-defined than "threadable" :3

The idea comes from functional programming languages like LISP.  If you're writing pure LISP, then all of your functions are pure, which leaves LISP free to thread your program however it likes.

I thought about suggesting that the compiler add a purity check in the same vein as inlining: any function found (or marked) as pure could be threaded off by the compiler without any adverse side-effects.

Of course, I never did since this would be a huge amount of work.  Which got me thinking about other kinds of multithreading.  Which is why I wrote my "future" implementation (not to be confused with "futurism" :P).

To be honest, I think we are more likely to see array operations *first* (ie: a[] = b[] + c[] as an element-wise addition) than this stuff.  The nice thing is that array operations can also be parrallelised (using things like SIMD, or even multi-threaded SIMD on multi-core systems) but much more easily.

So yeah, anything that makes multithreading easier is a good thing in my books.

</2ยข>

	-- Daniel
January 22, 2007
Daniel Keep wrote:
> janderson wrote:
>> [snip]
>>
>> What do you think?
> 
> "I think all the right thinking people in this country are sick and tired of being sick and tired of being sick and tired.  I know I'm not, and I'm sick and tired of being told that I am!"
> 
> Or something to that effect...
> 
> *ahem*
> 
> This is one of the things I've thought about quite a bit over the years.  What you call "threadable" I call "pure", as in a pure function. Basically, a pure function is one which has *no* side effects.  If a function has no side effects, then it doesn't matter when it gets executed.  The nice thing is that "pure function" is more well-defined than "threadable" :3

I fear pure may be mixed up with pure virtual.  I don't like threadable either.

Naming aside.  I'm glad that I'm not the only one who has thought about this technique (means I'm probably not crazy).

It also brings up an interesting point.  There are probably other non-threadable operations that can occur on these "pure function"'s. For instance order of operations can be switched around (if that resulted in some optimisation).  Or the function could be silently rewritten to work on an array of inputs/outputs rather then a single output so it could take advantage of SIMD.

> 
> The idea comes from functional programming languages like LISP.  If you're writing pure LISP, then all of your functions are pure, which leaves LISP free to thread your program however it likes.

Neat, so there are already examples of this at work.  Any procedural languages that use it?

> I thought about suggesting that the compiler add a purity check in the same vein as inlining: any function found (or marked) as pure could be threaded off by the compiler without any adverse side-effects.

Exactly.  It may be possible to get the expressiveness of the syntax where we can specify functions as pure virtual ourselves.  However then  it would be impossible to generalize the syntax so that the compiler could add pure functions at it's will.

> Of course, I never did since this would be a huge amount of work.  Which 

I think if it only worked in a few cases to begin with and that got expanded as time went on, I'd be fine with that.

> got me thinking about other kinds of multithreading.  Which is why I wrote my "future" implementation (not to be confused with "futurism" :P).

Nice work on the future implementation BTW...  The great thing about future is that you can still run other more complex threads at the same time and have "future" fill some of the idle gaps.

> 
> To be honest, I think we are more likely to see array operations *first* (ie: a[] = b[] + c[] as an element-wise addition) than this stuff.  The nice thing is that array operations can also be parrallelised (using things like SIMD, or even multi-threaded SIMD on multi-core systems) but much more easily.

Yeah, I was hanging out for this stuff too when Walter mentioned it a few years back.

...

Heres another con: It would make stepping though the code a PITA, unless you turned it off for debug.

-Joel
January 22, 2007
janderson wrote:
> Here's one idea about how D could help support multi-threading.  What if you could specify which functions where thread-able.  ie:
> 
> threadable int foo(int X)
> {
> ...
> }
> 
> void bar()
> {
>    int a = 0;
>    int array[3];
>    x[0] = foo(a);  //Separate worker thread
>    x[1] = foo(a);  //Separate worker thread (or could be main thread)
>    x[2] = foo(a);  //Main thread
>    //Would sleep here until things are done
> }
> 
> 
> What do you think?


Here's an extension to the idea.  It would be good to have an "un-pure", "!pure" or "!threadable" functions as well.  That way if you included an "un-pure" function in an pure function the compiler would give an error.     If an included function was not marked un-pure the user would have to face the side-effect consequences at run-time.

-Joel
January 25, 2007
I like this idea, but I think hand tuning and design will be necessary for quite a while.  For a big application, it might be enough to run with "-profile", then find the most expensive dozen or so functions and try to split those up.

On the other hand, the more CPUs there are in the box that lands on someone's desktop, the more important it will be to have apps that can use a dozen hardware threads.

(I kind of wish I had a big CPU intensive "real world" app written in D to play with though, so that I could test my theories about this stuff in a more realistic setting.)

Kevin