| |
| Posted by Era Scarecrow | PermalinkReply |
|
Era Scarecrow
| Just a little experience and perhaps some help on the subject. This is a partial repost from another forum too so. I've always saw how much threading was an annoyance trying to follow along (the API alone) but programming it is more annoying. I've never actually done multi-thread programming so this is a first for me.
First the problem. Trying to load up a data structure (that's fairly big) can take a fair amount of time, but if the records and structures never need to touch eachother, there's no reason they cannot be handled on separate cores/threads (or that's my logic on it anyways).
In order to try and use more cores, I've split off the loading and unpacking stages as separate. So first off within half a second the whole memory is filled with 80Mb of data and all the records are separated. Now that they are separated, they can all be unpacked by the different cores.
Part of a problem is when the thread activates, just because you start a thread doesn't mean it runs right away (it will run when it's ready), an any data that still relies on it via a delegate becomes a violate pointer data (At least in VisualD) and that data may change. So...
[code]
class Record {
//and stuff
void loadSubRecords();
}
Record[] recordList; //and stuff
foreach(rec; recordList) {
Thread th = new Thread( () {rec.loadSubRecords()} );
th.start();
}
[/code]
Rec (and even ref rec) may change at any time (Worse is during it's update or before the thread starts). So if we go with to copying an index instead it does improve a bit. So long as the data is copied before the next foreach loop it's fine, otherwise I may still change and it may do something unwanted.
[code]
foreach(i, rec; recordList) {
Thread th = new Thread( ()
{
int index = i;
recordList[index].loadSubRecords();
});
th.start();
}
[/code]
Several other combinations came up. I think I found an easy way to handle it without adding in unneeded mutexes and whatnot. What seems to work is if I pack all the data for the job I need in a structure, and have that structure start the thread (inside), then the chances of the problem happening go away (hopefully completely).
[code]
//or something similar
struct Packed {
Thread thread;
Record record;
void run() {
assert(record);
thread = new Thread( (){record.loadSubRecords();} );
thread.start();
}
}
//bad way of thread handling, but makes sense.
Packed[] obj;
obj.length = recordList.length;
foreach(i, rec; recordList) {
obj[i].record = rec; //class is reference type remember
obj[i].run(); //returns right away, but thread is running too
}
threads_joinAll();
[/code]
So long as the records (and subrecords) never touch eachother then mutexes and semephores aren't needed 90% of the time.
Now since the record count in the original file is 40k, having 40k of threads not only is dumb, but also expensive to set up. So instead I set up job groups.
[code]
struct PackedList {
Thread thread;
Record[] recordList;
void runWork() {
foreach(rec; recordList)
rec.loadSubRecords();
}
void run() {
assert(recordList);
thread = new Thread( (){this.runWork();} );
thread.start();
}
}
[/code]
With this basic idea, drop a thousand in one PackedList and start it, then grab another thousand and drop them into another PackedList. They'll run until their workload is done.
Is there a suggested magic number of how many threads per core you should use? If you have say a quad core, you can have 4 threads going (obviously) but if they go to sleep waiting on system resources or something (loading a file, saving, something other), then the core may be unused. It makes sense to have 2 per core since then if it gets silent it has another it can pick up on. I'm guessing 2-4 would be the number of threads to do this type of work.
|