View mode: basic / threaded / horizontal-split · Log in · Help
June 16, 2005
Dual Core Support
The shipping of the "AMD Athlon 64 X2" is announced to start at the 
end of this month.

A review is available:
http://www.amdreview.com/reviews.php?rev=athlonx24200

As the review suggests WinXP and Sandra are prepared to use more than 
one CPU.

Will D be outdated before the release of 1.0 because D has no support 
for multi core units?

-manfred
June 16, 2005
Re: Dual Core Support
Manfred Nowak wrote:
> The shipping of the "AMD Athlon 64 X2" is announced to start at the 
> end of this month.
> 
> A review is available:
> http://www.amdreview.com/reviews.php?rev=athlonx24200
> 
> As the review suggests WinXP and Sandra are prepared to use more than 
> one CPU.
> 
> Will D be outdated before the release of 1.0 because D has no support 
> for multi core units?
> 
> -manfred
How do you mean?  You can program in a multithreaded manner in D, which 
should take advantage of multiple cpus/cores.  Or am I missing something?

Brad
June 17, 2005
Re: Dual Core Support
| Will D be outdated before the release of 1.0 because D has no support
| for multi core units?

There's nothing special about multi-core processors, at least when it comes 
to the compiler, it's all the same. A PC with a dual-core CPU (or two 
'single-core' CPU's for that matter) can simply run two programs at full 
speed, at the same time.

On a single-core CPU, the operating system lets each running program use the 
CPU for a fraction of a second, so it seems they are running at the same 
time, but they never really are.

L.
June 17, 2005
Re: Dual Core Support
"Lionello Lunesu" <lio@lunesu.removethis.com> wrote:

> There's nothing special about multi-core processors, at least
> when it comes to the compiler, it's all the same.

Thank you both for your responses, Brad and Lionellu.

In essence both of you seem to want the OS to represent a
multicore system as a virtual single core system to you. In this
case you are right: neglecting the fact that you have a multicore
system does not raise any need to use its capabilities. 

On the other hand the OS has to do the work to make the multicore
sytem to appear as a virtual single core system to you. 

| If control of Northbridge functions is shared between software
| on both cores, software must ensure that only one core at a time
| is allowed to access the shared MSR. 
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_
docs/26094.PDF (p. 324)

So there is a need to adress the specialities of dual core machines. 


Please recall that an AMD Athlon64 system can contain up to 8 dual
core units and that one of D's major goals is to 

| Provide low level bare metal access as required
http://www.digitalmars.com/d/overview.html

Is this really true when all bare metal access has to use the asm 
statement?


Please look deeper into the D specs:
http://www.digitalmars.com/d/statement.html

The throw-statement:
| The Object reference is thrown as an exception.

What will happen if both cores throw an exception at the same
clock impulse? 

The volatile satement:
| Memory writes occurring before the Statement are performed
| before any reads within or after the Statement. Memory reads
| occurring after the Statement occur after any writes before or
| within Statement are completed.

What does this mean for a multi core system, which shares the main 
memory between all activated cores?


Algorithmically it is simply not true that a dual core system is 
aequivalent to a higher clocked single core system!

Please recall the simple task of deciding wether there is a given and 
fixed value in an array large enough.

Using a virtual single core machine you would simply loop through all 
indices until you find the given value or end up not finding it, then 
issuing the appropriate result.

Given a natural number n (n>=2 && n <=16) and a mchine with n cores 
you would divide the array into n equal sized pieces and assign a 
core to each piece of the array. In case of not finding the searched 
value you would in essence end up having cut down the number of clock 
cycles needed to an n-th of the time of a virtual single core system.

But if you cannot assign a core to a task because the used language 
does not allow this assignment you can do nothing more than assigning 
the n parts of the array to n threads and then _hope_ that the OS 
will execute them in parallel.

Would you trust your life to a system, that is usually fast but 
cannot be assured to have reaction time prolongations in a magnitude 
of more than ten?

You may want to answer with "no", and in this case my initial 
question on the outdatedness of D is assigned a positive value.

-manfred
June 17, 2005
Re: Dual Core Support
Manfred Nowak wrote:
> "Lionello Lunesu" <lio@lunesu.removethis.com> wrote:
> 
> 
>>There's nothing special about multi-core processors, at least
>>when it comes to the compiler, it's all the same.
> 
> 
> Thank you both for your responses, Brad and Lionellu.
> 
> In essence both of you seem to want the OS to represent a
> multicore system as a virtual single core system to you. In this
> case you are right: neglecting the fact that you have a multicore
> system does not raise any need to use its capabilities. 

AFAIK, multi-core processors are almost exactly the same as having 
multiple cpus, except they're in a single box and share a single bus to 
the outside world. So, I'd say that there's nothing much that can be 
done beyond what is already done (which is basically multi-threading 
support and synchronization objects).

I don't think starting a thread is light-weight enough that the compiler 
should try to multi-thread code automatically, because in 99.9% cases 
there'd be no benefit.


> On the other hand the OS has to do the work to make the multicore
> sytem to appear as a virtual single core system to you. 

I think the OS does just the opposite - by scheduling and 
task-switching, it hides the actual CPUs/cores, and makes the system 
appear as having any number of them (where the number is the number of 
threads that are running).


> | If control of Northbridge functions is shared between software
> | on both cores, software must ensure that only one core at a time
> | is allowed to access the shared MSR. 
> http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_
> docs/26094.PDF (p. 324)
> 
> So there is a need to adress the specialities of dual core machines.

You should've also mentioned the title of the white paper, which is BIOS 
and Kernel Developer's Guide for [AMD processors]. I disagree that D 
should be specialized for those types of software, and I think you'd 
still need assembler anyway; much important kernel code is both 
speed-critical and extremely specific, so coding it in a high-level 
langauge is just not an option realistically.


> Please look deeper into the D specs:
> http://www.digitalmars.com/d/statement.html
> 
> The throw-statement:
> | The Object reference is thrown as an exception.
> 
> What will happen if both cores throw an exception at the same
> clock impulse? 

Each thread will unwind its stack, like it does now, until it gets to an 
exception handler.. I don't see the difference when there is more than 
one core..


> The volatile satement:
> | Memory writes occurring before the Statement are performed
> | before any reads within or after the Statement. Memory reads
> | occurring after the Statement occur after any writes before or
> | within Statement are completed.
> 
> What does this mean for a multi core system, which shares the main 
> memory between all activated cores?

Again, you skipped an important part: A volatile statement does not 
guarantee atomicity. Whenever more than one thread can access the same 
memory (where at least one is writing to it), the accesses should be 
synchronized, multi-core or not. Providing synchronization methods is 
the job of OS and/or hardware, and using them is already simple in D.


> Algorithmically it is simply not true that a dual core system is 
> aequivalent to a higher clocked single core system!

Unfortunately, no, it isn't.

> [snip]
> 
> But if you cannot assign a core to a task because the used language 
> does not allow this assignment you can do nothing more than assigning 
> the n parts of the array to n threads and then _hope_ that the OS 
> will execute them in parallel.

The OS is in charge of both cores anyway; you can't bypass it and 
somehow take control of the cores, so you hope for the best in any case. 
That's another reason why automatically multi-threading doesn't make 
much sense.


> Would you trust your life to a system, that is usually fast but 
> cannot be assured to have reaction time prolongations in a magnitude 
> of more than ten?

No, but luckily both software and OSs in such systems are usually 
written with hard guarantees about how much time anything takes..


> You may want to answer with "no", and in this case my initial 
> question on the outdatedness of D is assigned a positive value.

Well, I certainly wouldn't like D to be outdated so soon, but I think 
that as far as performance is concerned, there are several better things 
that could be done first (any-order loops, array ops, easier MMX/SSE 
utilization, etc.). I think that only after single-thread optimizations 
are exhausted, we (or D or Walter) should be moving towards 
multi-cpu/core stuff.


xs0
June 17, 2005
Re: Dual Core Support
I need to read up a bit on multi-core systems, but they act the same as SMP
systems, correct?  So your concern is having library facilities which allow you
to assign tasks to different processors and so on?  If so, I think at least some
basic functionality is a candidate for 1.0, especially if some motivated person
is willing to write it :)  I'm currently experimenting with some lockless synch.
functionality in Ares, and would be happy to build processor affinity support
and such into the Thread class if someone is willing to supply the assembly for
it... and I believe Walter would do the same for Phobos.


Sean
June 17, 2005
Re: Dual Core Support
Sean Kelly wrote:
> I need to read up a bit on multi-core systems, but they act the same as SMP
> systems, correct?  So your concern is having library facilities which allow you
> to assign tasks to different processors and so on?  If so, I think at least some
> basic functionality is a candidate for 1.0, especially if some motivated person
> is willing to write it :)  I'm currently experimenting with some lockless synch.
> functionality in Ares, and would be happy to build processor affinity support
> and such into the Thread class if someone is willing to supply the assembly for
> it... and I believe Walter would do the same for Phobos.
> 
> 
> Sean
> 
> 
This I agree with, library support for multi-processor systems is a good 
idea.  Of course, as far as I am aware at the application level you 
don't really get to choose anyhow - you can provide hints to the OS 
about processor afinity, but that is about it.  Writing software for 
multicore systems is almost the same as writing multithreaded programs - 
the main difference being that even more sublte bugs can show due to the 
fact that threads actually are executing at the same time rather than 
concurrently.

As an aside, I don't particularly see the true use for multicore systems 
in real life applications at the moment.  Right now most CPUs, unless 
you program very carefully, are memory bound - they spend a lot of their 
time waiting for memory accesses.  Having multiple cores just increases 
the demand on the main memory bus, so the CPUs (unless executing 
completely out of cache) will still be waiting a lot.  But I guess that 
is why we are seeing larger and larger L1 caches.

Brad
June 17, 2005
Re: Dual Core Support
In article <d8uq8m$1heq$1@digitaldaemon.com>, Brad Beveridge says...
>
>As an aside, I don't particularly see the true use for multicore systems 
>in real life applications at the moment.  Right now most CPUs, unless 
>you program very carefully, are memory bound - they spend a lot of their 
>time waiting for memory accesses.  Having multiple cores just increases 
>the demand on the main memory bus, so the CPUs (unless executing 
>completely out of cache) will still be waiting a lot.  But I guess that 
>is why we are seeing larger and larger L1 caches.

Exactly.  And that leaves us with cache coherency problems.  I think we're
getting close to a fundamental change in how applications are designed, but I
haven't seen any suggestion for how to handle SMP efficiently and easily as
locks and such just don't cut it.  It's an interesting time for software design
:)


Sean
June 17, 2005
Re: Dual Core Support
Sean Kelly wrote:
> In article <d8uq8m$1heq$1@digitaldaemon.com>, Brad Beveridge says...
> 
<snip>
> 
> Exactly.  And that leaves us with cache coherency problems.  I think we're
> getting close to a fundamental change in how applications are designed, but I
> haven't seen any suggestion for how to handle SMP efficiently and easily as
> locks and such just don't cut it.  It's an interesting time for software design
> :)
> 
> 
> Sean
> 
> 
Thinking along these lines, performance programming in D would possibly 
benefit more from a library that lets you manipulate the cache.  Such a 
library could possibly provide functions to prefill the cache, lock 
portions of it, etc.  Of course, messing with caches is not the kind of 
thing that you want to do even 1% of the time - there is just too much 
chance that locking the cache down will negatively impact performance. 
Especially if the OS wants to do a context switch.  Sigh, programming 
just ain't what it used to be when you could cycle count your assembler 
instructions & figure out how fast your loop would be :)

Brad
June 17, 2005
Re: Dual Core Support
In article <d8utov$1khp$1@digitaldaemon.com>, Brad Beveridge says...
>
>Thinking along these lines, performance programming in D would possibly 
>benefit more from a library that lets you manipulate the cache.  Such a 
>library could possibly provide functions to prefill the cache, lock 
>portions of it, etc.  Of course, messing with caches is not the kind of 
>thing that you want to do even 1% of the time - there is just too much 
>chance that locking the cache down will negatively impact performance. 
>Especially if the OS wants to do a context switch.  Sigh, programming 
>just ain't what it used to be when you could cycle count your assembler 
>instructions & figure out how fast your loop would be :)

True enough :)  And things are changing for x86 architectures in this regard.
Until recently, x86 machines only had full mfence facilities (with the LOCK
instruction) but IIRC acquire/release instructions were added to the Itanium,
and I think things are moving towards more fine-grained cache control.  But this
is something that is sufficiently complex (even for experts) that it really
needs to be done right in a library so that the average joe doesn't have to
worry about it.  Lockless containers are one such feature, and perhaps some
other design patterns would be appropriate to support as well.  Ben's work is a
definite step in the right direction, and it may well be a basis for some of the
stuff that ends up in Ares.  As for the rest... it's worth keeping on on the C++
standardization process as they're facing similar issues for the next release.
But D has a lead on C++ at the moment because of the way Walter implemented
'volatile'.  It's my hope that D will be we well suited for concurrent
programming years before the next iteration of the C++ standard is finalized.


Sean
« First   ‹ Prev
1 2 3 4
Top | Discussion index | About this forum | D home