Jump to page: 1 24  
Page
Thread overview
Dual Core Support
Jun 16, 2005
Manfred Nowak
Jun 16, 2005
Brad Beveridge
Jun 17, 2005
Lionello Lunesu
Jun 17, 2005
Manfred Nowak
Jun 17, 2005
xs0
Jun 18, 2005
Manfred Nowak
Jun 18, 2005
xs0
Jun 18, 2005
Manfred Nowak
Jun 18, 2005
Sean Kelly
Jun 19, 2005
Manfred Nowak
Jun 19, 2005
Sean Kelly
Jun 19, 2005
James Dunne
Jun 19, 2005
Manfred Nowak
Jun 19, 2005
Sean Kelly
Jun 20, 2005
Matthias Becker
Jun 20, 2005
Brad Beveridge
Jun 20, 2005
Sean Kelly
Jun 17, 2005
Sean Kelly
Jun 17, 2005
Brad Beveridge
Jun 17, 2005
Sean Kelly
Jun 17, 2005
Brad Beveridge
Jun 17, 2005
Sean Kelly
Jun 18, 2005
Manfred Nowak
Jun 18, 2005
Matthias Becker
Jun 18, 2005
Sean Kelly
Jun 19, 2005
Manfred Nowak
Jun 19, 2005
Sean Kelly
Jun 20, 2005
Sean Kelly
Jun 20, 2005
Brad Beveridge
Jun 21, 2005
Sean Kelly
Jun 19, 2005
Derek Parnell
Jun 20, 2005
Manfred Nowak
Jun 20, 2005
Brad Beveridge
Jun 21, 2005
Matthias Becker
Jun 22, 2005
Manfred Nowak
Jun 22, 2005
Brad Beveridge
June 16, 2005
The shipping of the "AMD Athlon 64 X2" is announced to start at the end of this month.

A review is available: http://www.amdreview.com/reviews.php?rev=athlonx24200

As the review suggests WinXP and Sandra are prepared to use more than one CPU.

Will D be outdated before the release of 1.0 because D has no support for multi core units?

-manfred
June 16, 2005
Manfred Nowak wrote:
> The shipping of the "AMD Athlon 64 X2" is announced to start at the end of this month.
> 
> A review is available:
> http://www.amdreview.com/reviews.php?rev=athlonx24200
> 
> As the review suggests WinXP and Sandra are prepared to use more than one CPU.
> 
> Will D be outdated before the release of 1.0 because D has no support for multi core units?
> 
> -manfred
How do you mean?  You can program in a multithreaded manner in D, which should take advantage of multiple cpus/cores.  Or am I missing something?

Brad
June 17, 2005
| Will D be outdated before the release of 1.0 because D has no support | for multi core units?

There's nothing special about multi-core processors, at least when it comes to the compiler, it's all the same. A PC with a dual-core CPU (or two 'single-core' CPU's for that matter) can simply run two programs at full speed, at the same time.

On a single-core CPU, the operating system lets each running program use the CPU for a fraction of a second, so it seems they are running at the same time, but they never really are.

L.


June 17, 2005
"Lionello Lunesu" <lio@lunesu.removethis.com> wrote:

> There's nothing special about multi-core processors, at least when it comes to the compiler, it's all the same.

Thank you both for your responses, Brad and Lionellu.

In essence both of you seem to want the OS to represent a multicore system as a virtual single core system to you. In this case you are right: neglecting the fact that you have a multicore system does not raise any need to use its capabilities.

On the other hand the OS has to do the work to make the multicore sytem to appear as a virtual single core system to you.

| If control of Northbridge functions is shared between software
| on both cores, software must ensure that only one core at a time
| is allowed to access the shared MSR.
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_
docs/26094.PDF (p. 324)

So there is a need to adress the specialities of dual core machines.


Please recall that an AMD Athlon64 system can contain up to 8 dual core units and that one of D's major goals is to

| Provide low level bare metal access as required http://www.digitalmars.com/d/overview.html

Is this really true when all bare metal access has to use the asm statement?


Please look deeper into the D specs: http://www.digitalmars.com/d/statement.html

The throw-statement:
| The Object reference is thrown as an exception.

What will happen if both cores throw an exception at the same clock impulse?

The volatile satement:
| Memory writes occurring before the Statement are performed
| before any reads within or after the Statement. Memory reads
| occurring after the Statement occur after any writes before or
| within Statement are completed.

What does this mean for a multi core system, which shares the main memory between all activated cores?


Algorithmically it is simply not true that a dual core system is aequivalent to a higher clocked single core system!

Please recall the simple task of deciding wether there is a given and fixed value in an array large enough.

Using a virtual single core machine you would simply loop through all indices until you find the given value or end up not finding it, then issuing the appropriate result.

Given a natural number n (n>=2 && n <=16) and a mchine with n cores you would divide the array into n equal sized pieces and assign a core to each piece of the array. In case of not finding the searched value you would in essence end up having cut down the number of clock cycles needed to an n-th of the time of a virtual single core system.

But if you cannot assign a core to a task because the used language does not allow this assignment you can do nothing more than assigning the n parts of the array to n threads and then _hope_ that the OS will execute them in parallel.

Would you trust your life to a system, that is usually fast but cannot be assured to have reaction time prolongations in a magnitude of more than ten?

You may want to answer with "no", and in this case my initial question on the outdatedness of D is assigned a positive value.

-manfred


June 17, 2005
Manfred Nowak wrote:
> "Lionello Lunesu" <lio@lunesu.removethis.com> wrote:
> 
> 
>>There's nothing special about multi-core processors, at least
>>when it comes to the compiler, it's all the same.
> 
> 
> Thank you both for your responses, Brad and Lionellu.
> 
> In essence both of you seem to want the OS to represent a
> multicore system as a virtual single core system to you. In this
> case you are right: neglecting the fact that you have a multicore
> system does not raise any need to use its capabilities. 

AFAIK, multi-core processors are almost exactly the same as having multiple cpus, except they're in a single box and share a single bus to the outside world. So, I'd say that there's nothing much that can be done beyond what is already done (which is basically multi-threading support and synchronization objects).

I don't think starting a thread is light-weight enough that the compiler should try to multi-thread code automatically, because in 99.9% cases there'd be no benefit.


> On the other hand the OS has to do the work to make the multicore
> sytem to appear as a virtual single core system to you. 

I think the OS does just the opposite - by scheduling and task-switching, it hides the actual CPUs/cores, and makes the system appear as having any number of them (where the number is the number of threads that are running).


> | If control of Northbridge functions is shared between software
> | on both cores, software must ensure that only one core at a time
> | is allowed to access the shared MSR. http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_
> docs/26094.PDF (p. 324)
> 
> So there is a need to adress the specialities of dual core machines.

You should've also mentioned the title of the white paper, which is BIOS and Kernel Developer's Guide for [AMD processors]. I disagree that D should be specialized for those types of software, and I think you'd still need assembler anyway; much important kernel code is both speed-critical and extremely specific, so coding it in a high-level langauge is just not an option realistically.


> Please look deeper into the D specs:
> http://www.digitalmars.com/d/statement.html
> 
> The throw-statement:
> | The Object reference is thrown as an exception.
> 
> What will happen if both cores throw an exception at the same
> clock impulse? 

Each thread will unwind its stack, like it does now, until it gets to an exception handler.. I don't see the difference when there is more than one core..


> The volatile satement:
> | Memory writes occurring before the Statement are performed
> | before any reads within or after the Statement. Memory reads
> | occurring after the Statement occur after any writes before or
> | within Statement are completed.
> 
> What does this mean for a multi core system, which shares the main memory between all activated cores?

Again, you skipped an important part: A volatile statement does not guarantee atomicity. Whenever more than one thread can access the same memory (where at least one is writing to it), the accesses should be synchronized, multi-core or not. Providing synchronization methods is the job of OS and/or hardware, and using them is already simple in D.


> Algorithmically it is simply not true that a dual core system is aequivalent to a higher clocked single core system!

Unfortunately, no, it isn't.

> [snip]
> 
> But if you cannot assign a core to a task because the used language does not allow this assignment you can do nothing more than assigning the n parts of the array to n threads and then _hope_ that the OS will execute them in parallel.

The OS is in charge of both cores anyway; you can't bypass it and somehow take control of the cores, so you hope for the best in any case. That's another reason why automatically multi-threading doesn't make much sense.


> Would you trust your life to a system, that is usually fast but cannot be assured to have reaction time prolongations in a magnitude of more than ten?

No, but luckily both software and OSs in such systems are usually written with hard guarantees about how much time anything takes..


> You may want to answer with "no", and in this case my initial question on the outdatedness of D is assigned a positive value.

Well, I certainly wouldn't like D to be outdated so soon, but I think that as far as performance is concerned, there are several better things that could be done first (any-order loops, array ops, easier MMX/SSE utilization, etc.). I think that only after single-thread optimizations are exhausted, we (or D or Walter) should be moving towards multi-cpu/core stuff.


xs0
June 17, 2005
I need to read up a bit on multi-core systems, but they act the same as SMP systems, correct?  So your concern is having library facilities which allow you to assign tasks to different processors and so on?  If so, I think at least some basic functionality is a candidate for 1.0, especially if some motivated person is willing to write it :)  I'm currently experimenting with some lockless synch. functionality in Ares, and would be happy to build processor affinity support and such into the Thread class if someone is willing to supply the assembly for it... and I believe Walter would do the same for Phobos.


Sean


June 17, 2005
Sean Kelly wrote:
> I need to read up a bit on multi-core systems, but they act the same as SMP
> systems, correct?  So your concern is having library facilities which allow you
> to assign tasks to different processors and so on?  If so, I think at least some
> basic functionality is a candidate for 1.0, especially if some motivated person
> is willing to write it :)  I'm currently experimenting with some lockless synch.
> functionality in Ares, and would be happy to build processor affinity support
> and such into the Thread class if someone is willing to supply the assembly for
> it... and I believe Walter would do the same for Phobos.
> 
> 
> Sean
> 
> 
This I agree with, library support for multi-processor systems is a good idea.  Of course, as far as I am aware at the application level you don't really get to choose anyhow - you can provide hints to the OS about processor afinity, but that is about it.  Writing software for multicore systems is almost the same as writing multithreaded programs - the main difference being that even more sublte bugs can show due to the fact that threads actually are executing at the same time rather than concurrently.

As an aside, I don't particularly see the true use for multicore systems in real life applications at the moment.  Right now most CPUs, unless you program very carefully, are memory bound - they spend a lot of their time waiting for memory accesses.  Having multiple cores just increases the demand on the main memory bus, so the CPUs (unless executing completely out of cache) will still be waiting a lot.  But I guess that is why we are seeing larger and larger L1 caches.

Brad
June 17, 2005
In article <d8uq8m$1heq$1@digitaldaemon.com>, Brad Beveridge says...
>
>As an aside, I don't particularly see the true use for multicore systems in real life applications at the moment.  Right now most CPUs, unless you program very carefully, are memory bound - they spend a lot of their time waiting for memory accesses.  Having multiple cores just increases the demand on the main memory bus, so the CPUs (unless executing completely out of cache) will still be waiting a lot.  But I guess that is why we are seeing larger and larger L1 caches.

Exactly.  And that leaves us with cache coherency problems.  I think we're getting close to a fundamental change in how applications are designed, but I haven't seen any suggestion for how to handle SMP efficiently and easily as locks and such just don't cut it.  It's an interesting time for software design :)


Sean


June 17, 2005
Sean Kelly wrote:
> In article <d8uq8m$1heq$1@digitaldaemon.com>, Brad Beveridge says...
> 
<snip>
> 
> Exactly.  And that leaves us with cache coherency problems.  I think we're
> getting close to a fundamental change in how applications are designed, but I
> haven't seen any suggestion for how to handle SMP efficiently and easily as
> locks and such just don't cut it.  It's an interesting time for software design
> :)
> 
> 
> Sean
> 
> 
Thinking along these lines, performance programming in D would possibly benefit more from a library that lets you manipulate the cache.  Such a library could possibly provide functions to prefill the cache, lock portions of it, etc.  Of course, messing with caches is not the kind of thing that you want to do even 1% of the time - there is just too much chance that locking the cache down will negatively impact performance. Especially if the OS wants to do a context switch.  Sigh, programming just ain't what it used to be when you could cycle count your assembler instructions & figure out how fast your loop would be :)

Brad
June 17, 2005
In article <d8utov$1khp$1@digitaldaemon.com>, Brad Beveridge says...
>
>Thinking along these lines, performance programming in D would possibly benefit more from a library that lets you manipulate the cache.  Such a library could possibly provide functions to prefill the cache, lock portions of it, etc.  Of course, messing with caches is not the kind of thing that you want to do even 1% of the time - there is just too much chance that locking the cache down will negatively impact performance. Especially if the OS wants to do a context switch.  Sigh, programming just ain't what it used to be when you could cycle count your assembler instructions & figure out how fast your loop would be :)

True enough :)  And things are changing for x86 architectures in this regard. Until recently, x86 machines only had full mfence facilities (with the LOCK instruction) but IIRC acquire/release instructions were added to the Itanium, and I think things are moving towards more fine-grained cache control.  But this is something that is sufficiently complex (even for experts) that it really needs to be done right in a library so that the average joe doesn't have to worry about it.  Lockless containers are one such feature, and perhaps some other design patterns would be appropriate to support as well.  Ben's work is a definite step in the right direction, and it may well be a basis for some of the stuff that ends up in Ares.  As for the rest... it's worth keeping on on the C++ standardization process as they're facing similar issues for the next release. But D has a lead on C++ at the moment because of the way Walter implemented 'volatile'.  It's my hope that D will be we well suited for concurrent programming years before the next iteration of the C++ standard is finalized.


Sean


« First   ‹ Prev
1 2 3 4