December 07, 2008
Fawzi Mohamed wrote:
> On 2008-12-06 17:13:34 +0100, Sean Kelly <sean@invisibleduck.org> said:
> 
>> Fawzi Mohamed wrote:
>>>
>>> a memory barrier would be needed, and atomic decrements, but I see that it is not portable...
>>
>> It would also somewhat defeat the purpose of thread_needLock, since IMO this routine should be fast.  If memory barriers are involved then it may as well simply use a mutex itself, and this is exactly what it's intended to avoid.
> 
> the memory barrier would be needed in the code that decrements the number of active threads, so that you are sure that no pending writes are still there, (that is the problem that you said brought you to switch to a multithreaded flag), not in the code of thread_needLock...

Not true.  You would need an acquire barrier in thread_needLock. However, on x86 the point is probably moot since loads have acquire semantics anyway.

> But again I would say that this optimization is not really worth it (as you also said it), even if it is relevant for GUI applications.

:-)


Sean
December 07, 2008
Leandro Lucarella wrote:
> Christopher Wright, el  6 de diciembre a las 09:06 me escribiste:
>> Fawzi Mohamed wrote:
>>> So yes one could probably switch back to the old Phobos style.
>>> I would guess that it is not really a common situation for a program to become single threaded again, though...
>>> Fawzi
>> At work, we have a single-threaded application -- everything happens on the GUI thread. There are some operations that take a long time, though. For those, we throw up a spinny dialog box. But if these operations happened on the GUI thread, the spinny dialog box would not spin. So we do the expensive operations on a background thread.
>>
>> So, our application becomes multithreaded on rare occasions and becomes single-threaded again after.
>>
>> Not sure how common this is.
> 
> I think this is pretty common in GUI applications, but I don't think GUI
> applications usually are performance critical, right?
> 

Maya? Combustion? Final Cut Pro? Photoshop? Visual Studio (it shouldn't be, but it can get damn slow on occasion)?

Heck, most GUI programs seem like they "could be faster". Opening Outlook takes 30 seconds. Firefox takes 5-10 seconds to start. Even Windows Explorer feels sluggish (to its credit, much less so than Gnome or KDE). I'm not sure if this translates to "performance-critical", but it's certainly something to think about.
December 07, 2008
On 2008-12-07 06:34:20 +0100, Robert Fraser <fraserofthenight@gmail.com> said:

> Leandro Lucarella wrote:
>> Christopher Wright, el  6 de diciembre a las 09:06 me escribiste:
>>> Fawzi Mohamed wrote:
>>>> So yes one could probably switch back to the old Phobos style.
>>>> I would guess that it is not really a common situation for a program to become single threaded again, though...
>>>> Fawzi
>>> At work, we have a single-threaded application -- everything happens on the GUI thread. There are some operations that take a long time, though. For those, we throw up a spinny dialog box. But if these operations happened on the GUI thread, the spinny dialog box would not spin. So we do the expensive operations on a background thread.
>>> 
>>> So, our application becomes multithreaded on rare occasions and becomes single-threaded again after.
>>> 
>>> Not sure how common this is.
>> 
>> I think this is pretty common in GUI applications, but I don't think GUI
>> applications usually are performance critical, right?
>> 
> 
> Maya? Combustion? Final Cut Pro? Photoshop? Visual Studio (it shouldn't be, but it can get damn slow on occasion)?
> 
> Heck, most GUI programs seem like they "could be faster". Opening Outlook takes 30 seconds. Firefox takes 5-10 seconds to start. Even Windows Explorer feels sluggish (to its credit, much less so than Gnome or KDE). I'm not sure if this translates to "performance-critical", but it's certainly something to think about.

all example that you did are heavily multithreaded as far as I know, (VisualStudio I do not know).
An the standard way to make a GUI more responsive is th make it multithreaded (offloading computational intensive tasks.
The GUI is driven by a single thread, but the application itself is multithreaded.
If you have a single threaded application that it too slow in the single threaded parts, probably to speed it up you would want to make it multithreaded.
So is the speedup of single threaded parts worht making the runtime depend on memory barriers, that are not implemented for each platform?
I don't think so.

Fawzi

December 07, 2008
On 2008-12-07 03:48:40 +0100, Sean Kelly <sean@invisibleduck.org> said:

> Fawzi Mohamed wrote:
>> On 2008-12-06 17:13:34 +0100, Sean Kelly <sean@invisibleduck.org> said:
>> 
>>> Fawzi Mohamed wrote:
>>>> 
>>>> a memory barrier would be needed, and atomic decrements, but I see that it is not portable...
>>> 
>>> It would also somewhat defeat the purpose of thread_needLock, since IMO this routine should be fast.  If memory barriers are involved then it may as well simply use a mutex itself, and this is exactly what it's intended to avoid.
>> 
>> the memory barrier would be needed in the code that decrements the number of active threads, so that you are sure that no pending writes are still there, (that is the problem that you said brought you to switch to a multithreaded flag), not in the code of thread_needLock...
> 
> Not true.  You would need an acquire barrier in thread_needLock. However, on x86 the point is probably moot since loads have acquire semantics anyway.

You would need a very good processor to reorder speculative loads before a function call and a branch. As far as I know even alpha did not do it.
A volatile statement will probably be enough in all cases, but you are right that to be really correct a load barrier should be done, an even in a processor where this might matter the cost of it in the fast path will be basically 0 (so still better than a lock).

> 
>> But again I would say that this optimization is not really worth it (as you also said it), even if it is relevant for GUI applications.
> 
> :-)
> 
> 
> Sean


December 07, 2008
Fawzi Mohamed wrote:
> On 2008-12-07 03:48:40 +0100, Sean Kelly <sean@invisibleduck.org> said:
>>
>> Not true.  You would need an acquire barrier in thread_needLock. However, on x86 the point is probably moot since loads have acquire semantics anyway.
> 
> You would need a very good processor to reorder speculative loads before a function call and a branch. As far as I know even alpha did not do it.

But if thread_needLock() is inlined...

> A volatile statement will probably be enough in all cases, but you are right that to be really correct a load barrier should be done, an even in a processor where this might matter the cost of it in the fast path will be basically 0 (so still better than a lock).

Aye.  I'd do this if there were a common use case that justified it, but I don't see one.


Sean
December 07, 2008
On 2008-12-07 09:23:01 +0100, Sean Kelly <sean@invisibleduck.org> said:

> Fawzi Mohamed wrote:
>> On 2008-12-07 03:48:40 +0100, Sean Kelly <sean@invisibleduck.org> said:
>>> 
>>> Not true.  You would need an acquire barrier in thread_needLock. However, on x86 the point is probably moot since loads have acquire semantics anyway.
>> 
>> You would need a very good processor to reorder speculative loads before a function call and a branch. As far as I know even alpha did not do it.
> 
> But if thread_needLock() is inlined...
> 
>> A volatile statement will probably be enough in all cases, but you are right that to be really correct a load barrier should be done, an even in a processor where this might matter the cost of it in the fast path will be basically 0 (so still better than a lock).
> 
> Aye.  I'd do this if there were a common use case that justified it, but I don't see one.

I fully agree with you (see my answer to Robert Fraser)
> 
> 
> Sean


December 07, 2008
Reply to Robert,

> Opening
> Outlook takes 30 seconds.

The early (internal) versions of outlook took 5 MINUTES to open an e-mail. The solution ended up including fewer threads IIRC, (but probably not single threaded).


1 2
Next ›   Last »