September 25, 2014
On Monday, 22 September 2014 at 19:58:31 UTC, Dmitry Olshansky wrote:
> 22-Sep-2014 13:45, Ola Fosheim Grostad пишет:
>> Locking fibers to threads will cost you more than using threadsafe
>> features. One 300ms request can then starve waiting fibers even if you
>> have 7 free threads.
>
> This statement doesn't make any sense taken in isolation. It lacks way too much context to be informative. For instance, "locking a thread for 300ms" is easily averted if all I/O and blocking sys-call are managed in a separate thread pool (that may grow far beyond fiber-scheduled "web" thread pool).
>
> And if "locked" means CPU-bound locked, then it's
> a) hard to fix without help from OS: re-scheduling a fiber without explicit yield ain't possible (it's cooperative, preemption is in the domain of OS).

If you porocess and compress a large dataset in one fiber you don't need rescheduling. You just the scheduler to pick fibers according to priority regardless of origin thread.

> b) If CPU-bound is happening more often then once in a while, then fibers are poor fit anyway - threads (and pools of 'em) do exactly what's needed in this case by being natively preemptive and well suited for running multiple CPU intensive tasks.

Not really the issue. Load comes in spikes, if you on average only have a couple of heavy fibers at the same time then you are fine. You can spawn more threads if needed, but that wont help if fibers are stuck on a slow thread.

>> That's bad for latency, because then all fibers on
>> that thread will get 300+ms in latency.
>
> E-hm locking threads to fibers and arbitrary latency figures have very little to do with each other. The nature of that latency is extremely important.

If you in line behind a cpu heavy fiber then you get that effect.

>> How anyone can disagree with this is beyond me.
>
> IMHO poorly formed problem statements are not going to prove your point. Pardon me making a personal statement, but for instance showing how Go avoids your problem and clearly specifying the exact conditions that cause it would go a long way to demonstrated whatever you wanted to.

Any decent framework that is concerned about latency solves this the same way: light threads, or events, or whatever are not locked to a specific thread.

Isolates are fine, but D does not provide it afaik.

September 26, 2014
On Tuesday, 23 September 2014 at 06:19:58 UTC, deadalnix wrote:
> On Tuesday, 23 September 2014 at 03:03:49 UTC, Manu via Digitalmars-d wrote:
>> I still think most of those users would accept RC instead of GC. Why not
>> support RC in the language, and make all of this library noise redundant?
>> Library RC can't really optimise well, RC requires language support to
>> elide ref fiddling.
>
> I think a library solution + intrinsic for increment/decrement (so they can be better optimized) would be the best option.

Yes, inc/dec intrinsic is needed to support TSX. I.e. You dont have to inc/dec to keep the object alive within a transaction, you only need to read something on the same cacheline as the ref count. Essentially zero overhead in many cases afaik.
September 26, 2014
Analysis of Go growth / usage.

http://redmonk.com/dberkholz/2014/03/18/go-the-emerging-language-of-cloud-infrastructure/
September 26, 2014
24-Sep-2014 18:55, Andrei Alexandrescu пишет:
> On 9/24/14, 3:31 AM, Dmitry Olshansky wrote:
>> 23-Sep-2014 19:13, Andrei Alexandrescu пишет:
>>> On 9/23/14, 12:17 AM, Dmitry Olshansky wrote:
>>>> In my imagination it would be along the lines of
>>>> @ARC
>>>> struct MyCountedStuff{ void opInc(); void opDec(); }
>>>
>>> So that would be a pointer type or a value type? Is there copy on write
>>> somewhere? -- Andrei
>>
>> It would be an intrusively counted type with pointer somewhere in the
>> body. To put it simply MyCountedStuff is a kind of smart pointer.
>
> Then that would be confusing seeing as structs are value types. What
> you're saying is that a struct with opInc() and opDec() has pointer
> semantics whereas one with not has value semantics. That design isn't
> going to fly.

Read that as
struct RefCounted(T){

	void opInc();
	void opDec();
}

The main thing is to let compiler know the stuff is ref-counted in some generic way.

>
> For classes such a design makes sense as long as the class is no longer
> convertible to Object. That's what I'm proposing for RCObject (and
> Throwable that would inherit it).
>
>
> Andrei
>

-- 
Dmitry Olshansky
September 26, 2014
On 9/26/14, 2:50 PM, Dmitry Olshansky wrote:
> 24-Sep-2014 18:55, Andrei Alexandrescu пишет:
>> On 9/24/14, 3:31 AM, Dmitry Olshansky wrote:
>>> 23-Sep-2014 19:13, Andrei Alexandrescu пишет:
>>>> On 9/23/14, 12:17 AM, Dmitry Olshansky wrote:
>>>>> In my imagination it would be along the lines of
>>>>> @ARC
>>>>> struct MyCountedStuff{ void opInc(); void opDec(); }
>>>>
>>>> So that would be a pointer type or a value type? Is there copy on write
>>>> somewhere? -- Andrei
>>>
>>> It would be an intrusively counted type with pointer somewhere in the
>>> body. To put it simply MyCountedStuff is a kind of smart pointer.
>>
>> Then that would be confusing seeing as structs are value types. What
>> you're saying is that a struct with opInc() and opDec() has pointer
>> semantics whereas one with not has value semantics. That design isn't
>> going to fly.
>
> Read that as
> struct RefCounted(T){
>
>      void opInc();
>      void opDec();
> }

Consider:

struct MyRefCounted
    void opInc();
    void opDec();
    int x;
}

MyRefCounted a;
a.x = 42;
MyRefCounted b = a;
b.x = 43;

What is a.x after this?


Andrei
September 27, 2014
> Consider:
>
> struct MyRefCounted
>     void opInc();
>     void opDec();
>     int x;
> }
>
> MyRefCounted a;
> a.x = 42;
> MyRefCounted b = a;
> b.x = 43;
>
> What is a.x after this?
>
>
> Andrei

a.x == 42
a.ref_count == 1 (1 for init, +1 for copy, -1 for destruction)
b.x == 43
b.ref_count == 1 (only init)
September 27, 2014
27-Sep-2014 02:51, Andrei Alexandrescu пишет:
> On 9/26/14, 2:50 PM, Dmitry Olshansky wrote:
>> 24-Sep-2014 18:55, Andrei Alexandrescu пишет:
>>> On 9/24/14, 3:31 AM, Dmitry Olshansky wrote:
>>>> 23-Sep-2014 19:13, Andrei Alexandrescu пишет:
>>>>> On 9/23/14, 12:17 AM, Dmitry Olshansky wrote:
>>>>>> In my imagination it would be along the lines of
>>>>>> @ARC
>>>>>> struct MyCountedStuff{ void opInc(); void opDec(); }
>>>>>
>>>>> So that would be a pointer type or a value type? Is there copy on
>>>>> write
>>>>> somewhere? -- Andrei
>>>>
>>>> It would be an intrusively counted type with pointer somewhere in the
>>>> body. To put it simply MyCountedStuff is a kind of smart pointer.
>>>
>>> Then that would be confusing seeing as structs are value types. What
>>> you're saying is that a struct with opInc() and opDec() has pointer
>>> semantics whereas one with not has value semantics. That design isn't
>>> going to fly.
>>
>> Read that as
>> struct RefCounted(T){
>>
>>      void opInc();
>>      void opDec();
>> }
>
> Consider:
>
> struct MyRefCounted
>      void opInc();
>      void opDec();
>      int x;
> }
>
> MyRefCounted a;
> a.x = 42;
> MyRefCounted b = a;
> b.x = 43;
>
> What is a.x after this?

Okay it serves no good for me to make these tiny comments while on the go.

As usual, structs are value types, so this feature can be mis-used, no two thoughts abouts it. It may need a bit of improvement in user-friendliness, compiler may help there by auto-detecting common misuse.

Theoretically class-es would be better choice, except that question of allocation pops up immediately, then consider for instance COM objects.

The good thing w.r.t. to memory about structs - they are themselves already allocated "somewhere", and it's only ref-counted payload that is allocated and destroyed in a user-defined way.

And now for the killer reasons to go for struct is the following:

Compiler _already_ does all of life-time management and had numerous bug fixes to make sure it does the right thing. In contrast there is nothing for classes that tracks their lifetimes to call proper hooks.

Let's REUSE that mechanism we have with structs and go as lightly as possible on  untested LOCs budget.

Full outline, of generic to the max, dirt-cheap implementation with a bit of lowering:

ARC or anything close to it, is implemented as follows:
1. Any struct that have @ARC attached, must have the following methods:
	void opInc();
	bool opDec(); // true - time to destroy
It also MUST NOT have postblit, and MUST have destructor.

2. Compiler takes user-defined destructor and creates proper destructor, as equivalent of this:
	if(opDec()){
		user__defined_dtor;
	}
3. postblit is defined as opInc().

4. any ctor has opInc() appended to its body.

Everything else is taken care of by the very nature of the structs.
Now this is enough to make ref-counted stuff a bit simpler to write but not much beyond. So here the next "consequences" that we can then implement:

4. Compiler is expected to assume anywhere in fully inlined code, that opInc()/opDec() pairs are no-op. It should do so even in debug mode (though there is less opportunity to do so without inlining). Consider it an NRVO of the new age, required optimization.

5. If we extend opInc/opDec to take an argument, the compiler may go further and batch up multiple opInc-s and opDec-s, as long as it's safe to do so (e.g. there could be exceptions thrown!):

Consider:

auto a = File("some-file.txt");
//pass to some structs for future use
B b = B(a);
C c = C(a);
a = File("other file");

May be (this is overly simplified!):

File a = void, b = void, c = void;
a = File.user_ctor("some-file.txt")'
a.opInc(2);
b = B(a);
c = C(a);
a = File.user_ctor("other file");
a.opInc();


-- 
Dmitry Olshansky
September 27, 2014
27-Sep-2014 12:11, Foo пишет:
>> Consider:
>>
>> struct MyRefCounted
>>     void opInc();
>>     void opDec();
>>     int x;
>> }
>>
>> MyRefCounted a;
>> a.x = 42;
>> MyRefCounted b = a;
>> b.x = 43;
>>
>> What is a.x after this?
>>
>>
>> Andrei
>
> a.x == 42
> a.ref_count == 1 (1 for init, +1 for copy, -1 for destruction)
> b.x == 43
> b.ref_count == 1 (only init)

There is no implicit ref-count. opInc may just as well create a file on harddrive and count refs there. Guaranteed it would be idiotic idea, but the mechanism itself opens door to some cool alternatives like:

- separate tables for ref-counts (many gamedevs seem to favor this, also see Objective-C)
- use padding of some stuff for ref-count
- may go on and use e.g. 1 byte for ref-count on their own risk, or even a couple of bits here and there

I may go on, and on. But also consider:

GObject of GLib (GNOME libraries)
XPCOM (something I think Mozila did as sort-of COM)
MS COM
etc.

Refcounting is process of add(x), and sub(x), and calling destructor should the subtract call report zero. Everything else is in the hands of the creator.

-- 
Dmitry Olshansky
September 27, 2014
25-Sep-2014 17:31, Ola Fosheim Grostad пишет:
> On Monday, 22 September 2014 at 19:58:31 UTC, Dmitry Olshansky wrote:
>> 22-Sep-2014 13:45, Ola Fosheim Grostad пишет:
>>> Locking fibers to threads will cost you more than using threadsafe
>>> features. One 300ms request can then starve waiting fibers even if you
>>> have 7 free threads.
>>
>> This statement doesn't make any sense taken in isolation. It lacks way
>> too much context to be informative. For instance, "locking a thread
>> for 300ms" is easily averted if all I/O and blocking sys-call are
>> managed in a separate thread pool (that may grow far beyond
>> fiber-scheduled "web" thread pool).
>>
>> And if "locked" means CPU-bound locked, then it's
>> a) hard to fix without help from OS: re-scheduling a fiber without
>> explicit yield ain't possible (it's cooperative, preemption is in the
>> domain of OS).
>
> If you porocess and compress a large dataset in one fiber you don't need
> rescheduling. You just the scheduler to pick fibers according to
> priority regardless of origin thread.

So do not. Large dataset is not something a single thread should do anyway, just post it to the "workers" thread pool and wait on that (by yeilding).

There is no FUNDAMENTAL problem.

>
>> b) If CPU-bound is happening more often then once in a while, then
>> fibers are poor fit anyway - threads (and pools of 'em) do exactly
>> what's needed in this case by being natively preemptive and well
>> suited for running multiple CPU intensive tasks.
>
> Not really the issue. Load comes in spikes,

You are trying to change the issue itself.
Load is multitude of requests, we are speaking of a SINGLE one taking a lot of time. So load makes no difference here, we are talking of DoS-ish kind of thing, not DDoS.

And my postulate is as follows: as long as one requests may take loong amount of time, there are going to be arbitrary many such "long" requests in row esp. on public services, that everybody tries hard to abuse.

>if you on average only have
> a couple of heavy fibers at the same time then you are fine. You can
> spawn more threads if needed, but that wont help if fibers are stuck on
> a slow thread.

Well that's convenient I won't deny, but itself it just patches up the problem and in
non-transparent way - oh, hey 10 requests are taking too much time,
let's spawn 11-th thread.

But - if some requests may take arbitrary long to complete, just use the separate pool for heavy work, it's _better_ design and more resilent to "heavy" requests anyway.

>>> That's bad for latency, because then all fibers on
>>> that thread will get 300+ms in latency.
>>
>> E-hm locking threads to fibers and arbitrary latency figures have very
>> little to do with each other. The nature of that latency is extremely
>> important.
>
> If you in line behind a cpu heavy fiber then you get that effect.

Aye, I just don't see myself doing hard work on fiber. They are not meant to do that.

>>> How anyone can disagree with this is beyond me.
>>
>> IMHO poorly formed problem statements are not going to prove your
>> point. Pardon me making a personal statement, but for instance showing
>> how Go avoids your problem and clearly specifying the exact conditions
>> that cause it would go a long way to demonstrated whatever you wanted to.
>
> Any decent framework that is concerned about latency solves this the
> same way: light threads, or events, or whatever are not locked to a
> specific thread.

They do not have thread-local by default. But anyway - ad populum.

>
> Isolates are fine, but D does not provide it afaik.
>
Would you explain?

-- 
Dmitry Olshansky
September 27, 2014
26-Sep-2014 06:49, Ola Fosheim Grostad пишет:
> Analysis of Go growth / usage.
>
> http://redmonk.com/dberkholz/2014/03/18/go-the-emerging-language-of-cloud-infrastructure/
>

Google was popular last time I heard, so does their language.

-- 
Dmitry Olshansky