May 08, 2015
On Friday, 8 May 2015 at 19:54:26 UTC, Vladimir Panteleev wrote:
> I don't know enough about TLS to argue but it strikes me as odd that it would be slower than the layers of un-inlinable extern(C) calls, going through lifetime.d, gc.d, gcx.d, there locking on a global mutex, and allocating memory accordingly to a general-purpose GC (vs. specialized allocator).

No it won't, but I'd like us to be compared to state of the art allocator (jemalloc, java's G1 and so on) rather than the current thing that we have that is universally recognized as being not good, to put it nicely.

You want to compare yourself to what the best guy in town are doing, not the drunk hobo wandering around.
May 08, 2015
On 5/8/15 1:17 PM, deadalnix wrote:
> On Friday, 8 May 2015 at 19:54:26 UTC, Vladimir Panteleev wrote:
>> I don't know enough about TLS to argue but it strikes me as odd that
>> it would be slower than the layers of un-inlinable extern(C) calls,
>> going through lifetime.d, gc.d, gcx.d, there locking on a global
>> mutex, and allocating memory accordingly to a general-purpose GC (vs.
>> specialized allocator).
>
> No it won't, but I'd like us to be compared to state of the art
> allocator (jemalloc, java's G1 and so on) rather than the current thing
> that we have that is universally recognized as being not good, to put it
> nicely.
>
> You want to compare yourself to what the best guy in town are doing, not
> the drunk hobo wandering around.

Of course! I actually do think std.allocator is a net improvement over the state of the art. And if it isn't, we need to make it so. -- Andrei

May 10, 2015
On 2015-05-08 21:55, Andrei Alexandrescu wrote:

> a few measurements would be in order. -- Andrei

Be sure you do that on more than one platform. For example, the emulate TLS on OS X can be quite slow, I've heard.

-- 
/Jacob Carlborg
May 11, 2015
On Sunday, 10 May 2015 at 16:56:27 UTC, Jacob Carlborg wrote:
> On 2015-05-08 21:55, Andrei Alexandrescu wrote:
>
>> a few measurements would be in order. -- Andrei
>
> Be sure you do that on more than one platform. For example, the emulate TLS on OS X can be quite slow, I've heard.

I was trying to come up with a good benchmark for TLS, but it is remarkably difficult.

Usually, you have one TLS segment per linker module (meaning one for your app + one per shared object). You have once segment that is kept around by the compiler to be used.

Once you access TLS in your code, things goes as follow:
1/ The compiler know you have the right segment around and so segment lookup needs to take place.
2/ The compiler don't know it, but you have the right segment. In which case you do a round trip in the runtime, but take the fast path.
3/ You have the wrong segment, in which case the runtime have to figure out what is the right segment, and that is slow and often imply locks, and even, in worst case scenarii, round trip to the OS.

A good benchmark must have TLS accessed from the application and from some shared object, be big enough so the compiler do not see through all these access (or is will simply keep both segment around which it won't do by default, but will if necessity is apparent), and have a realistic access pattern (it is fairly easy to trash the perfs by doing ping pong between the 2 TLS segment, but it is probably not very realistic).

Long story short, I'm worried by this TLS issue, but I'd welcome more data.
May 11, 2015
On Thursday, 7 May 2015 at 18:26:47 UTC, Andrei Alexandrescu wrote:
> https://git-scm.com/book/tr/v2/Git-Internals-Plumbing-and-Porcelain
>
> Made perfect sense the second I first saw it. -- Andrei

I always thought that it was a bit vulgar, myself, but git has made the term at least somewhat common in this context.

- Jonathan M Davis
May 11, 2015
On Thursday, 7 May 2015 at 02:28:45 UTC, Andrei Alexandrescu wrote:
> http://erdani.com/d/phobos-prerelease/std_experimental_allocator_porcelain.html

On the face of it, it's doing roughly what I expected, though the devil's in the details, and it's likely taking care of quite a few things that I hadn't thought of. I was surprised to see that arrays didn't just use make, but on further inspection, I would guess that it was just easier to have a separate makeArray due to differences in arguments, since otherwise, you'd need to muck around with make's auto ref parameters to make sure that they lined up with the parameters that makeArray currently expects when make was instantiated with an array and give an error otherwise, and makeArray makes the parameters clear in the signature.

- Jonathan M Davis
1 2 3
Next ›   Last »