D perfomance (page 3)

On 4/24/2020 12:27 PM, Arine wrote: > There most definitely is a difference and the assembly generated with rust is better. D's @live functions can indeed do such optimizations, though I haven't got around to implementing them in DMD's optimizer. There's nothing particularly difficult about it.

April 25, 2020

Re: D perfomance

Posted by Joseph Rushton Wakeling
in reply to Walter Bright

Permalink

Joseph Rushton Wakeling

Posted in reply to Walter Bright

Permalink

On Saturday, 25 April 2020 at 10:15:33 UTC, Walter Bright wrote:
> On 4/24/2020 12:27 PM, Arine wrote:
>> There most definitely is a difference and the assembly generated with rust is better.
> D's @live functions can indeed do such optimizations, though I haven't got around to implementing them in DMD's optimizer. There's nothing particularly difficult about it.

In any case, I seriously doubt those kinds of optimization have anything to do with the web framework performance differences.

My experience of writing number-crunching stuff in D and Rust is that Rust seems to have a small but consistent performance edge that could quite possibly be down the kind of optimizations that Arine mentions (that's speculation: I haven't verified).  However, it's small differences, not order-of-magnitude stuff.

I suppose that in a more complicated app there could be some multiplicative impact, but where high-throughput web frameworks are concerned I'm pretty sure that the memory allocation and reuse strategy is going to be what makes 99% of the difference.

There may also be a bit of an impact from the choice of futures vs. fibers for managing asynchronous tasks (there's a context switching cost for fibers), but I would expect that to only make a difference at the extreme upper end of performance, once other design factors have been addressed.

BTW, on the memory allocation front, Mathias Lang has pointed out that there is quite a nasty impact from `assumeSafeAppend`.  Imagine that your request processing looks something like this:

    // extract array instance from reusable pool,
    // and set its length to zero so that you can
    // write into it from the start
    x = buffer_pool.get();
    x.length = 0;
    assumeSafeAppend(x);   // a cost each time you do this

    // now append stuff into x to
    // create your response

    // now publish your response

    // with the response published, clean
    // up by recycling the buffer back into
    // the pool
    buffer_pool.recycle(x);

This is the kind of pattern that Sociomantic used a lot.  In D1 it was easy because there was no array stomping prevention -- you could just set length == 0 and start appending.  But having to call `assumeSafeAppend` each time does carry a performance cost.

IIRC Mathias has suggested that it should be possible to tag arrays as intended for this kind of re-use, so that stomping prevention will never trigger, and you don't have to `assumeSafeAppend` each time you reduce the length.

24.04.2020 22:27, Arine пишет: > On Thursday, 23 April 2020 at 15:57:01 UTC, drug wrote: >> And your statement that Rust assembly output is better is wrong. > Yes, your statement that Rust assembly output is better is wrong, because one single optimization applicable in some cases does not make Rust better in general. Period. Once again, Rust assembly output can be better in some cases. But there is the big difference between these two statements - "better in some cases", and "better in general". More over you are wrong twice. Because this optimization is not free at all. You pay for it in the form of restriction that you can not have more than one mutable reference. This means that cyclic data structures are unusually difficult compared to almost any other programming language. Also this optimization is available in C for a long time. Even more - in some cases GC based application can be faster that one with manual memory management because it allows to avoid numerous allocation/deallocation. What you are talking about is premature optimization in fact. > > There most definitely is a difference and the assembly generated with rust is better. This is just a simple example to illustrate the difference. If you don't know why the difference is significant or why it is happening. There are a lot of great articles out there, sadly there are people such as yourself spreading misinformation that don't know what a borrow checker is and don't know Rust or why it is has gone as far as it has. This is why the borrow checker for D is going to fail. Because the person designing it, such as yourself, doesn't have any idea what they are redoing and have never even bothered to touch Rust or learn about it. Anyways I'm not your babysitter, if you don't understand the above, as most people seem to not bother to learn assembly anymore, you're on your own. > Self-importance written all over your post. Here you make your third mistake - you are very far away of being able to be my babysitter. Trying to show your competence you show only your blindly ignorance. The world is much less trivial than a function with two mutable references not performing any useful work.

On Saturday, April 25, 2020 4:34:44 AM MDT Joseph Rushton Wakeling via Digitalmars-d wrote: > IIRC Mathias has suggested that it should be possible to tag arrays as intended for this kind of re-use, so that stomping prevention will never trigger, and you don't have to `assumeSafeAppend` each time you reduce the length. You could probably do that, but I'm not sure that it could be considered @safe. It would probably make more sense to just use a custom array type if that's what you really needed, though of course, that causes its own set of difficulties (including having to duplicate the array appending logic). - Jonathan M Davis

On 4/25/2020 3:34 AM, Joseph Rushton Wakeling wrote: > In any case, I seriously doubt those kinds of optimization have anything to do with the web framework performance differences. I agree. I also generally structure my code so that optimization wouldn't make a difference. But it's still a worthwhile benefit to add it for @live functions. > I suppose that in a more complicated app there could be some multiplicative impact, but where high-throughput web frameworks are concerned I'm pretty sure that the memory allocation and reuse strategy is going to be what makes 99% of the difference. My experience is if the code has never been profiled, there's one obscure function unexpectedly consuming the bulk of the run time, which is easily recoded. A programs that have been runtime profiled tend to have a pretty flat graph of which functions eat the time. > // extract array instance from reusable pool, > // and set its length to zero so that you can > // write into it from the start > x = buffer_pool.get(); > x.length = 0; > assumeSafeAppend(x); // a cost each time you do this > > // now append stuff into x to > // create your response > > // now publish your response > > // with the response published, clean > // up by recycling the buffer back into > // the pool > buffer_pool.recycle(x); > > This is the kind of pattern that Sociomantic used a lot. In D1 it was easy because there was no array stomping prevention -- you could just set length == 0 and start appending. But having to call `assumeSafeAppend` each time does carry a performance cost. This is why I use OutBuffer for such activities. > IIRC Mathias has suggested that it should be possible to tag arrays as intended for this kind of re-use, so that stomping prevention will never trigger, and you don't have to `assumeSafeAppend` each time you reduce the length. Sounds like an idea worth exploring. How about taking point on that? But I would be concerned about tagging such arrays, and then stomping them unintentionally, leading to memory corruption bugs. OutBuffer is memory safe.

On 25.04.20 12:15, Walter Bright wrote: > On 4/24/2020 12:27 PM, Arine wrote: >> There most definitely is a difference and the assembly generated with rust is better. > D's @live functions can indeed do such optimizations, though I haven't got around to implementing them in DMD's optimizer. There's nothing particularly difficult about it. What's an example of such an optimization and why won't it introduce UB to @safe code?

On 4/24/2020 8:06 PM, SrMordred wrote: > Also u can achieve the same asm with @llvmAttr("noalias") in front of at least one argument. The following C code: int test(int * __restrict__ x, int * __restrict__ y) { *x = 0; *y = 1; return *x; } compiled with gcc -O: test: mov dword ptr [RDI],0 mov dword ptr [RSI],1 mov EAX,0 ret It's not a unique property of Rust, C99 has it too. DMC doesn't implement it, but it probably should.

On 4/25/2020 4:00 PM, Timon Gehr wrote: > What's an example of such an optimization and why won't it introduce UB to @safe code? @live void test() { int a,b; foo(a, b); } @live int foo(ref int a, ref int b) { a = 0; b = 1; return a; } ref a and ref b cannot refer to the same memory object.

On 26.04.20 04:22, Walter Bright wrote: > On 4/25/2020 4:00 PM, Timon Gehr wrote: >> What's an example of such an optimization and why won't it introduce UB to @safe code? > > @live void test() { int a,b; foo(a, b); } > > @live int foo(ref int a, ref int b) { > a = 0; > b = 1; > return a; > } > > ref a and ref b cannot refer to the same memory object. Actually they can, even in @safe @live code.

On 4/26/2020 12:45 AM, Timon Gehr wrote: > On 26.04.20 04:22, Walter Bright wrote: >> ref a and ref b cannot refer to the same memory object. > Actually they can, even in @safe @live code. Bug reports are welcome. Please tag them with the 'live' keyword in bugzilla.

Forums