D perfomance (page 4) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » D perfomance (page 4)

April 26, 2020

Re: D perfomance

Posted by Joseph Rushton Wakeling
in reply to Jonathan M Davis

Joseph Rushton Wakeling

Posted in reply to Jonathan M Davis

On Saturday, 25 April 2020 at 15:21:03 UTC, Jonathan M Davis wrote:
> You could probably do that, but I'm not sure that it could be considered @safe.

I think it would be OK to have it as a non-@safe tool.  But ...

> It would probably make more sense to just use a custom array type if that's what you really needed, though of course, that causes its own set of difficulties (including having to duplicate the array appending logic).

... I think that could possibly make more sense.  One thing that I really don't like about the original idea of an `alwaysAssumeSafeAppend(x)` is that it makes behaviour dependent on the instance rather than the type.  It would probably be better to have a clear type-based separation.

OTOH in my experience custom types are often finnicky in terms of how they interact with functions that expect a slice as input.  So there could be a convenience in having it as an option for regular dynamic arrays.  Or it could just be that the custom type would need a bit more work in its implementation :-)

April 26, 2020

Re: D perfomance

Posted by Joseph Rushton Wakeling
in reply to Walter Bright

Joseph Rushton Wakeling

Posted in reply to Walter Bright

On Saturday, 25 April 2020 at 22:15:44 UTC, Walter Bright wrote:
> My experience is if the code has never been profiled, there's one obscure function unexpectedly consuming the bulk of the run time, which is easily recoded. A programs that have been runtime profiled tend to have a pretty flat graph of which functions eat the time.

Yes! :-)  And in particular, that computational complexity in the real world is very different from theoretical arguments about O(...).  One can gain a lot by being clear headed about what the actual problem is and what is optimal for that particular problem with that particular data.

> This is why I use OutBuffer for such activities.

Yes, I have some memory of talking with Dicebot about whether this would be an appropriate tool for the other side of the D2 conversion.  I don't remember if any firm conclusions were drawn, though.

> Sounds like an idea worth exploring. How about taking point on that? But I would be concerned about tagging such arrays, and then stomping them unintentionally, leading to memory corruption bugs. OutBuffer is memory safe.

Yes, it's clear (as Jonathan noted) that an always-stompable array could probably not be @safe.  That said, what does OutBuffer do that means that it _is_ safe in this context?

Of course, Sociomantic never had @safe to play with.  In practice I don't recall there ever being an issue with unintentional stomping (I'm not saying it never happened, but I have no recollection of it being a common issue).  That did however rest on a program structure that made it less likely anyone would make such a mistake.

About stepping up with a feature contribution: the idea is lovely but I'm very aware of how limited my time is right now, so I don't want to make offers I can't guarantee to follow up on.  There's a reason I post so rarely in the forums these days!  But I will ping Mathias L. to let him know, as the idea was his to start with.

April 26, 2020

Re: D perfomance

Posted by John Colvin
in reply to Joseph Rushton Wakeling

John Colvin

Posted in reply to Joseph Rushton Wakeling

On Saturday, 25 April 2020 at 10:34:44 UTC, Joseph Rushton Wakeling wrote:
> In any case, I seriously doubt those kinds of optimization have anything to do with the web framework performance differences.
>
> My experience of writing number-crunching stuff in D and Rust is that Rust seems to have a small but consistent performance edge that could quite possibly be down the kind of optimizations that Arine mentions (that's speculation: I haven't verified).  However, it's small differences, not order-of-magnitude stuff.
>
> I suppose that in a more complicated app there could be some multiplicative impact, but where high-throughput web frameworks are concerned I'm pretty sure that the memory allocation and reuse strategy is going to be what makes 99% of the difference.
>
> There may also be a bit of an impact from the choice of futures vs. fibers for managing asynchronous tasks (there's a context switching cost for fibers), but I would expect that to only make a difference at the extreme upper end of performance, once other design factors have been addressed.
>
> BTW, on the memory allocation front, Mathias Lang has pointed out that there is quite a nasty impact from `assumeSafeAppend`.
>  Imagine that your request processing looks something like this:
>
>     // extract array instance from reusable pool,
>     // and set its length to zero so that you can
>     // write into it from the start
>     x = buffer_pool.get();
>     x.length = 0;
>     assumeSafeAppend(x);   // a cost each time you do this
>
>     // now append stuff into x to
>     // create your response
>
>     // now publish your response
>
>     // with the response published, clean
>     // up by recycling the buffer back into
>     // the pool
>     buffer_pool.recycle(x);
>
> This is the kind of pattern that Sociomantic used a lot.  In D1 it was easy because there was no array stomping prevention -- you could just set length == 0 and start appending.  But having to call `assumeSafeAppend` each time does carry a performance cost.
>
> IIRC Mathias has suggested that it should be possible to tag arrays as intended for this kind of re-use, so that stomping prevention will never trigger, and you don't have to `assumeSafeAppend` each time you reduce the length.

I understand that it was an annoying breaking change, but aside from the difficulty of migrating I don't understand why a custom type isn't the appropriate solution for this problem. I think I heard "We want to use the built-in slices", but I never understood the technical argument behind that, or how it stacked up against not getting the desired behaviour.

My sense was that the irritation at the breakage was influencing the technical debate.

April 26, 2020

Re: D perfomance

Posted by Stefan Koch
in reply to John Colvin

Stefan Koch

Posted in reply to John Colvin

On Sunday, 26 April 2020 at 11:40:49 UTC, John Colvin wrote:
>
> I understand that it was an annoying breaking change, but aside from the difficulty of migrating I don't understand why a custom type isn't the appropriate solution for this problem. I think I heard "We want to use the built-in slices", but I never understood the technical argument behind that, or how it stacked up against not getting the desired behaviour.
>
Can you imagine replacing every usage of slices with a custom type in your code?
And making sure programmers joining the company do the same?
and having converts that e.g. accept ubyte arrays from libraries and convert them into yours?

April 26, 2020

Re: D perfomance

Posted by Sebastiaan Koppe
in reply to Stefan Koch

Sebastiaan Koppe

Posted in reply to Stefan Koch

On Sunday, 26 April 2020 at 11:59:27 UTC, Stefan Koch wrote:
> On Sunday, 26 April 2020 at 11:40:49 UTC, John Colvin wrote:
>>
>> I understand that it was an annoying breaking change, but aside from the difficulty of migrating I don't understand why a custom type isn't the appropriate solution for this problem. I think I heard "We want to use the built-in slices", but I never understood the technical argument behind that, or how it stacked up against not getting the desired behaviour.
>>
> Can you imagine replacing every usage of slices with a custom type in your code?
> And making sure programmers joining the company do the same?
> and having converts that e.g. accept ubyte arrays from libraries and convert them into yours?

I suppose nowadays that custom type can use a scoped ubyte slice to expose its temp buffer.

April 26, 2020

Re: D perfomance

Posted by Joseph Rushton Wakeling
in reply to John Colvin

Joseph Rushton Wakeling

Posted in reply to John Colvin

On Sunday, 26 April 2020 at 11:40:49 UTC, John Colvin wrote:
> I understand that it was an annoying breaking change, but aside from the difficulty of migrating I don't understand why a custom type isn't the appropriate solution for this problem. I think I heard "We want to use the built-in slices", but I never understood the technical argument behind that, or how it stacked up against not getting the desired behaviour.
>
> My sense was that the irritation at the breakage was influencing the technical debate.

That's not entirely unfair, but I think it does help to appreciate the magnitude of the problem:

  * there's a very large codebase, including many different applications and
    a large amount of common library code, all containing a lot of functions
    that expect slice input (because the concept of a range was never in D1,
    and because slices were the only use case)

  * most of the library functionality shouldn't have to care whether its input
    is a reusable buffer or any other kind of slice

  * you can't rewrite to use range-based generics because that's D2 only and
    you need to keep D1 compatibility until the last application has migrated

  * there are _very_ extreme performance and reliability constraints on some
    of the key applications, meaning that validating D2 transition efforts is
    very time consuming

  * you can't use any Phobos functionality until the codebase is D2 only, and
    even then you probably want to limit how much of it you use because it is
    not written with these extreme performance concerns in mind

  * all the time spent on those transitional efforts is time taken away from
    feature development

It's very easy to look back and say something like, "Well, if you'd written with introspection-based design from the start, you would have had a much easier migration effort", but that in itself would have been trickier to do in D1, and would have carried extra maintenance and development costs (particularly w.r.t. forcing devs to write what would have seemed like very boilerplate-y code compared to the actual set of use cases).

Even with the D1 compatibility requirement dropped, there still remains a big burden to transition all the reusable buffers to a different type.  IIRC the focus would probably have been on using `Appender`.

Note that many of these concerns still apply if we want to preserve a future for any of the (very well crafted) library and application code that Sociomantic open-sourced.  They are all now D2-only, but the effort required to rewrite around dedicated reusable-buffer types would still be quite substantial.

April 26, 2020

Re: D perfomance

Posted by Guillaume Piolat
in reply to serge

Guillaume Piolat

Posted in reply to serge

On Friday, 24 April 2020 at 13:44:18 UTC, serge wrote:
>
> Could you please elaborate on that? what are you referring to as backend?

I was mentionning LLVM vs GCC vs Intel compiler backend, the part that converts code to instructions after the original language is out of sight.

> To me techempower stats is pretty good indicator - it shows json processing, single/multiquery requests, database, static. Overall performance across those stats give pretty good idea, how language and web framework is created, its ecosystem.
> For example if language is fast on basic operations but two frameworks show less then adequate performance then obviously something wrong with the whole ecosystem - it could be difficult to create  fast and efficient apps for average developer. For example Scala - powerfull but yet very complicated language with tons of problems. Most of Scala projects failed. It is very difficult and slow to create  efficient  applications for  average developer. It kinds requires rocket scientist to write good code in Scala.  Does D exhibit same problem?

Very fair reasoning.

I don't think D has as much problems as Scala, D has a very gentle learning curve and it's not difficult to be productive in. But I'd say most of D's problems are indeed ecosystem-related, possibly because of the kind of personnalities that D attracts : the reluctance from D programmers to gather around the same piece of code makes the ecosystem more insular than needed, as is typical with native programming. D code today has a tendency to balkanize based on various requirements such as exceptions or not, runtime or not, @safe or not, -betterC or not... It seems to me languages where DIY is frowned upon (Java) or discouraged by the practice of FFI have better library ecosystems, for better or worse.

April 26, 2020

Re: D perfomance

Posted by Steven Schveighoffer
in reply to Joseph Rushton Wakeling

Steven Schveighoffer

Posted in reply to Joseph Rushton Wakeling

On 4/25/20 6:34 AM, Joseph Rushton Wakeling wrote:
> On Saturday, 25 April 2020 at 10:15:33 UTC, Walter Bright wrote:
>> On 4/24/2020 12:27 PM, Arine wrote:
>>> There most definitely is a difference and the assembly generated with rust is better.
>> D's @live functions can indeed do such optimizations, though I haven't got around to implementing them in DMD's optimizer. There's nothing particularly difficult about it.
> 
> In any case, I seriously doubt those kinds of optimization have anything to do with the web framework performance differences.
> 
> My experience of writing number-crunching stuff in D and Rust is that Rust seems to have a small but consistent performance edge that could quite possibly be down the kind of optimizations that Arine mentions (that's speculation: I haven't verified). However, it's small differences, not order-of-magnitude stuff.
> 
> I suppose that in a more complicated app there could be some multiplicative impact, but where high-throughput web frameworks are concerned I'm pretty sure that the memory allocation and reuse strategy is going to be what makes 99% of the difference.
> 
> There may also be a bit of an impact from the choice of futures vs. fibers for managing asynchronous tasks (there's a context switching cost for fibers), but I would expect that to only make a difference at the extreme upper end of performance, once other design factors have been addressed.
> 
> BTW, on the memory allocation front, Mathias Lang has pointed out that there is quite a nasty impact from `assumeSafeAppend`. Imagine that your request processing looks something like this:
> 
>      // extract array instance from reusable pool,
>      // and set its length to zero so that you can
>      // write into it from the start
>      x = buffer_pool.get();
>      x.length = 0;
>      assumeSafeAppend(x);   // a cost each time you do this
> 
>      // now append stuff into x to
>      // create your response
> 
>      // now publish your response
> 
>      // with the response published, clean
>      // up by recycling the buffer back into
>      // the pool
>      buffer_pool.recycle(x);
> 
> This is the kind of pattern that Sociomantic used a lot.  In D1 it was easy because there was no array stomping prevention -- you could just set length == 0 and start appending.  But having to call `assumeSafeAppend` each time does carry a performance cost.

In terms of performance, depending on the task at hand, D1 code is slower than D2 appending, by the fact that there's a thread-local cache for appending for D2, and D1 only has a global one-array cache for the same. However, I'm assuming that since you were focused on D1, your usage naturally was written to take advantage of what D1 has to offer.

The assumeSafeAppend call also uses this cache, and so it should be quite fast. But setting length to 0 is a ton faster, because you aren't calling an opaque function.

So depending on the usage pattern, D2 with assumeSafeAppend can be faster, or it could be slower.

> 
> IIRC Mathias has suggested that it should be possible to tag arrays as intended for this kind of re-use, so that stomping prevention will never trigger, and you don't have to `assumeSafeAppend` each time you reduce the length.

I spoke for a while with Dicebot at Dconf 2016 or 17 about this issue. IIRC, I suggested either using a custom type or custom runtime. He was not interested in either of these ideas, and it makes sense (large existing code base, didn't want to stray from mainline D).

By far, the best mechanism to use is a custom type. Not only will that fix this problem as you can implement whatever behavior you want, but you also do not need to call opaque functions for appending either. It should outperform everything you could do in a generic runtime.

Note that this was before (I think) destructor calls were added. The destructor calls are something that assumeSafeAppend is going to do, and won't be done with just setting length to 0.

However, there are other options. We could introduce a druntime configuration option so when this specific situation happens (slice points at start of block and has 0 length), assumeSafeAppend is called automatically on the first append. Jonathan is right that this is not @safe, but it could be an opt-in configuration option.

I don't think configuring specific arrays makes a lot of sense, as this would require yet another optional bit that would have to be checked and allocated for all arrays.

-Steve

April 26, 2020

Re: D perfomance

Posted by Daniel Kozak
in reply to serge

Daniel Kozak

Posted in reply to serge

On Fri, Apr 24, 2020 at 3:46 PM serge via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> To me techempower stats is pretty good indicator - it shows json processing, single/multiquery requests, database, static. Overall performance across those stats give pretty good idea, how language and web framework is created, its ecosystem.

Unfortunately there is a big issue with techempower. Because it is so
popular almost every framework [language] try to have a best score in
it.
And in many cases this mean they use some hacks or tricks to achieve
that. So in general techempower results are useless. From my own
experience D performance is really good in a real word scenarios.
Other issue with techempower benchmark is there is almost zero
complexity. All tests do some basic operations on realy small
datasets.

April 26, 2020

Re: D perfomance

Posted by JN
in reply to Guillaume Piolat

JN

Posted in reply to Guillaume Piolat

On Sunday, 26 April 2020 at 12:37:48 UTC, Guillaume Piolat wrote:
> But I'd say most of D's problems are indeed ecosystem-related, possibly because of the kind of personnalities that D attracts : the reluctance from D programmers to gather around the same piece of code makes the ecosystem more insular than needed, as is typical with native programming. D code today has a tendency to balkanize based on various requirements such as exceptions or not, runtime or not, @safe or not, -betterC or not... It seems to me languages where DIY is frowned upon (Java) or discouraged by the practice of FFI have better library ecosystems, for better or worse.

These are connected. Languages like Java don't give you options. You will use the GC, you will use OOP. Imagine an XML library. Any Java XML DOM library will offer a XMLDocument object with a load method (or constructor). This is expected and more or less the same in every library.

D doesn't force the paradigm on you. Some people will want to use the GC, some won't, some will want to use OOP, some will avoid it like fire. It's a tradeoff, for higher flexibility and power you trade some composability.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation