April 26, 2020
On Wednesday, 22 April 2020 at 22:34:32 UTC, Arine wrote:
> Not quite. Rust will generate better assembly as it can guarantee that use of an object is unique. Similar to C's "restrict" keyword but you get it for "free" across the entire application.

Cool. Did not knew that. I know that different languages have different semantics and code that looks the same might produce different results so thats why I used a word equivalent instead of same. You can achieve the same goal in D as in Rust but the code would be different.
April 26, 2020
On Sunday, 26 April 2020 at 16:59:44 UTC, Daniel Kozak wrote:
> Unfortunately there is a big issue with techempower. Because it is so
> popular almost every framework [language] try to have a best score in
> it.
> And in many cases this mean they use some hacks or tricks to achieve
> that. So in general techempower results are useless. From my own
> experience D performance is really good in a real word scenarios.
> Other issue with techempower benchmark is there is almost zero
> complexity. All tests do some basic operations on realy small
> datasets.

It's nice to have a moral victory and claim to be above "those cheaters", but links to these benchmarks are shared in many places. If someone wants to see how fast D is, they will write "programming language benchmark" in their websearch of choice, and TechEmpower will be high in the results list. He will click, and go "oh wow, even PHP is faster than that D stuff".

Whether it's cheating or not, perception matters and people will use such benchmarks to base their decision, even if it's unreasonable and doesn't apply to real world scenarios.
April 26, 2020
On Saturday, 25 April 2020 at 22:15:44 UTC, Walter Bright wrote:
> On 4/25/2020 3:34 AM, Joseph Rushton Wakeling wrote:
>> In any case, I seriously doubt those kinds of optimization have anything to do with the web framework performance differences.
>
> I agree. I also generally structure my code so that optimization wouldn't make a difference. But it's still a worthwhile benefit to add it for @live functions.
I heard that proving that two pointers do not alias is a big problem in compiler backends and that some or most auto vectorization optimizations do not fire because compiler can't prove no aliasing.

A new language used in Unity game engine is designed such that references do not alias by default for optimization reasons. I haven't looked into this topic further but I believe its worth checking it out. Data science people would benefit greatly from autovectoriztion
April 26, 2020
On Sun, Apr 26, 2020 at 9:35 PM JN via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> It's nice to have a moral victory and claim to be above "those cheaters", but links to these benchmarks are shared in many places. If someone wants to see how fast D is, they will write "programming language benchmark" in their websearch of choice, and TechEmpower will be high in the results list. He will click, and go "oh wow, even PHP is faster than that D stuff".
>
> Whether it's cheating or not, perception matters and people will use such benchmarks to base their decision, even if it's unreasonable and doesn't apply to real world scenarios.

Yes I agree, this is the reason why I am improving those benchmarks
from time to time, to
make D faster than PHP :D
April 26, 2020
On Sunday, 26 April 2020 at 16:59:44 UTC, Daniel Kozak wrote:
> On Fri, Apr 24, 2020 at 3:46 PM serge via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>>
>> To me techempower stats is pretty good indicator - it shows json processing, single/multiquery requests, database, static. Overall performance across those stats give pretty good idea, how language and web framework is created, its ecosystem.
>
> Unfortunately there is a big issue with techempower. Because it is so
> popular almost every framework [language] try to have a best score in
> it.
> And in many cases this mean they use some hacks or tricks to achieve
> that. So in general techempower results are useless.

As somebody who implemented the Swoole+PHP and Crystal code at Techempowered, i can state that this statement is factually wrong.

The code is very idiomatic code that anybody writes. Basic database calls, pool for connections, prepare statements, standard http module or frameworks. There is no magic in the code that try's to do direct system calls or has stripped down drivers or any other stuff that people normally will not use.

Where you can see some funny business, is in the top 10 a 20, where Rust and co have some extreme optimized code, that is not how most people will write the code. But those are the extreme cases, what anybody with half a brain ignores because that is not how your write normal code. I always say: Look at the code to see if the results are normal or over optimized/unrealistic crap.

If we compare normal implementations ( https://www.techempower.com/benchmarks/#section=test&runid=c7152e8f-5b33-4ae7-9e89-630af44bc8de&hw=ph&test=plaintext ) like

Futures:

vibed-ldc-pgsql: 58k
Crystal: 206k
PHP+Swoole: 289k

D's results are simply abysmal. We are talking basic idiomatic code here. This tells me more that D has a issue with its DB handling on those tests.

We need to look at stuff like "hello world" ( plain text ), json, where the performance difference drops down to 2x.

The plaintext is literally take a string and output it. We are talking route + echo in PHP and any other language. Or basic encoding a json and outputting it. A few lines of code, that is it. Yet D suffers still in those tests with a 2x issue. Does that not tell you that D or VibeD suffer from a actual performance issue? A point that clearly needs to be looked after.

If the argument is dat D is not properly optimized, then what are PHP+Workerman/Swoole/..., Crystal?


> Other issue with techempower benchmark is there is almost zero
> complexity. All tests do some basic operations on realy small
> datasets.

Futures shows a more realistic real world web scenario. The rest mostly show weaknesses in each language+framework specific section. If your json score is low, there is a problem with your json library or the way your framework handles the requests. If your plaintext results are low ... You get the drill.

If you simply try to scuff at a issue by stating "in real world we are faster" but you have benchmarks like this online... The nice things about techempowered is that it really shows if your language is fast for basic web tasks or not. It does not give a darn that your language can run "real world" fast, if there are underlying issues. For people who are interested in D for website hosting, its simply slower then the competitors.

Do not like it? Then see where the issues are and fix them. Be it in the techempowered code, in D or in VibeD. But clearly there is a issue if D can not compete with implementations of other languages ( again, talking normal implementations, stuff that anybody will use ).

If given the choice, what will people pick? D that simply ignores the web market or other languages/frameworks where the speed out of the door is great.

Its funny seeing comments like this where a simple question by the OP, turns into a whole and totally useless technical discussion. Followed by some comment that comes down to"ignore it because everybody cheats" And people here wonder why D has issues with popularity. Really! Get out much?

From my point of view, the comment is insulting and tantamount as to calling people like me, who implemented a few of the other languages as "cheaters", when its literally basic code that is used everywhere ( trust me, i am not some magic programmer, who knows C++ out of the back of his hand. I barely scrap by on my own with PHP and Ruby ).

If the issue is at D its end. Be its D, Vibe.D or the Code used, then fix it but do not insult everybody else ( especially the people who wrote normal code ).

As the saying goes: "always clean your own house first, before criticizing your neighbor's house".
April 26, 2020
On 4/26/20 10:19 AM, Walter Bright wrote:
> On 4/26/2020 12:45 AM, Timon Gehr wrote:
>> On 26.04.20 04:22, Walter Bright wrote:
>>> ref a and ref b cannot refer to the same memory object.
>> Actually they can, even in @safe @live code.
> 
> Bug reports are welcome. Please tag them with the 'live' keyword in bugzilla.

I can't do that because you did not agree it was a bug. According to your DIP and past discussions, the following is *intended* behavior:

int bar(ref int x,ref int y)@safe @live{
    x=0;
    y=1;
    return x;
}

void main()@safe{
    int x;
    import std.stdio;
    writeln(bar(x,x)); // 1
}

I have always criticized this design, but so far you have stuck to it. I have stated many times that the main reason why it is bad is that you don't actually enforce any new invariant, so @live does not enable any new patterns at least in @safe code.

In particular, if you start optimizing based on non-enforced and undocumented @live assumptions, @safe @live code will not be memory safe.

You can't optimize based on @live and preserve memory safety. Given that you want to preserve interoperability, this is because it is tied to functions instead of types. @live in its current form is useless except perhaps as a linting tool.
April 26, 2020
On Sunday, 26 April 2020 at 16:20:19 UTC, Steven Schveighoffer wrote:
> In terms of performance, depending on the task at hand, D1 code is slower than D2 appending, by the fact that there's a thread-local cache for appending for D2, and D1 only has a global one-array cache for the same. However, I'm assuming that since you were focused on D1, your usage naturally was written to take advantage of what D1 has to offer.
>
> The assumeSafeAppend call also uses this cache, and so it should be quite fast. But setting length to 0 is a ton faster, because you aren't calling an opaque function.
>
> So depending on the usage pattern, D2 with assumeSafeAppend can be faster, or it could be slower.

That makes sense.  I just know that Mathias L. seemed to be quite concerned about the `assumeSafeAppend` performance impact.  I think he was not looking for a D1/D2 comparison but in terms of getting the most performant behaviour in future.

It's not that it was slower than D1, it's that it was a per-use speed hit.

> I spoke for a while with Dicebot at Dconf 2016 or 17 about this issue. IIRC, I suggested either using a custom type or custom runtime. He was not interested in either of these ideas, and it makes sense (large existing code base, didn't want to stray from mainline D).

Yes.  To be fair I think in that context, at that stage of transition, that probably made more sense: it was easier to just mandate that everybody start putting `assumeSafeAppend` into their code (actually we implemented a transitional wrapper, `enableStomping`, which was a no-op in D1 and called `assumeSafeAppend` in D2).

> By far, the best mechanism to use is a custom type. Not only will that fix this problem as you can implement whatever behavior you want, but you also do not need to call opaque functions for appending either. It should outperform everything you could do in a generic runtime.
>
> Note that this was before (I think) destructor calls were added. The destructor calls are something that assumeSafeAppend is going to do, and won't be done with just setting length to 0.
>
> However, there are other options. We could introduce a druntime configuration option so when this specific situation happens (slice points at start of block and has 0 length), assumeSafeAppend is called automatically on the first append. Jonathan is right that this is not @safe, but it could be an opt-in configuration option.
>
> I don't think configuring specific arrays makes a lot of sense, as this would require yet another optional bit that would have to be checked and allocated for all arrays.

The druntime option does sound interesting, although I'm leery about the idea of creating 2 different language behaviours.
April 27, 2020
On Sunday, 26 April 2020 at 16:20:19 UTC, Steven Schveighoffer wrote:
>
> In terms of performance, depending on the task at hand, D1 code is slower than D2 appending, by the fact that there's a thread-local cache for appending for D2, and D1 only has a global one-array cache for the same. However, I'm assuming that since you were focused on D1, your usage naturally was written to take advantage of what D1 has to offer.
>
> The assumeSafeAppend call also uses this cache, and so it should be quite fast. But setting length to 0 is a ton faster, because you aren't calling an opaque function.
>
> So depending on the usage pattern, D2 with assumeSafeAppend can be faster, or it could be slower.

Well, Sociomantic didn't use any kind of multi-threading in "user code".
We had single-threaded fibers for concurrency, and process-level scaling for parallelism.
Some corner cases were using threads, but it was for low level things (e.g. low latency file IO on Linux), which were highly scrutinized and stayed clear of the GC AFAIK.

Note that accessing TLS *does* have a cost which is higher than accessing a global. By this reasoning, I would assume that D2 appending would definitely be slower, although I never profiled it. What I did profile tho, is `assumeSafeAppend`. The fact that it looks up GC metadata (taking the GC lock in the process) made it quite expensive given how often it was called (in D1 it was simply a no-op, and called defensively).

>> IIRC Mathias has suggested that it should be possible to tag arrays as intended for this kind of re-use, so that stomping prevention will never trigger, and you don't have to `assumeSafeAppend` each time you reduce the length.
>
> I spoke for a while with Dicebot at Dconf 2016 or 17 about this issue. IIRC, I suggested either using a custom type or custom runtime. He was not interested in either of these ideas, and it makes sense (large existing code base, didn't want to stray from mainline D).
>
> By far, the best mechanism to use is a custom type. Not only will that fix this problem as you can implement whatever behavior you want, but you also do not need to call opaque functions for appending either. It should outperform everything you could do in a generic runtime.

Well... Here's nothing I never really quite understood actually: Mihails *did* introduce a buffer type. See https://github.com/sociomantic-tsunami/ocean/blob/36c9fda09544ee5a0695a74186b06b32feda82d4/src/ocean/core/Buffer.d#L116-L130
And we also had a (very old) similar utility here: https://github.com/sociomantic-tsunami/ocean/blob/36c9fda09544ee5a0695a74186b06b32feda82d4/src/ocean/util/container/ConcatBuffer.d
I always wanted to unify this, but never got to it. But if you look at the first link, it calls `assumeSafeAppend` twice, before and after setting the length. In practice it is only necessary *after* reducing the length, but as I mentioned, this is defensive programming.

For reference, most of our applications had a principled buffer use. The buffers would rarely be appended to from more than one, perhaps two places. However, slices to the buffer would be passed around quite liberally. So a buffer type from which one could borrow would indeed have been optimal.

> Note that this was before (I think) destructor calls were added. The destructor calls are something that assumeSafeAppend is going to do, and won't be done with just setting length to 0.
>
> However, there are other options. We could introduce a druntime configuration option so when this specific situation happens (slice points at start of block and has 0 length), assumeSafeAppend is called automatically on the first append. Jonathan is right that this is not @safe, but it could be an opt-in configuration option.
>
> I don't think configuring specific arrays makes a lot of sense, as this would require yet another optional bit that would have to be checked and allocated for all arrays.
>
> -Steve

I don't even know if we had a single case where we had arrays of objects with destructors. The vast majority of our buffer were `char[]` and `ubyte[]`. We had some elaborate types, but I think destructors + buffer would have been frowned upon in code review.

Also the reason we didn't modify druntime to just have the D1 behavior (that would have been a trivial change) was because how dependent on the new behavior druntime had become. It was also the motivation for the suggestion Joe mentioned. AFAIR I mentioned it in an internal issue, did a PoC implementation, but never got it to a state were it was mergeable.

Also, while a custom type might sound better, it doesn't really interact well with the rest of the runtime, and it's an extra word to pass around (if passed by value).
April 26, 2020
On 4/26/2020 2:52 PM, Timon Gehr wrote:
> I can't do that because you did not agree it was a bug. According to your DIP and past discussions, the following is *intended* behavior:
> 
> int bar(ref int x,ref int y)@safe @live{
>      x=0;
>      y=1;
>      return x;
> }
> 
> void main()@safe{
>      int x;
>      import std.stdio;
>      writeln(bar(x,x)); // 1
> }
> 
> I have always criticized this design, but so far you have stuck to it. I have stated many times that the main reason why it is bad is that you don't actually enforce any new invariant, so @live does not enable any new patterns at least in @safe code.
> 
> In particular, if you start optimizing based on non-enforced and undocumented @live assumptions, @safe @live code will not be memory safe.
> 
> You can't optimize based on @live and preserve memory safety. Given that you want to preserve interoperability, this is because it is tied to functions instead of types. @live in its current form is useless except perhaps as a linting tool.

@live's invariants rely on arguments passed to it that conform to its requirements. It's analogous to @safe code relying on its arguments conforming.

To get the checking here, main would have to be declared @live, too.
April 26, 2020
On 4/26/2020 4:30 AM, Joseph Rushton Wakeling wrote:
> That said, what does OutBuffer do that means that it _is_ safe in this context?

It manages its own memory privately, and presents the results as dynamic arrays which do their own bounds checking. It's been a reliable solution for me for maybe 30 years.