received 100000000 messages in 2906 msec sum=4999999950000000 speed=34411 msg/msec
so, it's ~2.7x faster than Java:
https://github.com/mingwugmail/liblfdsd/tree/master/comparison
And your https://github.com/nin-jin/go.d on my machine
go.d is 2~4 times slower than Go.
9.7638ms (Go) v.s [19 ms ~ 40 ms] (go.d)
Hello everyone, I've done a little refactoring and optimization of jin.go:
- I got rid of the vibe.d dependency because it's slow, big, and I haven't been able to make friends with version 2. When running only 1000 vibe-fibers, not only did the application crash, but even the graphics system driver crashed once, which required restarting the laptop.
- So far, I've settled on native streams with a small stack size (4 kb).
- I'm really looking forward to photon's stabilization to get fiber support back. It would be really awesome to see it in the standard library.
- I had to abandon the move semantics because I couldn't make friends with the delegates. Currently, the number of references to the queue is controlled by the copy constructor.
Good news! After all the optimizations, the channels show impressive speed in the above benchmark for pumping messages between two streams.
import std.datetime.stopwatch;
import std.range;
import std.stdio;
import jin.go;
const long n = 100_000_000;
auto threadProducer() {
return n.iota;
}
void main() {
auto queue = go!threadProducer;
StopWatch sw;
sw.start();
long sum = 0;
foreach (p; queue) {
sum += p;
}
sw.stop();
writefln("received %d messages in %d msec sum=%d speed=%d msg/msec", n,
sw.peek.total!"msecs", sum, n / sw.peek.total!"msecs");
assert(sum == (n * (n - 1) / 2));
}
```
```sh
received 100000000 messages in 718 msec sum=4999999950000000 speed=139275 msg/msec
I've almost caught up with Go in my goroutines benchmark:
> go run app.go --release
Workers Result Time
8 49995000000 109.7644ms
> dub --quiet --build=release
Workers Result Time
0 49995000000 124 ms
Bad news. Sometimes I get incorrect results and I can't figure out why.
> dub --quiet --build=release
Workers Result Time
0 49945005000 176 ms
I use the atomic acquire and release operations, although they are not required on x86, but I hope the compiler takes them into account and does not reorder instructions. But even with stricter memory barriers, I don't get a very stable result. If someone can tell me what could be wrong here, I would be very grateful.