Jump to page: 1 219  
Page
Thread overview
Slower than Python
Mar 01, 2013
cvk012c
Mar 01, 2013
simendsjo
Mar 01, 2013
cvk012c
Mar 01, 2013
bearophile
Mar 02, 2013
cvk012c
Mar 02, 2013
cvk012c
Mar 02, 2013
anon123
Mar 02, 2013
Walter Bright
faster splitter
May 22, 2016
qznc
May 23, 2016
Seb
May 23, 2016
qznc
May 23, 2016
Seb
May 23, 2016
qznc
May 23, 2016
Jack Stouffer
May 23, 2016
Jack Stouffer
May 23, 2016
qznc
May 23, 2016
Jack Stouffer
May 23, 2016
qznc
May 24, 2016
qznc
May 24, 2016
Chris
May 24, 2016
qznc
May 24, 2016
qznc
May 24, 2016
qznc
May 25, 2016
Chris
May 25, 2016
qznc
May 25, 2016
qznc
May 25, 2016
Chris
May 24, 2016
Chris
May 27, 2016
Chris
May 27, 2016
Chris
May 27, 2016
Chris
May 27, 2016
qznc
May 27, 2016
qznc
May 28, 2016
qznc
May 28, 2016
Chris
May 28, 2016
qznc
May 28, 2016
qznc
May 29, 2016
qznc
May 29, 2016
Chris
May 30, 2016
Chris
May 29, 2016
Jon Degenhardt
May 29, 2016
qznc
May 29, 2016
Jon Degenhardt
May 30, 2016
qznc
May 30, 2016
Chris
May 30, 2016
Chris
May 30, 2016
qznc
May 31, 2016
qznc
May 31, 2016
qznc
May 31, 2016
Chris
May 31, 2016
qznc
May 31, 2016
Chris
May 31, 2016
David Nadlinger
Jun 02, 2016
qznc
Jun 01, 2016
Chris
Jun 02, 2016
qznc
Jun 01, 2016
Patrick Schluter
Jun 01, 2016
Seb
Jun 01, 2016
Chris
Jun 01, 2016
Patrick Schluter
Jun 01, 2016
Chris
Jun 02, 2016
Chris
May 31, 2016
Chris
May 31, 2016
Wyatt
May 31, 2016
qznc
Jul 11, 2016
Henrique bucher
May 27, 2016
Patrick Schluter
May 27, 2016
Chris
May 27, 2016
Patrick Schluter
May 27, 2016
qznc
May 27, 2016
qznc
May 27, 2016
qznc
May 28, 2016
Chris
May 27, 2016
qznc
May 27, 2016
David Nadlinger
May 28, 2016
Chris
Mar 02, 2013
Russel Winder
Mar 02, 2013
John Colvin
Mar 02, 2013
Russel Winder
Mar 02, 2013
Peter Alexander
Mar 02, 2013
John Colvin
Mar 02, 2013
Russel Winder
Mar 02, 2013
Peter Alexander
Mar 02, 2013
Walter Bright
Mar 02, 2013
Russel Winder
Mar 02, 2013
jerro
Mar 02, 2013
Walter Bright
Mar 02, 2013
H. S. Teoh
Mar 02, 2013
Walter Bright
Mar 03, 2013
Russel Winder
Mar 03, 2013
Dmitry Olshansky
Mar 03, 2013
deadalnix
Mar 03, 2013
Walter Bright
Mar 03, 2013
bearophile
Mar 03, 2013
SomeDude
Mar 02, 2013
John Colvin
Mar 04, 2013
bearophile
Mar 02, 2013
cvk012c
Mar 02, 2013
Walter Bright
Mar 04, 2013
bearophile
Mar 02, 2013
deadalnix
Mar 02, 2013
bearophile
Mar 02, 2013
Jacob Carlborg
Mar 01, 2013
Timon Gehr
Mar 02, 2013
SomeDude
Mar 04, 2013
Zach the Mystic
Mar 04, 2013
Zach the Mystic
Mar 02, 2013
Jacob Carlborg
Mar 02, 2013
Walter Bright
Mar 03, 2013
Jacob Carlborg
Mar 03, 2013
Walter Bright
Mar 02, 2013
Walter Bright
Mar 02, 2013
John Colvin
Mar 02, 2013
Jacob Carlborg
Mar 02, 2013
John Colvin
Mar 03, 2013
Jacob Carlborg
Mar 03, 2013
John Colvin
Mar 04, 2013
Jacob Carlborg
Mar 04, 2013
Walter Bright
Mar 04, 2013
Jonathan M Davis
Mar 04, 2013
Jacob Carlborg
Mar 03, 2013
Walter Bright
Mar 04, 2013
deadalnix
Mar 04, 2013
jerro
Mar 04, 2013
bearophile
Mar 04, 2013
Russel Winder
Mar 05, 2013
Ellery Newcomer
Mar 04, 2013
deadalnix
Mar 04, 2013
Don
Mar 04, 2013
deadalnix
Mar 04, 2013
Jacob Carlborg
Mar 02, 2013
jerro
Mar 01, 2013
jerro
Mar 01, 2013
Robert
March 01, 2013
Tried to port my SIP parser from Python to D to boost performance
but got opposite result.
I created a simple test script which splits SIP REGISTER message
10 million times. Python version takes about 14 seconds to
execute, D version is about 23 seconds which is 1.6 times slower.
I used DMD 2.062 and compiled my script with options -release and
-O. I used Python 3.3 64 bit.
I ran both scripts on the same hardware with Windows 7 64.
Is this general problem with string performance in D or just
splitter() issue?
Did anybody compared performance of D string manipulating
functions with other languages like Python,Perl,Java and C++?


Here is Python version of test script:

import time

message = "REGISTER sip:comm.example.com SIP/2.0\r\n\
Content-Length: 0\r\n\
Contact:
<sip:12345@10.1.3.114:59788;transport=tls>;expires=4294967295;events=\"message-summary\";q=0.9\r\n\
To: <sip:12345@comm.example.com>\r\n\
User-Agent: (\"VENDOR=MyCompany\" \"My User Agent\")\r\n\
Max-Forwards: 70\r\n\
CSeq: 1 REGISTER\r\n\
Via: SIP/2.0/TLS
10.1.3.114:59788;branch=z9hG4bK2910497772630690\r\n\
Call-ID: 2910497622026445\r\n\
From: <sip:12345@comm.example.com>;tag=2910497618150713\r\n\r\n"

t1 = time.perf_counter()
for i in range(10000000):
   for notused in message.split("\r\n"):
     pass
print(time.perf_counter()-t1)


Here is D version:
import std.stdio,std.algorithm,std.datetime;

void main()
{
   auto message = "REGISTER sip:example.com SIP/2.0\r\n~
Content-Length: 0\r\n~
Contact:
<sip:12345@10.1.3.114:59788;transport=tls>;expires=4294967295;events=\"message-summary\";q=0.9\r\n~
To: <sip:12345@comm.example.com>\r\n~
User-Agent: (\"VENDOR=MyCompany\" \"My User Agent\")\r\n~
Max-Forwards: 70\r\n~
CSeq: 1 REGISTER\r\n~
Via: SIP/2.0/TLS
10.1.3.114:59788;branch=z9hG4bK2910497772630690\r\n~
Call-ID: 2910497622026445\r\n~
From: <sip:12345@comm.example.com>;tag=2910497618150713\r\n\r\n";

   auto t1 = Clock.currTime();
   foreach(i; 0..10000000)
   {
     foreach(notused; splitter(message, "\r\n"))
     {
     }
   }
   writeln(Clock.currTime()-t1);
}

March 01, 2013
On 3/1/13 3:30 PM, cvk012c wrote:
> Tried to port my SIP parser from Python to D to boost performance
> but got opposite result.
> I created a simple test script which splits SIP REGISTER message
> 10 million times. Python version takes about 14 seconds to
> execute, D version is about 23 seconds which is 1.6 times slower.
> I used DMD 2.062 and compiled my script with options -release and
> -O. I used Python 3.3 64 bit.
> I ran both scripts on the same hardware with Windows 7 64.

Add -inline to the options.

Andrei
March 01, 2013
On Friday, 1 March 2013 at 20:50:15 UTC, Andrei Alexandrescu wrote:
> On 3/1/13 3:30 PM, cvk012c wrote:
>> Tried to port my SIP parser from Python to D to boost performance
>> but got opposite result.
>> I created a simple test script which splits SIP REGISTER message
>> 10 million times. Python version takes about 14 seconds to
>> execute, D version is about 23 seconds which is 1.6 times slower.
>> I used DMD 2.062 and compiled my script with options -release and
>> -O. I used Python 3.3 64 bit.
>> I ran both scripts on the same hardware with Windows 7 64.
>
> Add -inline to the options.
>
> Andrei

--noboundscheck can also help if you don't mind missing the safety net.

$ rdmd -O -release sip
22 secs, 977 ms, 299 μs, and 8 hnsecs
$ rdmd -O -release -inline sip
12 secs, 245 ms, 567 μs, and 9 hnsecs
$ rdmd -O -release -inline -noboundscheck sip
10 secs, 171 ms, 209 μs, and 9 hnsecs
March 01, 2013
On Friday, 1 March 2013 at 20:30:24 UTC, cvk012c wrote:
> Tried to port my SIP parser from Python to D to boost performance
> but got opposite result.
> I created a simple test script which splits SIP REGISTER message
> 10 million times. Python version takes about 14 seconds to
> execute, D version is about 23 seconds which is 1.6 times slower.
> I used DMD 2.062 and compiled my script with options -release and
> -O. I used Python 3.3 64 bit.
> I ran both scripts on the same hardware with Windows 7 64.
> Is this general problem with string performance in D or just
> splitter() issue?
> Did anybody compared performance of D string manipulating
> functions with other languages like Python,Perl,Java and C++?

I'm guessing you are building without optimization options. When compiled with "dmd -O -inline -noboundscheck -release tmp" the D code takes 11.1 seconds on my machine and the python script takes 16.1 seconds. You can make the D code faster by building with LDC or GDC:

ldmd2 -O -noboundscheck -release tmp:
6.8 seconds

gdmd -O -nboundscheck -inline -release tmp:
6.1 seconds

So no, not slower than Python. I also suspect that much of the work in the Python code is actually done by functions that are implemented in C.
March 01, 2013
Hm, just recently a friend of mine and I hacked together at the FB
hacking cup, he in python and I in D. My solutions always were at least
faster by a factor of 80. For your example, I could not get a factor of
80, but with
 -inline
it is at least faster than the python version (about 30% faster on my
machine)

Best regards,

Robert


March 01, 2013
On 3/1/13 3:58 PM, simendsjo wrote:
> On Friday, 1 March 2013 at 20:50:15 UTC, Andrei Alexandrescu wrote:
>> On 3/1/13 3:30 PM, cvk012c wrote:
>>> Tried to port my SIP parser from Python to D to boost performance
>>> but got opposite result.
>>> I created a simple test script which splits SIP REGISTER message
>>> 10 million times. Python version takes about 14 seconds to
>>> execute, D version is about 23 seconds which is 1.6 times slower.
>>> I used DMD 2.062 and compiled my script with options -release and
>>> -O. I used Python 3.3 64 bit.
>>> I ran both scripts on the same hardware with Windows 7 64.
>>
>> Add -inline to the options.
>>
>> Andrei
>
> --noboundscheck can also help if you don't mind missing the safety net.
>
> $ rdmd -O -release sip
> 22 secs, 977 ms, 299 μs, and 8 hnsecs
> $ rdmd -O -release -inline sip
> 12 secs, 245 ms, 567 μs, and 9 hnsecs
> $ rdmd -O -release -inline -noboundscheck sip
> 10 secs, 171 ms, 209 μs, and 9 hnsecs

Also, the D version has a different string to parse (~ is not a line continuation character). The fixed version:

   auto message = "REGISTER sip:example.com SIP/2.0\r
Content-Length: 0\r
Contact:
<sip:12345@10.1.3.114:59788;transport=tls>;expires=4294967295;events=\"message-summaryq\";q=0.9\r
To: <sip:12345@comm.example.com>\r
User-Agent: (\"VENDOR=MyCompany\" \"My User Agent\")\r
Max-Forwards: 70\r
CSeq: 1 REGISTER\r
Via: SIP/2.0/TLS
10.1.3.114:59788;branch=z9hG4bK2910497772630690\r
Call-ID: 2910497622026445\r
From: <sip:12345@comm.example.com>;tag=2910497618150713\r\n\r\n";

That shaves one extra second bringing it down to 9.


Andrei
March 01, 2013
On Friday, 1 March 2013 at 20:58:09 UTC, simendsjo wrote:
> On Friday, 1 March 2013 at 20:50:15 UTC, Andrei Alexandrescu wrote:
>> On 3/1/13 3:30 PM, cvk012c wrote:
>>> Tried to port my SIP parser from Python to D to boost performance
>>> but got opposite result.
>>> I created a simple test script which splits SIP REGISTER message
>>> 10 million times. Python version takes about 14 seconds to
>>> execute, D version is about 23 seconds which is 1.6 times slower.
>>> I used DMD 2.062 and compiled my script with options -release and
>>> -O. I used Python 3.3 64 bit.
>>> I ran both scripts on the same hardware with Windows 7 64.
>>
>> Add -inline to the options.
>>
>> Andrei
>
> --noboundscheck can also help if you don't mind missing the safety net.
>
> $ rdmd -O -release sip
> 22 secs, 977 ms, 299 μs, and 8 hnsecs
> $ rdmd -O -release -inline sip
> 12 secs, 245 ms, 567 μs, and 9 hnsecs
> $ rdmd -O -release -inline -noboundscheck sip
> 10 secs, 171 ms, 209 μs, and 9 hnsecs


On my hardware with -inline options it now takes about 15 secs which is still slower than Python but with both -inline and -noboundscheck it takes 13 secs and finally beats Python.
But I still kind of disappointed because I expected a much better performance boost and got only 7%. Counting that Python is not the fastest scripting language I think that similar Perl and Java scripts will outperform D easily.
Thanks Andrei and simendsjo for a quick response though.
March 01, 2013
cvk012c:

> I think that similar Perl and Java scripts will outperform D easily.
> Thanks Andrei and simendsjo for a quick response though.

Why don't you write a Java version? It takes only few minutes, and you will have one more data point.

Python string functions are written in C, compiled very efficiently (the standard Python binaries on Windows are compiled with the Microsoft Compiler, but also Intel compile can be found), and they are well optimized in several years of work by people like Hettinger :-)

You will see Python2 code like this easily beat D for normal text files:

for line in file(foo.txt): ...

Both D and general performance aren't magical things. Performance comes from a long work of optimization of algorithms, code and compilers/virtual machines that run them.

Bye,
bearophile
March 01, 2013
On 3/1/13 4:28 PM, cvk012c wrote:
> On my hardware with -inline options it now takes about 15 secs which is
> still slower than Python but with both -inline and -noboundscheck it
> takes 13 secs and finally beats Python.
> But I still kind of disappointed because I expected a much better
> performance boost and got only 7%. Counting that Python is not the
> fastest scripting language I think that similar Perl and Java scripts
> will outperform D easily.

I doubt that.

1. Microbenchmarks are a crapshoot, they exercise a tiny portion of the language and library.

2. With Python, after comparing 2-3 idioms - well, this is pretty much it. We doubled the speed in no time by just tuning options. D being a systems language allows you take your code to a million places if you want to optimize.

3. split has been discussed and improved for years in the Python community (just google for e.g. python split performance). Perl isn't all that fast actually, try this (takes 30+ seconds):

$message = "REGISTER sip:comm.example.com SIP/2.0\r
Content-Length: 0\r
Contact:
<sip:12345@10.1.3.114:59788;transport=tls>;expires=4294967295;events=\"message-summary\";q=0.9\r
To: <sip:12345@comm.example.com>\r
User-Agent: (\"VENDOR=MyCompany\" \"My User Agent\")\r
Max-Forwards: 70\r
CSeq: 1 REGISTER\r
Via: SIP/2.0/TLS
10.1.3.114:59788;branch=z9hG4bK2910497772630690\r
Call-ID: 2910497622026445\r
From: <sip:12345@comm.example.com>;tag=2910497618150713\r\n\r\n";

foreach my $i (0 .. 10000000)
{
    foreach my $notused (split(/\r\n/, $message))
    {
    }
}


Andrei
March 01, 2013
On 03/01/2013 10:28 PM, cvk012c wrote:
> ...
>
> On my hardware with -inline options it now takes about 15 secs which is
> still slower than Python but with both -inline and -noboundscheck it
> takes 13 secs and finally beats Python.
> But I still kind of disappointed because I expected a much better
> performance boost and got only 7%. Counting that Python is not the
> fastest scripting language I think that similar Perl and Java scripts
> will outperform D easily.

Never make such statements without doing actual measurements. Furthermore, it is completely meaningless anyway. Performance benchmarks always compare language implementations, not languages.

(Case in point: You get twice the speed by using another compiler backend implementation.)
« First   ‹ Prev
1 2 3 4 5 6 7 8 9 10 11