Slower than Python

faster splitter
May 22, 2016 qznc
May 23, 2016 Seb
May 23, 2016 qznc
May 23, 2016 Seb
May 23, 2016 qznc
May 23, 2016 Andrei Alexandrescu
May 23, 2016 Jack Stouffer
May 23, 2016 Andrei Alexandrescu
May 23, 2016 Jack Stouffer
May 23, 2016 qznc
May 23, 2016 Andrei Alexandrescu
May 23, 2016 Jack Stouffer
May 23, 2016 Andrei Alexandrescu
May 23, 2016 Andrei Alexandrescu
May 23, 2016 qznc
May 23, 2016 Andrei Alexandrescu
May 24, 2016 qznc
May 24, 2016 Chris
May 24, 2016 qznc
May 24, 2016 Andrei Alexandrescu
May 24, 2016 Andrei Alexandrescu
May 24, 2016 qznc
May 24, 2016 Andrei Alexandrescu
May 24, 2016 qznc
May 25, 2016 Chris
May 25, 2016 qznc
May 25, 2016 qznc
May 25, 2016 Chris
May 24, 2016 Andrei Alexandrescu
May 24, 2016 Chris
May 27, 2016 Chris
May 27, 2016 Andrei Alexandrescu
May 27, 2016 Chris
May 27, 2016 Chris
May 27, 2016 qznc
May 27, 2016 Andrei Alexandrescu
May 27, 2016 qznc
May 28, 2016 Andrei Alexandrescu
May 28, 2016 qznc
May 28, 2016 Andrei Alexandrescu
May 28, 2016 Chris
May 28, 2016 qznc
May 28, 2016 qznc
May 29, 2016 qznc
May 29, 2016 Chris
May 30, 2016 Chris
May 29, 2016 Jon Degenhardt
May 29, 2016 qznc
May 29, 2016 Jon Degenhardt
May 30, 2016 qznc
May 30, 2016 Andrei Alexandrescu
May 30, 2016 Chris
May 30, 2016 Chris
May 30, 2016 Andrei Alexandrescu
May 30, 2016 qznc
May 31, 2016 Andrei Alexandrescu
May 31, 2016 qznc
May 31, 2016 Andrei Alexandrescu
May 31, 2016 qznc
May 31, 2016 Chris
May 31, 2016 qznc
May 31, 2016 Chris
May 31, 2016 Andrei Alexandrescu
May 31, 2016 David Nadlinger
Jun 02, 2016 qznc
Jun 01, 2016 Chris
Jun 02, 2016 qznc
May 31, 2016 Andrei Alexandrescu
Jun 01, 2016 Patrick Schluter
Jun 01, 2016 Seb
Jun 01, 2016 Chris
Jun 01, 2016 Andrei Alexandrescu
Jun 01, 2016 Patrick Schluter
Jun 01, 2016 Chris
Jun 02, 2016 Chris
May 31, 2016 Chris
May 31, 2016 Wyatt
May 31, 2016 qznc
Jul 11, 2016 Henrique bucher
May 27, 2016 Patrick Schluter
May 27, 2016 Chris
May 27, 2016 Patrick Schluter
May 27, 2016 qznc
May 27, 2016 qznc
May 27, 2016 Andrei Alexandrescu
May 27, 2016 qznc
May 28, 2016 Chris
May 27, 2016 qznc
May 27, 2016 David Nadlinger
May 28, 2016 Chris

Mar 02, 2013

Andrei Alexandrescu

Mar 02, 2013

Mar 02, 2013

Mar 02, 2013

Mar 02, 2013

Mar 02, 2013

Mar 02, 2013

Mar 02, 2013

Mar 02, 2013

Mar 02, 2013

Mar 02, 2013

Mar 02, 2013

Mar 02, 2013

Mar 02, 2013

Mar 03, 2013

Mar 03, 2013

Mar 03, 2013

Mar 03, 2013

Mar 03, 2013

Mar 03, 2013

Mar 02, 2013

Mar 04, 2013

Mar 02, 2013

Mar 02, 2013

Mar 04, 2013

Mar 02, 2013

Mar 02, 2013

Mar 01, 2013

Mar 02, 2013

Mar 02, 2013

Mar 01, 2013

Mar 02, 2013

Mar 01, 2013

Mar 01, 2013

Mar 01, 2013

Mar 01, 2013

Mar 01, 2013

Mar 01, 2013

Mar 02, 2013

Mar 02, 2013

Mar 04, 2013

Mar 04, 2013

Mar 04, 2013

Mar 02, 2013

Mar 02, 2013

Mar 02, 2013

Mar 02, 2013

Mar 02, 2013

Mar 02, 2013

Mar 03, 2013

Mar 02, 2013

Mar 03, 2013

Mar 03, 2013

Mar 02, 2013

Mar 02, 2013

Mar 02, 2013

Mar 02, 2013

Mar 03, 2013

Mar 03, 2013

Mar 04, 2013

Mar 04, 2013

Mar 04, 2013

Mar 04, 2013

Mar 03, 2013

Mar 04, 2013

Mar 04, 2013

Mar 04, 2013

Mar 04, 2013

Mar 05, 2013

Mar 04, 2013

Mar 04, 2013

Mar 04, 2013

Mar 04, 2013

Mar 04, 2013

Mar 04, 2013

Mar 02, 2013

Mar 02, 2013

Mar 01, 2013

Mar 01, 2013

March 01, 2013

Posted by cvk012c

Permalink

cvk012c

Permalink

Tried to port my SIP parser from Python to D to boost performance
but got opposite result.
I created a simple test script which splits SIP REGISTER message
10 million times. Python version takes about 14 seconds to
execute, D version is about 23 seconds which is 1.6 times slower.
I used DMD 2.062 and compiled my script with options -release and
-O. I used Python 3.3 64 bit.
I ran both scripts on the same hardware with Windows 7 64.
Is this general problem with string performance in D or just
splitter() issue?
Did anybody compared performance of D string manipulating
functions with other languages like Python,Perl,Java and C++?


Here is Python version of test script:

import time

message = "REGISTER sip:comm.example.com SIP/2.0\r\n\
Content-Length: 0\r\n\
Contact:
<sip:12345@10.1.3.114:59788;transport=tls>;expires=4294967295;events=\"message-summary\";q=0.9\r\n\
To: <sip:12345@comm.example.com>\r\n\
User-Agent: (\"VENDOR=MyCompany\" \"My User Agent\")\r\n\
Max-Forwards: 70\r\n\
CSeq: 1 REGISTER\r\n\
Via: SIP/2.0/TLS
10.1.3.114:59788;branch=z9hG4bK2910497772630690\r\n\
Call-ID: 2910497622026445\r\n\
From: <sip:12345@comm.example.com>;tag=2910497618150713\r\n\r\n"

t1 = time.perf_counter()
for i in range(10000000):
   for notused in message.split("\r\n"):
     pass
print(time.perf_counter()-t1)


Here is D version:
import std.stdio,std.algorithm,std.datetime;

void main()
{
   auto message = "REGISTER sip:example.com SIP/2.0\r\n~
Content-Length: 0\r\n~
Contact:
<sip:12345@10.1.3.114:59788;transport=tls>;expires=4294967295;events=\"message-summary\";q=0.9\r\n~
To: <sip:12345@comm.example.com>\r\n~
User-Agent: (\"VENDOR=MyCompany\" \"My User Agent\")\r\n~
Max-Forwards: 70\r\n~
CSeq: 1 REGISTER\r\n~
Via: SIP/2.0/TLS
10.1.3.114:59788;branch=z9hG4bK2910497772630690\r\n~
Call-ID: 2910497622026445\r\n~
From: <sip:12345@comm.example.com>;tag=2910497618150713\r\n\r\n";

   auto t1 = Clock.currTime();
   foreach(i; 0..10000000)
   {
     foreach(notused; splitter(message, "\r\n"))
     {
     }
   }
   writeln(Clock.currTime()-t1);
}

March 01, 2013

Re: Slower than Python

Posted by Andrei Alexandrescu
in reply to cvk012c

Permalink

Andrei Alexandrescu

Posted in reply to cvk012c

Permalink

On 3/1/13 3:30 PM, cvk012c wrote:
> Tried to port my SIP parser from Python to D to boost performance
> but got opposite result.
> I created a simple test script which splits SIP REGISTER message
> 10 million times. Python version takes about 14 seconds to
> execute, D version is about 23 seconds which is 1.6 times slower.
> I used DMD 2.062 and compiled my script with options -release and
> -O. I used Python 3.3 64 bit.
> I ran both scripts on the same hardware with Windows 7 64.

Add -inline to the options.

Andrei

March 01, 2013

Re: Slower than Python

Posted by simendsjo
in reply to Andrei Alexandrescu

Permalink

simendsjo

Posted in reply to Andrei Alexandrescu

Permalink

On Friday, 1 March 2013 at 20:50:15 UTC, Andrei Alexandrescu wrote:
> On 3/1/13 3:30 PM, cvk012c wrote:
>> Tried to port my SIP parser from Python to D to boost performance
>> but got opposite result.
>> I created a simple test script which splits SIP REGISTER message
>> 10 million times. Python version takes about 14 seconds to
>> execute, D version is about 23 seconds which is 1.6 times slower.
>> I used DMD 2.062 and compiled my script with options -release and
>> -O. I used Python 3.3 64 bit.
>> I ran both scripts on the same hardware with Windows 7 64.
>
> Add -inline to the options.
>
> Andrei

--noboundscheck can also help if you don't mind missing the safety net.

$ rdmd -O -release sip
22 secs, 977 ms, 299 μs, and 8 hnsecs
$ rdmd -O -release -inline sip
12 secs, 245 ms, 567 μs, and 9 hnsecs
$ rdmd -O -release -inline -noboundscheck sip
10 secs, 171 ms, 209 μs, and 9 hnsecs

March 01, 2013

Re: Slower than Python

Posted by jerro
in reply to cvk012c

Permalink

jerro

Posted in reply to cvk012c

Permalink

On Friday, 1 March 2013 at 20:30:24 UTC, cvk012c wrote:
> Tried to port my SIP parser from Python to D to boost performance
> but got opposite result.
> I created a simple test script which splits SIP REGISTER message
> 10 million times. Python version takes about 14 seconds to
> execute, D version is about 23 seconds which is 1.6 times slower.
> I used DMD 2.062 and compiled my script with options -release and
> -O. I used Python 3.3 64 bit.
> I ran both scripts on the same hardware with Windows 7 64.
> Is this general problem with string performance in D or just
> splitter() issue?
> Did anybody compared performance of D string manipulating
> functions with other languages like Python,Perl,Java and C++?

I'm guessing you are building without optimization options. When compiled with "dmd -O -inline -noboundscheck -release tmp" the D code takes 11.1 seconds on my machine and the python script takes 16.1 seconds. You can make the D code faster by building with LDC or GDC:

ldmd2 -O -noboundscheck -release tmp:
6.8 seconds

gdmd -O -nboundscheck -inline -release tmp:
6.1 seconds

So no, not slower than Python. I also suspect that much of the work in the Python code is actually done by functions that are implemented in C.

March 01, 2013

Re: Slower than Python

Posted by Robert
in reply to cvk012c

Permalink

Robert

Posted in reply to cvk012c

Permalink

Hm, just recently a friend of mine and I hacked together at the FB
hacking cup, he in python and I in D. My solutions always were at least
faster by a factor of 80. For your example, I could not get a factor of
80, but with
 -inline
it is at least faster than the python version (about 30% faster on my
machine)

Best regards,

Robert

March 01, 2013

Re: Slower than Python

Posted by Andrei Alexandrescu
in reply to simendsjo

Permalink

Andrei Alexandrescu

Posted in reply to simendsjo

Permalink

On 3/1/13 3:58 PM, simendsjo wrote:
> On Friday, 1 March 2013 at 20:50:15 UTC, Andrei Alexandrescu wrote:
>> On 3/1/13 3:30 PM, cvk012c wrote:
>>> Tried to port my SIP parser from Python to D to boost performance
>>> but got opposite result.
>>> I created a simple test script which splits SIP REGISTER message
>>> 10 million times. Python version takes about 14 seconds to
>>> execute, D version is about 23 seconds which is 1.6 times slower.
>>> I used DMD 2.062 and compiled my script with options -release and
>>> -O. I used Python 3.3 64 bit.
>>> I ran both scripts on the same hardware with Windows 7 64.
>>
>> Add -inline to the options.
>>
>> Andrei
>
> --noboundscheck can also help if you don't mind missing the safety net.
>
> $ rdmd -O -release sip
> 22 secs, 977 ms, 299 μs, and 8 hnsecs
> $ rdmd -O -release -inline sip
> 12 secs, 245 ms, 567 μs, and 9 hnsecs
> $ rdmd -O -release -inline -noboundscheck sip
> 10 secs, 171 ms, 209 μs, and 9 hnsecs

Also, the D version has a different string to parse (~ is not a line continuation character). The fixed version:

   auto message = "REGISTER sip:example.com SIP/2.0\r
Content-Length: 0\r
Contact:
<sip:12345@10.1.3.114:59788;transport=tls>;expires=4294967295;events=\"message-summaryq\";q=0.9\r
To: <sip:12345@comm.example.com>\r
User-Agent: (\"VENDOR=MyCompany\" \"My User Agent\")\r
Max-Forwards: 70\r
CSeq: 1 REGISTER\r
Via: SIP/2.0/TLS
10.1.3.114:59788;branch=z9hG4bK2910497772630690\r
Call-ID: 2910497622026445\r
From: <sip:12345@comm.example.com>;tag=2910497618150713\r\n\r\n";

That shaves one extra second bringing it down to 9.


Andrei

March 01, 2013

Re: Slower than Python

Posted by cvk012c
in reply to simendsjo

Permalink

cvk012c

Posted in reply to simendsjo

Permalink

On Friday, 1 March 2013 at 20:58:09 UTC, simendsjo wrote:
> On Friday, 1 March 2013 at 20:50:15 UTC, Andrei Alexandrescu wrote:
>> On 3/1/13 3:30 PM, cvk012c wrote:
>>> Tried to port my SIP parser from Python to D to boost performance
>>> but got opposite result.
>>> I created a simple test script which splits SIP REGISTER message
>>> 10 million times. Python version takes about 14 seconds to
>>> execute, D version is about 23 seconds which is 1.6 times slower.
>>> I used DMD 2.062 and compiled my script with options -release and
>>> -O. I used Python 3.3 64 bit.
>>> I ran both scripts on the same hardware with Windows 7 64.
>>
>> Add -inline to the options.
>>
>> Andrei
>
> --noboundscheck can also help if you don't mind missing the safety net.
>
> $ rdmd -O -release sip
> 22 secs, 977 ms, 299 μs, and 8 hnsecs
> $ rdmd -O -release -inline sip
> 12 secs, 245 ms, 567 μs, and 9 hnsecs
> $ rdmd -O -release -inline -noboundscheck sip
> 10 secs, 171 ms, 209 μs, and 9 hnsecs

On my hardware with -inline options it now takes about 15 secs which is still slower than Python but with both -inline and -noboundscheck it takes 13 secs and finally beats Python.
But I still kind of disappointed because I expected a much better performance boost and got only 7%. Counting that Python is not the fastest scripting language I think that similar Perl and Java scripts will outperform D easily.
Thanks Andrei and simendsjo for a quick response though.

March 01, 2013

Re: Slower than Python

Posted by bearophile
in reply to cvk012c

Permalink

bearophile

Posted in reply to cvk012c

Permalink

cvk012c:

> I think that similar Perl and Java scripts will outperform D easily.
> Thanks Andrei and simendsjo for a quick response though.

Why don't you write a Java version? It takes only few minutes, and you will have one more data point.

Python string functions are written in C, compiled very efficiently (the standard Python binaries on Windows are compiled with the Microsoft Compiler, but also Intel compile can be found), and they are well optimized in several years of work by people like Hettinger :-)

You will see Python2 code like this easily beat D for normal text files:

for line in file(foo.txt): ...

Both D and general performance aren't magical things. Performance comes from a long work of optimization of algorithms, code and compilers/virtual machines that run them.

Bye,
bearophile

March 01, 2013

Re: Slower than Python

Posted by Andrei Alexandrescu
in reply to cvk012c

Permalink

Andrei Alexandrescu

Posted in reply to cvk012c

Permalink

On 3/1/13 4:28 PM, cvk012c wrote:
> On my hardware with -inline options it now takes about 15 secs which is
> still slower than Python but with both -inline and -noboundscheck it
> takes 13 secs and finally beats Python.
> But I still kind of disappointed because I expected a much better
> performance boost and got only 7%. Counting that Python is not the
> fastest scripting language I think that similar Perl and Java scripts
> will outperform D easily.

I doubt that.

1. Microbenchmarks are a crapshoot, they exercise a tiny portion of the language and library.

2. With Python, after comparing 2-3 idioms - well, this is pretty much it. We doubled the speed in no time by just tuning options. D being a systems language allows you take your code to a million places if you want to optimize.

3. split has been discussed and improved for years in the Python community (just google for e.g. python split performance). Perl isn't all that fast actually, try this (takes 30+ seconds):

$message = "REGISTER sip:comm.example.com SIP/2.0\r
Content-Length: 0\r
Contact:
<sip:12345@10.1.3.114:59788;transport=tls>;expires=4294967295;events=\"message-summary\";q=0.9\r
To: <sip:12345@comm.example.com>\r
User-Agent: (\"VENDOR=MyCompany\" \"My User Agent\")\r
Max-Forwards: 70\r
CSeq: 1 REGISTER\r
Via: SIP/2.0/TLS
10.1.3.114:59788;branch=z9hG4bK2910497772630690\r
Call-ID: 2910497622026445\r
From: <sip:12345@comm.example.com>;tag=2910497618150713\r\n\r\n";

foreach my $i (0 .. 10000000)
{
    foreach my $notused (split(/\r\n/, $message))
    {
    }
}

Andrei

March 01, 2013

Re: Slower than Python

Posted by Timon Gehr
in reply to cvk012c

Permalink

Timon Gehr

Posted in reply to cvk012c

Permalink

On 03/01/2013 10:28 PM, cvk012c wrote:
> ...
>
> On my hardware with -inline options it now takes about 15 secs which is
> still slower than Python but with both -inline and -noboundscheck it
> takes 13 secs and finally beats Python.
> But I still kind of disappointed because I expected a much better
> performance boost and got only 7%. Counting that Python is not the
> fastest scripting language I think that similar Perl and Java scripts
> will outperform D easily.

Never make such statements without doing actual measurements. Furthermore, it is completely meaningless anyway. Performance benchmarks always compare language implementations, not languages.

(Case in point: You get twice the speed by using another compiler backend implementation.)

Top | Forum index | About this forum

Forums