my first D program (and benchmark against perl) (page 2)

V Thu, 12 Nov 2015 11:03:38 +0000 Tobias Pankrath via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com> napsáno: > > or with ~ operator: > > > > import std.stdio; > > > > [...] > > Did anyone check that the last loop isn't optimized out? Yes, it is not optimized out > Could also be improved further if you make the function take an output range and reuse one appender for every call, but that might be to far off the original perl solution. I agree, that would be to far off the original solution.

On Wednesday, 11 November 2015 at 14:26:32 UTC, Andrea Fontana wrote: > Did you try rdmd -O -noboundscheck -release yourscript.d ? I just did. It improves speed from 17.127s to 14.831s. Nice, but nowhere near gdc/ldc level. > You should try using appender!string rather than concatenate (http://dlang.org/phobos/std_array.html#.Appender) using capacity (http://dlang.org/phobos/std_array.html#.Appender.capacity) to improve performace. > You should also switch from for to foreach. Thanks for the above 2 tips.

On Wednesday, 11 November 2015 at 14:20:51 UTC, Rikki Cattermole wrote: > I turned it into mostly using large allocations, instead of small ones. > Although I'd recommend using Appender instead of my custom functions for this. > > Oh and for me, I got it at 2 secs, 513 ms, 397 μs, and 5 hnsecs. Unoptimized, using dmd. > When release mode is enabled on dmd: 1 sec, 550 ms, 838 μs, and 9 hnsecs. So significant improvement even with dmds awful optimizer. Hi Rikki, Thanks. With your version, I've managed to be ~4x faster: dmd : 0m1.588s dmd (release): 0m1.010s gdc : 0m2.093s ldc : 0m1.594s Perl version : 0m11.391s So, I'm satisfied enough with the speed for now. Turns out dmd is not always slower.

V Thu, 12 Nov 2015 12:13:10 +0000 perlancar via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com> napsáno: > On Wednesday, 11 November 2015 at 14:20:51 UTC, Rikki Cattermole wrote: > > I turned it into mostly using large allocations, instead of > > small ones. > > Although I'd recommend using Appender instead of my custom > > functions for this. > > > > Oh and for me, I got it at 2 secs, 513 ms, 397 μs, and 5 > > hnsecs. Unoptimized, using dmd. > > When release mode is enabled on dmd: 1 sec, 550 ms, 838 μs, and > > 9 hnsecs. So significant improvement even with dmds awful > > optimizer. > > Hi Rikki, > > Thanks. With your version, I've managed to be ~4x faster: > > dmd : 0m1.588s > dmd (release): 0m1.010s > gdc : 0m2.093s > ldc : 0m1.594s > > Perl version : 0m11.391s > > So, I'm satisfied enough with the speed for now. Turns out dmd is not always slower. It depends which flags do you use on ldc and gdc ldc (-singleobj -release -O3 -boundscheck=off) gdc (-O3 -finline -frelease -fno-bounds-check)

November 12, 2015

Re: my first D program (and benchmark against perl)

Posted by Daniel Kozak
in reply to Daniel Kozak

Permalink

Daniel Kozak

Posted in reply to Daniel Kozak

Permalink

On Thursday, 12 November 2015 at 12:25:08 UTC, Daniel Kozak wrote:
> V Thu, 12 Nov 2015 12:13:10 +0000
> perlancar via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com>
> napsáno:
>
>> On Wednesday, 11 November 2015 at 14:20:51 UTC, Rikki Cattermole wrote:
>> > I turned it into mostly using large allocations, instead of
>> > small ones.
>> > Although I'd recommend using Appender instead of my custom
>> > functions for this.
>> >
>> > Oh and for me, I got it at 2 secs, 513 ms, 397 μs, and 5
>> > hnsecs. Unoptimized, using dmd.
>> > When release mode is enabled on dmd: 1 sec, 550 ms, 838 μs, and
>> > 9 hnsecs. So significant improvement even with dmds awful
>> > optimizer.
>> 
>> Hi Rikki,
>> 
>> Thanks. With your version, I've managed to be ~4x faster:
>> 
>> dmd          : 0m1.588s
>> dmd (release): 0m1.010s
>> gdc          : 0m2.093s
>> ldc          : 0m1.594s
>> 
>> Perl version : 0m11.391s
>> 
>> So, I'm satisfied enough with the speed for now. Turns out dmd is not always slower.
>
> It depends which flags do you use on ldc and gdc
>
>
> ldc (-singleobj -release -O3 -boundscheck=off)
> gdc (-O3 -finline -frelease -fno-bounds-check)

import std.stdio;

auto fmttable(string[][] table) {

    import std.array : appender, uninitializedArray;
    import std.range : take, repeat;
    import std.exception : assumeUnique;


    if (table.length == 0) return "";
    // column widths
    auto widths = new int[](table[0].length);
	size_t total = (table[0].length + 1) * table.length + table.length;	
	
    foreach (rownum, row; table) {
        foreach (colnum, cell; row) {
            if (cell.length > widths[colnum])
                widths[colnum] = cast(int)cell.length;
        }
    }

    foreach (colWidth; widths)
    {
		total += colWidth * table.length;
	}	
	
    auto res = appender(uninitializedArray!(char[])(total));
    res.clear();

    foreach (row; table) {
        res ~= "|";
        foreach (colnum, cell; row) {
            int l = widths[colnum] - cast(int)cell.length;
            res ~= cell;
            if (l)
                res ~= ' '.repeat().take(l);
            res ~= "|";
        }
        res.put("\n");
    }

     return res.data.assumeUnique();
}

void main() {

    auto table = [
        ["row1.1", "row1.2  ", "row1.3"],
        ["row2.1", "row2.2", "row2.3"],
        ["row3.1", "row3.2", "row3.3  "],
        ["row4.1", "row4.2", "row4.3"],
        ["row5.1", "row5.2", "row5.3"],
    ];

    writeln(fmttable(table));
    for (int i=0; i < 1000000; ++i) {
        fmttable(table);
    }
}

dmd -O -release -inline -boundscheck=off  asciitable.d

real	0m1.463s
user	0m1.453s
sys	0m0.003s


ldc2 -singleobj -release -O3 -boundscheck=off asciitable.d

real	0m0.945s
user	0m0.940s
sys	0m0.000s

gdc -O3 -finline -frelease -fno-bounds-check -o asciitable asciitable.d

real	0m0.618s
user	0m0.613s
sys	0m0.000s


perl:

real	0m14.198s
user	0m14.170s
sys	0m0.000s

On Thursday, 12 November 2015 at 12:49:55 UTC, Daniel Kozak wrote: > On Thursday, 12 November 2015 at 12:25:08 UTC, Daniel Kozak wrote: > ... > auto res = appender(uninitializedArray!(char[])(total)); > res.clear(); > ... this is faster for DMD and ldc: auto res = appender!(string)(); res.reserve(total); but for gdc(fronend version 2.066) it makes it two times slower (same for dmd, ldc 2.066 and older)

On Thursday, 12 November 2015 at 12:49:55 UTC, Daniel Kozak wrote: > dmd -O -release -inline -boundscheck=off asciitable.d > > real 0m1.463s > user 0m1.453s > sys 0m0.003s > > > ldc2 -singleobj -release -O3 -boundscheck=off asciitable.d > > real 0m0.945s > user 0m0.940s > sys 0m0.000s > > gdc -O3 -finline -frelease -fno-bounds-check -o asciitable asciitable.d > > real 0m0.618s > user 0m0.613s > sys 0m0.000s > > > perl: > > real 0m14.198s > user 0m14.170s > sys 0m0.000s Nice! Seems like I can get a further 100% improvement in speed from the last version (so a total of ~8x speedup from my original D version). Now I wonder how C would fare...

Forums