November 12, 2015
V Thu, 12 Nov 2015 11:03:38 +0000
Tobias Pankrath via Digitalmars-d-learn
<digitalmars-d-learn@puremagic.com> napsáno:

> > or with ~ operator:
> >
> > import std.stdio;
> >
> > [...]
> 
> Did anyone check that the last loop isn't optimized out?

Yes, it is not optimized out

> Could also be improved further if you make the function take an output range and reuse one appender for every call, but that might be to far off the original perl solution.

I agree, that would be to far off the original solution.

November 12, 2015
On Wednesday, 11 November 2015 at 14:26:32 UTC, Andrea Fontana wrote:
> Did you try rdmd -O -noboundscheck -release yourscript.d ?

I just did. It improves speed from 17.127s to 14.831s. Nice, but nowhere near gdc/ldc level.

> You should try using appender!string rather than concatenate (http://dlang.org/phobos/std_array.html#.Appender) using capacity (http://dlang.org/phobos/std_array.html#.Appender.capacity) to improve performace.

> You should also switch from for to foreach.

Thanks for the above 2 tips.

November 12, 2015
On Wednesday, 11 November 2015 at 14:20:51 UTC, Rikki Cattermole wrote:
> I turned it into mostly using large allocations, instead of small ones.
> Although I'd recommend using Appender instead of my custom functions for this.
>
> Oh and for me, I got it at 2 secs, 513 ms, 397 μs, and 5 hnsecs. Unoptimized, using dmd.
> When release mode is enabled on dmd: 1 sec, 550 ms, 838 μs, and 9 hnsecs. So significant improvement even with dmds awful optimizer.

Hi Rikki,

Thanks. With your version, I've managed to be ~4x faster:

dmd          : 0m1.588s
dmd (release): 0m1.010s
gdc          : 0m2.093s
ldc          : 0m1.594s

Perl version : 0m11.391s

So, I'm satisfied enough with the speed for now. Turns out dmd is not always slower.
November 12, 2015
V Thu, 12 Nov 2015 12:13:10 +0000
perlancar via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com>
napsáno:

> On Wednesday, 11 November 2015 at 14:20:51 UTC, Rikki Cattermole wrote:
> > I turned it into mostly using large allocations, instead of
> > small ones.
> > Although I'd recommend using Appender instead of my custom
> > functions for this.
> >
> > Oh and for me, I got it at 2 secs, 513 ms, 397 μs, and 5
> > hnsecs. Unoptimized, using dmd.
> > When release mode is enabled on dmd: 1 sec, 550 ms, 838 μs, and
> > 9 hnsecs. So significant improvement even with dmds awful
> > optimizer.
> 
> Hi Rikki,
> 
> Thanks. With your version, I've managed to be ~4x faster:
> 
> dmd          : 0m1.588s
> dmd (release): 0m1.010s
> gdc          : 0m2.093s
> ldc          : 0m1.594s
> 
> Perl version : 0m11.391s
> 
> So, I'm satisfied enough with the speed for now. Turns out dmd is not always slower.

It depends which flags do you use on ldc and gdc


ldc (-singleobj -release -O3 -boundscheck=off)
gdc (-O3 -finline -frelease -fno-bounds-check)

November 12, 2015
On Thursday, 12 November 2015 at 12:25:08 UTC, Daniel Kozak wrote:
> V Thu, 12 Nov 2015 12:13:10 +0000
> perlancar via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com>
> napsáno:
>
>> On Wednesday, 11 November 2015 at 14:20:51 UTC, Rikki Cattermole wrote:
>> > I turned it into mostly using large allocations, instead of
>> > small ones.
>> > Although I'd recommend using Appender instead of my custom
>> > functions for this.
>> >
>> > Oh and for me, I got it at 2 secs, 513 ms, 397 μs, and 5
>> > hnsecs. Unoptimized, using dmd.
>> > When release mode is enabled on dmd: 1 sec, 550 ms, 838 μs, and
>> > 9 hnsecs. So significant improvement even with dmds awful
>> > optimizer.
>> 
>> Hi Rikki,
>> 
>> Thanks. With your version, I've managed to be ~4x faster:
>> 
>> dmd          : 0m1.588s
>> dmd (release): 0m1.010s
>> gdc          : 0m2.093s
>> ldc          : 0m1.594s
>> 
>> Perl version : 0m11.391s
>> 
>> So, I'm satisfied enough with the speed for now. Turns out dmd is not always slower.
>
> It depends which flags do you use on ldc and gdc
>
>
> ldc (-singleobj -release -O3 -boundscheck=off)
> gdc (-O3 -finline -frelease -fno-bounds-check)

import std.stdio;

auto fmttable(string[][] table) {

    import std.array : appender, uninitializedArray;
    import std.range : take, repeat;
    import std.exception : assumeUnique;


    if (table.length == 0) return "";
    // column widths
    auto widths = new int[](table[0].length);
	size_t total = (table[0].length + 1) * table.length + table.length;	
	
    foreach (rownum, row; table) {
        foreach (colnum, cell; row) {
            if (cell.length > widths[colnum])
                widths[colnum] = cast(int)cell.length;
        }
    }

    foreach (colWidth; widths)
    {
		total += colWidth * table.length;
	}	
	
    auto res = appender(uninitializedArray!(char[])(total));
    res.clear();

    foreach (row; table) {
        res ~= "|";
        foreach (colnum, cell; row) {
            int l = widths[colnum] - cast(int)cell.length;
            res ~= cell;
            if (l)
                res ~= ' '.repeat().take(l);
            res ~= "|";
        }
        res.put("\n");
    }

     return res.data.assumeUnique();
}

void main() {

    auto table = [
        ["row1.1", "row1.2  ", "row1.3"],
        ["row2.1", "row2.2", "row2.3"],
        ["row3.1", "row3.2", "row3.3  "],
        ["row4.1", "row4.2", "row4.3"],
        ["row5.1", "row5.2", "row5.3"],
    ];

    writeln(fmttable(table));
    for (int i=0; i < 1000000; ++i) {
        fmttable(table);
    }
}

dmd -O -release -inline -boundscheck=off  asciitable.d

real	0m1.463s
user	0m1.453s
sys	0m0.003s


ldc2 -singleobj -release -O3 -boundscheck=off asciitable.d

real	0m0.945s
user	0m0.940s
sys	0m0.000s

gdc -O3 -finline -frelease -fno-bounds-check -o asciitable asciitable.d

real	0m0.618s
user	0m0.613s
sys	0m0.000s


perl:

real	0m14.198s
user	0m14.170s
sys	0m0.000s
November 12, 2015
On Thursday, 12 November 2015 at 12:49:55 UTC, Daniel Kozak wrote:
> On Thursday, 12 November 2015 at 12:25:08 UTC, Daniel Kozak wrote:
> ...	
>     auto res = appender(uninitializedArray!(char[])(total));
>     res.clear();
> ...

this is faster for DMD and ldc:

auto res = appender!(string)();
res.reserve(total);

but for gdc(fronend version 2.066) it makes it two times slower (same for dmd, ldc 2.066 and older)


November 12, 2015
On Thursday, 12 November 2015 at 12:49:55 UTC, Daniel Kozak wrote:
> dmd -O -release -inline -boundscheck=off  asciitable.d
>
> real	0m1.463s
> user	0m1.453s
> sys	0m0.003s
>
>
> ldc2 -singleobj -release -O3 -boundscheck=off asciitable.d
>
> real	0m0.945s
> user	0m0.940s
> sys	0m0.000s
>
> gdc -O3 -finline -frelease -fno-bounds-check -o asciitable asciitable.d
>
> real	0m0.618s
> user	0m0.613s
> sys	0m0.000s
>
>
> perl:
>
> real	0m14.198s
> user	0m14.170s
> sys	0m0.000s

Nice! Seems like I can get a further 100% improvement in speed from the last version (so a total of ~8x speedup from my original D version). Now I wonder how C would fare...

1 2
Next ›   Last »