Thread overview
why my tail implementation is much more slower than coreutil ?
Nov 04, 2012
bioinfornatics
Nov 04, 2012
Timon Gehr
Nov 04, 2012
bioinfornatics
Nov 05, 2012
Marco Leise
November 04, 2012
my tail implementation do not provides all features exposed by coreutil.tail. Just able to read a file no stdout or piping and my D implementation is slower than coreutil.tail.


---- coreutil tail -----
$ time tail tail.d
                }
                else currentPosition--;
            }
            else
                currentPosition--;
        }
        if( currentPosition != f.length ) writeln( cast(string)
f[currentPosition .. f.length] );
    }
    exit(0);
}

real	0m0.002s
user	0m0.001s
sys	0m0.001s
---- mine tail implementation -----
$ time ./tail tail.d
                else currentPosition--;
            }
            else
                currentPosition--;
        }
        if( currentPosition != f.length ) writeln( cast(string)
f[currentPosition .. f.length] );
    }
    exit(0);
}


real	0m0.011s
user	0m0.004s
sys	0m0.007s

----- Code tail.d ----
import std.mmfile;
import std.getopt;
import std.stdio    : writeln, writefln;
import std.ascii    : isWhite;
import std.c.stdlib : atexit, exit;
import core.runtime : Runtime;

string programName      = "" ;
enum   programVersion   = "0.0.1";

extern (C) void terminateRuntime (){
    Runtime.terminate();
}

bool isNewline(dchar c) @safe pure nothrow {
    return ( c == 0x0A || c == 0x0D )? isWhite( c ) : false;
}


void main( string[] args ){
    void usage(){
        writefln( "%s [OPTION]... [FILE]...", programName );
        writefln( "%s is under GPL v3+", programName );
        writefln( "version: %s", programVersion );
        writeln("    -n, --lines=K   output the last K lines, instead of
the last 10; or use -n +K to output lines starting with the Kth" );
        writeln("    -h, --help      display this help and exit" );
        writeln("    --version       output version information and
exit" );
        exit(0);
    }

    void printVersion(){
        writefln( "%s is under GPL v3+", programName );
        exit(0);
    }

    programName                 = args[0];
    size_t lineNumber           = 10;
    size_t currentLineNumber    = 0;
    ulong  currentPosition      = 0;
    bool   isRunning            = true;

    atexit(&terminateRuntime);

    if ( args.length == 1 ) usage();

    getopt(
        args,
        "lines|n", &lineNumber,
        "help|h" , &usage,
        "version", &printVersion
    );

    foreach( filename; args[1..args.length] ){
        MmFile  f       = new MmFile( filename );
        if ( f.length > 0 )
            currentPosition = f.length - 1 ;
        else
            isRunning = false;

        while( isRunning ){
            if ( currentLineNumber >= lineNumber || currentPosition - 1
< 0)
                isRunning = false;
            else if( isNewline( cast(dchar) f[currentPosition] ) ){
                currentLineNumber++;
                if(currentLineNumber >= lineNumber){
                    isRunning = false;
                    currentPosition++; // do not take tne newline to
output
                }
                else currentPosition--;
            }
            else
                currentPosition--;
        }
        if( currentPosition != f.length ) writeln( cast(string)
f[currentPosition .. f.length] );
    }
    exit(0);
}


November 04, 2012
How long does the hello world program take? I guess it is just that the D program has a slightly higher startup cost than the C program.
November 04, 2012
Le dimanche 04 novembre 2012 à 17:36 +0100, bioinfornatics a écrit :
> my tail implementation do not provides all features exposed by coreutil.tail. Just able to read a file no stdout or piping and my D implementation is slower than coreutil.tail.

convenience way to look the code http://pastebin.geany.org/wWUYu/

Maybe Mmfile is useless since i read once time the file and i never come back to a any position in file, then map the file cost too?

November 05, 2012
tail from coreutils was compiled using the GCC. Use the same compiler backend (GDC). Any difference that still remains is from runtime initialization/termination, the overhead of using a wrapper class around the C routines (for the memory mapping or printing to terminal) and possibly a faster scan routine for line endings in the Linux tool. Implementations using SSE to scan for \n in a block of 16 bytes are possible.

-- 
Marco