Thread overview
MD5 hash on a file and rawRead
Feb 10, 2011
Andrej Mitrovic
Feb 10, 2011
Andrej Mitrovic
Feb 10, 2011
Andrej Mitrovic
Feb 10, 2011
Andrej Mitrovic
February 10, 2011
I'm trying to use the std.md5.sum method. It takes as an argument a digest to output the hash to, and the second argument is plain data.

So I'm trying to read an entire file at once. I thought about using rawRead, but I get a runtime exception:
        auto filename = r"C:\file.dat";
        File file;
        try
        {
            file = File(filename, "r");
        }
        catch (ErrnoException exc)
        {
            return;
        }
        ubyte[] buffer;
        file.rawRead(buffer);

error: stdio.d:rawRead must take a non-empty buffer

There are no size methods for the File structure (why?). There's a getSize function but it's in std.file, and I can't use it because:

        auto filename = r"C:\file.dat";
        File file;
        try
        {
            file = File(filename, "r");
        }
        catch (ErrnoException exc)
        {
            return;
        }

        ubyte[] buffer = new ubyte[](getSize(filename));
        ubyte[16] digest;
        file.rawRead(buffer);
        std.md5.sum(digest, buffer);

Error: cannot implicitly convert expression (getSize(cast(const(char[]))this._libFileName)) of type ulong to uint

I can use the buffered version fine:
        auto filename = r"C:\file.dat";
        File file;
        try
        {
            file = File(filename, "r");
        }
        catch (ErrnoException exc)
        {
            return;
        }

        ubyte[16] digest;
        MD5_CTX context;
        context.start();

        foreach (ubyte[] buffer; file.byChunk(4096 * 1024))
        {
            context.update(buffer);
        }

        context.finish(digest);
        writefln("MD5 (%s) = %s", filename, digestToString(digest));

But I'd prefer to write simpler code and use rawRead to read the entire file at once. I'm reading really small files, so rawRead should be fine.

Also, why do we have file handling in two different modules? I'd expect to find all file handling ops in std.file, not scattered around Phobos.

Let me know if I'm doing something obviously stupid. :)
February 10, 2011
Also disregard that the error shows "_libFileName", I was in the middle of refactoring so the name stayed.
February 10, 2011
On 2/10/11, Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote:
> Also disregard that the error shows "_libFileName", I was in the middle of refactoring so the name stayed.
>

*I mean disregard that it's called _libFileName, when it's really "filename".
February 10, 2011
On Wed, 09 Feb 2011 23:01:47 -0500, Andrej Mitrovic wrote:

> I'm trying to use the std.md5.sum method. It takes as an argument a digest to output the hash to, and the second argument is plain data.
> 
> So I'm trying to read an entire file at once. I thought about using
> rawRead, but I get a runtime exception:
>         auto filename = r"C:\file.dat";
>         File file;
>         try
>         {
>             file = File(filename, "r");
>         }
>         catch (ErrnoException exc)
>         {
>             return;
>         }
>         ubyte[] buffer;
>         file.rawRead(buffer);
> 
> error: stdio.d:rawRead must take a non-empty buffer
> 
> There are no size methods for the File structure (why?). There's a getSize function but it's in std.file, and I can't use it because:
> 
>         auto filename = r"C:\file.dat";
>         File file;
>         try
>         {
>             file = File(filename, "r");
>         }
>         catch (ErrnoException exc)
>         {
>             return;
>         }
> 
>         ubyte[] buffer = new ubyte[](getSize(filename)); ubyte[16]
>         digest;
>         file.rawRead(buffer);
>         std.md5.sum(digest, buffer);
> 
> Error: cannot implicitly convert expression
> (getSize(cast(const(char[]))this._libFileName)) of type ulong to uint
> 
> I can use the buffered version fine:
>         auto filename = r"C:\file.dat";
>         File file;
>         try
>         {
>             file = File(filename, "r");
>         }
>         catch (ErrnoException exc)
>         {
>             return;
>         }
> 
>         ubyte[16] digest;
>         MD5_CTX context;
>         context.start();
> 
>         foreach (ubyte[] buffer; file.byChunk(4096 * 1024)) {
>             context.update(buffer);
>         }
> 
>         context.finish(digest);
>         writefln("MD5 (%s) = %s", filename, digestToString(digest));
> 
> But I'd prefer to write simpler code and use rawRead to read the entire file at once. I'm reading really small files, so rawRead should be fine.

To read an entire file at once, you should use std.file.read(), or std.file.readText() if it's an UTF encoded text file.


> Also, why do we have file handling in two different modules? I'd expect to find all file handling ops in std.file, not scattered around Phobos.
> 
> Let me know if I'm doing something obviously stupid. :)

There are actually three modules for file handling, but I think they are nicely separated:

  - std.file handles files as isolated units, i.e. it reads,
    writes and manipulates entire files.

  - std.path manipulates file/directory names as strings, and
    performs no disk I/O.

  - std.stdio is for more advanced file I/O, as it lets you
    open files and manipulate them through the File handle.
    (This includes reading, writing, seeking, etc.)

Hope this clears things up. :)

-Lars
February 10, 2011
On 2/10/11, Lars T. Kyllingstad <public@kyllingen.nospamnet> wrote:
>
> To read an entire file at once, you should use std.file.read(), or
> std.file.readText() if it's an UTF encoded text file.

I missed that method while browsing through the docs. Thanks.

>
> There are actually three modules for file handling, but I think they are nicely separated:
>
>   - std.file handles files as isolated units, i.e. it reads,
>     writes and manipulates entire files.
>
>   - std.path manipulates file/directory names as strings, and
>     performs no disk I/O.
>
>   - std.stdio is for more advanced file I/O, as it lets you
>     open files and manipulate them through the File handle.
>     (This includes reading, writing, seeking, etc.)
>
> Hope this clears things up. :)
>
> -Lars
>

Yeah I know there's 3 modules, I'd still prefer having one module for file manipulation and one for the path string functionality. Right now I have to keep switching between stdio and file's documentation all the time, which is how I've managed to miss the .read method. Thanks again though.