Thread overview
Difference between chunks(stdin, 1) and stdin.rawRead?
Mar 28
jms
Mar 28
jms
March 28
Why in the below silly program am I reading both the \r and \n characters when using rawRead in block a, but when looping by 1 byte chunks in block b only appear to be reading the \n characters?

I'm on Windows 11 using DMD64 D Compiler v2.107.1 if that matters, but I'm thinking this maybe has something to do with stdin in general that I'm not aware of. Any pointers to understanding what's going on would be appreciated.

import std.stdio;

void main() {
    int i;
a: {
        i = 0;
        writeln("\nin a");
        ubyte[1] buffer;
        while (true) {
            i++;
            stdin.rawRead(buffer);
            if (buffer[0] == 13) {
                write("CR");
            } else if (buffer[0] == 10) {
                write("LF");
            }
            if (i > 5) {
                goto b;
            }

        }
    }
b: {

        writeln("\n\nin b");
        i = 0;
        foreach (ubyte[] buffer; chunks(stdin, 1)) {
            i++;
            if (buffer[0] == 13) {
                write("cr");
            } else if (buffer[0] == 10) {
                write("lf");
            }
            if (i > 5) {
                goto a;
            }
        }
    }

}



Output:
in a

CRLF
CRLF
CRLF

in b

lf
lf
lf
lf
lf
lf
in a
March 28
On Thursday, 28 March 2024 at 02:30:11 UTC, jms wrote:
> Why in the below silly program am I reading both the \r and \n characters when using rawRead in block a, but when looping by 1 byte chunks in block b only appear to be reading the \n characters?
>
> I'm on Windows 11 using DMD64 D Compiler v2.107.1 if that matters, but I'm thinking this maybe has something to do with stdin in general that I'm not aware of. Any pointers to understanding what's going on would be appreciated.
>
> import std.stdio;
>
> void main() {
>     int i;
> a: {
>         i = 0;
>         writeln("\nin a");
>         ubyte[1] buffer;
>         while (true) {
>             i++;
>             stdin.rawRead(buffer);
>             if (buffer[0] == 13) {
>                 write("CR");
>             } else if (buffer[0] == 10) {
>                 write("LF");
>             }
>             if (i > 5) {
>                 goto b;
>             }
>
>         }
>     }
> b: {
>
>         writeln("\n\nin b");
>         i = 0;
>         foreach (ubyte[] buffer; chunks(stdin, 1)) {
>             i++;
>             if (buffer[0] == 13) {
>                 write("cr");
>             } else if (buffer[0] == 10) {
>                 write("lf");
>             }
>             if (i > 5) {
>                 goto a;
>             }
>         }
>     }
>
> }
>
>
>
> Output:
> in a
>
> CRLF
> CRLF
> CRLF
>
> in b
>
> lf
> lf
> lf
> lf
> lf
> lf
> in a

I think I figured it out and the difference is probably in the mode. This documentation https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/fread?view=msvc-170 mentions that "If the given stream is opened in text mode, Windows-style newlines are converted into Unix-style newlines. That is, carriage return-line feed (CRLF) pairs are replaced by single line feed (LF) characters."

And rawRead's documention mentions that "rawRead always reads in binary mode on Windows.", which I guess should have given me a clue. chunks must be using text-mode.

March 28
On Thu, Mar 28, 2024 at 10:10:43PM +0000, jms via Digitalmars-d-learn wrote:
> On Thursday, 28 March 2024 at 02:30:11 UTC, jms wrote:
[...]
> I think I figured it out and the difference is probably in the mode.
> This documentation
> https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/fread?view=msvc-170
> mentions that "If the given stream is opened in text mode,
> Windows-style newlines are converted into Unix-style newlines. That
> is, carriage return-line feed (CRLF) pairs are replaced by single line
> feed (LF) characters."
> 
> And rawRead's documention mentions that "rawRead always reads in binary mode on Windows.", which I guess should have given me a clue. chunks must be using text-mode.

It's not so much that chunks is using text-mode, but that you opened the file in text mode.  On Windows, if you don't want crlf translation you need to open your file with File(filename, "rb"), not just File(filename "r"), because the latter defaults to text mode.


T

-- 
There's light at the end of the tunnel. It's the oncoming train.