Thread overview
[phobos] stdin.byChunk seems broken
Jun 04, 2010
Sean Kelly
Jun 04, 2010
Sean Kelly
June 04, 2010
Given the sample:

    import std.algorithm, std.concurrency, std.stdio;


    void main()
    {
        enum bufferSize = 1;
        auto tid = spawn( &fileWriter );
        // Read loop
        /+ BUG: stdio can't handle the immutable buffer
        foreach( immutable(ubyte)[] buffer; stdin.byChunk( bufferSize ) )
            send( tid, buffer );
        +/
        foreach( const(ubyte)[] buffer; stdin.byChunk( bufferSize ) )
            send( tid, buffer );
    }


    void fileWriter()
    {
        // Write loop
        for( ; ; )
        {
            // BUG: stdio can't handle the immutable buffer
            //auto buffer = receiveOnly!(immutable(ubyte)[])();
            auto buffer = receiveOnly!(const(ubyte)[])();
            writeln( "rx: ", buffer.field[0] );
        }
    }

The output I see is:

	abacus:tdpl sean$ ch13_7
	aaaa
	rx: 10
	rx: 10
	rx: 10
	rx: 10
	rx: 10
	^C
	abacus:tdpl sean$

Why if I send 4 'a' characters do I receive 5 \n characters?  Seems like the last character in the buffer is being copied over the preceding data.  I just thought I'd mention this in case someone has the time and inclination to look into it.
June 04, 2010
The reason is very simple - byChunk reuses the buffer for each chunk. That's why you are getting garbled data, and that's why you can't use immutable with byChunk.

There are two issues now:

1. send() should NEVER accept const(ubyte)[]. So you have a bug. The program as written shouldn't compile.

2. We need a sort of byImmutableChunk or something that creates a new buffer every pass, or simply recommend that people use .idup when they want to send stuff over.


Andrei

On 06/04/2010 05:48 PM, Sean Kelly wrote:
> Given the sample:
>
>      import std.algorithm, std.concurrency, std.stdio;
>
>
>      void main()
>      {
>          enum bufferSize = 1;
>          auto tid = spawn(&fileWriter );
>          // Read loop
>          /+ BUG: stdio can't handle the immutable buffer
>          foreach( immutable(ubyte)[] buffer; stdin.byChunk( bufferSize ) )
>              send( tid, buffer );
>          +/
>          foreach( const(ubyte)[] buffer; stdin.byChunk( bufferSize ) )
>              send( tid, buffer );
>      }
>
>
>      void fileWriter()
>      {
>          // Write loop
>          for( ; ; )
>          {
>              // BUG: stdio can't handle the immutable buffer
>              //auto buffer = receiveOnly!(immutable(ubyte)[])();
>              auto buffer = receiveOnly!(const(ubyte)[])();
>              writeln( "rx: ", buffer.field[0] );
>          }
>      }
>
> The output I see is:
>
> 	abacus:tdpl sean$ ch13_7
> 	aaaa
> 	rx: 10
> 	rx: 10
> 	rx: 10
> 	rx: 10
> 	rx: 10
> 	^C
> 	abacus:tdpl sean$
>
> Why if I send 4 'a' characters do I receive 5 \n characters?  Seems like the last character in the buffer is being copied over the preceding data.  I just thought I'd mention this in case someone has the time and inclination to look into it.
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
June 04, 2010
Thanks.  I haven't seen the final copy of TDPL (obviously) but I'm guessing the example in 13.7 should be something like this then?

    import std.algorithm, std.concurrency, std.stdio;

    void main()
    {
        enum bufferSize = 1024 * 100;
        auto tid = spawn( &fileWriter );
        // Read loop
        foreach( ubyte[] buffer; stdin.byChunk( bufferSize ) )
            send( tid, buffer.idup );
    }

    void fileWriter()
    {
        // Write loop
        for( ; ; )
        {
            auto buffer = receiveOnly!(immutable(ubyte)[])();
            tgt.write( buffer );
        }
    }

That builds and runs correctly for me, though having such a large chunk size for stdin is a bit weird :-)

On Jun 4, 2010, at 4:02 PM, Andrei Alexandrescu wrote:

> The reason is very simple - byChunk reuses the buffer for each chunk. That's why you are getting garbled data, and that's why you can't use immutable with byChunk.
> 
> There are two issues now:
> 
> 1. send() should NEVER accept const(ubyte)[]. So you have a bug. The program as written shouldn't compile.
> 
> 2. We need a sort of byImmutableChunk or something that creates a new buffer every pass, or simply recommend that people use .idup when they want to send stuff over.
> 
> 
> Andrei
> 
> On 06/04/2010 05:48 PM, Sean Kelly wrote:
>> Given the sample:
>> 
>>     import std.algorithm, std.concurrency, std.stdio;
>> 
>> 
>>     void main()
>>     {
>>         enum bufferSize = 1;
>>         auto tid = spawn(&fileWriter );
>>         // Read loop
>>         /+ BUG: stdio can't handle the immutable buffer
>>         foreach( immutable(ubyte)[] buffer; stdin.byChunk( bufferSize ) )
>>             send( tid, buffer );
>>         +/
>>         foreach( const(ubyte)[] buffer; stdin.byChunk( bufferSize ) )
>>             send( tid, buffer );
>>     }
>> 
>> 
>>     void fileWriter()
>>     {
>>         // Write loop
>>         for( ; ; )
>>         {
>>             // BUG: stdio can't handle the immutable buffer
>>             //auto buffer = receiveOnly!(immutable(ubyte)[])();
>>             auto buffer = receiveOnly!(const(ubyte)[])();
>>             writeln( "rx: ", buffer.field[0] );
>>         }
>>     }
>> 
>> The output I see is:
>> 
>> 	abacus:tdpl sean$ ch13_7
>> 	aaaa
>> 	rx: 10
>> 	rx: 10
>> 	rx: 10
>> 	rx: 10
>> 	rx: 10
>> 	^C
>> 	abacus:tdpl sean$
>> 
>> Why if I send 4 'a' characters do I receive 5 \n characters?  Seems like the last character in the buffer is being copied over the preceding data.  I just thought I'd mention this in case someone has the time and inclination to look into it.
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos