Thread overview | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
October 17, 2014 String created from buffer has wrong length and strip() result is incorrect | ||||
---|---|---|---|---|
| ||||
When creating a string from a ubyte[], I have an invalid length and string.strip() doesn't strip off all whitespace. I'm new to the language. Is this a compiler issue? import std.string : strip; import std.stdio : writefln; int main() { const string ATA_STR = " ATA "; // this works fine { ubyte[] buffer = [' ', 'A', 'T', 'A', ' ' ]; string test = strip(cast(string)(buffer)); assert(test == strip(ATA_STR)); } // This is where things breaks { ubyte[] buff = new ubyte[16]; buff[0..ATA_STR.length] = cast(ubyte[])(ATA_STR); // read the string back from the buffer, stripping whitespace string stringFromBuffer = strip(cast(string)(buff[0..16])); // this shows strip() doesn't remove all whitespace writefln("StrFromBuff is '%s'; length %d", stringFromBuffer, stringFromBuffer.length); // !! FAILS. stringFromBuffer is length 15, not 3. assert(stringFromBuffer.length == strip(ATA_STR).length); } return 0; } |
October 17, 2014 Re: String created from buffer has wrong length and strip() result is incorrect | ||||
---|---|---|---|---|
| ||||
Posted in reply to Lucas Burson | On Friday, 17 October 2014 at 06:29:24 UTC, Lucas Burson wrote:
> // This is where things breaks
> {
> ubyte[] buff = new ubyte[16];
> buff[0..ATA_STR.length] = cast(ubyte[])(ATA_STR);
>
> // read the string back from the buffer, stripping whitespace
> string stringFromBuffer = strip(cast(string)(buff[0..16]));
> // this shows strip() doesn't remove all whitespace
> writefln("StrFromBuff is '%s'; length %d", stringFromBuffer, stringFromBuffer.length);
>
> // !! FAILS. stringFromBuffer is length 15, not 3.
> assert(stringFromBuffer.length == strip(ATA_STR).length);
Unlike C, strings in D are not zero-terminated by default, they are just arrays, i.e. a pair of pointer and size. You create an array of 16 bytes and cast it to string, now you have a 16-chars string. You fill first few chars with data from ATA_STR but the rest 10 bytes of the array are still part of the string, not initialized with data, so having zeroes. Since this tail of zeroes is not whitespace (tabs or spaces etc.) 'strip' doesn't remove it.
|
October 17, 2014 Re: String created from buffer has wrong length and strip() result is incorrect | ||||
---|---|---|---|---|
| ||||
Posted in reply to thedeemon | >You fill first few chars with data from
> ATA_STR but the rest 10 bytes of the array are still part of the string
Edit: you fill first 5 chars and have 11 bytes of zeroes in the tail. My counting skill is too bad. ;)
|
October 17, 2014 Re: String created from buffer has wrong length and strip() result is incorrect | ||||
---|---|---|---|---|
| ||||
Posted in reply to thedeemon | On 17/10/14 09:29, thedeemon via Digitalmars-d-learn wrote:
> On Friday, 17 October 2014 at 06:29:24 UTC, Lucas Burson wrote:
>
>> // This is where things breaks
>> {
>> ubyte[] buff = new ubyte[16];
>> buff[0..ATA_STR.length] = cast(ubyte[])(ATA_STR);
>>
>> // read the string back from the buffer, stripping whitespace
>> string stringFromBuffer = strip(cast(string)(buff[0..16]));
>> // this shows strip() doesn't remove all whitespace
>> writefln("StrFromBuff is '%s'; length %d", stringFromBuffer,
>> stringFromBuffer.length);
>>
>> // !! FAILS. stringFromBuffer is length 15, not 3.
>> assert(stringFromBuffer.length == strip(ATA_STR).length);
>
> Unlike C, strings in D are not zero-terminated by default, they are just arrays,
> i.e. a pair of pointer and size. You create an array of 16 bytes and cast it to
> string, now you have a 16-chars string. You fill first few chars with data from
> ATA_STR but the rest 10 bytes of the array are still part of the string, not
> initialized with data, so having zeroes. Since this tail of zeroes is not
> whitespace (tabs or spaces etc.) 'strip' doesn't remove it.
Side-note: since your string has those zeroes at the end, strip only removes the space at start (thus, final size=15), instead of at both ends.
d
|
October 17, 2014 Re: String created from buffer has wrong length and strip() result is incorrect | ||||
---|---|---|---|---|
| ||||
Posted in reply to spir | On Friday, 17 October 2014 at 08:31:04 UTC, spir via Digitalmars-d-learn wrote:
> On 17/10/14 09:29, thedeemon via Digitalmars-d-learn wrote:
>> On Friday, 17 October 2014 at 06:29:24 UTC, Lucas Burson wrote:
>>
>>> // This is where things breaks
>>> {
>>> ubyte[] buff = new ubyte[16];
>>> buff[0..ATA_STR.length] = cast(ubyte[])(ATA_STR);
>>>
>>> // read the string back from the buffer, stripping whitespace
>>> string stringFromBuffer = strip(cast(string)(buff[0..16]));
>>> // this shows strip() doesn't remove all whitespace
>>> writefln("StrFromBuff is '%s'; length %d", stringFromBuffer,
>>> stringFromBuffer.length);
>>>
>>> // !! FAILS. stringFromBuffer is length 15, not 3.
>>> assert(stringFromBuffer.length == strip(ATA_STR).length);
>>
>> Unlike C, strings in D are not zero-terminated by default, they are just arrays,
>> i.e. a pair of pointer and size. You create an array of 16 bytes and cast it to
>> string, now you have a 16-chars string. You fill first few chars with data from
>> ATA_STR but the rest 10 bytes of the array are still part of the string, not
>> initialized with data, so having zeroes. Since this tail of zeroes is not
>> whitespace (tabs or spaces etc.) 'strip' doesn't remove it.
>
> Side-note: since your string has those zeroes at the end, strip only removes the space at start (thus, final size=15), instead of at both ends.
>
> d
Okay things are becoming more clear. The cast to string is nothing like the C++ string ctor, I made a bad assumption.
So given the below buffer would I use fromStringz (is this in the stdlib?) to cast it from a null-terminated buffer to a good string? Shouldn't the compiler give a warning about casting a buffer to a string without using fromStringz?
Buffer = [ 0x20, 0x41, 0x54, 0x41, 0x20, 0x00, 0x00, ...]?
|
October 17, 2014 Re: String created from buffer has wrong length and strip() result is incorrect | ||||
---|---|---|---|---|
| ||||
Posted in reply to Lucas Burson Attachments: | On Fri, 17 Oct 2014 15:24:21 +0000 Lucas Burson via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com> wrote: > So given the below buffer would I use fromStringz (is this in the stdlib?) to cast it from a null-terminated buffer to a good string? Shouldn't the compiler give a warning about casting a buffer to a string without using fromStringz? if you are really-really sure that your buffer is null-terminated, you can use this trick: import std.conv; string s = to!string(cast(char*)buff.ptr); please note, that this is NOT SAFE. you'd better doublecheck that your buffer is not empty and is null-terminated. |
October 17, 2014 Re: String created from buffer has wrong length and strip() result is incorrect | ||||
---|---|---|---|---|
| ||||
Attachments: | On Fri, 17 Oct 2014 18:30:43 +0300 ketmar via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com> wrote: > > Shouldn't the compiler give a warning about casting a buffer to a string without using fromStringz? nope. such casting is perfectly legal, as D strings can contain embedded '\0's. |
October 17, 2014 Re: String created from buffer has wrong length and strip() result is incorrect | ||||
---|---|---|---|---|
| ||||
Posted in reply to ketmar | On Friday, 17 October 2014 at 15:30:52 UTC, ketmar via Digitalmars-d-learn wrote:
> On Fri, 17 Oct 2014 15:24:21 +0000
> Lucas Burson via Digitalmars-d-learn
> <digitalmars-d-learn@puremagic.com> wrote:
>
>> So given the below buffer would I use fromStringz (is this in the stdlib?) to cast it from a null-terminated buffer to a good string? Shouldn't the compiler give a warning about casting a buffer to a string without using fromStringz?
> if you are really-really sure that your buffer is null-terminated, you
> can use this trick:
>
> import std.conv;
> string s = to!string(cast(char*)buff.ptr);
>
> please note, that this is NOT SAFE. you'd better doublecheck that your
> buffer is not empty and is null-terminated.
The buffer is populated from a scsi ioctl so it "should" be only ascii and null-terminated but it's a good idea to harden the code a bit.
Thank you for your help!
|
October 17, 2014 Re: String created from buffer has wrong length and strip() result is incorrect | ||||
---|---|---|---|---|
| ||||
Posted in reply to Lucas Burson Attachments: | On Fri, 17 Oct 2014 16:08:04 +0000 Lucas Burson via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com> wrote: > The buffer is populated from a scsi ioctl so it "should" be only > ascii and null-terminated but it's a good idea to harden the code > a bit. > Thank you for your help! i developed a habit of making such buffers one byte bigger than necessary and just setting the last byte to 0 before converting. this way it's guaranteed to be 0-terminated. |
October 18, 2014 Re: String created from buffer has wrong length and strip() result is incorrect | ||||
---|---|---|---|---|
| ||||
Posted in reply to ketmar | On Friday, 17 October 2014 at 17:40:09 UTC, ketmar via Digitalmars-d-learn wrote: > i developed a habit of making such buffers one byte bigger than > necessary and just setting the last byte to 0 before converting. this > way it's guaranteed to be 0-terminated. Perfect, great idea. Below is my utility method to pull strings out of a buffer. /** * Get a string from buffer where the string spans [offset_start, offset_end). * Params: * buffer = Buffer with an ASCII string to obtain. * offset_start = Beginning byte offset within the buffer where the string starts. * offset_end = Ending byte offset which is not included in the string. */ string bufferGetString(ubyte[] buffer, ulong offset_start, ulong offset_end) in { assert(buffer != null); assert(offset_start < offset_end); assert(offset_end <= buffer.length); } body { ulong bufflen = offset_end - offset_start; // add one to the lenth for null-termination ubyte[] temp = new ubyte[bufflen+1]; temp[0..bufflen] = buffer[offset_start..offset_end]; temp[bufflen] = '\0'; return strip(to!string(cast(const char*) temp.ptr)); } unittest { ubyte[] no_null = [' ', 'A', 'B', 'C', ' ']; assert("ABC" == bufferGetString(no_null, 0, no_null.length)); assert("ABC" == bufferGetString(no_null, 1, no_null.length-1)); assert("A" == bufferGetString(no_null, 1, 2)); } |
Copyright © 1999-2021 by the D Language Foundation