to delete the '\0' characters (page 2)

Settings

Help

Index » Learn » to delete the '\0' characters (page 2)

September 22, 2022

Re: to delete the '\0' characters

Posted by Ali Çehreli
in reply to Salih Dincer

Permalink

Ali Çehreli

Posted in reply to Salih Dincer

Permalink

On 9/22/22 14:31, Salih Dincer wrote:

> string splitz(string s)
> {
>    import std.string : indexOf;
>    auto seekPos = s.indexOf('\0');
>    return seekPos > 0 ? s[0..seekPos] : s;
> }

If you have multiple '\0' chars that you will continue looking for, how about the following?

import std;

auto splitz(string s) {
    return s.splitter('\0');
}

unittest {
    auto data = [ "hello", "and", "goodbye", "world" ];
    auto hasZeros = data.joiner("\0").text;
    assert(hasZeros.count('\0') == 3);
    assert(hasZeros.splitz.equal(data));
}

void main() {
}

Ali

September 23, 2022

Re: to delete the '\0' characters

Posted by Salih Dincer
in reply to Ali Çehreli

Permalink

Salih Dincer

Posted in reply to Ali Çehreli

Permalink

On Thursday, 22 September 2022 at 21:49:36 UTC, Ali Çehreli wrote:

On 9/22/22 14:31, Salih Dincer wrote:

If you have multiple '\0' chars that you will continue looking for, how about the following?

It can be preferred in terms of working at ranges. But it isn't useful in terms of having more than one character and moving away from strings. For example:

    auto data = [ "hello", "and", "goodbye", "world" ];
    auto hasZeros = data.joiner("\0\0").text; // ("hello\0\0", "and\0\0", "goodbye\0\0", "world\0\0")

    assert(hasZeros.count('\0') == 7);
    assert(hasZeros.splitz.walkLength == data.length * 2 - 1);

    auto range = hasZeros.splitz; // ("hello", "", "and", "", "goodbye", "", "world")

SDB@79

September 23, 2022

Re: to delete the '\0' characters

Posted by Quirin Schroll
in reply to Salih Dincer

Permalink

Quirin Schroll

Posted in reply to Salih Dincer

Permalink

On Thursday, 22 September 2022 at 10:53:32 UTC, Salih Dincer wrote:

Is there a more accurate way to delete the '\0' characters at the end of the string?

Accurate? No. Your code works. Correct is correct, no matter efficiency or style.

I tried functions in this module: https://dlang.org/phobos/std_string.html

[code]

You won’t do it any shorter than this if returning a range of dchar is fine:

auto removez(const(char)[] string, char ch = '\0')
{
    import std.algorithm.iteration;
    return string.splitter(ch).joiner;
}

If dchar is a problem and a range is not what you want,

inout(char)[] removez(inout(char)[] chars) @safe pure nothrow
{
    import std.array, std.algorithm.iteration;
    auto data = cast(const(ubyte)[])chars;
    auto result = data.splitter(0).joiner.array;
    return (() inout @trusted => cast(inout(char)[])result)();
}

Bonus: Works with any kind of array of qualified char. As string is simply immutable(char)[], removez returns a string given a string, but returns a char[] given a char[], etc.

Warning: I do not know if the @trusted expression is really okay. The cast is not @safe because of type qualifiers: If inout becomes nothing (i.e. mutable), the cast removes const. I suspect that it is still okay because the result of array is unique. Maybe others know better?

September 23, 2022

Re: to delete the '\0' characters

Posted by Jesse Phillips
in reply to Salih Dincer

Permalink

Jesse Phillips

Posted in reply to Salih Dincer

Permalink

On Friday, 23 September 2022 at 08:50:42 UTC, Salih Dincer wrote:

On Thursday, 22 September 2022 at 21:49:36 UTC, Ali Çehreli wrote:

On 9/22/22 14:31, Salih Dincer wrote:

If you have multiple '\0' chars that you will continue looking for, how about the following?

It can be preferred in terms of working at ranges. But it isn't useful in terms of having more than one character and moving away from strings. For example:

    auto data = [ "hello", "and", "goodbye", "world" ];
    auto hasZeros = data.joiner("\0\0").text; // ("hello\0\0", "and\0\0", "goodbye\0\0", "world\0\0")

    assert(hasZeros.count('\0') == 7);
    assert(hasZeros.splitz.walkLength == data.length * 2 - 1);

    auto range = hasZeros.splitz; // ("hello", "", "and", "", "goodbye", "", "world")

SDB@79

You should be explicit with requirements. It was hard to tell if you original code was correct.

auto splitz(string s) {
    return s.splitter('\0')
   .filter!(x => !x.empty);
}

September 23, 2022

Re: to delete the '\0' characters

Posted by Salih Dincer
in reply to Jesse Phillips

Permalink

Salih Dincer

Posted in reply to Jesse Phillips

Permalink

On Friday, 23 September 2022 at 14:38:35 UTC, Jesse Phillips wrote:

You should be explicit with requirements.

Sorry, generally what I speak is Turkish language. So, I speak English as a foreign language but it's clear I wrote. What do you think when you look at the text I've pointed to following?

On Thursday, 22 September 2022 at 10:53:32 UTC, Salih Dincer wrote:

Is there a more accurate way to delete the '\0' characters at the end of the string?

characterS
at the END
of the STRING

auto splitz(string s) {
    return s.splitter('\0')
   .filter!(x => !x.empty);
}

By the way, if we're going to filter, why are we splitting? Anyways! For this implementation, indexOf() is a powerful enough tool. In fact, it's pretty fast, as there is a maximum of the \0 8 characters possible and when those 8 '\0' are at the end of the string! For example:

void main()
{
  string[] samples = ["the one\0", "the two\0\0", "the three\0\0\0",
                      "the four\0\0\0\0", "the five\0\0\0\0\0",
                      "the six\0\0\0\0\0\0", "the seven\0\0\0\0\0\0\0",
                      "the eight\0\0\0\0\0\0\0\0"];

  import std.stdio : writefln;
  foreach(s; samples)
  {
    auto start = s.length - 8;
    string res = s.splitZeros!false(start);
    writefln("%(%02X%)", cast(ubyte[])res);
  }
}

string splitZeros(bool keepSep)(string s, size_t start = 0)
{
  auto keep = keepSep ? 0 : 1;

  import std.string : indexOf;
  if(auto seekPos = s.indexOf('\0', start) + 1)
  {
    return s[0..seekPos - keep];
  }
  return s;
}

SDB@79

September 23, 2022

Re: to delete the '\0' characters

Posted by Ali Çehreli
in reply to Salih Dincer

Permalink

Ali Çehreli

Posted in reply to Salih Dincer

Permalink

On 9/23/22 11:37, Salih Dincer wrote:

> * character**S**
> * at the **END**
> * of the **STRING**

I think the misunderstanding is due to the following data you've posted earlier (I am abbreviating):

53 F6 6E 6D 65 64 65 6E 20 79 75 72 64 75 6D 75 6E 20 FC 73 74 FC 6E 64 65 20 74 FC 74 65 6E 20 65 6E 20 73 6F 6E 20 6F 63 61 6B 0
4F 20 62 65 6E 69 6D 20 6D 69 6C 6C 65 74 69 6D 69 6E 20 79 131 6C 64 131 7A 131 64 131 72 20 70 61 72 6C 61 79 61 63 61 6B 0 0 0 0

You must have meant there were multiple strings there (apparently on separate lines) but I assumed you were showing a single string with 0 bytes inside the string. (Word wrap must have contributed to the misunderstanding.)

Ali

P.S. With that understanding, now I think searching from the end for the first non-zero byte may be faster than searching from the beginning for the first zero; but again, it depends on the data.

September 23, 2022

Re: to delete the '\0' characters

Posted by Paul Backus
in reply to Salih Dincer

Permalink

Paul Backus

Posted in reply to Salih Dincer

Permalink

On Friday, 23 September 2022 at 18:37:59 UTC, Salih Dincer wrote:

On Thursday, 22 September 2022 at 10:53:32 UTC, Salih Dincer wrote:

Is there a more accurate way to delete the '\0' characters at the end of the string?

characterS
at the END
of the STRING

Apologies for the confusion. You can use stripRight for this:

import std.string: stripRight;
import std.stdio: writeln;

void main()
{
    string[] samples = [
        "the one\0", "the two\0\0", "the three\0\0\0", "the four\0\0\0\0",
        "the five\0\0\0\0\0", "the six\0\0\0\0\0\0",
        "the seven\0\0\0\0\0\0\0", "the eight\0\0\0\0\0\0\0\0"
    ];

    foreach (s; samples) {
        writeln(s.stripRight("\0"));
    }
}

September 24, 2022

Re: to delete the '\0' characters

Posted by Salih Dincer
in reply to Paul Backus

Permalink

Salih Dincer

Posted in reply to Paul Backus

Permalink

On Friday, 23 September 2022 at 22:17:51 UTC, Paul Backus wrote:

Apologies for the confusion. You can use stripRight

We have a saying: Estaghfirullah!

Thank you all so much because it has been very useful for me.

I learned two things:

First, we can use strip() functions with parameters:
https://dlang.org/phobos/std_algorithm_mutation.html#.strip

(examples are very nice)

Second, we could walk through the string in reverse and with indexOf():
https://github.com/dlang/phobos/blob/master/std/string.d#L3418

Source Code:

//import std.string : stripRight;/*
string stripRight(string str, const(char)[] chars)
{
  import std.string : indexOf;
  for (; !str.empty; str.popBack())
  {
    if (chars.indexOf(str.back) == -1)
      break;
  }
  return str;
}//*/

Delicious...

SDB@79

Top | Forum index | About this forum

Forums