Thread overview
Trying to alias this a grapheme range + making it a forward range
Jul 08, 2019
aliak
Jul 08, 2019
ag0aep6g
Jul 09, 2019
aliak
July 08, 2019
Problem 1:

I'm trying to get a string to behave as a .byGrapheme range by default, but I can't figure out Grapheme. I'm trying to replicate this behavior:

foreach (g; "hello".byGrapheme) {
    write(g[]);
}

In a custom type:

struct ustring {
    string data;
    this(string data) {
    	this.data = data;
    }
    auto get() {
        static struct Range {
            typeof(string.init.byGrapheme) source;
            bool empty() { return source.empty; }
            void popFront() { source.popFront; }
            auto front() { return source.front[]; }
            auto save() { return this; };
        }
        return Range(this.data.byGrapheme);
    }
    alias get this;
}

But I keep on ending up with a UTFException: "Encoding an invalid code point in UTF-8" with code like:

writeln("hello".ustring);

Problem 2:

How can I get the aliased ustring type to behave as a ForwardRange? If I add the save method to the voldermort range type, the isForwardRange!ustring fails because the requirement on isForwardRange checks to see if save returns the same type it's called on. Which is not the case here since typeof(ustring.save) == ustring.get.Range). But nontheless does have a save method.

Cheers,
- Ali

July 09, 2019
On 08.07.19 23:55, aliak wrote:
> struct ustring {
>      string data;
>      this(string data) {
>          this.data = data;
>      }
>      auto get() {
>          static struct Range {
>              typeof(string.init.byGrapheme) source;
>              bool empty() { return source.empty; }
>              void popFront() { source.popFront; }
>              auto front() { return source.front[]; }
>              auto save() { return this; };
>          }
>          return Range(this.data.byGrapheme);
>      }
>      alias get this;
> }
> 
> But I keep on ending up with a UTFException: "Encoding an invalid code point in UTF-8" with code like:
> 
> writeln("hello".ustring);

`source.front` is a temporary `Grapheme` and you're calling `opSlice` on it. The documentation for `Grapheme.opSlice` warns: "Invalidates when this Grapheme leaves the scope, attempts to use it then would lead to memory corruption." [1]

So you can't return `source.front[]` from your `front`. You'll have to store the current `front` in your struct, I guess.

Also, returning a fresh range on every `alias this` call is asking for trouble. This is an infinite loop:

    auto u = "hello".ustring;
    while (!u.empty) u.popFront();

because `u.empty` and `u.popFront` are called on fresh, non-empty, independent ranges.

> Problem 2:
> 
> How can I get the aliased ustring type to behave as a ForwardRange? If I add the save method to the voldermort range type, the isForwardRange!ustring fails because the requirement on isForwardRange checks to see if save returns the same type it's called on. Which is not the case here since typeof(ustring.save) == ustring.get.Range). But nontheless does have a save method.

You must provide a `save` that returns a `ustring`. There's no way around it.

Maybe make `ustring` itself the range. In the code you've shown, the `alias this` only seems to make everything more complicated. But you might have good reasons for it, of course.

By the way, your're not calling `source.save` in `Range.save`. You're just copying `source`. I don't know if that's effectively the same, and even if it is, I'd advise to call `.save` explicitly. Better safe than sorry.


[1] https://dlang.org/phobos/std_uni.html#.Grapheme.opSlice
July 09, 2019
On Monday, 8 July 2019 at 23:01:49 UTC, ag0aep6g wrote:
> On 08.07.19 23:55, aliak wrote:
>> [...]
>
> `source.front` is a temporary `Grapheme` and you're calling `opSlice` on it. The documentation for `Grapheme.opSlice` warns: "Invalidates when this Grapheme leaves the scope, attempts to use it then would lead to memory corruption." [1]

Ah. Right. Thanks!

>
> [...]

hah yes, I realized this as well.

>
> [...]

No you're right. It was indeed just making things more complicated and was just a bad idea.

>
> [...]

Cheers,
- Ali