December 31, 2019
On Tue, Dec 31, 2019 at 09:33:14AM -0500, Steven Schveighoffer via Digitalmars-d-learn wrote:
> On 12/30/19 6:31 PM, H. S. Teoh wrote:
> > On Mon, Dec 30, 2019 at 03:09:58PM -0800, H. S. Teoh via Digitalmars-d-learn wrote:
[...]
> > Haha, it's actually right there in the Grapheme docs for the opSlice overloads:
> > 
> >          Random-access range over Grapheme's $(CHARACTERS).
> > 
> >          Warning: Invalidates when this Grapheme leaves the scope,
> >          attempts to use it then would lead to memory corruption.
> > 
> > Looks like when you use .map over the Grapheme, it gets copied into a temporary, which gets invalidated when map.front returns. Somewhere we're missing a 'scope' qualifier...
[...]
> Then the original example should be fixable by putting "ref" in for all the lambdas.
> 
> But this is kind of disturbing. Why does the grapheme do this? The original data is not scoped.

Honestly I have no idea, but glancing at the code in std.uni reveals that the returned slice is actually a wrapper object that contains a pointer to the parent Grapheme object.  So if the parent was a temporary and goes out of scope before the wrapper does, we're left with a dangling pointer.


> e.g.:
> 
> writeln(" Text = ", gr1.map!((ref g) => g[]).joiner.to!string);
[...]

Unfortunately this doesn't work. Somehow the ref parameter doesn't match whatever it is std.algorithm.map is trying to pass to it.


T

-- 
They say that "guns don't kill people, people kill people." Well I think the gun helps. If you just stood there and yelled BANG, I don't think you'd kill too many people. -- Eddie Izzard, Dressed to Kill
December 31, 2019
On 12/31/19 2:58 PM, H. S. Teoh wrote:
> On Tue, Dec 31, 2019 at 09:33:14AM -0500, Steven Schveighoffer via Digitalmars-d-learn wrote:
>> e.g.:
>>
>> writeln(" Text = ", gr1.map!((ref g) => g[]).joiner.to!string);
> [...]
> 
> Unfortunately this doesn't work. Somehow the ref parameter doesn't match
> whatever it is std.algorithm.map is trying to pass to it.

Huh, it seemed to work for me. Got the full "Robert" with an R. map does support ref-ness. Maybe you didn't put ref in the right place?

-Steve
December 31, 2019
On Tue, Dec 31, 2019 at 04:02:47PM -0500, Steven Schveighoffer via Digitalmars-d-learn wrote:
> On 12/31/19 2:58 PM, H. S. Teoh wrote:
> > On Tue, Dec 31, 2019 at 09:33:14AM -0500, Steven Schveighoffer via Digitalmars-d-learn wrote:
> > > e.g.:
> > > 
> > > writeln(" Text = ", gr1.map!((ref g) => g[]).joiner.to!string);
> > [...]
> > 
> > Unfortunately this doesn't work. Somehow the ref parameter doesn't match whatever it is std.algorithm.map is trying to pass to it.
> 
> Huh, it seemed to work for me. Got the full "Robert" with an R. map does support ref-ness. Maybe you didn't put ref in the right place?

Here's my full non-working code:

	import std;
	void main() {
		auto x = "Bla\u0301hbla\u0310h\u0309!";
		auto r = x.byGrapheme;
		writefln("%s", r.map!((ref g) => g[]).joiner.to!string);
	}

The compiler says:

	/usr/src/d/phobos/std/algorithm/iteration.d(604): Error: template D main.__lambda1 cannot deduce function from argument types !()(Grapheme), candidates are:
	test.d(5):        __lambda1
	/usr/src/d/phobos/std/algorithm/iteration.d(499): Error: template instance test.main.MapResult!(__lambda1, Result!string) error instantiating
	test.d(5):        instantiated from here: map!(Result!string)

What did I do wrong?


T

-- 
What do you get if you drop a piano down a mineshaft? A flat minor.
December 31, 2019
On 12/31/19 4:22 PM, H. S. Teoh wrote:
> On Tue, Dec 31, 2019 at 04:02:47PM -0500, Steven Schveighoffer via Digitalmars-d-learn wrote:
>> On 12/31/19 2:58 PM, H. S. Teoh wrote:
>>> On Tue, Dec 31, 2019 at 09:33:14AM -0500, Steven Schveighoffer via Digitalmars-d-learn wrote:
>>>> e.g.:
>>>>
>>>> writeln(" Text = ", gr1.map!((ref g) => g[]).joiner.to!string);
>>> [...]
>>>
>>> Unfortunately this doesn't work. Somehow the ref parameter doesn't
>>> match whatever it is std.algorithm.map is trying to pass to it.
>>
>> Huh, it seemed to work for me. Got the full "Robert" with an R. map
>> does support ref-ness. Maybe you didn't put ref in the right place?
> 
> Here's my full non-working code:
> 
> 	import std;
> 	void main() {
> 		auto x = "Bla\u0301hbla\u0310h\u0309!";
> 		auto r = x.byGrapheme;
> 		writefln("%s", r.map!((ref g) => g[]).joiner.to!string);
> 	}
> 
> The compiler says:
> 
> 	/usr/src/d/phobos/std/algorithm/iteration.d(604): Error: template D main.__lambda1 cannot deduce function from argument types !()(Grapheme), candidates are:
> 	test.d(5):        __lambda1
> 	/usr/src/d/phobos/std/algorithm/iteration.d(499): Error: template instance test.main.MapResult!(__lambda1, Result!string) error instantiating
> 	test.d(5):        instantiated from here: map!(Result!string)
> 
> What did I do wrong?

auto r = x.byGrapheme.array;

This is how Robert originally had it if you look a few messages up.

Otherwise, it's not an lvalue.

The fact that a Grapheme's return requires you keep the grapheme in scope for operations seems completely incorrect and dangerous IMO (note that operators are going to always have a ref this, even when called on an rvalue). So even though using ref works, I think the underlying issue here really is the lifetime problem.

-Steve
December 31, 2019
On Tue, Dec 31, 2019 at 04:36:56PM -0500, Steven Schveighoffer via Digitalmars-d-learn wrote:
> On 12/31/19 4:22 PM, H. S. Teoh wrote:
[...]
> > 	import std;
> > 	void main() {
> > 		auto x = "Bla\u0301hbla\u0310h\u0309!";
> > 		auto r = x.byGrapheme;
> > 		writefln("%s", r.map!((ref g) => g[]).joiner.to!string);
> > 	}
[...]
> > What did I do wrong?
> 
> auto r = x.byGrapheme.array;

Haha, in my hurry I totally forgot about the .array. Mea culpa.


[...]
> The fact that a Grapheme's return requires you keep the grapheme in scope for operations seems completely incorrect and dangerous IMO (note that operators are going to always have a ref this, even when called on an rvalue). So even though using ref works, I think the underlying issue here really is the lifetime problem.
[...]

After my wrong recollection of the history surrounding indexOf vs. countUntil, I'm not sure I can rely on my memory anymore, :-P but AIUI Dmitri implemented it this way because he wanted to avoid allocations (GC or otherwise) in the most common case of Grapheme containing just a small number of code points (usually 1 or 2). When the number of combining diacritics exceed the size of the Grapheme struct, then it would quietly switch to malloc or some such for holding the data.  My guess is that this is the reason for passing &this to the wrapper range returned by opSlice(). And possibly it's also to allow mutation of the Grapheme via the returned slice?

Perhaps this whole approach should be looked at again. Certainly, unless I'm missing something, it *ought* to be possible to implement Grapheme in a way that doesn't require this scoped reference business.


T

-- 
The diminished 7th chord is the most flexible and fear-instilling chord. Use it often, use it unsparingly, to subdue your listeners into submission!
January 04, 2020
On 2019-12-31 21:36:56 +0000, Steven Schveighoffer said:

> The fact that a Grapheme's return requires you keep the grapheme in scope for operations seems completely incorrect and dangerous IMO (note that operators are going to always have a ref this, even when called on an rvalue). So even though using ref works, I think the underlying issue here really is the lifetime problem.

Thanks for all the answers, pretty enlighting even I'm not sure I get everything 100%.

So, what to do for now? File a bug-report? What needs to be fixed?

I'm using the ref approach for now, in hope it will be OK for my use-case, which is converting a wstring to a grapheme[], alter the array, and map it back to a wstring. Sounds like a lot of porcessing for handling unicode text, but I don't think it gets a lot simpler than that.

-- 
Robert M. Münch
http://www.saphirion.com
smarter | better | faster

January 04, 2020
On Sat, Jan 04, 2020 at 08:19:14PM +0100, Robert M. Münch via Digitalmars-d-learn wrote:
> On 2019-12-31 21:36:56 +0000, Steven Schveighoffer said:
> 
> > The fact that a Grapheme's return requires you keep the grapheme in scope for operations seems completely incorrect and dangerous IMO (note that operators are going to always have a ref this, even when called on an rvalue). So even though using ref works, I think the underlying issue here really is the lifetime problem.
> 
> Thanks for all the answers, pretty enlighting even I'm not sure I get everything 100%.
> 
> So, what to do for now? File a bug-report? What needs to be fixed?

At a minimum, I think we should file a bug report to investigate whether Grapheme.opSlice can be implemented differently, such that we avoid this obscure referential behaviour that makes it hard to work with in complex code. I'm not sure if this is possible, but IMO we should at least investigate the possibilities.


> I'm using the ref approach for now, in hope it will be OK for my use-case, which is converting a wstring to a grapheme[], alter the array, and map it back to a wstring. Sounds like a lot of porcessing for handling unicode text, but I don't think it gets a lot simpler than that.
[...]

Unicode is a beast. Be glad that you can even do this in the first place.  If I were writing this in C, I wouldn't even know where to begin!


T

-- 
No! I'm not in denial!
January 06, 2020
On 2020-01-05 04:18:34 +0000, H. S. Teoh said:

> At a minimum, I think we should file a bug report to investigate whether
> Grapheme.opSlice can be implemented differently, such that we avoid this
> obscure referential behaviour that makes it hard to work with in complex
> code. I'm not sure if this is possible, but IMO we should at least
> investigate the possibilities.

Done... my first bug report :-) I copied togehter all the findings from this thread.

-- 
Robert M. Münch
http://www.saphirion.com
smarter | better | faster

January 06, 2020
On Monday, 6 January 2020 at 08:39:19 UTC, Robert M. Münch wrote:
> On 2020-01-05 04:18:34 +0000, H. S. Teoh said:
>
>> At a minimum, I think we should file a bug report to investigate whether
>> Grapheme.opSlice can be implemented differently, such that we avoid this
>> obscure referential behaviour that makes it hard to work with in complex
>> code. I'm not sure if this is possible, but IMO we should at least
>> investigate the possibilities.
>
> Done... my first bug report :-) I copied togehter all the findings from this thread.

For the sake of completeness
https://issues.dlang.org/show_bug.cgi?id=20483
1 2
Next ›   Last »