On Thu, Aug 14, 2014 at 3:59 AM, Ali Çehreli <digitalmars-d-learn@puremagic.com> wrote:

On 08/13/2014 10:57 PM, Carl Sturtivant wrote:
>
> This is the idea I mean.
> http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.14.9450&rep=rep1&type=pdf
>
> Here's a C++ implementation supported I think by gcc.
> http://www.sgi.com/tech/stl/Rope.html
>
> Is there a D implementation of a rope out there somewhere?

I am not aware of any but std.range.chain can be useful:

chain is very limited compared to what ropes provide. Eg: efficient inserting of a string inside a rope, space efficient maintaining of entire edit history, rebalancing, etc.

Ropes are very well suited for various applications (text editors, file system indexing, large scale text processing etc).

std.rope would be a very welcome addition for real world applications.

import std.range;

void main()
{
auto r = chain(chain("The qui", "ck brown"), " fox");
assert(r.equal("The quick brown fox"));

}

> How is unicode handled? Just by using dchar?

dchar is not necessary when using the rope as a sequence but it looks like each section must be a valid UTF-8 sequence. The following example where a Unicode character (ü) is split between two sections fails:

auto r = chain(chain("The q\xc3", "\xbcick brown"), " fox");
// std.utf.UTFException@./phobos/std/utf.d(1109): Attempted
// to decode past the end of a string (at index 1)
assert(r.equal("The qüick brown fox"));

Of course, it works when the character is complete:

auto r = chain(chain("The qü", "ick brown"), " fox");
assert(r.equal("The qüick brown fox"));

But yes, for O(1) character access dchar is needed. (As you know, it can't be exactly O(1) when there are a large amount of sections and one needs to be memory-efficient for character index lookup.)

Ali