Thread overview
Ropes (concatenation trees) for strings in D ?
Aug 14, 2014
Carl Sturtivant
Aug 14, 2014
Messenger
Aug 14, 2014
Ali Çehreli
Aug 16, 2014
Timothee Cour
Aug 16, 2014
ketmar
Aug 16, 2014
MrSmith
Aug 16, 2014
ketmar
Aug 17, 2014
ketmar
Aug 15, 2014
Kagamin
Aug 18, 2014
Justin Whear
August 14, 2014
This is the idea I mean.
http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.14.9450&rep=rep1&type=pdf
Here's a C++ implementation supported I think by gcc.
http://www.sgi.com/tech/stl/Rope.html

Is there a D implementation of a rope out there somewhere?

How is unicode handled? Just by using dchar?

August 14, 2014
On Thursday, 14 August 2014 at 05:57:18 UTC, Carl Sturtivant wrote:
>
> This is the idea I mean.
> http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.14.9450&rep=rep1&type=pdf
> Here's a C++ implementation supported I think by gcc.
> http://www.sgi.com/tech/stl/Rope.html
>
> Is there a D implementation of a rope out there somewhere?
>
> How is unicode handled? Just by using dchar?

Hmm. Appender?
> http://dlang.org/phobos/std_array.html#.Appender

    import std.array;
    import std.range;

    enum arrayLength = 1_000_000;

    void main() {
        auto writer = appender!string;
        auto xes = new char[](arrayLength);  // dynamic on heap

        xes[] = 'x';
        assert(xes[0] == 'x');
        assert(xes[100] == 'x');
        assert(xes[1000] == 'x');

        writer.reserve((2 * arrayLength) + 3);  // optional
        writer.put(xes);  // or writer ~= xes;
        writer.put("abc");
        writer.put(xes);

        // normal slicing op
        assert(writer.data[arrayLength..(arrayLength+3)] == "abc");

        // .array allocates a new array but not sure how else to do it
        assert(writer.data.retro.array[arrayLength..(arrayLength+3)] == "cba");
    }
August 14, 2014
On 08/13/2014 10:57 PM, Carl Sturtivant wrote:
>
> This is the idea I mean.
> http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.14.9450&rep=rep1&type=pdf 

>
> Here's a C++ implementation supported I think by gcc.
> http://www.sgi.com/tech/stl/Rope.html
>
> Is there a D implementation of a rope out there somewhere?

I am not aware of any but std.range.chain can be useful:

import std.range;

void main()
{
    auto r = chain(chain("The qui", "ck brown"), " fox");
    assert(r.equal("The quick brown fox"));
}

> How is unicode handled? Just by using dchar?

dchar is not necessary when using the rope as a sequence but it looks like each section must be a valid UTF-8 sequence. The following example where a Unicode character (ü) is split between two sections fails:

    auto r = chain(chain("The q\xc3", "\xbcick brown"), " fox");
    // std.utf.UTFException@./phobos/std/utf.d(1109): Attempted
    // to decode past the end of a string (at index 1)
    assert(r.equal("The qüick brown fox"));

Of course, it works when the character is complete:

    auto r = chain(chain("The qü", "ick brown"), " fox");
    assert(r.equal("The qüick brown fox"));

But yes, for O(1) character access dchar is needed. (As you know, it can't be exactly O(1) when there are a large amount of sections and one needs to be memory-efficient for character index lookup.)

Ali

August 15, 2014
Search gave result "enough rope to hang yourself".
August 16, 2014
On Thu, Aug 14, 2014 at 3:59 AM, Ali Çehreli < digitalmars-d-learn@puremagic.com> wrote:

> On 08/13/2014 10:57 PM, Carl Sturtivant wrote:
> >
> > This is the idea I mean. http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.
> 14.9450&rep=rep1&type=pdf
> >
> > Here's a C++ implementation supported I think by gcc. http://www.sgi.com/tech/stl/Rope.html
> >
> > Is there a D implementation of a rope out there somewhere?
>
> I am not aware of any but std.range.chain can be useful:
>


chain is very limited compared to what ropes provide. Eg: efficient inserting of a string inside a rope, space efficient maintaining of entire edit history, rebalancing, etc.

Ropes are very well suited for various applications (text editors, file system indexing, large scale text processing etc).

std.rope would be a very welcome addition for real world applications.



>
> import std.range;
>
> void main()
> {
>     auto r = chain(chain("The qui", "ck brown"), " fox");
>     assert(r.equal("The quick brown fox"));
>
> }
>
> > How is unicode handled? Just by using dchar?
>
> dchar is not necessary when using the rope as a sequence but it looks like each section must be a valid UTF-8 sequence. The following example where a Unicode character (ü) is split between two sections fails:
>
>     auto r = chain(chain("The q\xc3", "\xbcick brown"), " fox");
>     // std.utf.UTFException@./phobos/std/utf.d(1109): Attempted
>     // to decode past the end of a string (at index 1)
>     assert(r.equal("The qüick brown fox"));
>
> Of course, it works when the character is complete:
>
>     auto r = chain(chain("The qü", "ick brown"), " fox");
>     assert(r.equal("The qüick brown fox"));
>
> But yes, for O(1) character access dchar is needed. (As you know, it can't
> be exactly O(1) when there are a large amount of sections and one needs to
> be memory-efficient for character index lookup.)
>
> Ali
>
>


August 16, 2014
On Fri, 15 Aug 2014 19:04:10 -0700
Timothee Cour via Digitalmars-d-learn
<digitalmars-d-learn@puremagic.com> wrote:

sounds like my C library based on this article: http://e98cuenc.free.fr/wordprocessor/piecetable.html

i'm slowly converting my C code to D (nothing fancy yet, still C-style).

it's not a 'real' rope -- it's designed for text editing tasks, and it allocates like crazy now, but it's pretty usable to writing rich text editors and renderers in various GUI toolkits.

don't know when i'll finish first working version though, it's all little tedious and i have alot other things to do. but i'll announce it when it will be done. ;-)


August 16, 2014
On Saturday, 16 August 2014 at 02:26:29 UTC, ketmar via Digitalmars-d-learn wrote:
> On Fri, 15 Aug 2014 19:04:10 -0700
> Timothee Cour via Digitalmars-d-learn
> <digitalmars-d-learn@puremagic.com> wrote:
>
> sounds like my C library based on this article:
> http://e98cuenc.free.fr/wordprocessor/piecetable.html
>
> i'm slowly converting my C code to D (nothing fancy yet, still C-style).
>
> it's not a 'real' rope -- it's designed for text editing tasks, and it
> allocates like crazy now, but it's pretty usable to writing rich text
> editors and renderers in various GUI toolkits.
>
> don't know when i'll finish first working version though, it's all
> little tedious and i have alot other things to do. but i'll announce it
> when it will be done. ;-)

I've done some progress on making piece table some time ago, but have no time atm.
Have a look https://github.com/MrSmith33/textedit-d
It supports inserting, removing, undo, redo and slicing. Provides forward range interface. Can you also share your progress?
August 16, 2014
On Sat, 16 Aug 2014 10:25:01 +0000
MrSmith via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com>
wrote:

> Can you also share your progress?
sure. the project has no public repo yet, but you can take a look at it's current state here: http://ketmar.no-ip.org/etx.txz

no undo/redo support for now, and it accepts only dchars, but you can find working (i hope, at least unittests works ;-) rb-tree and piecetable there. no slicing yet, and range support is nearly non-existing, but piecetable supports extensible text styles! ;-)

i'm not really happy with design though. it mindlessly allocates alot of small classes (it should use pools instead, i believe) and high-level text rendering interface is not ported yet.

i'm planning to announce it in d.announces when it will be ready to 'go public'. not sure yet if i'll leave it 'as is' or will try to rewrite in in something like 'D style' before announcing.

ah, and it needs at least minimal documentation too. ;-)


August 17, 2014
On Sat, 16 Aug 2014 10:25:01 +0000
MrSmith via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com>
wrote:

> Can you also share your progress?
here is public repo: http://repo.or.cz/w/etx.d.git
remember, this is not even alpha-quality code (but it seems to be
stable enough to build something upon it). API sux, no docs and so on.
enjoy.

GPLv3.


August 18, 2014
On Thu, 14 Aug 2014 05:57:17 +0000, Carl Sturtivant wrote:

> This is the idea I mean. http://citeseer.ist.psu.edu/viewdoc/download?
doi=10.1.1.14.9450&rep=rep1&type=pdf
> Here's a C++ implementation supported I think by gcc. http://www.sgi.com/tech/stl/Rope.html
> 
> Is there a D implementation of a rope out there somewhere?
> 
> How is unicode handled? Just by using dchar?

I was working on a D Rope implementation years ago (now lost to time), when I came across this article and ditched it: http://scienceblogs.com/goodmath/2009/02/18/gap-buffers-or-why-bother- with-1/

The short version is that while Ropes are interesting theoretically, they have a bad implementation complexity compared to much simpler structures which give more than adequate results for most tasks.