Class-based string library

June 30, 2004
Posted by Sam McCall
Permalink
Sam McCall
Permalink
Okay, I got sick of dealing with UTF-8 :-)
I've got a proof-of-concept of a class-based String with a bunch of operations. All data manipulation is character-based, you manipulate unicode codepoints (characters) and don't worry about encodings.
It's fairly slow at the moment, the proof-of-concept version stores strings internally as dchar arrays, and it hasn't been optimised. Barring any killer bugs, it should be usable anywhere that string performance isn't a bottleneck.
It's probably only useful to you if you're not entirely happy with d's strings and/or arrays.
I plan to do some optimisations and write a backend that uses UTF-8 (internally only), which should be faster (I hope).
Interaction with libraries should be easy, char[]/wchar[]/dchar[] to String is just String(data) (or String.valueOf(data)), String to the array form is s.toUTF8/16/32().
I'll write up a pretty-looking example sometime that isn't 3am ;-)

A simple reference is here:
http://tunah.net/~tunah/d-string/doc.txt
And the code is here:
http://tunah.net/~tunah/d-string/string.d

If you try it, let me know what you think or any suggestions you have.
Sam
Forums