| Thread overview | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 
 | 
| October 11, 2006New string implementation: dstring 1.0 | ||||
|---|---|---|---|---|
| 
 | ||||
| Check out the FAQ at http://www.dprogramming.com/dstring.php and give it a spin. Documentation is online at http://www.dprogramming.com/docs/dstring/dstring.html Let me know what you think! | ||||
| October 11, 2006Re: New string implementation: dstring 1.0 | ||||
|---|---|---|---|---|
| 
 | ||||
| Posted in reply to Chris Miller | Chris Miller wrote:
> Check out the FAQ at http://www.dprogramming.com/dstring.php and give it a  spin.
> Documentation is online at  http://www.dprogramming.com/docs/dstring/dstring.html
> 
> Let me know what you think!
Looks cool!
I'm to strapped on time to try it out right now, but I will when I get the chance.  I do have a couple questions and a comment based on the documentation:
- Is toString() the same as toUTF8()?  If so, I'd like to see something in the documentation to say they are the same.
- Are there plans to extend this to act as a string manipulating library as well as a string type, adding stuff from phobos like toUpper(), toLower(), capitalize(), split(), etc?  That would be cool.
- Seems like it would be handy to have functions for converting to C style null terminated strings.  Something like char* toUTF8C() and wchar* toUTF16C().
This looks like a very useful string type.  I just hope that if people like it, it becomes part of the standard library or some such so that newbs aren't caught by indexing/slicing in the current string-as-array approach.  Anyhow, thanks for doing that.
 | |||
| October 12, 2006Re: New string implementation: dstring 1.0 | ||||
|---|---|---|---|---|
| 
 | ||||
| Posted in reply to Chad J | On Wed, 11 Oct 2006 17:27:52 -0400, Chad J spamIsBad gmail.com"> <"<""gamerChad"@"> wrote: > Chris Miller wrote: >> Check out the FAQ at http://www.dprogramming.com/dstring.php and give it a spin. >> Documentation is online at http://www.dprogramming.com/docs/dstring/dstring.html >> Let me know what you think! > > Looks cool! > > I'm to strapped on time to try it out right now, but I will when I get the chance. I do have a couple questions and a comment based on the documentation: > > - Is toString() the same as toUTF8()? If so, I'd like to see something in the documentation to say they are the same. Yes, they are the same; they both state returning UTF-8 so I didn't think it was necessary. It mainly only has toString for consistency with other D types and so it will work directly with writef. > > - Are there plans to extend this to act as a string manipulating library as well as a string type, adding stuff from phobos like toUpper(), toLower(), capitalize(), split(), etc? That would be cool. If there is enough interest, yes. > > - Seems like it would be handy to have functions for converting to C style null terminated strings. Something like char* toUTF8C() and wchar* toUTF16C(). I wasn't sure about this because it's not guaranteed that C uses Unicode; but I guess toUTF8z(), toUTF16z() and toUTF32z() would be fine, which only mean zero-terminated, not necessarily compatible with C. > > This looks like a very useful string type. I just hope that if people like it, it becomes part of the standard library or some such so that newbs aren't caught by indexing/slicing in the current string-as-array approach. Anyhow, thanks for doing that. Thanks | |||
| October 13, 2006Re: New string implementation: dstring 1.0 | ||||
|---|---|---|---|---|
| 
 | ||||
| Posted in reply to Chris Miller | On Wed, 11 Oct 2006 20:40:18 +0300, Chris Miller <chris@dprogramming.com> wrote:
> Check out the FAQ at http://www.dprogramming.com/dstring.php and give it a spin.
> Documentation is online at http://www.dprogramming.com/docs/dstring/dstring.html
>
> Let me know what you think!
Good work indeed. Now slicing can be done... Thanks. :)
I am wondering if D itself should support something like this. Well, even if 'char[]' will be aliased to 'string', it would make code clearer for everybody (IMHO). It would tell that this is a string, not an array of characters. (It alone would make the aliasing meaningful, not to mention that you cannot slice char[] safely.)
I think one should be aware of dstring's worst case memory consumption. For example, a huge file is read to a dstring. There is only one character that would make the whole string to use dchars instead of char. The string would then take four times more space than char[] would. Of course, if one would use char[], he/she couldn't slice it (in O(1) time). :) (And it would be probably necessary to write special routines for special cases anyway.)
 | |||
| October 24, 2006Re: New string implementation: dstring 1.0 | ||||
|---|---|---|---|---|
| 
 | ||||
| Posted in reply to Chris Miller | Chris Miller skrev:
> Check out the FAQ at http://www.dprogramming.com/dstring.php and give it a spin.
> Documentation is online at http://www.dprogramming.com/docs/dstring/dstring.html
> 
> Let me know what you think!
I love it! This is very much needed and should go into Phobos yesterday!
Solves the problem of:
char[] foo = "hög";
assert(foo.length == 3); // Sorry UTF-8, this is == 4
assert(foo[1] == 'ö');   // Not a chance!
You implementation of string could be a perfect wrapper that makes the fact that UTF-8 is of variable char size, invisible to the programmer.
// Fredrik Olsson
 | |||
| October 24, 2006Re: New string implementation: dstring 1.0 | ||||
|---|---|---|---|---|
| 
 | ||||
| Posted in reply to Fredrik Olsson | Fredrik Olsson napisał(a):
> Chris Miller skrev:
>> Check out the FAQ at http://www.dprogramming.com/dstring.php and give it a spin.
>> Documentation is online at http://www.dprogramming.com/docs/dstring/dstring.html
>>
>> Let me know what you think!
> 
> I love it! This is very much needed and should go into Phobos yesterday!
> 
> Solves the problem of:
> char[] foo = "hög";
> assert(foo.length == 3); // Sorry UTF-8, this is == 4
> assert(foo[1] == 'ö');   // Not a chance!
> 
> You implementation of string could be a perfect wrapper that makes the fact that UTF-8 is of variable char size, invisible to the programmer.
> 
> 
> // Fredrik Olsson
I didn't write it before when DString was introduced, but I got also very positive feelings about it.
As a programmer in common cases I should not be bothered about implementation details of string. It should not matter if I work with char[], wchar[] or dchar[].
I agree that it should be putted in Phobos immediately! (Maybe just some optimalizations with string size could be added, so adding one dchar to char[] string will not cause conversion from char[] to dchar[], but rather dchar to char).
Regards
Marcin Kuszczak
Aarti_pl
 | |||
| October 24, 2006Re: New string implementation: dstring 1.0 | ||||
|---|---|---|---|---|
| 
 | ||||
| Posted in reply to Aarti_pl | On Tue, 24 Oct 2006 05:12:21 -0400, Aarti_pl <aarti@interia.pl> wrote: > Fredrik Olsson napisał(a): >> Chris Miller skrev: >>> Check out the FAQ at http://www.dprogramming.com/dstring.php and give it a spin. >>> Documentation is online at http://www.dprogramming.com/docs/dstring/dstring.html >>> >>> Let me know what you think! >> I love it! This is very much needed and should go into Phobos yesterday! >> Solves the problem of: >> char[] foo = "hög"; >> assert(foo.length == 3); // Sorry UTF-8, this is == 4 >> assert(foo[1] == 'ö'); // Not a chance! >> You implementation of string could be a perfect wrapper that makes the fact that UTF-8 is of variable char size, invisible to the programmer. >> // Fredrik Olsson > > > I didn't write it before when DString was introduced, but I got also very positive feelings about it. > > As a programmer in common cases I should not be bothered about implementation details of string. It should not matter if I work with char[], wchar[] or dchar[]. Thanks guys. > > I agree that it should be putted in Phobos immediately! (Maybe just some optimalizations with string size could be added, so adding one dchar to char[] string will not cause conversion from char[] to dchar[], but rather dchar to char). But then it won't be ultra fast at finding dchar codepoints. | |||
| October 24, 2006Re: New string implementation: dstring 1.0 | ||||
|---|---|---|---|---|
| 
 | ||||
| Posted in reply to Chris Miller | Chris Miller skrev:
<snip>
>> I agree that it should be putted in Phobos immediately! (Maybe just some optimalizations with string size could be added, so adding one dchar to char[] string will not cause conversion from char[] to dchar[], but rather dchar to char).
> 
> But then it won't be ultra fast at finding dchar codepoints.
A little thought. Two bits are now used to represent the internal format, but there are only three formats available. Maybe the fourth format code could be "size optimal, but slightly slower"?
I mean, just what else should quad 2ghz machines do? :)
// Fredrik Olsson
 | |||
| October 24, 2006Re: New string implementation: dstring 1.0 | ||||
|---|---|---|---|---|
| 
 | ||||
| Posted in reply to Chris Miller | Chris Miller wrote:
> Check out the FAQ at http://www.dprogramming.com/dstring.php and give it a spin.
> Documentation is online at http://www.dprogramming.com/docs/dstring/dstring.html
> 
> Let me know what you think!
Hi!
The dstring module is very nice, but it's lacking one thing that I, at least personally, am gotten used to. It's that you cannot assign a null to it.
string str = null;
or
string getAStringWhichMightBeNull() { return null; }
It's not a big thing but makes using the string a bit clumsy.
One another thing. Not necessarily dstring's fault, but I tried compiling it with C::B in release mode. I had set every compiler option available for the release build and when I compiled I started getting errors about functions not returning any values. They were functions which had switches, which default cases had return statements.
I don't know which flag causes this behavior, but a debug build works just fine.
O.
 | |||
| October 24, 2006Re: New string implementation: dstring 1.0 | ||||
|---|---|---|---|---|
| 
 | ||||
| Posted in reply to Fredrik Olsson | On Tue, 24 Oct 2006 06:20:44 -0400, Fredrik Olsson <peylow@gmail.com> wrote:
> Chris Miller skrev:
> <snip>
>>> I agree that it should be putted in Phobos immediately! (Maybe just some optimalizations with string size could be added, so adding one dchar to char[] string will not cause conversion from char[] to dchar[], but rather dchar to char).
>>  But then it won't be ultra fast at finding dchar codepoints.
>
> A little thought. Two bits are now used to represent the internal format, but there are only three formats available. Maybe the fourth format code could be "size optimal, but slightly slower"?
>
> I mean, just what else should quad 2ghz machines do? :)
Neat idea. Currently it uses that 4th state to represent an uninitialized string, but that's not so important. I'll think about this.
 | |||
Copyright © 1999-2021 by the D Language Foundation
  Permalink
Permalink Reply
Reply