January 09, 2011 eliminate junk from std.string? | ||||
|---|---|---|---|---|
| ||||
There's a lot of junk in std.string that should be gone. I'm trying to motivate myself to port some functions to different string widths and... it's not worth it. What functions do you think we should remove from std.string? Let's make a string and then send them the way of the dino. Thanks, Andrei | ||||
January 09, 2011 Re: eliminate junk from std.string? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu Attachments:
| I think the tr, replace, and translate functions are a bit awkward. | |||
January 09, 2011 Re: eliminate junk from std.string? | ||||
|---|---|---|---|---|
| ||||
IIRC someone on this NG mentioned that several functions are going away from std.string and into std.algorithm. This would be nice, considering I frequently get name clashes when importing both modules (but at least there's no function hijacking. Thanks, D!). | ||||
January 10, 2011 Re: eliminate junk from std.string? | ||||
|---|---|---|---|---|
| ||||
On Sunday 09 January 2011 15:19:31 Jimmy Cao wrote:
> I think the tr, replace, and translate functions are a bit awkward.
Really? I use replace() fairly heavily in string-processing code, and I don't see anything about it which could be considered awkward.
tr() is definitely cool. I do think that it is a bit awkward if you want to deal with the modifiers, but my only real gripe is that I can't replace a character with a string of characters (e.g. replace all ":" with " - ") or a string of characters with a single character (e.g. replace all " - " with ":"). But I don't see a good way to do that. Maybe something with regexes would be better. I don't know. But the alternative to tr() in many cases, is multiple calls to replace(), which would cause unnecessary heap allocations.
- Jonthaan M Davis
| ||||
January 10, 2011 Re: eliminate junk from std.string? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Sun, 09 Jan 2011 16:51:57 -0600, Andrei Alexandrescu wrote:
> There's a lot of junk in std.string that should be gone. I'm trying to motivate myself to port some functions to different string widths and... it's not worth it.
>
> What functions do you think we should remove from std.string? Let's make a string and then send them the way of the dino.
My suggestions for things to remove:
hexdigits, digits, octdigits, lowercase, letters, uppecase, whitespace
- What are these arrays useful for?
capwords()
- It tries to do too much.
zfill()
- The ljustify(),rjustify(), and center() functions
should instead take an optional padding character
that defaults to a space.
maketrans(), translate()
- I don't even understand what these do.
inPattern(), countchars(), removechars()
- Pattern matching is std.regex's charter.
squeeze(), succ(), tr(), soundex(), column()
- I am having a very hard time imagining myself ever
using these functions...
-Lars
| |||
January 10, 2011 Re: eliminate junk from std.string? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Lars T. Kyllingstad | Lars T. Kyllingstad:
> My suggestions for things to remove:
>
> hexdigits, digits, octdigits, lowercase, letters, uppecase, whitespace
> - What are these arrays useful for?
>
> capwords()
> - It tries to do too much.
>
> zfill()
> - The ljustify(),rjustify(), and center() functions
> should instead take an optional padding character
> that defaults to a space.
>
> maketrans(), translate()
> - I don't even understand what these do.
>
> inPattern(), countchars(), removechars()
> - Pattern matching is std.regex's charter.
>
> squeeze(), succ(), tr(), soundex(), column()
> - I am having a very hard time imagining myself ever
> using these functions...
I agree with about nothing you have said :-)
How much string processing you do day by day? I am using most of those things... If you are used in using Python or Ruby you probably find most of those things useful. If Andrei removes arrays like lowercase, letters, uppecase, I will have to write them myself in code. ljustify(),rjustify(), and center() are very useful, even if they may be improved in some ways. maketrans() and translate() (as other things) come from Python string functions, and I have used them a hundred times in string processing code. I have used squeeze() some times. soundex is not hurting, because even if it's not commonly necessary, its name is easy to understand and it's not easy to miss for something different, so it doesn't add much noise to the library. And I've seen that it's easy to implement soundex wrongly, while the one in the std.string is correct.
I agree that too much stuff is generally bad in a library, because searching for something requires more time if there are more items to search into. In Bugzilla I have three or four bug reports that ask for few small changes in std.string (like removing chop and keeping chomp). But please don't remove too much. In a library more is often better.
Bye,
bearophile
| |||
January 10, 2011 Re: eliminate junk from std.string? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to bearophile | On Mon, 10 Jan 2011 03:41:51 -0500, bearophile wrote:
> Lars T. Kyllingstad:
>
>> My suggestions for things to remove:
>>
>> hexdigits, digits, octdigits, lowercase, letters, uppecase, whitespace
>> - What are these arrays useful for?
>>
>> capwords()
>> - It tries to do too much.
>>
>> zfill()
>> - The ljustify(),rjustify(), and center() functions
>> should instead take an optional padding character that defaults to a
>> space.
>>
>> maketrans(), translate()
>> - I don't even understand what these do.
>>
>> inPattern(), countchars(), removechars()
>> - Pattern matching is std.regex's charter.
>>
>> squeeze(), succ(), tr(), soundex(), column()
>> - I am having a very hard time imagining myself ever
>> using these functions...
>
> I agree with about nothing you have said :-)
>
> How much string processing you do day by day? I am using most of those things... If you are used in using Python or Ruby you probably find most of those things useful. If Andrei removes arrays like lowercase, letters, uppecase, I will have to write them myself in code. ljustify(),rjustify(), and center() are very useful, even if they may be improved in some ways. maketrans() and translate() (as other things) come from Python string functions, and I have used them a hundred times in string processing code. I have used squeeze() some times. soundex is not hurting, because even if it's not commonly necessary, its name is easy to understand and it's not easy to miss for something different, so it doesn't add much noise to the library. And I've seen that it's easy to implement soundex wrongly, while the one in the std.string is correct.
I think you may have misunderstood some of my suggestions. For instance, I never proposed to remove ljustify(), rjustify(), and center(). Rather, I would have them take an extra 'padding' parameter, so we can eliminate zfill().
Before: auto s = zfill("123", 6);
After: auto s = rjustify("123", 6, '0');
As for the other things I suggested, well... those are the things i vote to remove from std.string. If they only get that one vote, they stay. ;)
By the way, since you seem to be using these things quite often, maybe you can answer this:
1. What are the hexdigits, digits, octdigits, lowercase, letters, uppercase, and whitespace arrays useful for? The only thing I can think of is to check whether a character belongs to one of them, but I think that is better done with the std.ctype functions.
2. What do maketrans() and translate() do? (A brief example would be nice.)
-Lars
| |||
January 10, 2011 Re: eliminate junk from std.string? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Lars T. Kyllingstad | "Lars T. Kyllingstad" <public@kyllingen.NOSPAMnet> wrote in message news:igeia5$1t4a$4@digitalmars.com... > > 1. What are the hexdigits, digits, octdigits, lowercase, letters, uppercase, and whitespace arrays useful for? The only thing I can think of is to check whether a character belongs to one of them, but I think that is better done with the std.ctype functions. > They're good for people like me who never noticed std.ctype ;) | |||
January 11, 2011 Re: eliminate junk from std.string? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to bearophile | On 1/10/11 2:41 AM, bearophile wrote: > Lars T. Kyllingstad: > >> My suggestions for things to remove: >> >> hexdigits, digits, octdigits, lowercase, letters, uppecase, whitespace >> - What are these arrays useful for? >> >> capwords() >> - It tries to do too much. >> >> zfill() >> - The ljustify(),rjustify(), and center() functions >> should instead take an optional padding character >> that defaults to a space. >> >> maketrans(), translate() >> - I don't even understand what these do. >> >> inPattern(), countchars(), removechars() >> - Pattern matching is std.regex's charter. >> >> squeeze(), succ(), tr(), soundex(), column() >> - I am having a very hard time imagining myself ever >> using these functions... > > I agree with about nothing you have said :-) > > How much string processing you do day by day? I am using most of > those things... If you are used in using Python or Ruby you probably > find most of those things useful. If Andrei removes arrays like > lowercase, letters, uppecase, I will have to write them myself in > code. The arrays letters, uppercase, and lowercase aren't all that useful because they only make sense for ASCII. Besides, they should be encoded as functions. > ljustify(),rjustify(), and center() are very useful, even if > they may be improved in some ways. Hmmm. I suspected everyone's list will be different :o). I personally think the justification and centering functions are rarely useful - how often does one need to justify plain text? If you generate HTML the markup will do that for you and if you generate some nice text then the font will be proportional so the functions are useless. Nevertheless, I ported them (and also fixed them - they were broken for anything non-ASCII, which probably is telling of the extent of their usage). What are your use cases for these three functions? > maketrans() and translate() (as > other things) come from Python string functions, and I have used them > a hundred times in string processing code. I have used squeeze() some > times. soundex is not hurting, because even if it's not commonly > necessary, its name is easy to understand and it's not easy to miss > for something different, so it doesn't add much noise to the library. > And I've seen that it's easy to implement soundex wrongly, while the > one in the std.string is correct. I think maketrans/translate are okay (if a bit arcane) but they need to be ported to Unicode. Python apparently does mind Unicode as of 3.x, although I'm not sure exactly what the semantics are: http://stackoverflow.com/questions/3031045/how-come-string-maketrans-does-not-work-in-python-3-1. One odd thing is that you'd expect a dynamic language like Python to dynamically detect ASCII vs. non-ASCII. The example shows that Python rejects string-based translation tables even when they are, in fact, ASCII. > I agree that too much stuff is generally bad in a library, because > searching for something requires more time if there are more items to > search into. In Bugzilla I have three or four bug reports that ask > for few small changes in std.string (like removing chop and keeping > chomp). But please don't remove too much. In a library more is often > better. I think we should remove all functions that rely on patterns represented as strings: inPattern, countchars, removechars, squeeze, munch. Representing patterns as a convention on top of otherwise untyped strings doesn't seem a good solution for D. We should either go with regex or with a simple pattern structure and a helper function. That way people can say e.g. munch(s, pattern("[0-9]")). Andrei | |||
January 11, 2011 Re: eliminate junk from std.string? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | > What are your use cases for these three functions?
I don't know about bearophile, but I used a lot of the functions you are talking about removing in my HTML -> Plain Text conversion function used for emails and other similar environments. squeeze the whitespace, align text, wrap for the target, etc.
| |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply