Jump to page: 1 29  
Page
Thread overview
eliminate junk from std.string?
Jan 09, 2011
Jimmy Cao
Jan 09, 2011
Andrej Mitrovic
Jan 10, 2011
Jonathan M Davis
Jan 10, 2011
bearophile
Jan 10, 2011
Nick Sabalausky
Jan 11, 2011
Walter Bright
Jan 11, 2011
Adam Ruppe
Jan 11, 2011
Andrej Mitrovic
Jan 11, 2011
Nick Sabalausky
Jan 11, 2011
Walter Bright
Jan 11, 2011
Walter Bright
Jan 11, 2011
Nick Sabalausky
Jan 11, 2011
Walter Bright
Jan 11, 2011
Nick Sabalausky
Jan 11, 2011
Ary Borenszweig
Jan 11, 2011
Ary Borenszweig
Jan 11, 2011
Max Samukha
Jan 11, 2011
Ary Borenszweig
Jan 11, 2011
Max Samukha
Jan 11, 2011
Nick Sabalausky
Jan 11, 2011
Max Samukha
Jan 11, 2011
spir
Jan 11, 2011
Nick Sabalausky
Jan 11, 2011
spir
Jan 11, 2011
Walter Bright
Jan 11, 2011
Ary Borenszweig
Jan 11, 2011
Walter Bright
Jan 11, 2011
Daniel Gibson
Jan 11, 2011
Ary Borenszweig
Jan 11, 2011
Ary Borenszweig
Jan 11, 2011
Daniel Gibson
Jan 12, 2011
spir
Jan 12, 2011
Don
Jan 11, 2011
Walter Bright
Jan 11, 2011
Walter Bright
Jan 11, 2011
Ary Borenszweig
Jan 12, 2011
Jonathan M Davis
Jan 12, 2011
Daniel Gibson
Jan 12, 2011
Jonathan M Davis
Jan 12, 2011
Daniel Gibson
Jan 12, 2011
Jonathan M Davis
Jan 12, 2011
Daniel Gibson
Jan 12, 2011
spir
Jan 12, 2011
Jonathan M Davis
Jan 12, 2011
Daniel Gibson
Jan 12, 2011
Ary Borenszweig
Jan 23, 2011
Joel C. Salomon
Jan 11, 2011
Max Samukha
Jan 11, 2011
Nick Sabalausky
Jan 11, 2011
Jonathan M Davis
Jan 14, 2011
Eric Poggel
Jan 11, 2011
Daniel Gibson
D standard style [was: Re: eliminate junk from std.string?]
Jan 12, 2011
spir
Jan 12, 2011
Michel Fortin
Jan 11, 2011
David Nadlinger
Jan 12, 2011
BlazingWhitester
Jan 12, 2011
Michel Fortin
Jan 12, 2011
Nick Sabalausky
Jan 12, 2011
Justin Johansson
Jan 12, 2011
Lutger Blijdestijn
Jan 12, 2011
Jacob Carlborg
Jan 12, 2011
Don
Jan 12, 2011
Paolo Invernizzi
Jan 12, 2011
Masahiro Nakagawa
Jan 12, 2011
Dmitry Olshansky
Jan 28, 2011
Bruno Medeiros
Jan 11, 2011
Nick Sabalausky
Jan 11, 2011
Daniel Gibson
Jan 11, 2011
Nick Sabalausky
Jan 11, 2011
spir
Jan 11, 2011
Nick Sabalausky
Jan 11, 2011
Jonathan M Davis
Jan 11, 2011
Justin Johansson
Jan 11, 2011
Walter Bright
January 09, 2011
There's a lot of junk in std.string that should be gone. I'm trying to motivate myself to port some functions to different string widths and... it's not worth it.

What functions do you think we should remove from std.string? Let's make a string and then send them the way of the dino.


Thanks,

Andrei
January 09, 2011
I think the tr, replace, and translate functions are a bit awkward.


January 09, 2011
IIRC someone on this NG mentioned that several functions are going away from std.string and into std.algorithm. This would be nice, considering I frequently get name clashes when importing both modules (but at least there's no function hijacking. Thanks, D!).
January 10, 2011
On Sunday 09 January 2011 15:19:31 Jimmy Cao wrote:
> I think the tr, replace, and translate functions are a bit awkward.

Really? I use replace() fairly heavily in string-processing code, and I don't see anything about it which could be considered awkward.

tr() is definitely cool. I do think that it is a bit awkward if you want to deal with the modifiers, but my only real gripe is that I can't replace a character with a string of characters (e.g. replace all ":" with " - ") or a string of characters with a single character (e.g. replace all " - " with ":"). But I don't see a good way to do that. Maybe something with regexes would be better. I don't know. But the alternative to tr() in many cases, is multiple calls to replace(), which would cause unnecessary heap allocations.

- Jonthaan M Davis
January 10, 2011
On Sun, 09 Jan 2011 16:51:57 -0600, Andrei Alexandrescu wrote:

> There's a lot of junk in std.string that should be gone. I'm trying to motivate myself to port some functions to different string widths and... it's not worth it.
> 
> What functions do you think we should remove from std.string? Let's make a string and then send them the way of the dino.


My suggestions for things to remove:

hexdigits, digits, octdigits, lowercase, letters, uppecase, whitespace
 - What are these arrays useful for?

capwords()
 - It tries to do too much.

zfill()
 - The ljustify(),rjustify(), and center() functions
   should instead take an optional padding character
   that defaults to a space.

maketrans(), translate()
 - I don't even understand what these do.

inPattern(), countchars(), removechars()
 - Pattern matching is std.regex's charter.

squeeze(), succ(), tr(), soundex(), column()
 - I am having a very hard time imagining myself ever
   using these functions...


-Lars
January 10, 2011
Lars T. Kyllingstad:

> My suggestions for things to remove:
> 
> hexdigits, digits, octdigits, lowercase, letters, uppecase, whitespace
>  - What are these arrays useful for?
> 
> capwords()
>  - It tries to do too much.
> 
> zfill()
>  - The ljustify(),rjustify(), and center() functions
>    should instead take an optional padding character
>    that defaults to a space.
> 
> maketrans(), translate()
>  - I don't even understand what these do.
> 
> inPattern(), countchars(), removechars()
>  - Pattern matching is std.regex's charter.
> 
> squeeze(), succ(), tr(), soundex(), column()
>  - I am having a very hard time imagining myself ever
>    using these functions...

I agree with about nothing you have said :-)

How much string processing you do day by day? I am using most of those things... If you are used in using Python or Ruby you probably find most of those things useful. If Andrei removes arrays like lowercase, letters, uppecase, I will have to write them myself in code. ljustify(),rjustify(), and center() are very useful, even if they may be improved in some ways. maketrans() and translate() (as other things) come from Python string functions, and I have used them a hundred times in string processing code. I have used squeeze() some times. soundex is not hurting, because even if it's not commonly necessary, its name is easy to understand and it's not easy to miss for something different, so it doesn't add much noise to the library. And I've seen that it's easy to implement soundex wrongly, while the one in the std.string is correct.

I agree that too much stuff is generally bad in a library, because searching for something requires more time if there are more items to search into. In Bugzilla I have three or four bug reports that ask for few small changes in std.string (like removing chop and keeping chomp). But please don't remove too much. In a library more is often better.

Bye,
bearophile
January 10, 2011
On Mon, 10 Jan 2011 03:41:51 -0500, bearophile wrote:

> Lars T. Kyllingstad:
> 
>> My suggestions for things to remove:
>> 
>> hexdigits, digits, octdigits, lowercase, letters, uppecase, whitespace
>>  - What are these arrays useful for?
>> 
>> capwords()
>>  - It tries to do too much.
>> 
>> zfill()
>>  - The ljustify(),rjustify(), and center() functions
>>    should instead take an optional padding character that defaults to a
>>    space.
>> 
>> maketrans(), translate()
>>  - I don't even understand what these do.
>> 
>> inPattern(), countchars(), removechars()
>>  - Pattern matching is std.regex's charter.
>> 
>> squeeze(), succ(), tr(), soundex(), column()
>>  - I am having a very hard time imagining myself ever
>>    using these functions...
> 
> I agree with about nothing you have said :-)
> 
> How much string processing you do day by day? I am using most of those things... If you are used in using Python or Ruby you probably find most of those things useful. If Andrei removes arrays like lowercase, letters, uppecase, I will have to write them myself in code. ljustify(),rjustify(), and center() are very useful, even if they may be improved in some ways. maketrans() and translate() (as other things) come from Python string functions, and I have used them a hundred times in string processing code. I have used squeeze() some times. soundex is not hurting, because even if it's not commonly necessary, its name is easy to understand and it's not easy to miss for something different, so it doesn't add much noise to the library. And I've seen that it's easy to implement soundex wrongly, while the one in the std.string is correct.

I think you may have misunderstood some of my suggestions.  For instance, I never proposed to remove ljustify(), rjustify(), and center().  Rather, I would have them take an extra 'padding' parameter, so we can eliminate zfill().

  Before:  auto s = zfill("123", 6);
  After:   auto s = rjustify("123", 6, '0');

As for the other things I suggested, well... those are the things i vote to remove from std.string.  If they only get that one vote, they stay. ;)


By the way, since you seem to be using these things quite often, maybe you can answer this:

1. What are the hexdigits, digits, octdigits, lowercase, letters, uppercase, and whitespace arrays useful for?  The only thing I can think of is to check whether a character belongs to one of them, but I think that is better done with the std.ctype functions.

2. What do maketrans() and translate() do?  (A brief example would be nice.)

-Lars
January 10, 2011
"Lars T. Kyllingstad" <public@kyllingen.NOSPAMnet> wrote in message news:igeia5$1t4a$4@digitalmars.com...
>
> 1. What are the hexdigits, digits, octdigits, lowercase, letters, uppercase, and whitespace arrays useful for?  The only thing I can think of is to check whether a character belongs to one of them, but I think that is better done with the std.ctype functions.
>

They're good for people like me who never noticed std.ctype ;)


January 11, 2011
On 1/10/11 2:41 AM, bearophile wrote:
> Lars T. Kyllingstad:
>
>> My suggestions for things to remove:
>>
>> hexdigits, digits, octdigits, lowercase, letters, uppecase, whitespace
>>   - What are these arrays useful for?
>>
>> capwords()
>>   - It tries to do too much.
>>
>> zfill()
>>   - The ljustify(),rjustify(), and center() functions
>>     should instead take an optional padding character
>>     that defaults to a space.
>>
>> maketrans(), translate()
>>   - I don't even understand what these do.
>>
>> inPattern(), countchars(), removechars()
>>   - Pattern matching is std.regex's charter.
>>
>> squeeze(), succ(), tr(), soundex(), column()
>>   - I am having a very hard time imagining myself ever
>>     using these functions...
>
> I agree with about nothing you have said :-)
>
> How much string processing you do day by day? I am using most of
> those things... If you are used in using Python or Ruby you probably
> find most of those things useful. If Andrei removes arrays like
> lowercase, letters, uppecase, I will have to write them myself in
> code.

The arrays letters, uppercase, and lowercase aren't all that useful because they only make sense for ASCII. Besides, they should be encoded as functions.

> ljustify(),rjustify(), and center() are very useful, even if
> they may be improved in some ways.

Hmmm. I suspected everyone's list will be different :o). I personally think the justification and centering functions are rarely useful - how often does one need to justify plain text? If you generate HTML the markup will do that for you and if you generate some nice text then the font will be proportional so the functions are useless.

Nevertheless, I ported them (and also fixed them - they were broken for anything non-ASCII, which probably is telling of the extent of their usage).

What are your use cases for these three functions?

> maketrans() and translate() (as
> other things) come from Python string functions, and I have used them
> a hundred times in string processing code. I have used squeeze() some
> times. soundex is not hurting, because even if it's not commonly
> necessary, its name is easy to understand and it's not easy to miss
> for something different, so it doesn't add much noise to the library.
> And I've seen that it's easy to implement soundex wrongly, while the
> one in the std.string is correct.

I think maketrans/translate are okay (if a bit arcane) but they need to be ported to Unicode.

Python apparently does mind Unicode as of 3.x, although I'm not sure exactly what the semantics are: http://stackoverflow.com/questions/3031045/how-come-string-maketrans-does-not-work-in-python-3-1. One odd thing is that you'd expect a dynamic language like Python to dynamically detect ASCII vs. non-ASCII. The example shows that Python rejects string-based translation tables even when they are, in fact, ASCII.

> I agree that too much stuff is generally bad in a library, because
> searching for something requires more time if there are more items to
> search into. In Bugzilla I have three or four bug reports that ask
> for few small changes in std.string (like removing chop and keeping
> chomp). But please don't remove too much. In a library more is often
> better.

I think we should remove all functions that rely on patterns represented as strings: inPattern, countchars, removechars, squeeze, munch.

Representing patterns as a convention on top of otherwise untyped strings doesn't seem a good solution for D. We should either go with regex or with a simple pattern structure and a helper function. That way people can say e.g. munch(s, pattern("[0-9]")).


Andrei
January 11, 2011
> What are your use cases for these three functions?

I don't know about bearophile, but I used a lot of the functions you are talking about removing in my HTML -> Plain Text conversion function used for emails and other similar environments. squeeze the whitespace, align text, wrap for the target, etc.
« First   ‹ Prev
1 2 3 4 5 6 7 8 9