On Tue, Mar 13, 2012 at 1:27 PM, Dmitry Olshansky <dmitry.olsh@gmail.com> wrote:
For a couple of releases we have a new revamped std.regex, that as far as I'm concerned works nicely, thanks to my GSOC commitment last summer. Yet there was certain dark trend around std.regex/std.regexp as both had severe bugs, missing documentation and what not, enough to consider them unusable or dismiss prematurely.

It's about time to break this gloomy aura, and show that std.regex is actually easy to use, that it does the thing and has some nice extras.

Link: http://blackwhale.github.com/regular-expression.html

Comments are welcome from experts and newbies alike, in fact it should encourage people to try out a few tricks ;)

This is intended as replacement for an article on dlang.org
about outdated (and soon to disappear) std.regexp:
http://dlang.org/regular-expression.html

[Spoiler] one example relies on a parser bug being fixed (blush):
https://github.com/D-Programming-Language/phobos/pull/481
Well, it was a specific lookahead inside lookaround so that's not severe bug ;)

P.S. I've been following through a bunch of new bug reports recently, thanks to everyone involved :)


--
Dmitry Olshansky

Second paragraph:
- "..,expressions, though one though one should..." has too many "though one"s

Third paragraph:
- "...keeping it's implementation..." should be "its"
- "We'll see how close to built-ins one can get this way." was kind of confusing.  I'd consider just doing away with the distinction between built in and non-built in regex since it's an implementation detail most programmers who use it don't even need to know about.  Maybe say that it is not built in and explain why that is a neat thing to have (meaning, the language itself is powerful enough to express it in user code).

Fourth paragraph:
- "...article you'd have..." should probably be "you'll" or, preferably, "you will".
- "...utilize it's API..." should be "its"
- "yet it's not required to get an understanding of the API." I'd probably change this to "...yet it's not required to understand the API"

Lost track of which paragraph:
- "... that allows writing a regex pattern in it's natural notation" another "its"
- "trying to match special characters like" I'd write "trying to match special regex characters like" for clarity
- "over input like e.g. search or simillar" I'd remove the e.g., write search as "search()" to show it's a function in other languages and fix the spelling of similar :P
- "An element type is Captures for the string type being used, it is a random access range." I just found this confusing.  Not sure what it's trying to say.
- "I won't go into full detail of the range conception, suffice to say," I'd change "conception" to "concept" and remove "suffice to say". (It's a shame we don't a range article we can link to).
- "At that time ancors like" misspelled "anchors"
- "Needless to say, one need not" I'd remove the "Needless to say," because I think it's actually important to say :P
- "replace(text, regex(r"([0-9]{1,2})/([0-9]{1,2})/([0-9]{4})","g"), "--");" Is this code example correct?  It references $1, $2, etc. in the explanatory paragraph below but they are no where to be found.
- When you are explaining named captures it sounds like you are about to show them in the subsequent code example but you are actually showing what it'd look like without them which was a bit confusing.
- Maybe some more words on what lookaround/lookahead do as I was lost.
- "Amdittedly, barrage of ? and ! makes regex rather obscure, more then it's actually is. However" should be "Admittedly, the barrage of ? and ! makes the regex rather obscure, more than it actually is.".  Maybe change "obscure" to a different adjective. Perhaps "complex looking" or "complicated". (note I've removed the "However" as the upcoming sentence isn't contradicting what you just said.
- "Needless to say it's", again, I think it's rather important to say :P
- "Run-time version took around 10-20us on my machine, admittedly no statistics." here, borrow this "µ" :P.  Also, I'd get rid of "admittedly no statistics".
- "meaningful tasks, it's features" another "its"
- "together it's major" and another :P
- "...flexible tools: match, replace, spliter" should be spelled "splitter"


Great article.  I didn't even know about the replacement delegate feature which is something I've often wished I could use in other regex systems.  D and Phobos need more articles like this.  We should have a link to it from the std.regex documentation once this is added to the website.

Regards,
Brad Anderson