Thread overview | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
January 09, 2012 [Issue 7260] New: "g" on default in std.regex.match | ||||
---|---|---|---|---|
| ||||
http://d.puremagic.com/issues/show_bug.cgi?id=7260 Summary: "g" on default in std.regex.match Product: D Version: D2 Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Phobos AssignedTo: nobody@puremagic.com ReportedBy: bearophile_hugs@eml.cc --- Comment #0 from bearophile_hugs@eml.cc 2012-01-09 13:52:08 PST --- D2 code: import std.stdio: write, writeln; import std.regex: regex, match; void main() { string text = "abc312de"; foreach (c; text.match("1|2|3|4")) write(c, " "); writeln(); foreach (c; text.match(regex("1|2|3|4", "g"))) write(c, " "); writeln(); } It outputs (DMD 2.058 Head): ["3"] ["3"] ["1"] ["2"] In my code I have seen that usually the "g" option (that means "repeat over the whole input") is what I want. So what do you think about making "g" the default? Note: I have not marked this issue as "enhancement" because of this comment by Dmitry Olshansky (found by drey_ on IRC #D): http://dfeed.kimsufi.thecybershadow.net/discussion/thread/jc9hrl$2lpp$1@digitalmars.com#post-jc9mag:2430tq:241:40digitalmars.com > Yet I have to issue yet another warning about new std.regex compared with old one: > > import std.stdio; > import std.regex; > > void main() { > string src = "4.5.1"; > foreach (c; match(src, regex(r"(\d+)"))) > writeln(c.hit); > } > > previously this will find all matches, now it finds only first one. To get all of matches use "g" option. > > Seems like 100% compatibility was next to impossible. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
February 24, 2012 [Issue 7260] "g" on default in std.regex.match | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile_hugs@eml.cc | http://d.puremagic.com/issues/show_bug.cgi?id=7260 Dmitry Olshansky <dmitry.olsh@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dmitry.olsh@gmail.com --- Comment #1 from Dmitry Olshansky <dmitry.olsh@gmail.com> 2012-02-24 12:21:44 PST --- I dunno how to "fix" this bug. "g" by default imples there is a way to override it. regex("blah","") ? Leaving it as is now breaks old codebases that rely on "g" (though there should be more of legacy std.regexp code out there). Making it "g" on affects old code only inside foreach and generic constructs that show all matches or iterate on them, it's rare but non-zero. Another way would be to ditch current API, which I is not ideal btw ;) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
February 24, 2012 [Issue 7260] "g" on default in std.regex.match | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile_hugs@eml.cc | http://d.puremagic.com/issues/show_bug.cgi?id=7260 --- Comment #2 from bearophile_hugs@eml.cc 2012-02-24 12:45:08 PST --- (In reply to comment #1) > I dunno how to "fix" this bug. "g" by default imples there is a way to override > it. regex("blah","") ? > Leaving it as is now breaks old codebases that rely on "g" (though there should > be more of legacy std.regexp code out there). > Making it "g" on affects old code only inside foreach and generic constructs > that show all matches or iterate on them, it's rare but non-zero. > > Another way would be to ditch current API, which I is not ideal btw ;) Fully ditching the currently used API is probably too much. A possible idea: regex("blah") <<== repeat over the whole input. regex("blah","") <<== repeat over the whole input. regex("blah","g") <<== repeat over the whole input. regex("blah","d") <<== doesn't repeat over the whole input. So far you have done good work on the regular expression implementation, so I trust your work. Thank you. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
April 19, 2012 [Issue 7260] "g" on default in std.regex.match | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile_hugs@eml.cc | http://d.puremagic.com/issues/show_bug.cgi?id=7260 SomeDude <lovelydear@mailmetrash.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |lovelydear@mailmetrash.com Severity|normal |enhancement -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
April 19, 2012 [Issue 7260] "g" on default in std.regex.match | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile_hugs@eml.cc | http://d.puremagic.com/issues/show_bug.cgi?id=7260 bearophile_hugs@eml.cc changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|enhancement |normal --- Comment #3 from bearophile_hugs@eml.cc 2012-04-19 15:18:13 PDT --- This is not an enhancement request (I consider it more like a little Phobos regression). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
January 25, 2013 [Issue 7260] "g" on default in std.regex | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile_hugs@eml.cc | http://d.puremagic.com/issues/show_bug.cgi?id=7260 bearophile_hugs@eml.cc changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|"g" on default in |"g" on default in std.regex |std.regex.match | --- Comment #4 from bearophile_hugs@eml.cc 2013-01-24 19:21:14 PST --- If changing std.regex.regex is not possible, then an alternative solution is to introduce the new little function "std.regex.re", that repeats on default, that is like: re(someString) === regex(someString, "g") re(someString, "d") === regex(someString, "dg") -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
January 25, 2013 [Issue 7260] "g" on default in std.regex | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile_hugs@eml.cc | http://d.puremagic.com/issues/show_bug.cgi?id=7260 --- Comment #5 from Dmitry Olshansky <dmitry.olsh@gmail.com> 2013-01-25 12:22:46 PST --- (In reply to comment #4) > If changing std.regex.regex is not possible, then an alternative solution is to introduce the new little function "std.regex.re", that repeats on default, that is like: > > re(someString) === regex(someString, "g") > > re(someString, "d") === regex(someString, "dg") Frankly this is stupid (sorry). Obviously the wrong turn is that people (rightfully so) associate "find all" vs "find first" with operation that is "match"/"replace" not the "regex" as in the pattern itself. Personally I think that we better go with explicit overrides on "match"/"replace"/etc. and very slowly deprecate the "g" switch. Then how the override will look like is up for debate. match(someString, pattern).all //range of all matches match(someString, pattern).first //only the first one match(someString, pattern) // using the "g" flag to decide Or pass the override as optional parameter to match: match(someString, pattern, Regex.all); match(someString, pattern, Regex.first); match(someString, pattern); //use the flag I'll probably open a poll to pick the better one. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
March 10, 2013 [Issue 7260] "g" on default in std.regex | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile_hugs@eml.cc | http://d.puremagic.com/issues/show_bug.cgi?id=7260 --- Comment #6 from Dmitry Olshansky <dmitry.olsh@gmail.com> 2013-03-10 10:43:30 PDT --- (In reply to comment #4) > If changing std.regex.regex is not possible, then an alternative solution is to introduce the new little function "std.regex.re", that repeats on default, that is like: > > re(someString) === regex(someString, "g") > > re(someString, "d") === regex(someString, "dg") Here is a plan based on one of my previous idea that I think is clean enough, given the circumstances and the fact that e.g. this Perl-ism is fairly popular in certain circles. (Namely attaching mode of operation to the pattern itself as in /`pattern`/`mode-suffix`). What we do is at first specify that "g" serves only as the intended default "mode" of this pattern. Then introduce simple and elegant way to explicitly specify what mode of matching to use: first, all or the default for this pattern. The your code looks like this (I'm still pondering better names/ways for overriding default): void main() { string text = "abc312de"; foreach (c; text.match("1|2|3|4").first) write(c, " "); writeln(); foreach (c; text.match(regex("1|2|3|4")).all) //could use string pattern as above write(c, " "); writeln(); } Then I'd try to do the same with replace. No overrides used would imply "use whatever the default mode is". How does it sound? Then we place nice bold warning that use of "g" option is discouraged and is provided only for compatibilty and is going be deprecated in future. A year later and depending on the mood of people it gets finally deprecated and slowly shifted towards oblivion. I'll probably cross-post this to NG to collect opinions since this is the largest pain point of the otherwise fine interface. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
March 10, 2013 [Issue 7260] "g" on default in std.regex | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile_hugs@eml.cc | http://d.puremagic.com/issues/show_bug.cgi?id=7260 --- Comment #7 from bearophile_hugs@eml.cc 2013-03-10 11:09:31 PDT --- (In reply to comment #5) > match(someString, pattern).all //range of all matches > match(someString, pattern).first //only the first one > match(someString, pattern) // using the "g" flag to decide (In reply to comment #6) > No overrides used would imply "use whatever the default mode is". > > How does it sound? > > Then we place nice bold warning that use of "g" option is discouraged and is provided only for compatibilty and is going be deprecated in future. > > A year later and depending on the mood of people it gets finally deprecated and slowly shifted towards oblivion. Once "g" is deprecated what is match(someString, pattern) (without all and first) doing? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
March 10, 2013 [Issue 7260] "g" on default in std.regex | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile_hugs@eml.cc | http://d.puremagic.com/issues/show_bug.cgi?id=7260 --- Comment #8 from Dmitry Olshansky <dmitry.olsh@gmail.com> 2013-03-10 11:54:55 PDT --- (In reply to comment #7) > (In reply to comment #5) > > > match(someString, pattern).all //range of all matches > > match(someString, pattern).first //only the first one > > match(someString, pattern) // using the "g" flag to decide > > > (In reply to comment #6) > > > No overrides used would imply "use whatever the default mode is". > > > > How does it sound? > > > > Then we place nice bold warning that use of "g" option is discouraged and is provided only for compatibilty and is going be deprecated in future. > > > > A year later and depending on the mood of people it gets finally deprecated and slowly shifted towards oblivion. > > Once "g" is deprecated what is match(someString, pattern) (without all and > first) doing? Could go both ways. The other posibility I just thought about is: match(...).first - is the same as current match(...).front i.e. simplify interface for the case when 1 match is needed match(...).all - the same as current match(... with "g" overrided) i.e. a range Then once "g" is off we could either make .all a nop. Alternative is to make it opaque object that has 2 methods only .first/.all. The third alternative to add alias this to make .first implicit. I feel it won't work reliably with range-based templates as it would make it "2 ranges in one". So only the first 2 are viable. I'd go with 1st that gets upgraded to the second once people forget about "g" switch entierly. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
Copyright © 1999-2021 by the D Language Foundation