Thread overview | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
August 13, 2005 negative assertion support for RegExp? | ||||
---|---|---|---|---|
| ||||
Attachments: | Is there any D library that offers regular expressions with negative assertion support? There seems to be no documented way to use negative assertions in Phobo's regular expressions. (http://digitalmars.com/ctg/regular.html) Usually the syntax "(?!doNotMatch)" is used for that on Linux systems. Thomas - -- sample code --- import std.regexp; import std.stdio; int main(){ char[] log= "IP:127.0.0.1; USER:some; additional info\n" "IP:123.3.8.0; USER:other; additional info\n"; char[] pattern = "^(?!IP:(127[.]0[.]0[.]1)); USER:([^;@]*);"; char[] format = "; USER:$2@$1;"; char[] attributes = "g"; char[] filtered = sub(log, pattern, format, attributes); writef("---unfiltered---\n%s\n", log); writef("---filtered---\n%s\n", filtered); return 0; } /* Expected Output: - ---unfiltered--- IP:127.0.0.1; USER:some; additional info IP:123.3.8.17; USER:other; additional info - ---filtered--- IP:127.0.0.1; USER:some; additional info IP:123.3.8.17; USER:other@123.3.8.17; additional info */ |
August 14, 2005 Re: negative assertion support for RegExp? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Thomas Kühne | =?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?= <thomas-dloop@kuehne.THISISSPAM.cn> wrote: [...] > Is there any D library that offers regular expressions with negative assertion support? [...] Why do you need such? With a little bit of programming with split, find and rfind you should be able to use std.regexp for that purpose. -manfred |
August 14, 2005 Re: negative assertion support for RegExp? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manfred Nowak | In article <ddmogt$19aq$1@digitaldaemon.com>, Manfred Nowak says... > >=?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?= <thomas-dloop@kuehne.THISISSPAM.cn> wrote: > >[...] >> Is there any D library that offers regular expressions with negative assertion support? >[...] > >Why do you need such? With a little bit of programming with split, find and rfind you should be able to use std.regexp for that purpose. To save himself that bit of programming? ;) Regexes are currently somewhat limited in phobos. I find myself missing Perl features all the time. --AJG. |
August 14, 2005 Re: neg. support for RegExp? (Yes, PCRE) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Thomas Kühne | Hi Thomas, Actually, I ported PCRE version 5 to D about a month ago when Walter told me phobos didn't support named groups. AFAIK it works correctly; I compiled the test program (a version of grep) and it didn't show any errors. The only problem is that it's not object-oriented (it's the C API). Anyway, I'm going to upload the code and maybe you can use that. You can find example code in main.d. All you need essentially is: # import pcre; And off you go. If you have Build you can do: % build main And that's it. Let me know if you find it useful. If there's enough interest, I could develop a D-based OO interface for it, and maybe Walter will consider it for inclusion in phobos to replace the old regex. Some technical notes: I ported the code with SUPPORT_UTF8, but _not_ with SUPPORT_UCP because that was just a lot of bloat. Also, the LINK_SIZE I selected was 2, the default. Here's the link: http://pantheon.yale.edu/~ajg36/pcre.zip Enjoy! --AJG. In article <ddkoss$2u5m$1@digitaldaemon.com>, =?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?= says... > >-----BEGIN PGP SIGNED MESSAGE----- >Hash: SHA1 > >Is there any D library that offers regular expressions with negative assertion support? > >There seems to be no documented way to use negative assertions in Phobo's regular expressions. (http://digitalmars.com/ctg/regular.html) > >Usually the syntax "(?!doNotMatch)" is used for that on Linux systems. > > >Thomas > > >- -- sample code --- >import std.regexp; >import std.stdio; > >int main(){ > char[] log= > "IP:127.0.0.1; USER:some; additional info\n" > "IP:123.3.8.0; USER:other; additional info\n"; > > char[] pattern = "^(?!IP:(127[.]0[.]0[.]1)); USER:([^;@]*);"; > char[] format = "; USER:$2@$1;"; > char[] attributes = "g"; > > char[] filtered = sub(log, pattern, format, attributes); > > writef("---unfiltered---\n%s\n", log); > writef("---filtered---\n%s\n", filtered); > > return 0; >} > >/* Expected Output: > >- ---unfiltered--- >IP:127.0.0.1; USER:some; additional info >IP:123.3.8.17; USER:other; additional info > >- ---filtered--- >IP:127.0.0.1; USER:some; additional info >IP:123.3.8.17; USER:other@123.3.8.17; additional info > >*/ >-----BEGIN PGP SIGNATURE----- > >iD4DBQFC/ec13w+/yD4P9tIRAh+7AJ9kLB27xKffpuoXhbkuT34WDP/DYQCYo1x7 >r0vTnBDmV/cn7+gjOfKbyA== >=Ep0M >-----END PGP SIGNATURE----- |
August 14, 2005 Re: negative assertion support for RegExp? | ||||
---|---|---|---|---|
| ||||
Posted in reply to AJG Attachments: | AJG schrieb:
> In article <ddmogt$19aq$1@digitaldaemon.com>, Manfred Nowak says...
>
>>=?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?= <thomas-dloop@kuehne.THISISSPAM.cn> wrote:
>>
>>[...]
>>
>>>Is there any D library that offers regular expressions with negative assertion support?
>>
>>[...]
>>
>>Why do you need such? With a little bit of programming with split, find and rfind you should be able to use std.regexp for that purpose.
>
>
> To save himself that bit of programming? ;) Regexes are currently somewhat limited in phobos. I find myself missing Perl features all the time.
What I gave was a very simple regex. The production ones are nested, include alternatives and contain more than one negative assertion.
Thomas
|
August 14, 2005 Re: negative assertion support for RegExp? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Thomas Kühne | =?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?= <thomas-dloop@kuehne.THISISSPAM.cn> wrote: [...] > What I gave was a very simple regex. The production ones are nested, include alternatives and contain more than one negative assertion. [...] Then I do not believe, that an approach with RE's and "assertions" is feasable in terms of run time requirements in first place, but also in terms of time for development and maintenance, because you are implementing some sort of lexer/parser for a language you do not have an explicit formal grammar for nor the definitions for the lexical tokens. I do not know the details of the implementation of PCRE, but I do not believe, that a tool that has its emphasis on RE's incidentally also implements an LALR-parser. -manfred |
August 14, 2005 Re: neg. support for RegExp? (Yes, PCRE) | ||||
---|---|---|---|---|
| ||||
Posted in reply to AJG | On Sun, 14 Aug 2005 08:02:41 +0000 (UTC), AJG wrote: > Hi Thomas, > > Actually, I ported PCRE version 5 to D about a month ago when Walter told me phobos didn't support named groups. AFAIK it works correctly; I compiled the test program (a version of grep) and it didn't show any errors. > > The only problem is that it's not object-oriented (it's the C API). I don't see that O-O is a requirement. A simple procedural API is quite satisfactory. -- Derek Parnell Melbourne, Australia 14/08/2005 11:07:28 PM |
August 14, 2005 Re: neg. support for RegExp? (Yes, PCRE) | ||||
---|---|---|---|---|
| ||||
Posted in reply to AJG Attachments: | AJG schrieb: > Hi Thomas, > > Actually, I ported PCRE version 5 to D about a month ago when Walter told me phobos didn't support named groups. AFAIK it works correctly; I compiled the test program (a version of grep) and it didn't show any errors. > > The only problem is that it's not object-oriented (it's the C API). > > Anyway, I'm going to upload the code and maybe you can use that. You can find example code in main.d. All you need essentially is: > > # import pcre; > > And off you go. If you have Build you can do: > > % build main > > And that's it. > > Let me know if you find it useful. If there's enough interest, I could develop a D-based OO interface for it, and maybe Walter will consider it for inclusion in phobos to replace the old regex. > > Some technical notes: > > I ported the code with SUPPORT_UTF8, but _not_ with SUPPORT_UCP because that was just a lot of bloat. Also, the LINK_SIZE I selected was 2, the default. > > Here's the link: > > http://pantheon.yale.edu/~ajg36/pcre.zip Thanks for the code :))) The main.d sample requires to small changes: line 1 < private import pcre_c; > private import pcre; line 8 < pcre *re; > pcre.pcre *re; I think PCRE_D - after a bit of clean up and some unittests - might become a valuable Phobos addon. Thomas |
Copyright © 1999-2021 by the D Language Foundation