February 16, 2006
On Wed, 15 Feb 2006 15:51:40 -0800, Walter Bright wrote:

> 
> There's a link in the std_regexp page to it: www.digitalmars.com/ctg/regular.html

There is a couple of problems with this link. It doesn't work when one uses the downloaded html docs. This is because it uses a link to a file that is not a part of the downloaded stuff. But more importantly, the syntax is wrong.

The actual html you use is (notice the twin double quotes)

	<a href=""../../ctg/regular.html"">Regular expressions</a>

but it would be better to use something like ...

	<a href="http://www.digitalmars.com/ctg/regular.html">Regular
expressions</a>

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocracy!"
16/02/2006 10:59:41 AM
February 16, 2006
Walter Bright wrote:
> "Sean Kelly" <sean@f4.ca> wrote in message news:dt0ds9$226k$1@digitaldaemon.com...
>> Walter Bright wrote:
>>> Added match expressions.
>> Interesting.  So where can I find documentation on pattern syntax?  The docs for std.regexp doesn't seem to mention it.  Is it just the classic textbook syntax, or are there differences?
> 
> There's a link in the std_regexp page to it: www.digitalmars.com/ctg/regular.html
> 
> It's the classic syntax. 

Awesome!  This will take some getting used to, but it promises to be of tremendous use.  Don't ask me why a built-in feature seems preferable to the same thing in library code, but it does :-p  Perhaps some of it is that this will work for both compile-time and run-time evaluation, while the library version would likely be different for each.


Sean
February 16, 2006
Derek Parnell wrote:
> On Wed, 15 Feb 2006 15:51:40 -0800, Walter Bright wrote:
> 
>> There's a link in the std_regexp page to it: www.digitalmars.com/ctg/regular.html
> 
> There is a couple of problems with this link. It doesn't work when one uses
> the downloaded html docs. This is because it uses a link to a file that is
> not a part of the downloaded stuff. But more importantly, the syntax is
> wrong.

Got me.  I'm looking at the online docs (http://www.digitalmars.com/d/phobos/std_regexp.html) and both links at the top of the page just link to std_regexp.html.  Thus my question.


Sean
February 16, 2006
I obviously lack the terminology neccessary.
Trying with pseudo-code:

// --- Situation: find matches, chunks of data in a list, no continuous // buffer, memcpys to duplicate and concatenate data inefficient

slist List    = ...; // containers with some data differing in size rxres Results = List ~~ "regex+";

// Hopefully indexed all potentially dangling matches between two
// chunks (?)...

while( !Results )
.. = Results.nFirst,
.. = Results.nLast,
.. = Results.get_ptr,
Results++;

// --- Situation: find matches in a continuous buffer:

utf16 Text[]  = ...;
rxres Results = Text ~~ "foo+";

while( !Results )
print( Results++ );

---

Maybe this explains what I meant, maybe it is just absurd.

Chr. Grade


In article <dt0b6l$1vpe$1@digitaldaemon.com>, Walter Bright says...
>
>
>"Chr. Grade" <Chr._member@pathlink.com> wrote in message news:dt0ait$1v78$1@digitaldaemon.com...
>>
>> Nifty feature. Would be handy if regex searches be included as well - for continuous buffers and for chunked buffers.
>
>Not sure what you mean?
>
>


February 16, 2006
On Wed, 15 Feb 2006 15:50:03 -0800, Walter Bright wrote:


> Also, operating system wildcard thing isn't the one used, it's real regular expressions from std.regexp. So you'd write it as:
> 
>     assert(".wav$" ~~ filename);
> 
> which means any string ending in ".wav".

Should that be ...

       assert("\.wav$" ~~ filename);

otherwise it would also match things like "somefile.awav" because doesn't the "." in the regexp represents 'any-character'.

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocracy!"
16/02/2006 11:16:13 AM
February 16, 2006
On Thu, 16 Feb 2006 00:13:16 +0000 (UTC), Chr. Grade wrote:

> I obviously lack the terminology neccessary.
> Trying with pseudo-code:
> 
> // --- Situation: find matches, chunks of data in a list, no continuous // buffer, memcpys to duplicate and concatenate data inefficient
> 
> slist List    = ...; // containers with some data differing in size rxres Results = List ~~ "regex+";
> 
> // Hopefully indexed all potentially dangling matches between two
> // chunks (?)...
> 
> while( !Results )
> .. = Results.nFirst,
> .. = Results.nLast,
> .. = Results.get_ptr,
> Results++;
> 
> // --- Situation: find matches in a continuous buffer:
> 
> utf16 Text[]  = ...;
> rxres Results = Text ~~ "foo+";
> 
> while( !Results )
> print( Results++ );
> 
> ---
> 
> Maybe this explains what I meant, maybe it is just absurd.
> 

I'm really sorry, but this has just made it worse for me. I have absolutely no idea what you are trying to do or say.

Are you talking about a list of pointers to strings and searching over the referenced strings in one ~~ operation?

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocracy!"
16/02/2006 11:20:32 AM
February 16, 2006
>
>Currently, evaluating ("^abc"~~string) invokes the full std.regexp machinery. But a compiler is free to optimize (1) into (2). I'm thinking of Eric and Don's examples of generating custom recognizers for static regex strings. This could make D's regex support into a real screamer.
>

Static regex? Umm...
Again, this might be absurd, but there could be a type "regex".

regex rxSome  = "ยง|&|=";
regex rxMore  = "[a-n]";
regex rxMerge = "foo($rxSome)?($rxMore)+";

Whereas...
char[] cpSome  = "...";
char[] cpMore  = "...";
.. would lead to a less readable:
char[] cpMerge = "foo" . cpSome . "?" . cpMore . "+";

---

Chr. Grade


February 16, 2006
"Derek Parnell" <derek@psych.ward> wrote in message news:1fyp16zonzb9q$.1qxsxpiy1s1ry.dlg@40tude.net...
> On Wed, 15 Feb 2006 15:50:03 -0800, Walter Bright wrote:
>> So you'd write it as:
>>
>>     assert(".wav$" ~~ filename);
>>
>> which means any string ending in ".wav".
>
> Should that be ...
>
>       assert("\.wav$" ~~ filename);
>
> otherwise it would also match things like "somefile.awav" because doesn't the "." in the regexp represents 'any-character'.

Yes. <g>


February 16, 2006
"Sean Kelly" <sean@f4.ca> wrote in message news:dt0fp5$23qb$1@digitaldaemon.com...
> Awesome!  This will take some getting used to, but it promises to be of tremendous use.  Don't ask me why a built-in feature seems preferable to the same thing in library code, but it does :-p

I don't really know why either. std.regexp has been in D since day 1, but it's been completely overlooked, and I regularly get comments about D not doing regular expressions. If this is what it takes, then so be it <g>.


February 16, 2006
"Derek Parnell" <derek@psych.ward> wrote in message news:k0lbfijz1ng3$.7oaf5rf2w9ut$.dlg@40tude.net...
> On Wed, 15 Feb 2006 13:52:12 -0800, Walter Bright wrote:
>
>> Added match expressions.
>
> Too lazy to test sorry. Do match expressions support Unicode or just ASCII?

I know it works with ASCII, and it's supposed to work with UTF. I wouldn't be surprised if the latter is buggy, though, since I haven't written test cases for it.

It's designed, however, so the compiler itself need know nothing about regular expressions. The work is all done by std.regexp.