Jump to page: 1 24  
Page
Thread overview
regexp suggestion
Feb 08, 2002
Pavel Minayev
Feb 08, 2002
Walter
Feb 08, 2002
Pavel Minayev
Feb 08, 2002
Walter
Feb 09, 2002
Pavel Minayev
Feb 09, 2002
Walter
Feb 09, 2002
Pavel Minayev
Feb 09, 2002
Walter
Feb 09, 2002
Pavel Minayev
Feb 09, 2002
Sean L. Palmer
Feb 09, 2002
Pavel Minayev
Feb 10, 2002
Sean L. Palmer
Feb 09, 2002
Walter
Feb 09, 2002
Pavel Minayev
Feb 09, 2002
Pavel Minayev
Feb 10, 2002
Walter
Feb 10, 2002
Karl Bochert
Feb 10, 2002
Pavel Minayev
Feb 10, 2002
Walter
Feb 10, 2002
Pavel Minayev
Feb 10, 2002
Karl Bochert
Feb 10, 2002
Pavel Minayev
Feb 10, 2002
Karl Bochert
Feb 10, 2002
Pavel Minayev
Feb 10, 2002
Walter
Feb 11, 2002
Karl Bochert
Feb 10, 2002
Walter
Feb 11, 2002
Pavel Minayev
Feb 11, 2002
Walter
Feb 11, 2002
Pavel Minayev
Feb 11, 2002
Walter
Feb 11, 2002
Karl Bochert
February 08, 2002
It would be really nice to have a method of RegExp similar to test(),
but only matching regexp at the position given, not advancing
further on error, and returning number of bytes read (or 0 on failure).
It could be used for easy token parsing:

    RegExp identifier = new RegExp('\w', "");
    char[] code, token;
    int pos;
    ...
    int count = identifier.get(code, pos);
    if (count)
    {
        token = code[pos .. pos + count];
        pos += count;    // next token
    }




February 08, 2002
I believe you can already do that with regexp by looking at the match array and using it to slice the input array.

"Pavel Minayev" <evilone@omen.ru> wrote in message news:a41ccn$2m50$1@digitaldaemon.com...
> It would be really nice to have a method of RegExp similar to test(),
> but only matching regexp at the position given, not advancing
> further on error, and returning number of bytes read (or 0 on failure).
> It could be used for easy token parsing:
>
>     RegExp identifier = new RegExp('\w', "");
>     char[] code, token;
>     int pos;
>     ...
>     int count = identifier.get(code, pos);
>     if (count)
>     {
>         token = code[pos .. pos + count];
>         pos += count;    // next token
>     }
>
>
>
>


February 08, 2002
"Walter" <walter@digitalmars.com> wrote in message news:a41imc$2pnk$1@digitaldaemon.com...
> I believe you can already do that with regexp by looking at the match
array
> and using it to slice the input array.

Yes, but it's sloooooow!


February 08, 2002
You can also use the "g" attribute.

"Pavel Minayev" <evilone@omen.ru> wrote in message news:a41jep$2q3p$1@digitaldaemon.com...
> "Walter" <walter@digitalmars.com> wrote in message news:a41imc$2pnk$1@digitaldaemon.com...
> > I believe you can already do that with regexp by looking at the match
> array
> > and using it to slice the input array.
>
> Yes, but it's sloooooow!
>
>


February 09, 2002
"Walter" <walter@digitalmars.com> wrote in message news:a41oek$2se5$1@digitaldaemon.com...

> You can also use the "g" attribute.

Sorry, I'm not very familiar with regexp... how is
it supposed to do what I want?


February 09, 2002
"Pavel Minayev" <evilone@omen.ru> wrote in message news:a42jse$6h1$1@digitaldaemon.com...
> "Walter" <walter@digitalmars.com> wrote in message news:a41oek$2se5$1@digitaldaemon.com...
>
> > You can also use the "g" attribute.
>
> Sorry, I'm not very familiar with regexp... how is
> it supposed to do what I want?

If you use the "g" attribute to the RegExp constructor, and repeated calls to exec() will each pick up where the previous left off.


February 09, 2002
"Walter" <walter@digitalmars.com> wrote in message news:a42tc9$hrc$1@digitaldaemon.com...

> If you use the "g" attribute to the RegExp constructor, and repeated calls to exec() will each pick up where the previous left off.

But doesn't it try to search for the regexp further if it doens't match in current position?


February 09, 2002
"Pavel Minayev" <evilone@omen.ru> wrote in message news:a433vk$l3i$1@digitaldaemon.com...
> "Walter" <walter@digitalmars.com> wrote in message news:a42tc9$hrc$1@digitaldaemon.com...
>
> > If you use the "g" attribute to the RegExp constructor, and repeated
calls
> > to exec() will each pick up where the previous left off.
>
> But doesn't it try to search for the regexp further if it doens't match in current position?

Yes.


February 09, 2002
"Walter" <walter@digitalmars.com> wrote in message news:a43tq3$11uk$2@digitaldaemon.com...

> > But doesn't it try to search for the regexp further if it doens't match in current position?
>
> Yes.

Then I don't understand how it can be used to tokenize the string. Suppose I have:

    foo123 = bar456 + 789;

Now I first search for the identifier, and get "foo123" and "bar456". Then I search for numbers and get "123", "456" and "789" - and only the latter is correct...

With my suggestion implemented, however, it'd look somewhat different. First I check for identifier, and get "foo123". Now I advance after the end of that token, and perform another check... when I get to "789", I check if it matches an identifier /\w.../ - it doesn't, so I check if it is a number /0-9+/ and succeed... that's how it is supposed to work.


February 09, 2002
I think sscanf could do this if it could return a pointer to how far it got in the input string during processing in addition to how many fields were converted.  sscanf as it exists in C is not so useful.

Sean

"Pavel Minayev" <evilone@omen.ru> wrote in message news:a443lq$147s$1@digitaldaemon.com...
> "Walter" <walter@digitalmars.com> wrote in message news:a43tq3$11uk$2@digitaldaemon.com...
>
> > > But doesn't it try to search for the regexp further if it doens't match in current position?
> >
> > Yes.
>
> Then I don't understand how it can be used to tokenize the string. Suppose I have:
>
>     foo123 = bar456 + 789;
>
> Now I first search for the identifier, and get "foo123" and "bar456". Then I search for numbers and get "123", "456" and "789" - and only the latter is correct...
>
> With my suggestion implemented, however, it'd look somewhat different. First I check for identifier, and get "foo123". Now I advance after the end of that token, and perform another check... when I get to "789", I check if it matches an identifier /\w.../ - it doesn't, so I check if it is a number /0-9+/ and succeed... that's how it is supposed to work.



« First   ‹ Prev
1 2 3 4