February 18, 2006
"Oskar Linde" <olREM@OVEnada.kth.se> wrote in message news:dt40sg$29nc$1@digitaldaemon.com...
> Andrew Fedoniouk wrote:
>
>> 1) Will builtin RegExp increase minimal size of D executable? I mean if this executable is not using regexp at all.
>
> No. This was as far as I understood one of the considerations.
>
>> 2) Is it possible to override operator ~~ ?
>
> Yes. opMatch() and opNext().

And what is this opNext() doing exactly?
next sub-expression, next match from last position matched (/g) ?

>
>> 3) What is the main purpose of incorporating
>> interprettable regexps in natively compileable language?
>
> To make regexps more accessible I guess. Makes D seem like a alternative
> to
> scripting languages.

???

alternative to some scripting language can be another scripting language. alternative to some natively compileable language can be another natively compileable language.

>
>> 4) When happens check of regexp for syntax correctness -
>> at compile time or at runtime?  "..." ~~ "..."
>> If ~~ is a part of language syntax then one can assume that expression
>> is getting compiled somehow.
>
> At runtime. For now atleast. In the future it could possibly be compiled
> at
> compile time, but there will still always be a need to support run-time
> regexps anyway.
>

Having "builtin" regexps without strings in the language seems unnatural.

Andrew.


February 18, 2006
"Andrew Fedoniouk" <news@terrainformatica.com> wrote in message news:dt591g$erk$1@digitaldaemon.com...
> Next questions then:
> [char string literal] ~~ [char string literal]
>
> 1) For what object I need to override opMatch to be able
> to get it invoked in the line above?

None. Operator overloading requires one object be a class or a struct. But you could do:

    RegExp("string") ~~ "string"

and overload opMatch for RegExp.

> 2) For some types of RE (alike) expressions there is no need
> to create instance of RegExp, e.g. test
> "*.ext" ~~ file_name
> can be implemented times faster than standard RE creation/invocation.

Sure. Create your own MyReg object, and use it like:

    MyReg("*.ext") ~~ filename

> 3) Some objects has no string representation of match operation. For example CSS selector as an object has match operation with DOM element as an argument. But you have a requirement:
>
> "Both operands must be implicitly convertible to char[]."
>
> What to do in this case?

Operator overloading happens before implicit conversions.

>>> 3) What is the main purpose of incorporating
>>> interprettable regexps in natively compileable language?
>>
>> Make them easier to use.
>
> Easier? What is wrong with standard way:
>
>   regexp re = new regexp(".....");
>   re.test(...);

For whatever reason, people find that confusing and impractical.

> And easier is not mean more effective.

True. I didn't say it was more effective.

> If it does not compile this regexp at compile time than this is just a
> fake and not a
> a solution at all for the language of D level.
> Even Perl compiles its regular expresions in compile time.

It isn't worth trying to do them at compile time if the feature itself doesn't catch on.

> So the real meaning of
>  arg1 ~~ arg2
> notation is just a shortcut of
>  arg1.test(arg2)

It's more than that, because of the implicit declaration of the match results.

> In general shortcuts are good but in this particular case
> it has hidden side effects in creation of new RegExp object on each test
> invocation.

Yes, but why is that a bad thing?


February 18, 2006
"Andrew Fedoniouk" <news@terrainformatica.com> wrote in message news:dt5eo9$kgu$1@digitaldaemon.com...
> will be more a) compact b) human readable c) maintainable d) natural

For startsWith(), sure. But if that was all regex was used for, nobody would have ever invented them. Regexes can search for arbitrarilly complex patterns, and are used that way. Writing a library of custom functions for each is out of the question.

What you're also missing in the examples is using the match result, not just testing for the match.


February 18, 2006
"Andrew Fedoniouk" <news@terrainformatica.com> wrote in message news:dt5ton$10qu$1@digitaldaemon.com...
> There is also /g flag which allow to scan the whole string  (Perl)
> $i = 0while ($string =~ m/regex/g) {
>  print "Gotcha #" . $i. "!\n";
> }So what exactly this ~~ does?Andrew.

m/regex/g  =>  RegExp("regex", "g")


February 18, 2006
"Walter Bright" <newshound@digitalmars.com> wrote in message news:dt6da8$1ci6$1@digitaldaemon.com...
>
> "Andrew Fedoniouk" <news@terrainformatica.com> wrote in message news:dt591g$erk$1@digitaldaemon.com...
>> Next questions then:
>> [char string literal] ~~ [char string literal]
>>
>> 1) For what object I need to override opMatch to be able
>> to get it invoked in the line above?
>
> None. Operator overloading requires one object be a class or a struct. But you could do:
>
>    RegExp("string") ~~ "string"
>
> and overload opMatch for RegExp.

And this RegExp("string") ~~ "string" is more honest, isn't it?

Or as in Harmonia:

string s = ....
bool r = s.like("str*");


>
>> 2) For some types of RE (alike) expressions there is no need
>> to create instance of RegExp, e.g. test
>> "*.ext" ~~ file_name
>> can be implemented times faster than standard RE creation/invocation.
>
> Sure. Create your own MyReg object, and use it like:
>
>    MyReg("*.ext") ~~ filename

But I want my own function for char[] ~~ char[] !
Simple pattern match does not require compilation phase
or even memory allocation...


>
>> 3) Some objects has no string representation of match operation. For example CSS selector as an object has match operation with DOM element as an argument. But you have a requirement:
>>
>> "Both operands must be implicitly convertible to char[]."
>>
>> What to do in this case?
>
> Operator overloading happens before implicit conversions.

I don't understand why not allow this:
bool opMatch(char[] a, char[] b) ?


>
>>>> 3) What is the main purpose of incorporating
>>>> interprettable regexps in natively compileable language?
>>>
>>> Make them easier to use.
>>
>> Easier? What is wrong with standard way:
>>
>>   regexp re = new regexp(".....");
>>   re.test(...);
>
> For whatever reason, people find that confusing and impractical.

uh, people....  I see.

>
>> And easier is not mean more effective.
>
> True. I didn't say it was more effective.
>
>> If it does not compile this regexp at compile time than this is just a
>> fake and not a
>> a solution at all for the language of D level.
>> Even Perl compiles its regular expresions in compile time.
>
> It isn't worth trying to do them at compile time if the feature itself doesn't catch on.
>
>> So the real meaning of
>>  arg1 ~~ arg2
>> notation is just a shortcut of
>>  arg1.test(arg2)
>
> It's more than that, because of the implicit declaration of the match results.
>
>> In general shortcuts are good but in this particular case
>> it has hidden side effects in creation of new RegExp object on each test
>> invocation.
>
> Yes, but why is that a bad thing?

You need to explain very well what is going on under the hood of this ~~ - it is statefull operator (if it is /g).

<ot>

I am using stream tokenizer in Harmonia instead of this /g.
(class TokenizerT(CHAR) // harmonia/string.d)

Simple like(pattern)  method is enough in 90% of cases.

Perl is completely different story - it is built around RegExp. And it is typeless.

</ot>

BTW: Have you seen Nemerle and its way of meta-programming? http://nemerle.org/

Andrew.


February 18, 2006
On Fri, 17 Feb 2006 20:46:01 -0800, Andrew Fedoniouk <news@terrainformatica.com> wrote:
> "Oskar Linde" <olREM@OVEnada.kth.se> wrote in message
> news:dt40sg$29nc$1@digitaldaemon.com...
>> Andrew Fedoniouk wrote:
>>
>>> 1) Will builtin RegExp increase minimal size of D executable?
>>> I mean if this executable is not using regexp at all.
>>
>> No. This was as far as I understood one of the considerations.
>>
>>> 2) Is it possible to override operator ~~ ?
>>
>> Yes. opMatch() and opNext().
>
> And what is this opNext() doing exactly?
> next sub-expression, next match from last position matched (/g) ?
>
>>
>>> 3) What is the main purpose of incorporating
>>> interprettable regexps in natively compileable language?
>>
>> To make regexps more accessible I guess. Makes D seem like a alternative
>> to
>> scripting languages.
>
> ???
>
> alternative to some scripting language can be another scripting language.
> alternative to some natively compileable language can be another natively
> compileable language.

I think you're thinking inside the box. :)
With the recent additions is it not possible to write scripts in D?

Regan
February 18, 2006
"Regan Heath" <regan@netwin.co.nz> wrote in message news:ops45qq5rn23k2f5@nrage.netwin.co.nz...
> On Fri, 17 Feb 2006 20:46:01 -0800, Andrew Fedoniouk <news@terrainformatica.com> wrote:
>> "Oskar Linde" <olREM@OVEnada.kth.se> wrote in message news:dt40sg$29nc$1@digitaldaemon.com...
>>> Andrew Fedoniouk wrote:
>>>
>>>> 1) Will builtin RegExp increase minimal size of D executable? I mean if this executable is not using regexp at all.
>>>
>>> No. This was as far as I understood one of the considerations.
>>>
>>>> 2) Is it possible to override operator ~~ ?
>>>
>>> Yes. opMatch() and opNext().
>>
>> And what is this opNext() doing exactly?
>> next sub-expression, next match from last position matched (/g) ?
>>
>>>
>>>> 3) What is the main purpose of incorporating
>>>> interprettable regexps in natively compileable language?
>>>
>>> To make regexps more accessible I guess. Makes D seem like a alternative
>>> to
>>> scripting languages.
>>
>> ???
>>
>> alternative to some scripting language can be another scripting language. alternative to some natively compileable language can be another natively compileable language.
>
> I think you're thinking inside the box. :)
> With the recent additions is it not possible to write scripts in D?

I beleive there is a sort of misunderstanding about what scripting is and
why there are scripting (typeless) languages, compiled bytecoded and
compiled native.
These three groups has their own niches. D as a compiled language will never
reach
flexibility of e.g. prototype based JavaScript or Ruby. There are just
different definitions of flexibility
for these groups - different and sometimes even orthogonal tasks .

Andrew.








February 18, 2006
"Andrew Fedoniouk" <news@terrainformatica.com> wrote in message news:dt6gbc$1eig$1@digitaldaemon.com...
>
> "Walter Bright" <newshound@digitalmars.com> wrote in message news:dt6da8$1ci6$1@digitaldaemon.com...
>> None. Operator overloading requires one object be a class or a struct. But you could do:
>>
>>    RegExp("string") ~~ "string"
>>
>> and overload opMatch for RegExp.
>
> And this RegExp("string") ~~ "string" is more honest, isn't it?
>
> Or as in Harmonia:
>
> string s = ....
> bool r = s.like("str*");

That doesn't give the match results, though.


>> Sure. Create your own MyReg object, and use it like:
>>    MyReg("*.ext") ~~ filename
> But I want my own function for char[] ~~ char[] !

Consider overloading the '+' in '1+2'? To overload operators, one of the operands must be a user defined type.


> I don't understand why not allow this:
> bool opMatch(char[] a, char[] b) ?

For the same reason opAdd(int a, int b) is not allowed. Such a function would apply globally, all the library code will break, etc.


> BTW: Have you seen Nemerle and its way of meta-programming? http://nemerle.org/

I don't know anything about it. I'll take a look at the link.


February 18, 2006
On Sat, 18 Feb 2006 00:36:23 -0800, Andrew Fedoniouk <news@terrainformatica.com> wrote:
> "Regan Heath" <regan@netwin.co.nz> wrote in message
> news:ops45qq5rn23k2f5@nrage.netwin.co.nz...
>> On Fri, 17 Feb 2006 20:46:01 -0800, Andrew Fedoniouk
>> <news@terrainformatica.com> wrote:
>>> "Oskar Linde" <olREM@OVEnada.kth.se> wrote in message
>>> news:dt40sg$29nc$1@digitaldaemon.com...
>>>> Andrew Fedoniouk wrote:
>>>>
>>>>> 1) Will builtin RegExp increase minimal size of D executable?
>>>>> I mean if this executable is not using regexp at all.
>>>>
>>>> No. This was as far as I understood one of the considerations.
>>>>
>>>>> 2) Is it possible to override operator ~~ ?
>>>>
>>>> Yes. opMatch() and opNext().
>>>
>>> And what is this opNext() doing exactly?
>>> next sub-expression, next match from last position matched (/g) ?
>>>
>>>>
>>>>> 3) What is the main purpose of incorporating
>>>>> interprettable regexps in natively compileable language?
>>>>
>>>> To make regexps more accessible I guess. Makes D seem like a alternative
>>>> to
>>>> scripting languages.
>>>
>>> ???
>>>
>>> alternative to some scripting language can be another scripting language.
>>> alternative to some natively compileable language can be another natively
>>> compileable language.
>>
>> I think you're thinking inside the box. :)
>> With the recent additions is it not possible to write scripts in D?
>
> I beleive there is a sort of misunderstanding about what scripting is and
> why there are scripting (typeless) languages, compiled bytecoded and
> compiled native.
> These three groups has their own niches. D as a compiled language will never
> reach
> flexibility of e.g. prototype based JavaScript or Ruby. There are just
> different definitions of flexibility
> for these groups - different and sometimes even orthogonal tasks .

I think there is some overlap, i.e. some scripting tasks do not require the flexibilty you mention, instead the important factor may be one or more of:
 - how fast can I code the solution
 - how easily can I code the solution
 - how easily can I maintain the solution
 - how likely is my solution to contain bugs
 - how easy will it be to find those bugs

Assuming you're a D programmer and assuming the D std lib contains the tools to achieve your task, why not use D?

Regan
February 18, 2006
"Andrew Fedoniouk" <news@terrainformatica.com> wrote in message news:dt6ma6$1jt0$1@digitaldaemon.com...
> I beleive there is a sort of misunderstanding about what scripting is and
> why there are scripting (typeless) languages, compiled bytecoded and
> compiled native.
> These three groups has their own niches. D as a compiled language will
> never reach
> flexibility of e.g. prototype based JavaScript or Ruby. There are just
> different definitions of flexibility
> for these groups - different and sometimes even orthogonal tasks .

I agree. But I don't believe that there's anything special about scripting that makes it especially suited for regex, but regex is a large reason people use scripting languages.