January 30, 2006
Matthew wrote:
> "Walter Bright" <newshound@digitalmars.com> wrote in message news:drgk9o$22kr$1@digitaldaemon.com...
> 
>>"Charles" <noone@nowhere.com> wrote in message news:drg2mr$1hrg$1@digitaldaemon.com...
>>
>>>>        D has built-in reg-exp, a la Ruby, &&
>>>
>>>Is this really practical in a compiled language ?  Does the D community have
>>>a desperate need for built-in regexes ?
>>>
>>>Can you give me an example why built-in would win over a regex library ?
>>>( genuinely curious ).
>>
>>Not only that, but D's support for regex's is far superior to C++'s (see Eric Anderton's regex template library).
> 
> 
> So what? Who ever mentioned C++'s regex, which, fwiw, I rate very poorly. No-one in their right mind would do regex in C++ unless they had to. D's being better than C++'s is a furphy. (As for how good it is, I have no opinion, although I'm led to believe it's good.)
> 
> 
>>I don't see that Ruby has more than a trivial advantage here, if that.
> 
> 
> One man's trivia ...
> 
> My point is, D's casting about like a beached fish looking for someone/something to be better than. _If_ it wants to be considered as looking like an option for hard-core regex, then it needs to be as usable as Perl and Ruby in this respect. If it isn't then forget it. Can't bring a knife to a gun-fight - go to a knife fight instead.
> 
> If that's not what D wants to be, then what does it want to be? (This is not rhetorical, I'm genuinely interested in getting an update on this issue.) Whatever that may be, can you give me an update on what current/forthcoming features will give it the advantage in that arena?
> 
> 
> 

Sorry for my apparent ignorance, but what are regex'es used for?! Who uses them?! Can you give me an example scenario or something like that?
January 30, 2006
Hasan Aljudy wrote:

> Sorry for my apparent ignorance, but what are regex'es used for?! Who uses them?! Can you give me an example scenario or something like that?

http://www.regular-expressions.info/

"You can think of regular expressions as wildcards on steroids."

http://en.wikipedia.org/wiki/Regex

--anders
January 30, 2006
> >>"Charles" <noone@nowhere.com> wrote in message news:drg2mr$1hrg$1@digitaldaemon.com...
> >>
> >>>>        D has built-in reg-exp, a la Ruby, &&
> >>>
> >>>Is this really practical in a compiled language ?  Does the D community
> >>>have
> >>>a desperate need for built-in regexes ?
> >>>
> >>>Can you give me an example why built-in would win over a regex library
?
> >>>( genuinely curious ).
> >>
> >>Not only that, but D's support for regex's is far superior to C++'s (see Eric Anderton's regex template library).
> >
> >
> > So what? Who ever mentioned C++'s regex, which, fwiw, I rate very
poorly.
> > No-one in their right mind would do regex in C++ unless they had to. D's being better than C++'s is a furphy. (As for how good it is, I have no opinion, although I'm led to believe it's good.)
> >
> >
> >>I don't see that Ruby has more than a trivial advantage here, if that.
> >
> >
> > One man's trivia ...
> >
> > My point is, D's casting about like a beached fish looking for someone/something to be better than. _If_ it wants to be considered as looking like an option for hard-core regex, then it needs to be as
usable as
> > Perl and Ruby in this respect. If it isn't then forget it. Can't bring a knife to a gun-fight - go to a knife fight instead.
> >
> > If that's not what D wants to be, then what does it want to be? (This is
not
> > rhetorical, I'm genuinely interested in getting an update on this
issue.)
> > Whatever that may be, can you give me an update on what
current/forthcoming
> > features will give it the advantage in that arena?
> >
> >
> >
>
> Sorry for my apparent ignorance, but what are regex'es used for?! Who uses them?! Can you give me an example scenario or something like that?

Don't apologise! Ignorance is just a gap in your brain waiting to be filled. ;-)

Regular Expressions are a very powerful syntax for expressing search strings, and reg ex libraries define mechanisms for applying that syntax and retrieving results. Here are some sample expressions from some of my scripts:


    if line =~ /^# include <#{projectName}/#{projectName}.h>/

This matches "line" against a format whereby the begins (^ anchors the search at the start of the line) with "# include <" and then contains the string equal to the "projectName" variable, followed by "/", followed by the string equal to the "projectName", followed by ".h>". In other words it matches includes such as "# include <stlsoft/stlsoft.h>" and "# include <comstl/comstl.h>" but not "# include <comstl/error_functions.h>" nor "# include "stlsoft/stlsoft.h""

        elsif line =~ /(^.*?Updated\:\W+)(.*)/
            new_lines << $1 + currentDate + "\n"

This matches any line containing the word "Updated:" followed by one or more whitespace token, and returns two variables (in Ruby and Perl these are returned in the implicit variables $1 and $2) containing all the text up to and including the matched tokens, and everything afterwards. This is used to form a new line with the current date, which is then inserted into the new_lines array.

Those are pretty rudimentary reg-exps, but that's about as far as I go. There's a lot more to learn, but I find this "level" affords one incredible productivity.

For example, using regexp and recls/Ruby, I can effect wholesale changes to the thousands of source files in my libraries with reasonably simple scripts. Examples including updating date/version, changing licence info, changing layout, fixing common bugs, etc. etc.

HTH










January 30, 2006
Matthew wrote:
>>>>"Charles" <noone@nowhere.com> wrote in message
>>>>news:drg2mr$1hrg$1@digitaldaemon.com...
>>>>
>>>>
>>>>>>       D has built-in reg-exp, a la Ruby, &&
>>>>>
>>>>>Is this really practical in a compiled language ?  Does the D community
>>>>>have
>>>>>a desperate need for built-in regexes ?
>>>>>
>>>>>Can you give me an example why built-in would win over a regex library
> 
> ?
> 
>>>>>( genuinely curious ).
>>>>
>>>>Not only that, but D's support for regex's is far superior to C++'s (see
>>>>Eric Anderton's regex template library).
>>>
>>>
>>>So what? Who ever mentioned C++'s regex, which, fwiw, I rate very
> 
> poorly.
> 
>>>No-one in their right mind would do regex in C++ unless they had to. D's
>>>being better than C++'s is a furphy. (As for how good it is, I have no
>>>opinion, although I'm led to believe it's good.)
>>>
>>>
>>>
>>>>I don't see that Ruby has more than a trivial advantage here, if that.
>>>
>>>
>>>One man's trivia ...
>>>
>>>My point is, D's casting about like a beached fish looking for
>>>someone/something to be better than. _If_ it wants to be considered as
>>>looking like an option for hard-core regex, then it needs to be as
> 
> usable as
> 
>>>Perl and Ruby in this respect. If it isn't then forget it. Can't bring a
>>>knife to a gun-fight - go to a knife fight instead.
>>>
>>>If that's not what D wants to be, then what does it want to be? (This is
> 
> not
> 
>>>rhetorical, I'm genuinely interested in getting an update on this
> 
> issue.)
> 
>>>Whatever that may be, can you give me an update on what
> 
> current/forthcoming
> 
>>>features will give it the advantage in that arena?
>>>
>>>
>>>
>>
>>Sorry for my apparent ignorance, but what are regex'es used for?! Who
>>uses them?! Can you give me an example scenario or something like that?
> 
> 
> Don't apologise! Ignorance is just a gap in your brain waiting to be filled.
> ;-)
> 
> Regular Expressions are a very powerful syntax for expressing search
> strings, and reg ex libraries define mechanisms for applying that syntax and
> retrieving results. Here are some sample expressions from some of my
> scripts:
> 
> 
>     if line =~ /^# include <#{projectName}/#{projectName}.h>/
> 
> This matches "line" against a format whereby the begins (^ anchors the
> search at the start of the line) with "# include <" and then contains the
> string equal to the "projectName" variable, followed by "/", followed by the
> string equal to the "projectName", followed by ".h>". In other words it
> matches includes such as "# include <stlsoft/stlsoft.h>" and "# include
> <comstl/comstl.h>" but not "# include <comstl/error_functions.h>" nor "#
> include "stlsoft/stlsoft.h""
> 
>         elsif line =~ /(^.*?Updated\:\W+)(.*)/
>             new_lines << $1 + currentDate + "\n"
> 
> This matches any line containing the word "Updated:" followed by one or more
> whitespace token, and returns two variables (in Ruby and Perl these are
> returned in the implicit variables $1 and $2) containing all the text up to
> and including the matched tokens, and everything afterwards. This is used to
> form a new line with the current date, which is then inserted into the
> new_lines array.
> 
> Those are pretty rudimentary reg-exps, but that's about as far as I go.
> There's a lot more to learn, but I find this "level" affords one incredible
> productivity.

Matthew,

Evidently you find std.regexp from Phobos to be unsatisfactory. Can you explain what's wrong with it (and exactly how the Ruby one is better)?
And what's wrong with boost.regex, for that matter.
Eric and I are developing a compile-time regexp parser, and it would help to know what the ideal would be. We can certainly get syntax that is quite close to the examples you've shown above... but the subtle details can be crucial.

eg, would this be acceptable?

char [] projectname;
...
if ( matches!("^#include<#{1}/#{1}.h>")(projectname, line) ) {

}

(not sure about the syntax for #{1} ).
which is eminently possible right now, or do we need to go further?

if (line.matches!("^#include<#{1}/#{1}.h>")(projectname) ){
}

or even something like:

if (matches!("^#include<#{projectname}/#{projectname}.h>")(line) ){
}

(this last one isn't currently possible, but I _think_ it could be done with a small language change -- would be easier to argue for it, if it was believed to be necessary).

January 30, 2006
Don Clugston wrote:

> 
> Evidently you find std.regexp from Phobos to be unsatisfactory. Can you explain what's wrong with it (and exactly how the Ruby one is better)?
> And what's wrong with boost.regex, for that matter.
> Eric and I are developing a compile-time regexp parser, and it would help to know what the ideal would be. We can certainly get syntax that is quite close to the examples you've shown above... but the subtle details can be crucial.
> 
> eg, would this be acceptable?
> 
> char [] projectname;
> ....
> if ( matches!("^#include<#{1}/#{1}.h>")(projectname, line) ) {
> 
> }
> 
> (not sure about the syntax for #{1} ).
> which is eminently possible right now, or do we need to go further?
> 
> if (line.matches!("^#include<#{1}/#{1}.h>")(projectname) ){
> }
> 
> or even something like:
> 
> if (matches!("^#include<#{projectname}/#{projectname}.h>")(line) ){
> }
> 
> (this last one isn't currently possible, but I _think_ it could be done with a small language change -- would be easier to argue for it, if it was believed to be necessary).
> 


My trivial opinion: this looks great, Don.  I don't think everything has to be in the language in order for it to compete with scripting languages like Ruby and Perl.  The best that can be done for D, at the moment, is to complete the powerful and flexibile template system.

Walter has already decided D's fate; there's little to be done to change the foundations now; but there's still much that can be done to decorate the exterior.

The argument/debate phase for D feature adoption is rapidly grinding to a halt.

-JJR
January 30, 2006
Yea I feel your pain.  I've kind of given up actively plugging D, its just taking so long.  Hopefully before 2009 ( C++0x ) we will have a 1.0.

Id like to see the compile time regex lib before I say yes or no to built in regex's , but I guess builtin couldnt really hurt the language.

On a side note I read that article in one of the other posts comparing ruby/c++/python/java -- and after looking at the source code for the examples, ruby looks uber!  I definetly want to give it a try now -- having retired perl long ago.



"Matthew" <matthew@stlsoft.com> wrote in message news:drkd1t$1tnk$1@digitaldaemon.com...
> >>         D has built-in reg-exp, a la Ruby, &&
> >
> > Is this really practical in a compiled language ?
>
> I'm not an expert, but I can see no practical impediment. Obviously
there's
> the theoretical issue whereby one likes to avoid gratuitous mixing of language and libraries. I am usually a very strong advocate against, but
in
> this case would make an exception because of the incredible utility.
>
> >  Does the D community have
> > a desperate need for built-in regexes ?
> >
> > Can you give me an example why built-in would win over a regex library ?
> > ( genuinely curious ).
> >
>
> Look, it's like this. I use the 2000 boot of my Windows machine in preference to the XP for basically one reason: I can select text from command boxes without first having to do Alt-Space E K, which I find incredibly tiresome when I've already been incovenienced by having to use the damn mouse. The other influencing factor is the fact that XP is a steaming heap of omni-crashing shite, but it's really just the usability
of
> the command-box. I know it's crazy, but it's the truth.
>
> By the same token, I always reach for Ruby in preference to Python or
Perl.
> It's built-in regex means it pantses Python every time, and the fact that
I
> can read it and write extensions (recls/Ruby, OpenRJ/Ruby, and other proprietary ones) for it (which I've thus far found impossible to do for Perl) rules out Perl. I don't care how much more comprehensive Python's libraries are. Anytime I find a missing Ruby library, even if Python has
it,
> I'll write one or write an extension, just so I can keep using that
built-in
> regex and recls/Ruby.
>
> So, if D wants to be considered a viable alternative to Python and Ruby, then I believe it needs built-in regex. If it doesn't, then it doesn't. Simple as that. Since it seems like D's not yet decided what it wants to
be,
> or who it wants to serve (apart from people interested in language design and compiler implementation, of course), then the potential of being a scripting alternative raises this point. If that's not a viable option,
then
> so be it. It can't be everything to everyone. It's just that as such it will, just like C/C++/Java/.NET, always be a poor cousin to Ruby and Perl for hardcore regex processing.
>
> As I said in another thread the other day, the lack of libraries/track record/tools/v1.1 rules out D for things I'd use C/C++/.NET.Java for, so it's just not a competitor there. It's not even an issue. (I can't comment on when/if that'll change, as I've not been around enough, but I'd be interested to hear if/when people think it will ever reach this point.) Second, it's not, as yet, suitable as an alternative for scripting for
most
> things I, for one, write scripts for.
>
> Unfortunately, at least for me, the things that D _would_ be useful for, writing small utilities with short-medium term lifespans, is also not an option because Walter's not updated std.recls in Phobos since ~2003/4, and
I
> use recls in just about every such utility I write, meaning that they're always done in C++ or in Ruby and never in D. I consider this to be a real shame, even if, perhaps, that's only for me. There was a time, a couple of years ago, when I used D a lot for such things, but now it's barely more than a passing fancy - there's just nothing useful that I can use D for
any
> more. I'm hoping (1) that someone will come along and take Phobos out of
> Walter's hands and make it into something coherent and useful, and (2)
that
> I can squeeze the time this year to do some DTL (and to put include some
of
> that in the next volume of my book, which I'll be starting work on in Mar/Apr). Absent such changes, I guess I'll continue to be a part-timer.
;-/
>
> Matthew
>
> P.S. Sorry to sound so negative. I still have high hopes for D, just my glass-half-empty side is beginning to "Show me the money" on all my non-paying activities.
>
>


January 30, 2006
John Reimer wrote:
> Don Clugston wrote:
> 
>>
>> Evidently you find std.regexp from Phobos to be unsatisfactory. Can you explain what's wrong with it (and exactly how the Ruby one is better)?
>> And what's wrong with boost.regex, for that matter.
>> Eric and I are developing a compile-time regexp parser, and it would help to know what the ideal would be. We can certainly get syntax that is quite close to the examples you've shown above... but the subtle details can be crucial.
>>
>> eg, would this be acceptable?
>>
>> char [] projectname;
>> ....
>> if ( matches!("^#include<#{1}/#{1}.h>")(projectname, line) ) {
>>
>> }
>>
>> (not sure about the syntax for #{1} ).
>> which is eminently possible right now, or do we need to go further?
>>
>> if (line.matches!("^#include<#{1}/#{1}.h>")(projectname) ){
>> }
>>
>> or even something like:
>>
>> if (matches!("^#include<#{projectname}/#{projectname}.h>")(line) ){
>> }
>>
>> (this last one isn't currently possible, but I _think_ it could be done with a small language change -- would be easier to argue for it, if it was believed to be necessary).
>>
> 
> 
> My trivial opinion: this looks great, Don.  I don't think everything has to be in the language in order for it to compete with scripting languages like Ruby and Perl.  The best that can be done for D, at the moment, is to complete the powerful and flexibile template system.

The language change I'm thinking of is the __identifier() keyword, and nothing to do with regular expressions.
Essentially identical to the one in recent versions of MSVC, where it is used to allow you to have identifier names with the same names as keywords. eg
int __identifier("abstract") = 2 ;
creates an integer called abstract -- even though "abstract" is normally a reserved word. In C++, it's not very exciting.
But in D, with the existing template system, this would allow all kinds of crazy stuff, like extracting variable names from strings as in the last example above. Would help with compile-time reflection, too.
I'm trying to come up with some good use cases to justify it, and prove that the idea would work (I fear that the case above would require a mixin, which would defeat the purpose).

> Walter has already decided D's fate; there's little to be done to change the foundations now; but there's still much that can be done to decorate the exterior.

> 
> The argument/debate phase for D feature adoption is rapidly grinding to a halt.
> 
> -JJR
January 30, 2006
Don Clugston wrote:

> The language change I'm thinking of is the __identifier() keyword, and nothing to do with regular expressions.
> Essentially identical to the one in recent versions of MSVC, where it is used to allow you to have identifier names with the same names as keywords. eg
> int __identifier("abstract") = 2 ;
> creates an integer called abstract -- even though "abstract" is normally a reserved word. In C++, it's not very exciting.
> But in D, with the existing template system, this would allow all kinds of crazy stuff, like extracting variable names from strings as in the last example above. Would help with compile-time reflection, too.
> I'm trying to come up with some good use cases to justify it, and prove that the idea would work (I fear that the case above would require a mixin, which would defeat the purpose).


I'm almost to the point where I would love to see anything added that improves compile-time relflection in D.  It's one area that I think could really strengthen D's position in multiple aspects.

From my limited point of view, your suggestion seems reasonable (although I dislike any keyword that must be prefixed with __).

Please continue to make these useful suggestions, Don! :)

You're getting your way more than most of us! Must be more of that ninjitsu. ;)

-JJR
January 30, 2006
On Mon, 30 Jan 2006 17:47:31 +1100, Matthew <matthew@stlsoft.com> wrote:
> Look, it's like this. I use the 2000 boot of my Windows machine in
> preference to the XP for basically one reason: I can select text from
> command boxes without first having to do Alt-Space E K, which I find
> incredibly tiresome when I've already been incovenienced by having to use
> the damn mouse. The other influencing factor is the fact that XP is a
> steaming heap of omni-crashing shite, but it's really just the usability of the command-box. I know it's crazy, but it's the truth.

Have you tried "QuickEdit mode"?
(left-click the upper left corner of a cmd window, choose properties, choose options, tick "QuickEdit mode")

Now you can select and paste with the mouse, left-click-drag highlights, right-click copies (deselects) and then pastes (multiple times). I've come to like it, despite having to remove my right hand from the keyboard to do it.

Regan
January 30, 2006
Regan Heath wrote:
> On Mon, 30 Jan 2006 17:47:31 +1100, Matthew <matthew@stlsoft.com> wrote:
>> Look, it's like this. I use the 2000 boot of my Windows machine in
>> preference to the XP for basically one reason: I can select text from
>> command boxes without first having to do Alt-Space E K, which I find
>> incredibly tiresome when I've already been incovenienced by having to use
>> the damn mouse. The other influencing factor is the fact that XP is a
>> steaming heap of omni-crashing shite, but it's really just the usability of the command-box. I know it's crazy, but it's the truth.
> 
> Have you tried "QuickEdit mode"?
> (left-click the upper left corner of a cmd window, choose properties, choose options, tick "QuickEdit mode")
> 
> Now you can select and paste with the mouse, left-click-drag highlights, right-click copies (deselects) and then pastes (multiple times). I've come to like it, despite having to remove my right hand from the keyboard to do it.

Wow, I'd never seen this option for some reason.  Makes my life much easier.  Thanks!


Sean