February 16, 2005
Matthew wrote:
> I'm really surprised that D's above Ruby.
> 
> All we need now is to use that reserved $ for built-in regex, and D will be on its way to top place. :-)

Man, I'm proud of D  !!!!!!!!!

Thanks, Walter, for such a wonderful language!
And thanks, Matthew, for bringing up the regexp issue!

Think about it, it's only two days since then, and already
we have figured out how to smoothly incorporate regexps
into D!

Both syntax wise, and (I guess, and hope) also smoothly
for Walter, who has to do the hard work.

As the regexp issue stands now, I see awesome power in it.
If this doesn't take us way higher than #29, I'll eat
quiche for a month!

---------

You know, today I had a business meeting. I got there
late, and couldn't even bring myself to the issues at
hand. All I went on about was how we now have regexps
in D, and how easy it was -- all it took was a bit of
serious thinking.

Some of the audience actually got excited.   :-)

There is something right in D!
February 16, 2005
Georg Wrede wrote:
> Matthew wrote:
> 
>> I'm really surprised that D's above Ruby.
>>
>> All we need now is to use that reserved $ for built-in regex, and D will be on its way to top place. :-)
> 
> 
> Man, I'm proud of D  !!!!!!!!!
> 
> Thanks, Walter, for such a wonderful language!
> And thanks, Matthew, for bringing up the regexp issue!
> 
> Think about it, it's only two days since then, and already
> we have figured out how to smoothly incorporate regexps
> into D!
> 
> Both syntax wise, and (I guess, and hope) also smoothly
> for Walter, who has to do the hard work.
> 
> As the regexp issue stands now, I see awesome power in it.
> If this doesn't take us way higher than #29, I'll eat
> quiche for a month!
> 
> ---------
> 
> You know, today I had a business meeting. I got there
> late, and couldn't even bring myself to the issues at
> hand. All I went on about was how we now have regexps
> in D, and how easy it was -- all it took was a bit of
> serious thinking.
> 
> Some of the audience actually got excited.   :-)
> 
> There is something right in D!

I would be excited too but...

Did I miss a post?  Did Walter endorse the idea already or say he was in the works of implementing it in D?

Otherwise, we can't celebrate yet! :-P

February 16, 2005

John Reimer wrote:
> Did I miss a post?  Did Walter endorse the idea 
> already or say he was in the works of implementing
> it in D?
> 
> Otherwise, we can't celebrate yet! :-P

Hmm. You're right!

However, even if he doesn't, I'm still going to raise
a glass of champagne this Friday for D! Merely being
able to figure out a smooth way to do it, is something
a bunch of other languages don't offer!!   :-P
February 16, 2005
On Thu, 17 Feb 2005 00:26:12 +0200, Georg Wrede wrote:

> 
> 
> John Reimer wrote:
>> Did I miss a post?  Did Walter endorse the idea
>  > already or say he was in the works of implementing
>  > it in D?
>> 
>> Otherwise, we can't celebrate yet! :-P
> 
> Hmm. You're right!
> 
> However, even if he doesn't, I'm still going to raise
> a glass of champagne this Friday for D! Merely being
> able to figure out a smooth way to do it, is something
> a bunch of other languages don't offer!!   :-P

He he... true.  Never hurts to be merry about such things. :-)

February 16, 2005
"Georg Wrede" <georg.wrede@nospam.org> wrote in message news:42125F89.2070109@nospam.org...
>
>
> John Reimer wrote:
>> Georg Wrede wrote:
>>
>>>>>>>>>>  if line =~ /unquoted-regex-pattern-here/
>>>>>>>>>>      gr1 = $1
>>>>>>>>>>      gr2 = $2
>>>>>>>>>>      etc ...
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Just off hand, suggesting this syntax for D scares the living daylights out of me.
>>>
>>>
>>>
>>> if file1line =~ /unquoted-regex-pattern-here/
>>> mystring = $3 ~ "bla" ~ $5;
>>> if ($3<$8 && $9==$1 || $3>$2 && $5=<($2 ~ foo)
>>>     mystring ~= $9 ~ "xx" ~ $3;
>>> else {
>>>     if file2line =~ /unquoted-regex-pattern-here/
>>>     mystring ~= $2 ~ $3 ~ " got out of hand!";
>>> }
>>> mystring ~= "\nAnd he never knew why.";
>>
>>
>> Actually, it doesn't look all that bad.
>
> Yeah, but there is one bug already in this code, made by
> the "author" (i.e. me, writing the code). And there is
> another that'll hit him the second he starts writing the
> rest (of this imagined example).
>
> My problem is that neither you, nor Matthew, noticed it.
> So probably the average user, or the next guy won't either.
>
> Thus, I'm not happy with the syntax.

I wasn't looking at it in that way. I don't look at _any_ code on ng posts for bugworthiness. That's what I do in code reviews, or in development of code, or when determining whether or not a particular library has enough merit to bother unzipping it. On ngs, I assume we're talking 'concept' and glimpse code for its feel.

Hence, I wouldn't read anything into my having not seen the bug, since I didn't even delve into what the code did. I took it in the spirit of the context of the discussion at that time.

> ---------
>
> D can brag about being very easy to understand. Just browsing
> someone's code gives you immediately an idea of what's
> going on. But the above code takes a pencil and paper,
> and peace and quiet, before one can figure it out.
>
> I really see worms and anacondas taking over, if we
> we unlock the door and let Larry Wall in.

I'm sure Regan would be better able to give a definitive logical objection here, but it seems like the conclusion you've drawn has had at least one hopeful leap from the premise of earlier in the post.



February 17, 2005
"John Reimer" <brk_6502@yahoo.com> wrote in message news:cuuoes$2c4h$1@digitaldaemon.com...
> Georg Wrede wrote:
>>
>>
>> John Reimer wrote:
>>
>>> Georg Wrede wrote:
>>>
>>>>>>>>>>>  if line =~ /unquoted-regex-pattern-here/
>>>>>>>>>>>      gr1 = $1
>>>>>>>>>>>      gr2 = $2
>>>>>>>>>>>      etc ...
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Just off hand, suggesting this syntax for D scares the living daylights out of me.
>>>>
>>>>
>>>>
>>>>
>>>> if file1line =~ /unquoted-regex-pattern-here/
>>>> mystring = $3 ~ "bla" ~ $5;
>>>> if ($3<$8 && $9==$1 || $3>$2 && $5=<($2 ~ foo)
>>>>     mystring ~= $9 ~ "xx" ~ $3;
>>>> else {
>>>>     if file2line =~ /unquoted-regex-pattern-here/
>>>>     mystring ~= $2 ~ $3 ~ " got out of hand!";
>>>> }
>>>> mystring ~= "\nAnd he never knew why.";
>>>
>>>
>>>
>>> Actually, it doesn't look all that bad.
>>
>>
>> Yeah, but there is one bug already in this code, made by
>> the "author" (i.e. me, writing the code). And there is
>> another that'll hit him the second he starts writing the
>> rest (of this imagined example).
>>
>> My problem is that neither you, nor Matthew, noticed it.
>> So probably the average user, or the next guy won't either.
>>
>> Thus, I'm not happy with the syntax.
>>
>> ---------
>>
>> D can brag about being very easy to understand. Just browsing
>> someone's code gives you immediately an idea of what's
>> going on. But the above code takes a pencil and paper,
>> and peace and quiet, before one can figure it out.
>>
>> I really see worms and anacondas taking over, if we
>> we unlock the door and let Larry Wall in.
>
> Frankly, I didn't look at this as an exercise in bug-hunting.  I thought the point of the post was to show how ugly the $ sign notation was. Thus my response to the contrary.  So the fact that I didn't notice the problem is of no relevance.
>
> I'm not too surprised that you could make a buggy example out of a not-yet-defined D syntax.  Someone else could make a non-buggy example too, I'm sure, with a nice, clear presentation.
>
> I was just musing that, aesthetically, it looked tolerable.  This doesn't rule out the possibility of a better way, of course; and, by far, I'm in no position to offer any expert opinion on what that way might be.
>
> Matthew on the other hand...

... what? :-)

If you mean expert, I'm not. This whole sub-thread came from my experience in doing lots of regexp in Perl, Python and Ruby and some in C++ and D. There's no expertise, no theoretical basis to what I say, just the facts (from my pov) of my experience.

Just this morning I was able to come back to Ruby from Python (this is because I didn't need XML and I _did_ need recls; the recls/Python mapping's not done yet), and it was like that wonderful feeling of getting home after a long hard day in a client's office, debating the relative merits of whether we should go to production on the basis of unit testing only.

I'm not saying "D must have built-in regex", nor am I saying that "D's built-in regex must use Perl/Ruby syntax for the matched groups", but I am saying that "I believe that if D has built-in regex it will be more generally useful, and therefore more popular/successful, than if it does not".

If we don't have $1, $2, but we do have, say (thread-specific) _1, _2 etc., then I'd be content with that.

As for the buggy-code, I'd say it's a furphy in the extreme. Again, and this is only me speaking from my own experience, I've found the built-in way is _far_, _far_, _far_ (three times - must be a poltician) easier to get correct first time than when having to use a function call based approach.



February 17, 2005
> Context and commenting are what REALLY matter, even in complicated examples.  We could keep trading examples and get nowhere.  The fact is, this:
>
> if (url ~= /^http:\/\/(.+)$/)
>    url = $1;
>
> Is just as self explanatory as:
>
> for (int i = 0; i < url.length; i++)

Agreed.

>
> If it's not, the person at fault is not the programmer, but rather the reader.  I'm sorry if that sounds harsh.

Agreed, in this case. Because regexp is a very widely recognised, valued, and understood para-language. If anyone's using regexp _in any language_, then they need to understand it. The myriad possible configurations of squiggles that go to make up the expression make any confusion of use of $n irrelevant. I can truthfully say I've _never_ experienced a single moment of confusion about the use of matched groups in regexp-client code; of course that cannot be said wrt the expressions themselves (they're a daily cause for crying).

>  However, it should of course be commented when at all necessary:
>
> // Strip off the http:// because we know it's HTTP.
> if (url ~= /^http:\/\/(.+)$/)
>    url = $1;
>
> If you really think this is MORE readable and MORE self explanatory:
>
> // Strip off the http:// because we know it's HTTP.
> Parameters parameters;
> if (new Regexp(`^http://(.+)$`, "").test(url, parameters))
> {
>    with (parameters)
>       url = p1;
> }
>
> Or, perhaps:
>
> // Strip off the http:// because we know it's HTTP.
> HTTPStripper parameters;
> if (new Regexp(`^http://(.+)$`, "").test(url, parameters))
>    url = parameters.urlWithoutHttp;

These are less clear, to be sure. But they're not 'unclear'.

Indeed, the with() idea does help with a lot of valid objections to the global nature of matched groups.

> Then we must simply think differently.  For one thing, the second example looks more complicated and "scary".  What's this "HTTPStripper" thing?  Gotta go look it up... oh, it's just a struct with fullMatch and urlWithoutHttp in it.  What's fullMatch?  Doesn't appear to be used... err... weird... I don't understand it.  That's what people reading my code will say to the last two examples.

For my part, I can live with with(), but the non-built-in-ness of the regexp itself is not attractive.

> Then again, I'm more of an open source guy.  I want my code to be fast, effecient, clean, understandable, structured, and most importantly commented.  I don't think adding identifiers or using longer names where completely unecessary (e.g. when the variable is only used the once and a comment can make it 100% clear) acheives any of the above goals.

This is all a bit off point, methinks

> Yes, it's just syntactical sugar, perhaps, but arguing against it because it's less readable... well, that's like saying driving a car (whatever your age) should be illegal because doing so can kill people. Okay, that's true, but should it be illegal?


Please put the top back on that worm can immediately! :-)



February 17, 2005
> No; the real issue is with respect to global variables, thread-locals,
> encapsulation, and whatever else one wishes to throw into that basket.
> The fact
> is that match() is a function emitting a (potential) set of
> sub-matches, an
> indication of success, and an index into the matched content. In other
> words, it
> is a multi-return function.
>
> This thread was started with a suggestion to give that particular
> function very,
> very, special standing within D by making its multi-return values
> inherent
> within the language spec. I think that's overlooking what D can
> support right
> now for such cases, and it would introduce a raft of other issues
> related to
> that lack of encapsulation noted previously.
>
> I'm really not even vaguely interested in debating syntactical or
> stylistic
> preferences with anyone. What I will bitch and whine about is a notion
> to
> introduce special syntax into the language (along with a bunch of
> potential
> scoping,liveness, and concurreny problems) when the functionality can
> be
> supported perfectly well with the language as it is, without creating
> further
> special-case scenarios.
>
> Further, given the 'positive' argument for a built-in syntax, there's
> been a
> conspicuous absence of resolution for the lack of encapsulation and so
> on.
>
> Sure, one can argue all day long about how some built-in syntax for
> such a
> notion can be made to look more attractive (or more cryptic) than
> leveraging the
> existing capabilites. But why bother? If the primary purpose of D were
> to be a
> regex language, then perhaps there'd be a point to all this. Yet that
> is simply
> not the case.
>
> Thus, I feel we should look for a way to make RegExp as 'acceptable'
> to all
> concerned, whilst taking advantage of what we already have with the
> current
> language constructs. If you don't think that's even possible or
> desirable, then
> we should abate right now :-)
>
> What are your thoughts, Matthew?

There are several comments one would make:

- given that D already has an uncomfortable (to some at least) coupling
between language and library (Object.print(), anyone?), one might say
that one more _worthy_ case should be considered
- there's a general principle - often ascribed to B.S. himself - that
anything that can be handled in a library has no business being in the
language. This is a very wise principle.
- there's the practical fact that using regex in Python is a pain, and
in C++ it's downright torture. (Same can be said of D, IMO and IMLX)

so, I'm somewhat straddling the spikey spikes here:

    I'm coming down on the side of "with()" for scoping the variables,
since they must, at worst, be thread-local. Even when thread-local, one
must be very careful about re-entrancy. (What if D evolves - as I think
it should - to allow custom implicit toString() stuff and inside such a
toString() another regex is used. Nasty!)

    However, I'm also informed by (as I acknowledge, my (not im)partial)
experience of how disheartening doing regexp in anything other than Perl
& Ruby is.

Maybe my position is something like:


    - in principle, language and library should always be maximally
independent. However, given that they are not, and cannot be, in D, we
should have our minds ever so slightly open to VERY IMPORTANT CASES.
    - assuming for the moment that we agree that regex is a VERY
IMPORTANT CASE, then we must ensure that use of the regexp is maximally
convenient, maximally conventional, while not betraying D's aims of
robust, being thread-safe, programmer protective (albeit not programmer
'cossetive' / responsibility abbrogative), efficient, maintainable, etc.
    - this means that D *cannot* use process-local global variables for
pattern matches. Furthermore, thread-local globals are still not safe
from intra-thread re-entrancy. That being the case, the issue of using
$1 syntax is moot, so we most certainly should use with().

At this point I think rather than making any further half-arsed prognostications, I should maybe go and play with D's existing regexp more, and identify the specific things that turned me off it so much in the past. Then maybe that'll inform, at least for me, on any future ideas.

Derek the D i t h e r e d




February 17, 2005
On Thu, 17 Feb 2005 10:55:05 +1100, Matthew <admin@stlsoft.dot.dot.dot.dot.org> wrote:
> "Georg Wrede" <georg.wrede@nospam.org> wrote in message
> news:42125F89.2070109@nospam.org...
>>
>>
>> John Reimer wrote:
>>> Georg Wrede wrote:
>>>
>>>>>>>>>>>  if line =~ /unquoted-regex-pattern-here/
>>>>>>>>>>>      gr1 = $1
>>>>>>>>>>>      gr2 = $2
>>>>>>>>>>>      etc ...
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Just off hand, suggesting this syntax for D scares the
>>>>>>>>>> living daylights out of me.
>>>>
>>>>
>>>>
>>>> if file1line =~ /unquoted-regex-pattern-here/
>>>> mystring = $3 ~ "bla" ~ $5;
>>>> if ($3<$8 && $9==$1 || $3>$2 && $5=<($2 ~ foo)
>>>>     mystring ~= $9 ~ "xx" ~ $3;
>>>> else {
>>>>     if file2line =~ /unquoted-regex-pattern-here/
>>>>     mystring ~= $2 ~ $3 ~ " got out of hand!";
>>>> }
>>>> mystring ~= "\nAnd he never knew why.";
>>>
>>>
>>> Actually, it doesn't look all that bad.
>>
>> Yeah, but there is one bug already in this code, made by
>> the "author" (i.e. me, writing the code). And there is
>> another that'll hit him the second he starts writing the
>> rest (of this imagined example).
>>
>> My problem is that neither you, nor Matthew, noticed it.
>> So probably the average user, or the next guy won't either.
>>
>> Thus, I'm not happy with the syntax.
>
> I wasn't looking at it in that way. I don't look at _any_ code on ng
> posts for bugworthiness. That's what I do in code reviews, or in
> development of code, or when determining whether or not a particular
> library has enough merit to bother unzipping it. On ngs, I assume we're
> talking 'concept' and glimpse code for its feel.
>
> Hence, I wouldn't read anything into my having not seen the bug, since I
> didn't even delve into what the code did. I took it in the spirit of the
> context of the discussion at that time.
>
>> ---------
>>
>> D can brag about being very easy to understand. Just browsing
>> someone's code gives you immediately an idea of what's
>> going on. But the above code takes a pencil and paper,
>> and peace and quiet, before one can figure it out.
>>
>> I really see worms and anacondas taking over, if we
>> we unlock the door and let Larry Wall in.
>
> I'm sure Regan would be better able to give a definitive logical
> objection here, but it seems like the conclusion you've drawn has had at
> least one hopeful leap from the premise of earlier in the post.

I'll post my link again:
http://www.infidels.org/news/atheism/logic.html

It describes the form of a deductive argument using boolean logic, how/when it's useful and how to go about it in addition to common fallacies in logic.

In this case it sounds like Matthew disagrees with an earlier premise or the inference from an earlier premise to a later one.

Regan
February 17, 2005
"Ant" <Ant_member@pathlink.com> wrote in message news:cuvje9$77s$1@digitaldaemon.com...
> In article <cuo83m$1mjn$1@digitaldaemon.com>, Walter says...
> >
> >http://www.tiobe.com/tpci.htm
> >
> >We've risen to number 29!
> >
>
> Doesn't anyone else find strange that D is higher then objective-C?

Not me. Back in the ancient times, when dinosaurs and C ruled, I was looking for a way to set my C compiler (Datalight C) apart from the crowd. I ran across Objective-C by Stepstone and C++ by AT&T. Both were used about equally, and there was fierce debate about which one was going to be the future.

But Stepstone wanted royalties for anyone wanting to do Objective-C. I contacted AT&T, and they graciously said I could implement a C++ compiler, call it C++, and not pay royalties to AT&T. (They also thanked me for being the only one who ever bothered to even ask!)

That settled it for me, C++ was the future.