February 16, 2006
Oskar Linde wrote:
> Sean Kelly wrote:
> 
>> Hold on.  Walter, can you explain this injection business a bit?  For
>> example, the effect here seems clear:
>>
>> if( "x" ~~ "y" ) {
>>      _match.blah;
>> }
>>
>> But what about this:
>>
>> if( "x" ~~ "y" && "y" ~~ "z" ) {
>>      _match.blah;
>> }
>>
>> And this:
>>
>> if( "x" ~~ "y" || "y" ~~ "z" ) {
>>      _match.blah;
>> }
> 
> Those are AndAndExpression and OrOrExpression and will not inject anything.
> Only a pure if(MatchExpression) injects anything.

Very weird.  So a MatchExpression by itself has a boolean result but injects a value into the following scope?

>> Also, with respect to the above proposal, how might this work:
>>
>> int numStudents();
>> float avgGrade();
>>
>> if( numStudents() < 10 || avgGrade() > 50.0 ) {
>>
>> }
> 
> In this case, $ would always refer to the value of (numStudents() < 10 ||
> avgGrade() > 50.0), which is bool and must always be true. (It would be
> interesting to change the || expression into returning the left value if it
> is nonzero and the right value otherwise, without converting anything to
> bool, but I'm not fully sure what implications that would have...)

So based on the above, your suggestion would only be useful for single call expressions:

if( numStudents() )
    printf( "%i students\n", $.whatever );

Seems reasonable I suppose.

>> While the result of each subexression is actually boolean (just as in
>> the match expression above), the values we'd be interested in are the
>> integer and float.  But in the above example, the float might not be
>> evaluated at all.  I'd merely like to voice this as a qualifier to my
>> initial support of this idea above :-)
> 
> This is probably impossible. How would the compiler know what subexpressions
> are interesting and how would those be referred to?

That's fine.  I was merely trying to sort out the implications of this new feature.


Sean
February 16, 2006
Walter Bright wrote:
> D dramatically improves the convenience of string handling over C++. But while I think using the library std.regexp is straightforward, obviously it just isn't gaining traction. People like the shortcut approaches Ruby and Perl use for regular expressions, hence the new D match-expression support.
> 
> So, now we have:
> 
>     if (regular_expression ~~ string)
>     {
>             _match.pre
>             _match.post
>             _match.match(n)
>     }
> 
> Should we do some aliases:
> 
>     $` => _match.pre
>     $' => _match.post
>     $& => _match.match(0)
>     $n => _match.match(n)
> 
> ? Syntactic sugar is often a good idea, but at what point do they become cyclamates and cause cancer in laboratory animals? Will these $ tokens render D more accessible, but perhaps too unreadable? 
> 
> 

I havent read this whole thread, but pardon if this has been suggested.
Why doesnt the regular expression stuff use foreach?

struct Match {
  short start, end;
}

foreach( Match m ; "[0-9]" ~~ mystring )
{
  writefln( "Found number:%s", mystring[m.start..m.end] );
}

Basically this implements a callback methodology for regexes, similar to:

void match( char[] regex, char[] str, bool delegate( Match m, char[] s )  dg );

Obviously this doesnt cover all cases, but I'm just curious why it isn't used.

-DavidM
February 16, 2006
Sean Kelly wrote:

> Oskar Linde wrote:
>> 
>> Those are AndAndExpression and OrOrExpression and will not inject anything. Only a pure if(MatchExpression) injects anything.
> 
> Very weird.  So a MatchExpression by itself has a boolean result but injects a value into the following scope?

No, not boolean. A MatchExpression has a _Match* result. This result is what gets injected into the following scope. My suggestion is just a generalization of this.

> So based on the above, your suggestion would only be useful for single call expressions:
> 
> if( numStudents() )
>      printf( "%i students\n", $.whatever );
> 

Yes.

/Oskar
February 16, 2006
kris wrote:
> Walter Bright wrote:
> 
>> I've been blasted for putting strings in the language (instead of as a library String class), for putting complex numbers in, and for associative arrays. I think the results speak for these being a success. If regex's are heavilly used, then the extra sugar for them becomes worthwhile as well.
> 
> That's getting a bit off topic, isn't it? OK, I'll go with it:
> 
> I'm an advocate for getting regex support in the grammar, but I'm certainly not an advocate for tying Phobos to the compiler (RegExp has a notable resultant import set; because of this I refactored it for Ares and Mango).
> 
> Without a clearly defined means to decouple Phobos from the compiler, you're effectively erecting barriers for other solutions to clamber over (as Sean vaguely intimated earlier). What's missing from all this built-in stuff is a clean and documented means to have it supported outside of Phobos. After all, the compiler is injecting explicit references for AA code, utf conversion code, regex code, and a variety of other things. What's next?

I'm branching Ares before I check in this last block of changes.  In the new branch I'm simply going to move all necessary Phobos std code required into dmdrt/util and will plan to trim it down over time.  Not ideal, I know, but better than trying to play catch-up with heavily modified code such as the version of RegExp you provided.  For the rest, I agree completely, but then I've already said as much in d.D.announce :-)

>> Who uses regex in C++? Hardly anyone. I'm betting it's because using them sucks in C++, not because people don't use regex's.
> 
> Again, it's horses for courses. BTW, regex does not suck in C, so why C++ ?

The lack of a standard library component is a significant factor IMO. As is the widely divergent syntaxes supported by third party libraries.  Personally, I haven't used regular expressions in D because I haven't needed to yet, not because they weren't a language feature.  But I can't help liking this being built-in from a language perspective, even if this is balanced by practical concerns.

>> I thought it fit in well with D's new capability of being runnable in a script-like fashion. If this opens up a reasonably broad new range of applications that D is a good fit for, that's good. I might be wrong, of course, as I've been with the bit data type (a complete botch). Match expressions don't break anything, were not expensive to implement, and the only way to see how they'll work out is to try them. 
> 
> I figured that was the motivation. The "cost" you speak of considers only how much effort it takes you to get the functionality into the compiler, test it a bit, document the usage, and respond to the flak ;-)
>
> BTW: perhaps it would be appropriate to deprecate bit[] before 1.0 and provide a nice library class/struct instead? You might even reuse the old code from Zortech/Zorland days.

If it helps, I'll send you a case of beer or something ;-)  But if there's universal agreement that packed bit arrays were a mistake then they need to be out pre-1.0 and broken code be damned.  I really don't want to see a 1.0 D release containing features that even the designer thinks should not exist.


Sean
February 16, 2006
Oskar Linde wrote:
> Sean Kelly wrote:
> 
>> Oskar Linde wrote:
>>> Those are AndAndExpression and OrOrExpression and will not inject
>>> anything. Only a pure if(MatchExpression) injects anything.
>> Very weird.  So a MatchExpression by itself has a boolean result but
>> injects a value into the following scope?
> 
> No, not boolean. A MatchExpression has a _Match* result. This result is what
> gets injected into the following scope. My suggestion is just a
> generalization of this.

Oh right.  And pointers can be implicitly evaluates as logical expressions.  Makes sense now.


Sean
February 16, 2006
Sean Kelly schrieb am 2006-02-16:
> kris wrote:

[snip]

>> BTW: perhaps it would be appropriate to deprecate bit[] before 1.0 and provide a nice library class/struct instead? You might even reuse the old code from Zortech/Zorland days.
>
> If it helps, I'll send you a case of beer or something ;-)  But if there's universal agreement that packed bit arrays were a mistake then they need to be out pre-1.0 and broken code be damned.  I really don't want to see a 1.0 D release containing features that even the designer thinks should not exist.

What is the cost of keeping bit[] in the language?

Currently, every type - including void - can be used as the type on an array element. What would be the consequences for generic programming if T -> T[] isn't guaranteed to succeed?

Thomas



February 16, 2006
Thomas Kuehne wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Sean Kelly schrieb am 2006-02-16:
>> kris wrote:
> 
> [snip]
> 
>>> BTW: perhaps it would be appropriate to deprecate bit[] before 1.0 and provide a nice library class/struct instead? You might even reuse the old code from Zortech/Zorland days.
>> If it helps, I'll send you a case of beer or something ;-)  But if there's universal agreement that packed bit arrays were a mistake then they need to be out pre-1.0 and broken code be damned.  I really don't want to see a 1.0 D release containing features that even the designer thinks should not exist.
> 
> What is the cost of keeping bit[] in the language?
> 
> Currently, every type - including void - can be used as the type on an
> array element. What would be the consequences for generic programming
> if T -> T[] isn't guaranteed to succeed?

The same as the problems with std::vector<bool> in C++ (though I don't have any specific references handy).  I think the true ramifications of this in D won't be completely apparent until the language has been in use a bit longer however.

One thought I had was to leave bit in place, perhaps deprecated, and add 'bool' as a non-packed but otherwise equivalent type.


Sean
February 17, 2006
"Ivan Senji" <ivan.senji_REMOVE_@_THIS__gmail.com> wrote in message news:dt1u3t$d97$1@digitaldaemon.com...
> I hope this isn't a bug that this works?

It's supposed to work <g>.


February 17, 2006
"Thomas Kuehne" <thomas-dloop@kuehne.cn> wrote ...
> Currently, every type - including void - can be used as the type on an array element. What would be the consequences for generic programming if T -> T[] isn't guaranteed to succeed?

Easy fix ~ change the bool alias to byte, instead of bit :-)


February 17, 2006
Kris wrote:
> "Thomas Kuehne" <thomas-dloop@kuehne.cn> wrote ...
>> Currently, every type - including void - can be used as the type on an
>> array element. What would be the consequences for generic programming
>> if T -> T[] isn't guaranteed to succeed?
> 
> Easy fix ~ change the bool alias to byte, instead of bit :-) 

I already use byte in some cases :-)  But it lacks the boolean value safety of bit, so I tend to litter my code with asserts just to be sure something didn't get screwed up... or simply make sure I'm only comparing to zero and not-zero.  Either way, it's more error prone than I'd like.


Sean