February 17, 2006
On Thu, 16 Feb 2006 17:25:23 -0800, Walter Bright wrote:

> "Regan Heath" <regan@netwin.co.nz> wrote in message news:ops43d5lmc23k2f5@nrage.netwin.co.nz...
>> In the end it's just syntactic sugar for & | and ^. The question is, does it make the code clearer, I think so. Does it make bit manipulation easier to code, I think so. Is that enough to make it a valuable feature?
> 
> I regularly do bit masking and shifting on ints. I'm so used to it, I don't think that adding sugar for it would help any.

YOU ARE DEAD WRONG! Sheesh!!! Not all us are blessed with your abilities. That's why I don't do Assembler anymore and that's why we use higher level languages than machine code.

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocracy!"
17/02/2006 12:40:36 PM
February 17, 2006
"Derek Parnell" <derek@psych.ward> wrote in message news:edpqlnztl599.19xc3uf14ntbh.dlg@40tude.net...
> On Thu, 16 Feb 2006 17:25:23 -0800, Walter Bright wrote:
>
>> "Regan Heath" <regan@netwin.co.nz> wrote in message news:ops43d5lmc23k2f5@nrage.netwin.co.nz...
>>> In the end it's just syntactic sugar for & | and ^. The question is,
>>> does
>>> it make the code clearer, I think so. Does it make bit manipulation
>>> easier
>>> to code, I think so. Is that enough to make it a valuable feature?
>>
>> I regularly do bit masking and shifting on ints. I'm so used to it, I
>> don't
>> think that adding sugar for it would help any.
>
> YOU ARE DEAD WRONG! Sheesh!!! Not all us are blessed with your abilities.

What about using some functions instead:

    int setBit(inout v, int b)
    {
        return v |= 1 << b;
    }

?

> That's why I don't do Assembler anymore and that's why we use higher level languages than machine code.

<g>


February 17, 2006
"Sean Kelly" <sean@f4.ca> wrote in message news:dt394j$1n2j$1@digitaldaemon.com...
> This is really more of a library issue than a compiler issue.  My concern is that, since internal/object.d now imports std.regexp, the runtime code can no longer be built without at least a skeleton regexp module available.  And if the regexp implementation changes then the runtime must be rebuilt.  I'll admit that the current approach is probably best given that std.regexp exists and code duplication is a Bad Thing, but it still creates a language dependency on library code, even if the compiler isn't emitting RegExp calls directly.

I was concerned that code that did not use MatchExpressions might inadvertantly link in the std.regexp module, which would be a Bad Thing. It does not, so I'm not convinced this is a bad thing.

>> Regex is non-trivial. There's no way to have any sort of language support for it without it being in the library. Anyone working on D libraries or other things is welcome to use RegExp, so I am just not understanding what the problem is. Phobos isn't a closed shop, the license on the files allows anyone to do pretty much anything they want with it.
>
> I agree.  And this works fine for Phobos.  But if Phobos is to be a template for future standard library implementations, then it should be designed in a way that allows for closed-source compiler implementations as well.

Sure, and std.regexp's license allows it to be used in closed source. It's a different license from dmd's source code, and the reason for the difference is so that people can use it for just the purpose you suggest. If one wanted to reimplement (or better, extend) RegExp in order to support, say, Perl 6 regex, all that object._Match needs are about 4 trival members, which shouldn't be a burden.

Other than that, why reimplement RegExp?

> Also, what if a library writer decides to exploit the regular expression support provided by the language, and merely implements his RegExp class as a veneer over the built-in functionality?  It creates an odd sort of circular dependency.

At some point, he'll need a regex implementation. And the license for std.RegExp allows him to use/adapt it as required.

> I assume there's no plan to remove std.regexp from Phobos now that language support is in place?

I'm just not getting it - why should it be removed? There never was a plan to remove it. And why would an implementation of a D runtime library not want to do a regex implementation? Of course, it's a lot of work to implement a regex, but one can just copy over std.RegExp and use/adapt it as required, as the license allows that. So I am just not getting what the problem is.


February 17, 2006
Walter Bright wrote:
> "kris" <fu@bar.org> wrote 
>> Walter Bright wrote:

>>At least "in" has some relevant meaning to it.
> 
> It would be overloading its existing meaning, which means that it'll take semantic, rather than syntactic, analysis to disambiguate. This is potential trouble.

Sad. "in" did sound good. :-)

>>>That is a problem, one that would get solved when RegExp can do wchar and dchar. That isn't a technical problem, it's more of a getting around to it problem.
>>
>>Well, since grammar supported regex has elevated itself to the top of the priority list, perhaps wchar/dchar support might tag along with it?
> 
> The thing is, RegExp has been in there from the beginning, but it has gone unused and even its existence is overlooked. I don't believe that's because it isn't useful - look at Ruby, Perl, Javascript, etc. Those languages heavilly use regex. Is there something inherent about *script* languages that make them nice for regex? I don't believe there is, I think it gets heavilly used in those languages because the syntactic sugar makes it easy to use.

There are 2 things reducing its usage.

First, the using itself has been awkward.

Second, and more important, most real-world uses of regex involve literals. And that implies compile-time compilation, if they are to be perceived efficient.

>>>I don't think this takes away from the regex templates. I hope to use the regex templates in conjunction with this syntactic sugar to create optimized regex evaluation.
>>
>>Perhaps, but I really don't see the need for this sudden rush to get regex support into the grammar. Experience with regex templates is almost certain to uncover some conflict in this regard ~ one that will likely have to be compromised to fit in with the current syntax. That's just Murphy's law. What's the big hurry?
> 
> I thought it fit in well with D's new capability of being runnable in a script-like fashion.

Experience has shown that using D as a scripting language in a production environment, currently needs some method of compiler-version-locking.

In other words, if a script is written for D.130, then something should ensure that it stays compiled with that version, even after the system D compiler gets updated.

If this is not done, then system scripts break at unexpected times (i.e. the first time that particular script is run after the compiler is updated to the first version that breaks the script). In a production environment it is plain impossible to search and test-run each D script any time the compiler gets updated.

This problem is made even worse by the run-time library not having any version identifier. It sure would be nice if one could leave the old run-time libraries as-is, and only add the new one next to them. The binaries should choose the right one automagically.

The way we are using D scripting (digitalmars.D.announce:2674) is version independent (meaning we can use _any_ DMD), but of course the individual D scripts introduce compiler version dependencies by themselves.

One solution to all the above mentioned problems, would of course be a "dscript.d" binary, that takes care of everything. (A good starting point would be to use the above mentioned scripting script.) Then every D script would start with

#! /usr/local/bin/dscript

but that would then totally obviate the DMD -run parameter!

> If this opens up a reasonably broad new range of applications that D is a good fit for, that's good. I might be wrong, of course, as I've been with the bit data type (a complete botch). Match expressions don't break anything, were not expensive to implement, and the only way to see how they'll work out is to try them. 

I think the current implementation is good. I don't like to see any $whatever (or even worse, $` $ยด $' $") implemented!!!! We don't like to see D become Perl.

And hey, Perl itself has been moving away from the $-unbrememberable-fly-droppings stuff. AND even _bash_ has been starting to avoid them lately! (See man bash.)

Syntactic sugar is ok in general. But not "semantic" or "hieroglyphic" sugar. Let's see how the brand new stuff works, and whether any additional sugar ever becomes needed here!
February 17, 2006
kris wrote:

> I'm an advocate for getting regex support in the grammar, but I'm certainly not an advocate for tying Phobos to the compiler (RegExp has a notable resultant import set; because of this I refactored it for Ares and Mango).

Would it be correct to assume that if we had compile-time regexps, then the resultant import set would be effectively zero? (As long as we of course don't also use regexps that aren't compile-time compilable?)

Since (IMHO) most shortish programs only use literal regexes, this would be quite important.
February 17, 2006
"Walter Bright" <newshound@digitalmars.com> wrote ... [snip]
>> One is asking the question "does this thing on the left exist within the thing on the right". It even takes care of getting the operand ordering correct. Thus, I'd urge you to at least see if there's actually a notable problem for the compiler to handle this before writing the idea off.
>
> It's not a problem with the compiler. It's a conceptual problem for the user. When I see 'in' I think of containers. That's completely different from regex.

Can't say that I agree, but my opinion matters rather little anyway <g>


>> I'm an advocate for getting regex support in the grammar,
>
> I thought you were arguing against that <g>.

Not at all. I've been an advocate for it in the past also. It's certain other aspects of built-in functionality that I consistently have a beef with.


>> In short: you're (a) building more and more library functionality directly into the language without providing a means to cleanly support alternate implementations, extensions, or otherwise decouple the compiler. And (b) by doing so, you're (perhaps inadvertantly) stifling some innovation and causing some headaches for the very people who are trying to help D along the road to acceptance. It would really help if you'd be somewhat sensitive to these aspects rather than persistently ignoring them.
>>
>> For instance, how does one change .sort to use a different sorting algorithm? How does one change the hashing function for non-classes? How can one unhook RegExp+OutBuffer+String+Others, and replace it? etc. etc. If D is intended to be a closed-shop, Phobos-only environment, then some of us are presumably wasting our time supporting the language; right?
>
> Regex is non-trivial. There's no way to have any sort of language support for it without it being in the library. Anyone working on D libraries or other things is welcome to use RegExp, so I am just not understanding what the problem is. Phobos isn't a closed shop, the license on the files allows anyone to do pretty much anything they want with it.

It's one thing to hear you say that; yet the proof is in the pudding. It's actually quite tricky to disentangle the compiler from Phobos. Some parts simply cannot be decoupled at all (at this time).  It's not a critisism of you personally, but the above concerns are very real and the frustration is something you perhaps need to know about.

If I read your answer a particular way, it can be interpreted as saying "why would you *not* want to use Phobos?". That would be an example of stifling innovation, for all kind of reasons.


> Also, let me reiterate that the compiler does *not* emit any hardcoded references to RegExp, nor does it know anything at all about regex's. It uses object._Match, which is a proxy to whatever the language implementor wants to use.
>
> RegExp could probably remove its dependence on OutBuffer, though.

Probably. On the same topic, you've often 'lectured' about the need to decouple such that the "libraries don't end up like Java" . Yet RegExp imports String too, which in turn imports all these (std.format in particular):

private import std.stdio;
private import std.utf;
private import std.uni;
private import std.array;
private import std.format;
private import std.ctype;
private import std.stdarg;

It's quite easy to eliminate OutBuffer and String from RegExp. There's an adjusted version of it in circulation, if you'd like to forego the effort.


> It sucks in C, and why do I say that? I've shipped a C compiler for 22 years now, and not once, not ever, did anyone ask for a regex library for it. Regex wasn't put in the C standard, or the C++ one. Yet regex is considered a core capability of several other languages. There are many ways to interpret that - I am interpreting it as meaning that regex sucks in C, and so people seem to just never even think of using C when they need to process strings.

I'm surprised that you'd interpret it that way. I've used regex in C for decades. There was one great implementation from, uhhh, Ian somebody from Edinburgh Uni, which generated x86 code on the fly. I used that to great effect ~ a truly impressive utility.


February 17, 2006
"Walter Bright" <newshound@digitalmars.com> wrote
>
> "Regan Heath" <regan@netwin.co.nz> wrote in message news:ops43d5lmc23k2f5@nrage.netwin.co.nz...
>> In the end it's just syntactic sugar for & | and ^. The question is, does it make the code clearer, I think so. Does it make bit manipulation easier to code, I think so. Is that enough to make it a valuable feature?
>
> I regularly do bit masking and shifting on ints. I'm so used to it, I don't think that adding sugar for it would help any.

Besides, its easy to use op-overloads for such things as necessary.


February 17, 2006
On Thu, 16 Feb 2006 17:48:54 -0800, Walter Bright wrote:

> "Derek Parnell" <derek@psych.ward> wrote in message news:edpqlnztl599.19xc3uf14ntbh.dlg@40tude.net...
>> On Thu, 16 Feb 2006 17:25:23 -0800, Walter Bright wrote:
>>
>>> "Regan Heath" <regan@netwin.co.nz> wrote in message news:ops43d5lmc23k2f5@nrage.netwin.co.nz...
>>>> In the end it's just syntactic sugar for & | and ^. The question is,
>>>> does
>>>> it make the code clearer, I think so. Does it make bit manipulation
>>>> easier
>>>> to code, I think so. Is that enough to make it a valuable feature?
>>>
>>> I regularly do bit masking and shifting on ints. I'm so used to it, I
>>> don't
>>> think that adding sugar for it would help any.
>>
>> YOU ARE DEAD WRONG! Sheesh!!! Not all us are blessed with your abilities.
> 
> What about using some functions instead:
> 
>     int setBit(inout v, int b)
>     {
>         return v |= 1 << b;
>     }
> 
> ?

You mean like std.regexp library functions? Oh that's right ... we have ~~ now; silly me.


-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocracy!"
17/02/2006 1:49:07 PM
February 17, 2006
Walter Bright wrote:

> It sucks in C, and why do I say that? I've shipped a C compiler for
> 22 years now, and not once, not ever, did anyone ask for a regex
> library for it. Regex wasn't put in the C standard, or the C++ one.
> Yet regex is considered a core capability of several other languages.
> There are many ways to interpret that - I am interpreting it as
> meaning that regex sucks in C, and so people seem to just never even
> think of using C when they need to process strings.

Hmm.

Regexes being a big thing for interpreted languages is much thanks to the Q&D convenience. Also systems scripting needs it for nontrivial filtering, and of course complicated line rewriting.

C folks tend to "peek directly" into the strings because it's cheap, and you have a sense of complete control.

Using regexps in C needs a total change of paradigm. Regexps are kind of "top down" things, wherease traditionally "peeking into strings" is bottom-up programming.

You'd also have to learn regexps. The trivial things are trivial in C-style too, and the non-trivial stuff gets avoided because of the up-front investment. Folks rather do nested ifs and stuff.

Conversely, many interpreted languages make it inefficient to do "peek" kind of programming, as compared to using regexps.
February 17, 2006
"Kris" <fu@bar.com> wrote in message news:dt3cc2$1pc7$1@digitaldaemon.com...
>> It sucks in C, and why do I say that? I've shipped a C compiler for 22 years now, and not once, not ever, did anyone ask for a regex library for it. Regex wasn't put in the C standard, or the C++ one. Yet regex is considered a core capability of several other languages. There are many ways to interpret that - I am interpreting it as meaning that regex sucks in C, and so people seem to just never even think of using C when they need to process strings.
>
> I'm surprised that you'd interpret it that way. I've used regex in C for decades. There was one great implementation from, uhhh, Ian somebody from Edinburgh Uni, which generated x86 code on the fly. I used that to great effect ~ a truly impressive utility.

How do you interpret the fact that it has failed to gain traction among the general C population?