February 18, 2006
Georg Wrede wrote:
> BCS wrote:
> 
>> To elaborate, the use of templates to implement compile time regex just seems like an error prone mess (a fantastic, made by a genus mess, but still a mess). 
> 
> 
> Probably nobody thinks that compile time regexes should be implemented with template metaprogramming.
> 
> While D template programming screams compared to C++, shoving work on the template system does slow down compilation unnecessarily, compared to doing things 'the proper way'. This would erode the absolutely coolest feature DMD has: a blazing compilation speed.
> 
>> While templates can be vary powerful and get a lot of stuff done, I think that the language should include support for compile time programming that is not just a side effect of other features.
> 
> 
> At the time, I think they just served to demonstrate a few things:
> 
>  - that you (or actually, Don) really can do most amazing things with templates
> 
>  - that showing this would motivate Walter to add effort and priority to implementing them properly (i.e. non-template)
> 
>  - serve as a vehicle to demonstrate the utility (of both template programming in itself, and the utility of compile-time regexes)

DMD has the speed; that's great and all, but we simply can't assume all implementations of the D language will be equivalent in performance. (Someone is going to write one in Java, I just know it).  IMO, basing language decisions off reference implementations is a Bad Thing.

-- 
Regards,
James Dunne
February 18, 2006
James Dunne wrote:
> Georg Wrede wrote:
>> BCS wrote:
>>
>>> To elaborate, the use of templates to implement compile time regex just seems like an error prone mess (a fantastic, made by a genus mess, but still a mess). 
>>
>>
>> Probably nobody thinks that compile time regexes should be implemented with template metaprogramming.
>>
>> While D template programming screams compared to C++, shoving work on the template system does slow down compilation unnecessarily, compared to doing things 'the proper way'. This would erode the absolutely coolest feature DMD has: a blazing compilation speed.
>>
>>> While templates can be vary powerful and get a lot of stuff done, I think that the language should include support for compile time programming that is not just a side effect of other features.
>>
>>
>> At the time, I think they just served to demonstrate a few things:
>>
>>  - that you (or actually, Don) really can do most amazing things with templates
>>
>>  - that showing this would motivate Walter to add effort and priority to implementing them properly (i.e. non-template)
>>
>>  - serve as a vehicle to demonstrate the utility (of both template programming in itself, and the utility of compile-time regexes)
> 
> DMD has the speed; that's great and all, but we simply can't assume all implementations of the D language will be equivalent in performance. (Someone is going to write one in Java, I just know it).  IMO, basing language decisions off reference implementations is a Bad Thing.
> 

No, but we can assume implementations of D will be faster than C++ since Walter's DMD is twice as fast as DMC for building Empire, even though they share the same optimizer, code gen, and linker. The D frontend, which is open source, gives D its speed.

For me, the fast compile times compared to C++ are a big feature of D.

February 18, 2006
Walter Bright wrote:
> "Sean Kelly" <sean@f4.ca> wrote in message news:dt5cbc$i49$1@digitaldaemon.com...
>> Walter Bright wrote:
>>> For the language implementor, the stuff in std.gc. How operator new interfaces with the gc is up to the language implementor.
>> But what if the user wants to employ a non-standard GC?  There have already been questions about this for real-time programming and other specialized applications.
> 
> I don't know what you mean by non-standard. It must implement the interface in std.gc, and operator new and delete need to work. Other than that, there are a wide range of gc implementation strategies one can use.

By non-standard I simply meant a different implementation.

>> First, it's important to note that I consider the runtime to be a distinct library containing anything required for basic language support, the garbage collector similarly separated and devoted to memory management, and the standard library as a third distinct entity which contains all components and interfaces the user is expected to actually interact with. Phobos already has this basic separation, but the points of interaction between each component aren't particularly well-defined. For example, if someone wants to provided a specialized garbage collector, what does he do?  A bit of research will reveal that some modules from internal/gc should be removed and a new class of type GC should be created, but this requires more interaction with low-level code than most users want to have.
> 
> Writing a gc is non-trivial, and someone who is up to that task I doubt will have much difficulty with the interface to it. You're right in that one can't casually create a GC class, but I don't see that as a fault in the interface.

True enough.

>> Second, I believe it's important that the need to import modules across these library boundaries should be avoided if at all possible, as doing so creates a compile-time depencency between them.  Also, it seems logical to assume that the runtime and GC code might not be written in D at all, so the points of interaction should be equally accessible from other languages, implying that all such points of interaction should be extern (C) functions.  This also hass the side-benefit of allowing the functions to be delared in the module they're called, as the name mangling scheme ignores declaration placement.
> 
> I don't see the reason why one would want to write a new GC that is not in D. If one wants to use an existing one, say the Boehm GC which is in C, all one needs is a simple wrapper of D functions around the Boehm ones.

I meant that more as a general statement rather than regarding the GC specifically--I think it's more likely that portions of the runtime code will not be written in D than the GC.  But as D code can call C functions directly, why not use that for library interaction?  It seems more straightforward than creating wrappers.  Also, I think the thread control functions may be useful for a debugger (which may well be written in C/C++), and the GC functions might be useful in mixed-code applications.  Wrappers could again be created, but I don't see the point.

>> So you can see that, so far, there has been no need to import any modules across library boundaries--all imports are either internal or of C headers (which can be easily declared in the module they're called if desired).  I think Phobos could ultimately benefit from such an arrangement, but it's really not critical at this point.
> 
> I see what you're doing, but what is the advantage of avoiding doing the import if you're going to need that code anyway? 

Largely to avoid compile-time dependencies between libaries, as I feel it's important that a user should be able to download an alternate standard library or GC and use it simply by linking it in.  And while this could also be accomplished by documenting that UDTs should perhaps not be used and compiling against header modules, it seems more straightforward to simply define things at the code level.

Another benefit I discovered is that this approach allows specialized functionality to be exposed or code paths to be optimized specifically for library use.  For example, I'd originally defined a Thread.count method which I knew was being called by the GC.  But when I got around to looking at the GC code I realized that it didn't actually care how many threads were running so much as whether critical sections were necessary to ensure correct behavior.  And this revealed that the way I was tracking thread count--modified by the newly created thread before entering user code--was not only incorrect, but the fact that Thread.count passed through a critical section of its own made it effectively useless to the GC code.  The redesigned function serves one purpose: to indicate whether Thread.start has ever been called by the application, and thus whether memory synchronization issues might be present or mutual exclusion might be necessary.  No critical sections are used, and indeed, a count of threads isn't even maintained--just a bit flag.  This approach was obvious in light of what the GC needed, but it was not at all apparent from the context of what a standard library user might be interested in.

Finally, defining specific means of interaction allows behavior to be modified quite easily.  When a system error occurs in Ares, rather than throwing an exception directly the runtime instead passes relevant information to a callback exposed by the standard library.  Thus the runtime has no dependence on the exception object definition (aside from the requirements imposed by the stack unwinding code itself), and the user has a clear means of hooking the error handling mechanism if different behavior is desired--the behavior of onAssertError can be modified, for example, so the user can signal the debugger immediately instead of waiting for an exception to propogate.  If the modules were imported and exceptions thrown directly, this would obviously not be possible.

Since this approach seems to provide at least marginal benefit, I would like to turn things around and ask what the advantage is of importing modules directly as Phobos does?  I can see that it offers immediate relief if the library writer decides he needs more functionality than has been predetermined, but with a prototype standard library already in place I would think that such needs should already be obvious.  Are there other advantages as well?


Sean
February 20, 2006
"Sean Kelly" <sean@f4.ca> wrote in message news:dt7u8d$2o5g$1@digitaldaemon.com...
> Since this approach seems to provide at least marginal benefit, I would like to turn things around and ask what the advantage is of importing modules directly as Phobos does?  I can see that it offers immediate relief if the library writer decides he needs more functionality than has been predetermined, but with a prototype standard library already in place I would think that such needs should already be obvious.  Are there other advantages as well?

You have made some good points.


February 21, 2006
Georg Wrede wrote:
> Walter Bright wrote:
>> "Wang Zhen" <nehzgnaw@gmail.com> wrote in message news:dt49iv$2hm5$1@digitaldaemon.com...
>>
>>> Although syntactically correct, MatchExpression in
>>> StaticIfCondition or StaticAssert do not compile. For example:
>>>
>>> void main(){static if(!("" ~~ "")){}static assert("" ~~ "");}
>>>
>>> Is this intended or an unimplemented feature?
>>
>> The problem is that getting it to work requires the compiler itself
>> to understand regular expressions. Currently, it does not.
> 
> Intriguing. I'd sure love to hear more about this.
> 
> I take it understanding regular expressions is much more than just compiling them? (Like what the runtime does, or Perl, etc.?)

A problem is that there are a number of dialects of regexp.  The spec doesn't seem to indicate which dialect is being used.

Among the differences between them is whether subexpressions are parenthesised by \(...\) or simply (...).  Another issue is whether we expect implementations to support the Unicode extensions to regexps described here

http://www.textpad.info/forum/viewtopic.php?t=4778

No doubt there are other differences....

Whichever we choose, the behaviour of using std.regexp directly, ~~ evaluated at runtime and ~~ evaluated at compiletime must be consistent.  But that isn't hard - the compiler would just call the same code that std.regexp uses.

Stewart.

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/M d- s:- C++@ a->--- UB@ P+ L E@ W++@ N+++ o K-@ w++@ O? M V? PS- PE- Y? PGP- t- 5? X? R b DI? D G e++>++++ h-- r-- !y
------END GEEK CODE BLOCK------

My e-mail is valid but not my primary mailbox.  Please keep replies on the 'group where everyone may benefit.
February 21, 2006
Walter Bright wrote:
> "Georg Wrede" <georg.wrede@nospam.org> wrote in message news:43F658D2.2000608@nospam.org...
> 
>> Walter Bright wrote:
>> 
>>> "Wang Zhen" <nehzgnaw@gmail.com> wrote in message news:dt49iv$2hm5$1@digitaldaemon.com...
>>> 
>>> 
>>>> Although syntactically correct, MatchExpression in StaticIfCondition or StaticAssert do not compile. For example:
>>>> 
>>>> void main(){static if(!("" ~~ "")){}static assert("" ~~ "");}
>>>> 
>>>> Is this intended or an unimplemented feature?
>>> 
>>> The problem is that getting it to work requires the compiler
>>> itself to understand regular expressions. Currently, it does not.
>>> 
>> 
>> Intriguing. I'd sure love to hear more about this.
> 
> 
> If the compiler is to constant fold regular expressions, then it
> needs to build in to the compiler exactly what would happen if the
> regex code was evaluated at runtime.

Yes. IMHO in essence, the binary machine code, which the runtime also would build. What I have a hard time seeing is, how this differs from building a normal function at compile time?

And eventually storing both in the executable image.

(I'd give you more intelligent questions, but I'm too baffled.)
February 21, 2006
"Georg Wrede" <georg.wrede@nospam.org> wrote in message news:43FB25FC.8090806@nospam.org...
> Walter Bright wrote:
>> If the compiler is to constant fold regular expressions, then it needs to build in to the compiler exactly what would happen if the regex code was evaluated at runtime.
> Yes. IMHO in essence, the binary machine code, which the runtime also would build. What I have a hard time seeing is, how this differs from building a normal function at compile time?

Consider the strlen() function. Compiling a strlen() function and generating machine code for it is a very different thing from the compiler knowing what strlen is and replacing:

    strlen("abc")

with:

    3


February 22, 2006
Interesting indeed.

Is there no way to "fold constants" in this kind of code too? If you know the inputs to a function are all constant, can't you simply replace the inputs + function call with the function's output?

Would be really cool if this kind of general constant folding could take place. The compiler would need to keep track of all constant variables, and flagging outputs of operations with constants as constants too. In your example, since the input to the strlen function is a constant, the compiler could just call the strlen-code itself and replace the actual call with that call's output.

I have no experience what-so-ever with compiler writing, so I'm probably overlooking MANY things :-)

Lio.

"Walter Bright" <newshound@digitalmars.com> wrote in message news:dtfin6$29hi$1@digitaldaemon.com...
>
> "Georg Wrede" <georg.wrede@nospam.org> wrote in message news:43FB25FC.8090806@nospam.org...
>> Walter Bright wrote:
>>> If the compiler is to constant fold regular expressions, then it needs to build in to the compiler exactly what would happen if the regex code was evaluated at runtime.
>> Yes. IMHO in essence, the binary machine code, which the runtime also would build. What I have a hard time seeing is, how this differs from building a normal function at compile time?
>
> Consider the strlen() function. Compiling a strlen() function and generating machine code for it is a very different thing from the compiler knowing what strlen is and replacing:
>
>    strlen("abc")
>
> with:
>
>    3
> 


February 22, 2006
Lionello Lunesu skrev:
> Interesting indeed.
> 
> Is there no way to "fold constants" in this kind of code too? If you know the inputs to a function are all constant, can't you simply replace the inputs + function call with the function's output?

Disclaimer: I don't know much about this. Most is pure speculation.

I guess it is theoretically possible, but the compiler has to know that the function is pure. That is:

a) The function can not have any side effects.
b) The result has to be deterministic and only depend on the arguments.

This means that the function can not call any function not fulfilling a and b, and that it can not rely on things like floating point rounding state etc.

In the general case, the compiler has no way of knowing this. The function may be externally defined, and only resolved at link time. For stdlib-functions the compiler could of course be given this knowledge beforehand (like strlen).

For functions fully known to the compiler, inlining followed by constant folding could theoretically have the same effect, but I don't think any compilers are smart enough to identify pure blocks of code in a general fashion and being able to evaluate them at compile time. Somewhat easier would be to identify pure functions and evaluate them at compile time. I guess this is going much further than current constant folding. The problems I see are:

a) Hard for the compiler to tell if a function is pure. In many cases it is not even possible (The halting problem has an example of such an undecidable function).
b) The compiler needs a way to evaluate the function at compile time.
c) The compiler has no way of knowing the function space and time complexity.

It would be interesting if there was a way to flag functions as being pure. The compiler could then try to evaluate the function at compile time or reduce the number of calls to the function at run time similar to what a common sub-expression removal optimization would do.

/Oskar
February 22, 2006
Oskar Linde wrote:
> It would be interesting if there was a way to flag functions as being pure.

This is what I've always thought declaring a function as "const", like can be done in C++, should do. Optimisation avenues galore.