June 30, 2004
quetzal wrote:

> Is there any rationale to make language support dynamic arrays natively? We got
> standart library for that. IMHO C++ has got it almost right with std::vector.
> Seriously, there's nothing that prevents one from coding a portable and reliable
> implementation of dynamic array class. Why do we need a direct support from
> language here?
It's not about need, it's about optimisation as others have pointed out.
Resizing-in-place particularly. I think if the compiler is clever enough and accessors for array elements were inlined, we might get vectorization anyway?

On the other hand, I'm tempted to agree with the desire for a class-based dynamic array as standard, for consistency with other objects. Having arrays in java working the same as class references was very nice.

Sam
June 30, 2004
pragma <EricAnderton at yahoo dot com> wrote:
> In article <cbsdjs$2k4n$1@digitaldaemon.com>, Andy Friesen says...
> 
>>Here's some ideas I was scribbling earlier: <http://andy.tadan.us/d/macro_example.d.html> (warning, not close to being fully baked)
> 
> 
> Gotcha.  I like the examples.  Really got me thinking.
> 
> I like your approach, but the syntax still felt too much like perl to me. :)
> IMO, It didn't really feel like D once you entered the 'meta{}' space (but what
> a neat example).  So I tried meshing our ideas together to see what we get.

I suspect the $ symbol is to blame for that.  It's funny how a single language can sour someone on something so basic as a single, specific symbol. (I dislike $ for the same reason, even though I know full well that it's completely irrational)

At any rate, I'm not particularly attached to that particular construct.

> Andy, feel free to abuse this post.  ;)

Oboy!

> The result is a meta syntax that lets you generate code as string data, which
> then is wrapped by the compiler to create an extension.  The extension is then
> invoked to add the appropriate handles to the D parser.  When a meta symbol is
> parsed, its handle is invoked which in turn generates a substitute expression,
> method or whatever.
> 
> # module etc.macros;
> # # meta{
> # 	expression char[] format(...) {
> # 		char[] result;
> #               // _args is implicit to varadic meta methods
> # 		for (DCompiler.MethodArg arg; _args) # 			result ~= ".format(" ~ arg.toString() ~ ")");
> # 		}
> # 		result ~= ";";
> # 		return(result);
> # 	}
> # }

The problem with this is that code is no longer being handled like code, but like a string of characters.  This is certainly a powerful metaprogramming mechanism (it can say anything you can for obvious reasons), but it discards that notion of modelling the code in the abstract sense.

For instance, it becomes more cumbersome to filter a macro through another macro because all the compiler gets is a string.  Compilers shouldn't have to parse the code they've just created.

My suggested meta{} syntax is just shorthand for creating AST nodes directly.  For instance

    Expression e = meta { $x = $x + 1; }

would be more or less synonymous with

    Expression e = new AssignStatement(
        x,
        new AddExpression(x, new IntegerLiteral(1))
    );

If nothing else, this is going to be a bit faster, as no strings are generated just so they can be reconsumed.

The other advantage is that we're staying very close to the problem domain.  When generating code, we want to have a "code literal".  A meta{} block is precisely that.  The $ notation is exactly what it is in Perl: interpolating a code literal with variables.

Frankly, though, I can't help but think that all this couldn't itself be implemented as a macro.  The core compiler would only have to expose a standard class heirarchy representing its various AST nodes (which would probably coincide with the compiler's internal structures, but doesn't necessarily have to--the compiler merely has to do some extra work to convert the two)

It may be worthwhile to require that macros to be "activated" with something like:

    import macro foobar; // activates all macros defined in foobar.dll

This minimizes potential screwups due to unexpected macro expansion.

> The compiler would expand the meta statement into the following to generate an
> extension .dll. (it could also make a first pass to collect all meta statements
> into a single dll if need be).

Proposed massacre:
(dropping namespaces for brevity.  assume all type names are standard lib type things)

> import std.compiler;
>  Expression format_metahandle(Expression[] _args) {
> 	Expression result;
> 	for (Expression arg; args)
> 		result = meta {
> 			// still using $ because I can't think of anything better
> 			( $result ).format( $arg );
> 		};
> 	}
> 	return result;
> }
> 
> MacroArgumentList format_arguments = MacroArgumentList.createVariadicArgumentList();
> 
> // DLLNAME is the same name as the .dll file, sans extension.
> extern (C) MacroDefinition[] init_DLLNAME() { 	MacroDefinition[] macros;
> 
> 	// macros are expanded before argument resolution, so
> 	// we don't need to talk about whether it has a return type
> 	// or what that type is.  We do, however, need to tell the
> 	// compiler how the macro is used syntactically.
> 
> 	macros ~= new MacroDefinition("format", &format_metahandle, format_arguments);
> 
> 	return macros;
> }

Additionally, I don't think any of this should be automatic at all.  The programmer should have to type all this crap out manually for a very, very simple reason:

    We could write a macro to do it for us.

It would be cool if the meta{} syntax could also be written as a macro.  If done as such, the compiler's responsibilities would amount to three things:

    (1) import macro x;

    (2) Recognizing macro invokations and replacing them with the code they return.

    (3) Converting the 'public' AST class heirarchy to and from its own internal classes. (easy if they're one and the same)

I'm starting to think that the compiler should not compile and run macros within the same project for a few reasons.  First is the obvious simplification of the implementation.  Requiring that the compiler execute macros defined in the very compilation unit it is working on necessatates either that the compiler be able to compile and link a complete DLL, or implement an interpreter.  Relaxing this restriction allows the compiler to remain ignorant of linking.

Second, separate compilation makes it abundantly clear when the macro DLLs are used. The compiler needs them to build the software, the resulting application does not need them at all.  This goes a long way towards clearing confusion as to what is being compiled and executed at what stage.

The last reason is that there is a necessarily huge potential for obfuscation.  Extending the language is a pretty big deal and should not be done at the drop of a hat.

 -- andy
June 30, 2004
In article <cbt0kv$dbb$1@digitaldaemon.com>, Andy Friesen says...
>
>The problem with this is that code is no longer being handled like code, but like a string of characters.  This is certainly a powerful metaprogramming mechanism (it can say anything you can for obvious reasons), but it discards that notion of modelling the code in the abstract sense.

I'm with you there.  My stab at using text to generate an expression was really based on my more practical experience using things like 'eval' in PHP and the like.  Personally, I have yet to write anything even approaching a compiler, hence my being somewhat naive on the topic.

>For instance, it becomes more cumbersome to filter a macro through another macro because all the compiler gets is a string.  Compilers shouldn't have to parse the code they've just created.

I agree that it puts more work on the compiler, but I can't help but feel that both styles of code generation have their place.  After all, many (scripted) languages have an 'eval' statement somewhere that works with raw text; its had to have some merit to hang on conceptually.

Besides, what if you wanted to generate code based on external input, like a file?  You could nest arbitrary code-snippets in that external file and they would fold neatly into a generic piece of metacode.  Now you can take your code DB buisness-object generation suite and chuck it: D can build code out of virtually anything.

C#'s reflection.emit is a good example of what I'm talking about here.  From one family of interfaces, you can work with raw expressions or go all the way down to IL opcodes if you really want to go there.

>Frankly, though, I can't help but think that all this couldn't itself be implemented as a macro.  The core compiler would only have to expose a standard class heirarchy representing its various AST nodes (which would probably coincide with the compiler's internal structures, but doesn't necessarily have to--the compiler merely has to do some extra work to convert the two)

Now this really gets me thinking.  Is there wiggle room in the DMD frontend to expose this kind of functionality?  Or is this yet another motivation to construct a D-based D compiler from the ground up?

>Additionally, I don't think any of this should be automatic at all.  The programmer should have to type all this crap out manually for a very, very simple reason:
>
>     We could write a macro to do it for us.

Better yet, you could write the entire D language with this kind of facility.

In essence, we're really turning the compiler inside out and making the entire mess accessable if not completely mutable.  In fact, macros now wouldn't even have to be explicit or opaque.  You could even write in those exception-stack-traces that you wanted. :)

>I'm starting to think that the compiler should not compile and run macros within the same project for a few reasons.  First is the obvious simplification of the implementation.  Requiring that the compiler execute macros defined in the very compilation unit it is working on necessatates either that the compiler be able to compile and link a complete DLL, or implement an interpreter.  Relaxing this restriction allows the compiler to remain ignorant of linking.
>
>Second, separate compilation makes it abundantly clear when the macro DLLs are used. The compiler needs them to build the software, the resulting application does not need them at all.  This goes a long way towards clearing confusion as to what is being compiled and executed at what stage.
>
>The last reason is that there is a necessarily huge potential for obfuscation.  Extending the language is a pretty big deal and should not be done at the drop of a hat.

Okay, that makes things much more clear.  My only motivation for including macros side-by-side with code was that the macro was very tightly coupled to the code that used it.  Macros tend to be very domain-specific, at least in C programming.  But then again, there probably isn't much merit in sharing C preprocessor macros due to how weak the language is.

But I see what you're saying with using 'import macro foobar;'.  By explictly prodding the compiler to treat a file as a compiler extension, rather than sourcecode for the current target, it makes things far more manageable.

Maybe D might be one of the first C-style languages to get a full-on macro distribution.  As long as you have the compiler extension for the macro grammar of your choice, things should compile along just peachy.

- Pragma


June 30, 2004
pragma <EricAnderton at yahoo dot com> wrote:

> Besides, what if you wanted to generate code based on external input, like a
> file?

I hadn't thought of that at all.  I think you're right.

> Is there wiggle room in the DMD frontend to
> expose this kind of functionality?  Or is this yet another motivation to
> construct a D-based D compiler from the ground up? 

Since the whole point is to transform one piece of code into another, it should be wholly implementable in the frontend, just like templates.

The trick is making DMD (which is implemented in C++) talk to D classes.  Worst case: inline assembly.  Second worst case: (probably best case) lots of C++ and D glue code. (a C API to manipulate AST things and D classes which connect to it)  The third choice is to eschew objects in the compile-time API so that C++ can connect to D via the extern (C) ABI.

Given that, though, it should be quite doable.

> Better yet, you could write the entire D language with this kind of facility.   

Probably, but that'd be a whole lot more work. ;)

> In essence, we're really turning the compiler inside out and making the entire
> mess accessable if not completely mutable.  In fact, macros now wouldn't even
> have to be explicit or opaque.  You could even write in those
> exception-stack-traces that you wanted. :)

There are literally a ton of interesting things you can do.  The majority of the Nemerle language syntax is implemented with macros. (this includes the primitive operators and conditional constructs like if() and while())

 -- andy
June 30, 2004
quetzal wrote:

> In article <cbso6e$1g4$1@digitaldaemon.com>, Norbert Nemec says...
>>
>>There is one aspect of arrays where no library will ever reach native arrays: vectorizing expressions!
>>
>>In C++, expression templates go some way in that direction, but they are still way of what a good vectorizing compiler can do.
> 
>>Years ago, this would only have been a matter for high-performance specialists coding for multi-processor number-crunching machines. Nowadays, every PC has plenty of vectorizing capabilities (super-scalar technology, etc.), therefore, high-level language elements really are necessary to allow the compiler to do the work of optimizing the code.
> 
> There's nothing that prevents library from implementing dynamic arrays as pointer to data + size (just like language does now). So it can be vectorized just the same way. Also programmer gets control and can fine-tune array implementation for his own needs. I think the way to go is interface based standart library.. if programmer wants to change array behaviour he can just write class that implements given interface and alias it.

The problem is not the implementation of the array itself, but that of vectorized expressions like:

        A[] = B[] + 3*C[];

Implementing this in the library would be possible, of course, but you could never get to the same level of optimization as it is possible for a good compiler that knows all the details of the processor.

And, to have vectorized expressions in the language, you need, of course, to have the arrys in the language first.

Beyond this special example, there is a ton of other examples where the compiler can do optimizations on arrays that are not possible for library-implemented arrays.


June 30, 2004
quetzal wrote:

>>I think so.  By making dynamic arrays first class citizens of the language, they work a bit smoother.
> But also programmer loses control. He cant change how memory is managed in array, how array is sorted (bloody .sort) and other stuff like that.

I agree with you, that .sort as a language was not really a good idea. Of course, it seems convenient on the first glimpse, but then - if everything that is convenient is packed into the language, it will become a complete mess.

Anyhow, for arrays in general: Nobody hinders you to ignore the language level arrays and do your own.



July 08, 2004
"Sam McCall" <tunah.d@tunah.net> wrote in message news:cbt02j$ci2$1@digitaldaemon.com...
> On the other hand, I'm tempted to agree with the desire for a class-based dynamic array as standard, for consistency with other objects. Having arrays in java working the same as class references was very nice.

I'm pretty familiar with Java arrays, since I implemented a Java compiler and worked on a Java VM. D arrays can do many things Java arrays cannot:

1) can be resized
2) can be sliced
3) can exist as 'lightweight' arrays on the stack
4) integrate seamlessly with C arrays
5) can have bounds checking turned off
6) have no extra length overhead when using static arrays
7) can exist in static data

Having arrays in the syntax rather than as a vector<> class offers the advantages:

1) you can declare them with a specialized array syntax, like:
    int[3] foo;
2) specialized array literal syntax is possible:
    int[3] foo = [0:1,2,3];
3) seamless interaction between static and dynamic arrays
4) the compiler knows about arrays, so can give sensible error messages
rather than incomprehensible errors related to template implementation
internals
5) the compiler knowing they are arrays means better code can be generated,
particularly for things like foreach loops
6) vector ops are possible with specialized array syntax
7) arrays and strings can be the same thing, rather than incompatible


1 2
Next ›   Last »