Thread overview
[Suggestion] Syntax for typesafe Multiargument Functions
Mar 26, 2004
Manfred Nowak
Mar 26, 2004
Stewart Gordon
Mar 27, 2004
Manfred Nowak
Mar 29, 2004
Stewart Gordon
Mar 31, 2004
Manfred Nowak
March 26, 2004
I would like to extend D's syntax for multiargument functions by the way, that semicola `;' are allowed in actual parameter lists in addition to the comma `,' as a separator for arguments. The semicolon would serve the purpose to divide the arbitrary long actual parameter list into managable sublists à la:

    result= foo( bar, 10; x1, x2, x3; "hello", cast(uint) 5, 2 + 3i);

Or a somehow more familiar example:

    char[] s1= ...
    char*  s2= ...
    char[] s3= ...
    print( buffer, "%D %D %.*D %D %D\n"; s1; s2; 10, s3; 1_000; 20, 18, PI);

This syntax is accompanied by the rule, that the sublists created by the semicola are evaluated from left to right, i.e. the evaluation of the first argument of sublist sub(n+1) starts if and only if all arguments of sublist sub(n) are already evaluated. Whereas the order of the evaluation of the arguments within such a sublist, i.e. those arguments, that are separated by commas, is undefined.

In the spec the grammar rule

    PostfixExpression:
		PostfixExpression ( ArgumentList )

would have to be deleted and replaced by

    PostfixExpression:
		PostfixExpression ( MultiArgumentList )

And the following rules, or similar, must be inserted:

    MultiArgumentList:
                ArgumentList
                MultiArgumentlist ; Argumentlist

If this is acceptable, then I will explain in a further posting, what background leeds to this suggestion.

So long!
March 26, 2004
Manfred Nowak wrote:

> I would like to extend D's syntax for multiargument functions by the way,
> that semicola `;' are allowed in actual parameter lists in addition to the
> comma `,' as a separator for arguments. The semicolon would serve the
> purpose to divide the arbitrary long actual parameter list into managable
> sublists à la:
> 
>     result= foo( bar, 10; x1, x2, x3; "hello", cast(uint) 5, 2 + 3i);
<snip>
> If this is acceptable, then I will explain in a further posting, what
> background leeds to this suggestion.

If what's acceptable?

(a) Your whole idea - before anyone can decide, I think it's necessary to know the foreground first, i.e.

- how this mere syntactic sugar leads to typesafety
- how the program would use such an argument list

(b) To explain the background - that would most certainly be welcome.

Stewart.

-- 
My e-mail is valid but not my primary mailbox, aside from its being the unfortunate victim of intensive mail-bombing at the moment.  Please keep replies on the 'group where everyone may benefit.
March 27, 2004
Stewart Gordon wrote:

> If what's acceptable?

If it is not acceptable to extend the syntax for multiargument functions in the way I suggested or a similar extension, then it is a mere waste of time to talk any further---and I would rather not be a starry-eyed idealist.

There is one more requirement that has to be acceptable, but is not a syntax change: one more operator has to be accepted, i would like to call it end-of-medium operator `opEOM', and again someone may come up with an ever better name.

Also this new operator may ( I am not quite sure about this) come with a rule that has to be accepted:

If an `opCall' is declared in some struct/class, then an `opEOM' must be declared also. In general it is not required that the struct/class in which that `opEOM' is declared must be the same struct/class, that holds the declaration of the `opCall'. But it is required, that the struct/class that declares the `opEOM' is "reachable" from the struct/class that declares the `opCall'. What is meant by "reachable" will hopefully be explained by further reading.

Again: if it is not acceptable to have a further operator or the (maybe)
further rule to include a line `someType opEOM(){ return someType;}' or
`void opEOM(){}' into a struct/union that declares an `opCall', then any
further dispute is useless.


> (a) Your whole idea - before anyone can decide, I think it's necessary to know the foreground first, i.e.
> 
> - how this mere syntactic sugar leads to typesafety

This syntactic sugar does not lead to typesafety. It is nearly only a
mental support for the typesaftey that is already built into D as a
language. The only requirement that is fulfilled by this syntax is,
that it is signalled, that the end of the multi argument parameter list is
reached. And this is hidden in the right parenthesis, which enables the
compiler to ask for the existence of the `opEOM'.


> - how the program would use such an argument list

If the requirements, that I suggested, were enabled it would be dead easy to code transducers, i.e. finite state machines whos edges are labeled with actions, to analyze the actual multi argument list.

And because the languages that are accepted by finite state machines are the same that can be expressed by regular expressions, the enabling of such coding might already be stated as an overkill to the problem that is usually presented by the need to analyze multi argument lists.

To drive this overkill to its ultimate end: although I have not yet analyzed it, it does not seem impossible to even code LALR(1) parsers to analyze multi argument lists, thereby beeing able to accept every language for which a context free grammar exists.


> (b) To explain the background - that would most certainly be welcome.

You may have noticed, that the second example I gave, was a print statement. I am experimenting with a typesafe wrapper for `printf'. Because it is typesafe the format string of such a wrapper needs nearly only mention where an argument has to be printed, to state the type of the argument is superfluous. Therefore the format string consists nearly only of white space and the "%D" specifier, which I choose for the "here" denotion as suggested from someone in an earlier post.

Actual parameter lists for `printf' can be analyzed by a transducer that has only two states.

I have got this wrapper working in the current version of dmd for int, uint, char[] and char*. From the latter string literals must be excluded, because DigitalMars has made up a dual typedness for them, as stated in another thread.

This working wrapper has two shortcomings:

1) The syntax for the actual parameter list is extremely ugly for my feeling.

2) Because the lack of a signalling of the end of the multi parameter list I am unable to check, whether the supplied format string conforms to the rest of the actual parameter list.

Number 1) leeds to my suggestion for the syntax change.

Number 2) denotes a general problem, because in general transducers ( I do not want to talk about general parsers here) contain states that are _not_ accepting states. Without a compiler that asks for, or generates an `opEOM', every state of the transducer _must_ be an accepting state, thereby imposing a huge restriction on the languages that can be recognized.

So long!
March 29, 2004
Manfred Nowak wrote:
<snip>
> There is one more requirement that has to be acceptable, but is not a  syntax change: one more operator has to be accepted, i would like to call it end-of-medium operator `opEOM', and again someone may come up with an ever better name.

Even after looking down, it still isn't quite clear how your opEOM is going to work.  Maybe you could provide an example?

> Also this new operator may ( I am not quite sure about this) come with a rule that has to be accepted:
> 
> If an `opCall' is declared in some struct/class, then an `opEOM' must be declared also.

Why should it be necessary to force every struct/class with an opCall to become more complex and break backward compatibility, when only a handful of them will want to use multiarguments?

>> (a) Your whole idea - before anyone can decide, I think it's necessary to know the foreground first, i.e.
>> 
>> - how this mere syntactic sugar leads to typesafety
> 
> This syntactic sugar does not lead to typesafety. It is nearly only a  mental support for the typesaftey that is already built into D as a language. The only requirement that is fulfilled by this syntax is, that it is signalled, that the end of the multi argument parameter list is reached. And this is hidden in the right parenthesis, which enables the compiler to ask for the existence of the `opEOM'.

Don't forget, with the flexibility you seem to be proposing, being able to determine which type each argument is!

>> - how the program would use such an argument list
> 
> If the requirements, that I suggested, were enabled it would be dead easy to code transducers, i.e. finite state machines whos edges are labeled with actions, to analyze the actual multi argument list.
<snip>

Sorry, I actually meant literally "how" rather than "for what purpose".  I.e. if you're going to invent a syntax for declaring a function to take a variable argument list, and for passing such a list into a function, you'll also need to invent a syntax to be used within the body of the function to access the variable arguments.

> To drive this overkill to its ultimate end: although I have not yet analyzed it, it does not seem impossible to even code LALR(1) parsers to analyze multi argument lists, thereby beeing able to accept every language for which a context free grammar exists.

A typical parser (by the BNF I've seen) doesn't care how many arguments a particular function has, so it certainly shouldn't care if this number is variable for some function.

>> (b) To explain the background - that would most certainly be welcome.
> 
> 
> You may have noticed, that the second example I gave, was a print statement. I am experimenting with a typesafe wrapper for `printf'.

I wonder what Walter does plan (if anything) for a robust system in D for formatted I/O.  BTW have you seen my idea?
Re: Possible solution for D's I/O system
http://www.digitalmars.com/drn-bin/wwwnews?D/25096

> Because it is typesafe the format string of such a wrapper needs nearly only mention where an argument has to be printed, to state the type of the argument is superfluous. Therefore the format string consists nearly only of white space and the "%D" specifier, which I choose for the "here" denotion as suggested from someone in an earlier post.
<snip>

Yes, you have an idea there....

Stewart.

-- 
My e-mail is valid but not my primary mailbox, aside from its being the
unfortunate victim of intensive mail-bombing at the moment.  Please keep
replies on the 'group where everyone may benefit.
March 31, 2004
Stewart Gordon wrote:

>Maybe you could provide an example?

Coming soon. Just want to become more sure about this matter.


> Why should it be necessary to force every struct/class with an opCall to become more complex and break backward compatibility, when only a handful of them will want to use multiarguments?

Agreed. So for this requirement I want leave `opCall' untouched. Instead I one further new operator, resulting in two operators: the `opEOM' as already requested and `opMultArg', which is semantical identical to `opCall' with the exception, that a corresponding `opEOM' must exist.


> Don't forget, with the flexibility you seem to be proposing, being able to determine which type each argument is!

I do not need to, the compiler will do and is already doing that.


> Sorry, I actually meant literally "how" rather than "for what purpose".
>   I.e. if you're going to invent a syntax for declaring a function to
> take a variable argument list, and for passing such a list into a
> function, you'll also need to invent a syntax to be used within the body
> of the function to access the variable arguments.

No new syntax needed inside. But because the coding of the access to the supplied actual parameter list is highly formal, the wish to have an extra syntax for that may arise.


> I wonder what Walter does plan (if anything) for a robust system in D for formatted I/O.  BTW have you seen my idea? Re: Possible solution for D's I/O system http://www.digitalmars.com/drn-bin/wwwnews?D/25096

Thanks. Meanwhile I have commented on that.


> Yes, you have an idea there....

Thanks for your agreement.

Just got a template running, that implements general processing of lists of arguments, when all arguments are of the same type. I.e. let `I' and `O' be arbitrary types, let type `I' be the type of the input and type `O' the type of the output, that is the result of the processing of a list of arguments of type `I'.

Then the template is able to instantiate a function `f' that recognizes the list of `I'-typed arguments, or in means of a regular expression an "I*", and returns an `O'-typed value, as if the syntax of D would be extended to beeing able to recognize a declaration like:

   O f( I*){};

So long!