March 17, 2007
Derek Parnell wrote:
> On Fri, 16 Mar 2007 15:54:26 -0700, Walter Bright wrote:
> 
>> Derek Parnell wrote:
>>> Given ...
>>>
>>>    int[int**][]*[char[]][3][17][3,17]* x;
>>>
>>> and I want to make the 'x' immutable and the 'char[]' immutable, how does
>>> one write the declaration? Where does one place the 'final'/'const'/'super
>>> const' and where do the parenthesis go?
>> final int[int**][]*[ const(char[]) ][3][17][3,17]* x;
> 
> Are you serious???? 
> 
> I want 'x' to be immutable so I need to use 'final' and place it the
> furtherest away from what is is qualifying.
> 
> I want 'char[]' to be immutable so I must use 'const' and place it next to
> what it is qualifying plus use parenthesis around its target too.
> 
> Why do you think that this is intuitive, consistent, and easy to read?

Because you didn't suggest a better design. :o)

Andrei
March 17, 2007
Walter Bright wrote:
> eao197 wrote:
>> On Sat, 17 Mar 2007 01:41:54 +0300, Walter Bright <newshound@digitalmars.com> wrote:
>>
>>> eao197 wrote:
>>>> No, at first there must be 'D with macros' (like 'C with classes') and only then -- D++ :))
>>>
>>> D will get macros - but they won't be text processing macros, they'll be abstract syntax tree (AST) processing macros.
>>
>> It is very interesting to hear that. Could you say more about the future macro system in D? Or it is a secret now?
> 
> It's pretty simple:
> 
>     macro foo(args)
>     {
>         ...syntax that gets inserted...
>     }
> 
> and the args and syntax is evaluated in the context of the invocation of the macro, not the definition of the macro. You'll be able to do things like:
> 
>     macro print(arg)
>     {
>         writefln(__FILE__, __LINE__, arg);
>     }
> 
> which get the file and line in the right context. There's no way to do that right now.

On naming why not use mixin, since they are so similar?

mixin print(arg)
{

}

Then make them typesafe (using alias to get out of type-safty)?


What if you want the file in some other context?  It would be nice to have a complete solution to that, which allows some sort of stack traversal.  Although I'm not sure its possible, due to not wanting to keep this sort of information around at release time.

ie:

     macro print(arg)
     {
         writefln(__FILE__[stack_level], __LINE__[stack_level], arg);
     }

Or even better:

Stack[0].Line; //Current line
Stack[1].Line; //Line one level up
Stack[0].File;
Stack[0].Module;
Stack[0].Function;
Stack[0].NumbArgs; //Some form of reflection
Stack[0].Arg[N];   //Access to the value of the argument (ie Turple of values)
Stack[0].ArgIdentifier[N] //String name of the identifier
Stack[0].FuncType	 //The type of function we are in (is it a macro, a compile time function a member, a regular function)
ect...

     macro print(arg)
     {
	with(Stack[0])
	{
        	writefln(File, Line, arg);
	}
     }

I guess that would have problems for larger depths.  Perhaps it could be limited to two levels (current and parent) or only work for ones that can be evaluated at compile time (you would get a compile time error if the line number couldn't be evaluated.)  There could be a debug version that would work at run time.

-Joel
March 17, 2007
On Sat, 17 Mar 2007 11:49:59 +0300, Walter Bright <newshound@digitalmars.com> wrote:

> It's pretty simple:
>
> 	macro foo(args)
> 	{
> 		...syntax that gets inserted...
> 	}
>
> and the args and syntax is evaluated in the context of the invocation of the macro, not the definition of the macro. You'll be able to do things like:
>
> 	macro print(arg)
> 	{
> 		writefln(__FILE__, __LINE__, arg);
> 	}
>
> which get the file and line in the right context. There's no way to do that right now.

As I can see the content of macro body will be inserted into AST in the place of invocation. But there is not much differences with existing C/C++ macro handling (insertion of right __FILE__, __LINE__ is good thing anyway).

Is there allowed any access to previous parsed entity? For example, can I define macro:

macro some_class_extender(class_name) {
  ...modification of 'class_name' class structure...
}

class MyCoolClass { ... }
some_class_extender( MyCoolClass );

and get the modified version of MyCoolClass after some_class_externder invocation?

-- 
Regards,
Yauheni Akhotnikau
March 17, 2007
Bruno Medeiros wrote:
> What is the status of the experimental designs ...


I asked this because yesterday, out of nowhere, I started thinking about this problem, and as a kind of a mental exercise I came to a working design. It seems kinda pointless, since you already made your design, but I'll show this one nevertheless. Consider it a late entry to the max challenge :P . Still, there are some aspects presented here, that I don't how they would work on your planned design (like those keyed by the questions).
Most of the terms here are tentative, and so is the syntax. Please consider the syntax separately from the semantics and conceptualization. Errors may be present in the text. This design is presented as is, without any warranty of any kind. :P

CONCEPTUALIZATION

There are 3 major kinds of D entities:

Values (expressions), types, and templates. (and labels too I guess...)

Values are characterized by properties that define what one can do with the value. The most important of these properties is the type. D offers a very rich mechanism to query and manipulate types (typeof(..), is(..) expression, auto, type parameters, etc.), much better than any other statically typed language I know. But the type is not the only "property" of an expression. There are others, such as whether an expression is an lvalue or not, if it can be assigned, etc.. The problem so far is that D does not offer a good mechanism to query and manipulate such "type properties".

Let's consider the following type properties:

R: is the value readable.
C: is the value readable at compile time (compile time constant).
&: is the value referenciable.
W: is the value writable. (explained later)

I can't find a good term for these "extended type properties" or "extended value properties" so I'll just call it QUX for now. Silly, I know, but whatever. And I will call "core type" to the current notion of type, which is what typeof(..) returns.

So, with these QUX, what are the valid combinations of them?
They are:

R
CR
R&
RW&
W&

(I'm skipping the explanation of why, since I think that's clear, see examples below). Furthermore, these property combinations are related in the following hierarchy of conversion:

  R
 /  \
CR  R&   W&
     \  /
      RW&

Examples of the QUX for various values in current D:

42	// CR - A literal is a constant.
var	// RW& - As in:  int var;
fvar	// R& - As in:  const fvar; fvar = 2; It's like 'final'
func()	// R - the value of a function is readonly and not referenciable
	// W - No example in current D for W

Let's give some tentative keywords for the possible QUX combinations:

CR  - const
RW& - ref
W&  - wronly ref
R&  - rdonly ref
R   - rdonly

Now, recall the following: QUX describe the properties of a value. The first thing you may think now is that one can use QUX to declare variables of the same QUX type. That's not entirely accurate. For example you can't declare a variable of QUX R, because a var is always either referenciable, or a (compile-time) constant. There are no R variables.
Specifying a var as rdonly will create a R&. Specifying no QUX will create a RW&. Specifying ref in a var declaration will alse create a RW& var, but the identity will be the same as the var in the given initializer. ref will preserve the "reference" (memory location) of the initializer. This is mentioned to clear the declaration semantics.
Examples:

int varA = varX; 	// var is RW&
const int varB = 1;	// var is CR
rdonly int varC = 2;	// var is R&
rdonly ref int varD = &varX; // var is RW& too
wronly ref int varE = &varX; // var is W&

varA will be a copy of varX, while varD will be the same as varX (same identity). After definition, varA and varD will have the same QUX.

And what about the definitions of function parameters, and function return types?
There are some minor differences. In function return types, if no QUX is specified, then the QUX is rdonly (as it is currently in D). In function parameters, if no QUX is specified the QUX is rdonly too (this is different from current D, but is considered a nice improvement. Function arguments are almost never modified anyway).

What about composing types? That is, when one has a composite type (array, pointers, etc.), how does one specify QUX for each of the type components?

Well for this var:
  int[]* var;
then QUX are specified like this:

rdonly int[]* var; // var (the pointer) is rdonly
(rdonly int[])* var; // the pointer target (the array) is rdonly
(rdonly int)[]* var; // the members of the array are rdonly
rdonly (rdonly (rdonly int)[])* var; // All are rdonly

Note that some QUX don't make sense in certain declarations, like declaring an array member as ref, like this:
  (ref int)[] var;
because the members of arrays are refs already. This could be an error or simply ignored.

What about auto?
auto does not in any way capture the QUX, just the core type (as in typeof(..) ).
  rdonly var;
  auto foo = var;  // foo is not rdonly

How do we templatize and parameterize QUX?
Let's see by example, looking at previous design challenges:

The id function:

T id(expr T) (T a) {
  return a;
}

So, "expr T" denotes that T is not a normal type parameter, but an "extended type parameter". Besides the core type, it will also hold information about the QUX. id can be instanced manually or with IFTI.

The max function (challenge #3) will show more advanced scenarios of QUX manipulation, but let's first recall what max does.
Consider these vars:

a = 3;
b = 9;
const fvar;  // fvar is 'final'
fvar = 1;

And now some examples of max usage:

max(1, 2)	2 of QUX CR
max(a, 2)	a of QUX R
max(a, b)	b of QUX RW&
max(a, func())	a of QUX R
max(a, fvar)	a of QUX R&


As requested, max preservers the greatest common QUX information.
Here's how we define max:


maxExtType!(A,B) max(expr A :: rdonly, expr B :: rdonly) (A a, B b) {
  if(a >= b)
    return a;
  else
    return b;
}


Of note: The 'T :: U' syntax means specialize the template if T can be converted to U. This is a variant of the current 'T : U' syntax which means specialize if T is the *same* as U. In both these constructs, U can be a QUX, but only if T is an "extended type parameter" (a parameter declared with expr).

maxExtType is the key to complete the challenge. It defines the maximum common extended type of A and B. This is defined as:


template maxExtType(expr A, expr B) {
  static if( !is(typeof(A) == typeof(B)) ) {
    alias maxCoreType(A, B) maxType; // Type cannot be ref
  } else {
    static if( is(A == ref) && is(B == ref))
      alias (ref typeof(A)) maxType;
    else static if( is(A :: rdonly ref) && is(B :: rdonly ref))
      alias (rdonly ref typeof(A)) maxType;
    else static if( is(A :: wronly ref) && is(B :: wronly ref) )
      alias (wronly typeof(A)) maxType;
    else static if( is(A :: rdonly) && is(B :: rdonly) )
      alias (rdonly typeof(A)) maxType;
    else
      static assert(false, "No common extended type for:"+A+" and "+B);
  }
}


So, like mentioned in the original challenge thread, if the core type of A and B are not the same, then maxExtType cannot be a ref. That's what the first static if checks for (note: an exception can be made for object types). The subsequent static ifs check for increasingly less restrictive common QUXs. It's possible that a common QUX does not exist if one is R& and the other is W& for example.


What about lazy?
In this design, lazy simply isn't considered as a QUX, as it simply is not a property of an expression. There are no lazy expressions. After a lazy FOO variable is created (which must be initialized), the variable becomes for *all effects* indistinguishable from a FOO delegate() , that is, a delegate returning type FOO. Thus, lazy can't also be parameterized/templatized.


IMMUTABILITY

Immutability, as in, "transitive immutability" is achieved as a type modifier with the keyword "immut". An immut value means that any other member obtained from the original value cannot be modified, and so on. The members of immut values are rdonly and immut. immut is not a QUX, it is a type modifier that modifies (and is part of) the core type. This means that immut appears in "typeof(..)", and consequentely is also captured by auto. This is the only sensible behavior, since immut describes a property of the referenced data of that expression, and must be preserved upon assignments (and thus part of the core type). This is unlike QUX, since QUX only describe properties of the immediate-value of an expression, which is copied in assignments. I.e., you can assign a rdonly value to a non-rdonly var, but you can't assign an immut value to a non-immut (normal) var. This shows how immut and QUX are somewhat different in nature. Also immut vars and not automatically rdonly, they are rdonly only if 'rdonly' is also specified.
An example:

  immut Foo[] fooar;
then:
  typeof(fooar[0]) == rdonly immut Foo;


TODO
*Syntax to specify the "this" of a method as immut. Maybe do it like C++?
*A way to conveniently specify/templatize methods that are identical and only vary in the mutability of it's types (like 'romaybe' in Javari).


The following describes a particular use case for rdonly and wronly:

VARIANT COMPOSITE TYPES.

Consider this hierarchy:

FooBar extends Foo extends Object
Xpto extends Object

Suppose we have an array of Foo:
  Foo[] fooarr;

The classic contravariance problem is, is fooarr safely castable to Object[] ? On first sight one might think yes, since Foo is an Object, then Foo[] is an Object[]. But that is not the case since an Object[] array is an array that one can put an Xpto object into:
  (cast(Object[]) fooarr)[0] = new Xpto();
which would break type safety, since we would have a Xpto in an array of Foo's. What happens is that we have some array operations (like readers) that remain safe , but others do not (like writers).
Java allows that cast, but has runtime checks on array member assignments, and throws an exception if the type safety is violated like in the example above.
Can a language provide (compile time) support for safe casting? With rdonly and wronly it can.
  We have that Foo[] cannot be cast to Object[], but it can be safely cast to (rdonly Object)[]. And then:

  fooarr2 = cast((rdonly Object)[]) fooarr;
  fooarr2[0] = new Xpto(); // Not allowed
  fooarr2[0].doSomething(); // Allowed

Assignments won't be allowed, but reading is allowed. Conversely, the array type parameter can be contravariantly cast, from Foo[] to (wronly Foobar)[].

  // Ok because fooarr[] is of type wronly FooBar
  fooarr[0] = new FooBar();
  // Not allowed because you can't read fooarr[0];
  fooarr[0].doSomething();

This was the main motivation I saw for the use of wronly, however, this mechanism is quite simple (as in, simplistic) and limited. It's not as powerful as Java's generics, which allow a greater degree of functionality with lower-bounded types. As such, it may not be worth having wronly just because of this. Still, I guess wronly could also be used in place of 'out' parameters.


SYNTAX AND TERMS SUBJECT TO CHANGE:

QUX - EVT (Extended Value Properties) or ETP (Extended Type Properties) ? Or 'attributes' instead of 'properties' ? But definitely not "storage class", that term sucks. :o
immut  - perhaps 'immutable'?
rdonly - perhaps 'readonly' or 'final'?
wronly - perhaps 'writeonly'?
expr
"::" - Ideally, it would be better that the ":" of template specialization would behave the same as the ":" of the is(..) expression.



Comments are welcome.

-- 
Bruno Medeiros - MSc in CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
March 18, 2007
Walter Bright wrote:
> 
> There are 3 distinct, and very different, flavors of constant:
> 
> 1) rebinding of a value to a name
> 2) a read-only view of a data structure
> 3) a value that never changes
> 
> C++ tries to do all three with one keyword, and makes a confusing hash of it.
> 
> For the purposes of discussion, and to avoid confusing ourselves, we adopted working names of:
> 1) final
> 2) const
> 3) super const
> 

Well, I'm a C++ programmer and I think the last two are much more confusing than "readonly" and a different "const".  If they keywords are simple and intuitive then the learning curve will be quick.  Can't get much better than "readonly" (I can't write to this data, but it may change) and "const" (the data will not change).

Consider these two statements:

The problem with C++ "const" is that it isn't really constant.  D fixes this by having "readonly" for an immutable view, and "const" actually meaning that the data is guaranteed not going to change.

The problem with C++ "const" is that it isn't really constant.  D's "const" is also not really constant, but "super const" has been added for data that is guaranteed not going to change.

Also, I am slightly confused why examples of final have been given with value types:
  final int x = 3;
But x can never be rebound as it isn't a reference - what the user really means is the value never changes, therefore keyword (3) rather than keyword (1) seems appropriate here.


> The hard part is to figure out how these 3 interact, how they look & feel, how they work with type deduction, type inference, type combining, how do we sidestep the clumsiness in C++'s handling of it, etc.
> 

Regardless of syntax, I would love to hear the advantages of the proposed system.  It seems the really big change is the addition of (3), but I can't quite see how you could create complex data sets that are guaranteed constant apart from using "trust me" casts.

For example, an function may take a data set as super const, therefore making optimisations.  But the creation of such a data set (say a map of ints to an array of strings, or a graph of nodes) may well be done at runtime, such as loading it from a file.  How do you "pin" runtime data to get a super constant type?

Can you even create super const data using custom classes?  Must everything be done in the constructor?  What about pointers to non-owned non-constant data, such as back to an owner object?  I'm intrigued.  Any chance of posting some "behind the scenes" with the discussion and resolution to the issues you guys have grappled with?

March 18, 2007
Andrei Alexandrescu (See Website For Email) wrote:
>> How about “const const”?  No new keywords, no reuse of keywords in disparate contexts, and equally noticeable to the eye.
> 
> It was on the plate briefly. Probably it would be an uphill battle to convince people to enjoy it.

I’m still kinda shaky on the question of why the various extra specifiers are needed.  If a variable is being declared “const” ought to mean nobody can change it, if it’s a function parameter it’s a promise that the function won’t change it.  How vital is it that the language express “put this in read-only memory” as opposed to “don’t put this in read-only memory, but complain if I try to change it”?  Why should a function care if it got a pointer to read-only memory or if it just promised not to alter the value?

For numeric constants that don’t need addresses, I’d suggest “define” rather than “final”, as in:
	define float PI = 3.14159;
If the “#if” syntax for “static if” is adopted, I might even suggest “#define”, just to make it clear that something is happening at compile-time rather than run-time.

--Joel
March 18, 2007
See new thread for reply.
March 18, 2007
Andrei Alexandrescu (See Website For Email) wrote:
> kris wrote:
>> So, "invariant" is already a keyword ... what about that?
> I completely missed that one. I think it's a good idea to look into it as a candidate for a qualifier name. Thank you.

I agree. I think:

final
const
invariant

for the three cases looks pretty good.
March 18, 2007
On Sat, 17 Mar 2007 19:09:16 -0700, Walter Bright wrote:

I'm sorry I'm so thick, but have I got the idea right yet ... ?

> final
 This applies only to assignments to Write-Once RAM locations. This can be
done by either the compiler or at run time depending on the amount of
knowledge the compiler has about the location's usage.

> const
 This is applied to declarations to prevent code in the same scope as the
declaration from being able to modify the item being declared.

> invariant
 This is applied to declarations to prevent code in the same application as
the declaration from being able to modify the item being declared.


As you can see, I'm confused as to how the qualifier effects which code is allowed to change which items. Even more so when it comes to reference items ... 'cos I'm not sure how to use these qualifiers to specify whether the reference and/or the data being referenced can be changed, and by whom.


-- 
Derek Parnell
Melbourne, Australia
"Justice for David Hicks!"
skype: derek.j.parnell
March 18, 2007
Derek Parnell wrote:
>> final
>  This applies only to assignments to Write-Once RAM locations. This can be
> done by either the compiler or at run time depending on the amount of
> knowledge the compiler has about the location's usage.

No. This applies to rebinding of a name:
	final x = 3;
	x = 4;		// error, x is final
i.e. final applies to the declared name, not the type.

>> const
>  This is applied to declarations to prevent code in the same scope as the
> declaration from being able to modify the item being declared.

No. This means that it is a readonly view of data - other views of the same data may change it.
	char[] s = "hello".dup;
	const char[] t = s;
	t[0] = 'j';	// error, const char[] is a readonly view
	s[0] = 'j';	// ok
	writefln(t);	// prints "jello"
	t = s[1..2];	// ok, as t is not final
Note that const applies to the type, not the name.

>> invariant
>  This is applied to declarations to prevent code in the same application as
> the declaration from being able to modify the item being declared.

Almost right. It isn't the declaration, but the *type* that is invariant. Invariant applies to the type, not the name.

> As you can see, I'm confused as to how the qualifier effects which code is
> allowed to change which items. Even more so when it comes to reference
> items ... 'cos I'm not sure how to use these qualifiers to specify whether
> the reference and/or the data being referenced can be changed, and by whom.

'final' is a storage class, like 'static'. It doesn't apply to the type of the symbol, only the symbol itself.
'const' and 'invariant' apply to the type of the symbol, not the symbol.