January 27, 2007
Andrei Alexandrescu (See Website For Email) wrote:
> Kevin Bealer wrote:
>> I would prefer for the types passed to a template to carry a lot more info across, and let the user decide which parts to throw away.  You could call this a "strict" template, if you want backward compatibility (but it's early in D's template design so maybe we don't care).
>>
>> I'm thinking that this (current) syntax:
>>
>> template foo(T) {
>>    T plus(T x, T y) { return x + y; }
>> }
>>
>> would be equivalent to:
>>
>> strict template foo(T) {
>>     T.value plus(T.value x, T.value y) { return x + y; }
>> }
>>
>> In a 'strict' template the type "T" would carry a lot of baggage.  It would know if T was constant (as in a local "const int x = 10"), or a constant view (maybe) whether it is an lvalue, rvalue, literal, known-at-compile-time, static array, result of arithmetic, etc.
>>
>> I couldn't tell you the actual names and definitions of all of these attributes -- I imagine they could be added one at a time as the need arose, starting with an equivalent for "storageof".  
> [snip]
> 
> These are great points. The direction I hope to steer things in would be to (backward-compatibly) continue matching types "lossily", but to also give you the ability to pattern-match the extra info when you want.
> 
> The syntax that I am proposing does away with storageof and is very much in spirit with the current D:
> 
> S int foo(S, T)(S T t) { }
> 
> It does exactly what it says it does, in a clear and terse manner, and is 100% within the spirit of existing D:
> 
> * It's customized on symbols S and T
> 
> * It's matching (by sheer position of S and T) the storage (i.e., all of the gooey fuzzy nice information about the argument passed) and the type of the incoming argument
> 
> * In this example it uses the storage in the result type (e.g. const goes to const)
> 
> * Inside the function can easily use either T separately or S T as a group that transports the storage around. You have total flexibility (and don't forget that juxtaposition is always easier than extraction).
> 
> So far we're only thinking of storage-like attributes, but a good deal of information can be encoded under the loose declaration of "storage".
> 
> 
> Andrei

I like this; does this mean that when declaring an instance, you would do something like this?  Or do the 'const' parts get deduced?

alias Foo!(const, int) TheFoo;

Kevin
January 27, 2007
Walter Bright wrote:
> BCS wrote:
>> Reply to Walter,
>>
>>> I didn't invalidate it because it is a bug. The lookup thing was a debate about whether Java-style or C++-style overriding is done. D does C++ style overriding.
>>>
>>
>> Is it? VS complains if there is any question, I haven't checked GCC. Or are you talking about something different.
> 
> Looks like something different.

Could you maybe explain what you mean by "C++-style
overriding" as opposed to Java-style overriding?  In
what way is D more like C++ here than like Java?

The overriding rules seem similar in C++ and in Java,
though the name/function lookup rules are very
different, and there are restrictions on Java related
to changing visibility when overriding.

-- James
January 27, 2007
Andrei Alexandrescu (See Website For Email) wrote:
> Frits van Bommel wrote:
>> Andrei Alexandrescu (See Website For Email) wrote:
>>> The syntax that I am proposing does away with storageof and is very much in spirit with the current D:
>>>
>>> S int foo(S, T)(S T t) { }
>>>
>>> It does exactly what it says it does, in a clear and terse manner, and is 100% within the spirit of existing D:
>>>
>>> * It's customized on symbols S and T
>>>
>>> * It's matching (by sheer position of S and T) the storage (i.e., all of the gooey fuzzy nice information about the argument passed) and the type of the incoming argument
>>>
>>> * In this example it uses the storage in the result type (e.g. const goes to const)
>>>
>>> * Inside the function can easily use either T separately or S T as a group that transports the storage around. You have total flexibility (and don't forget that juxtaposition is always easier than extraction).
>>>
>>> So far we're only thinking of storage-like attributes, but a good deal of information can be encoded under the loose declaration of "storage".
>>
>> Would it also be possible to 'cherry-pick' attributes?
>> So that e.g something like S.constness expands to either 'const' or ''?
>>
>> And would this mean 'raw' storage attributes would be valid template parameters? So that something like foo!(const) would be valid syntax?
> 
> You will be able to cherry-pick with "is" tests. Walter is opposed to manipulating the raw storage attributes.

If you have to cherry pick with "is" tests to separate the bits out of the S above, then why not just use "is" tests to pick the S info off the type to begin with?

--bb
January 27, 2007
Andrei Alexandrescu (See Website For Email) wrote:
> kris wrote:
> [about implicit conversion rules]
> 
>> extern (C) int printf (char*, ...);
>>
>> class Foo
>> {
>>         void write (int x) {printf("int\n");}
>>         void write (uint x) {printf("uint\n");}
>>         void write (long x) {printf("long\n");}
>>         void write (char x) {printf("char\n");}
>>         void write (wchar x) {printf("wchar\n");}
>>         void write (double x) {printf("double\n");}
>>
>>         void write (char[] x) {printf("char[]\n");}
>>         void write (wchar[] x) {printf("wchar[]\n");}
>> }
>>
>> void main()
>> {
>>         auto foo = new Foo;
>>
>>         foo.write ('c');
>>         foo.write (1);
>>         foo.write (1u);
>>         foo.write (3.14);
>>         //foo.write ("asa");
>> }
>>
>> prints:
>>
>> char
>> int
>> uint
>> double
>>
>>
>> DMD has actually become smarter than the last time I tried something like this: it manages to select the correct overload for 'c' whereas before it couldn't decide whether int or uint was a better match for the char instead. This is good.
>> It seems clear from the above that D /defaults/ the type of character, and undecorated integers, to something appropriate? In the above case 'c' is defaulted to char, rather than wchar, for example. The undecorated int constant is defaulted to int, rather than uint or long. This is good.
> 
> 
> Yup. So far so good.
> 
>> Now for the broken part. When you uncomment the string constant, the compiler gets all confused about whether it's a char[] or wchar[]. There is no defaulting to one type, as there is for other constants (such as char). It /is/ possible to decorate the string constant in a similar manner to decorating integer constants:
>>
>> foo.write ("qwe"c);
>>
>> And this, of course, compiles. It's a PITA though, and differs from the rules for other constants.
> 
> 
> I talked to Walter about this and it's not a bug, it's a feature :o). Basically it's hard to decide what to do with an unadorned string when both wchar[] and char[] would want to "attract" it. I understand you're leaning towards defaulting to char[]? Then probably others will be unhappy.
> 


You'll have noticed that the constant 'c' defaults to /char/, and that there's no compile-time conflict between the write(char) & write(wchar)?  Are people unhappy about that too? Perhaps defaulting of char constants and int constants should be abolished also?

I just want the compiler to be consistent, and it's quite unlikely that I'm alone in that regard -- consistency is a very powerful tool. Besides, aren't so-called 'features' actually bugs, coyly renamed by the marketing department? :)

BTW: if you remove the write(char) overload, the compiler says it doesn't know which of int/uint overloads to select for 'c', and completely ignores write(wchar) as an viable option. That seems reasonable, but it clearly shows that a char constant is being defaulted to type char; and it's vaguely amusing in a twisted manner <g>


>> Things start to go south when using templates with string constants. For example, take this template sig:
>>
>> uint locate(T) (T[] source, T match, uint start=0)
>>
>> This is intended to handle types of char[], wchar[] and dchar[]. There's a uint on the end, as opposed to an int. Suppose I call it like this:
>>
>> locate ("abc", "ab", 1);
>>
>> we get a compile error, since the int-constant does not match a uint in the sig (IFTI currently needs exact sig matches). In order to get around this, we wrap the template with a few functions:
>>
>> uint locate (char[] source, char[] match, uint start=0)
>> {
>>     return locateT!(char) (source, match, start);
>> }
>>
>> uint locate (wchar[] source, wchar[] match, uint start=0)
>> {
>>     return locateT!(wchar) (source, match, start);
>> }
>>
>> and dchar too.
>>
>> Now we call it:
>>
>> locate ("abc", "ab", 1);
>>
>> Well, the int/uint error goes away (since function matching operates differently than IFTI matching), but we've now got our old friend back again -- the constant char[], wchar[], dchar[] mismatch problem.
> 
> 
> I think a sound solution to this should be found. It's kind of hard, because char[] is the worst match but also probably the most used one. The most generous match is dchar[] but wastes much time and space for the minority of cases in which it's useful.

In the FWIW department, after writing several truckloads of text-oriented library code and wrappers for external text-processing libs, I've reached a simple conclusion: utf8 is where it's at for ~80% of code written, IMO. There's probably a ~15% need to go to utf32 for serious text handling (Word Processor, etc), and utf16 is the half-way house that the remaining percentage resort to when compromising (such as when stuffing things into ROM).

Before the flames rise up and engulf that claim, Let's consider one major exclusion: certain GUI APIs use utf16 throughout. What the heck does one do in that situation if the compiler defaults strings-constants to char[] instead of wchar[]?

Well, it's actually no issue at all since those APIs typically don't have method overloads for char/dchar also. They have only utf16 signatures instead, for any given method name, because they only deal in utf16. Thus, the compiler can happily morph a string constant to a wchar[] instead of the default -- *just as it happily does today*

Let's also keep in mind we're talking string constants only, rather than all strings. I'll just try not to harp on about the consistency mismatch between char & char[] constants any more than I have done already.


> 
>> How about another type of template? Here's one that does some simply text processing:
>>
>> T[] layout(T) (T[] output, T[] format, T[][] subs...)
>>
>> This has an output buffer, a format string, and a set of optional args; all of the same type. If I call it like so:
>>
>> char[128] tmp;
>> char[]    world = "world";
>> layout (tmp, "hello %1", world);
>>
>> that compiles ok. If I use wchar[] instead, it doesn't compile:
>>
>> wchar[128] tmp;
>> wchar[]    world = "world";
>> layout (tmp, "hello %1", world);
>>
>> In this case, the constant string used for formatting remains as a char[], so the template match fails (args: wchar[], char[], wchar[])
>>
>> However, if I change the template signature to this instead:
>>
>> T[] layout(T) (T[] output, T[][] subs...)
>>
>> then everything works the way I want it to, but the design is actually wrong (the format string is now not required). String constants can be a royal PITA.
> 
> 
> Color me convinced. :o) I have no bright ideas on solving it though.

Perhaps a change to a default type might take care of it? After all, this particular issue is specific to string-constants only; not for other types (such as char/int/long/float).

Certainly worth a try, one would think?


> 
>> inout
>> -----
>>
>> Since you're working on inout also, I'd like to ask what the plan is relating to a couple of examples. Tango uses this style of call quite regularly:
>>
>> get (inout int x);
>> get (inout char[] x);
>> etc.
>>
>> This is a clean way to pass-by-reference, instead of dealing with all the pointer syntax. I sure hope D will retain reference semantics like this, in some form?
> 
> 
> It will, and in the same form.

Grand!

> 
>> One current problem with inout, which you might not be aware of, is with regard to const structs. I need to pass structs by reference, because I don't want to pass them by value. Applying inout is the mechanism for describing this:
>>
>> struct Bar {int a, b;}
>>
>> Bar b = {1, 2};
>>
>> void parf (inout Bar x) {}
>>
>> void main()
>> {
>>    parf (b);
>> }
>>
>> That all works fine. However, when I want to make those structs /const/ instead, I cannot use inout since it has mutating semantics: I get a compile error to that effect:
>>
>> const Bar b = {1, 2};
>>
>>  >> Error: cannot modify const variable 'b'
>>
>> That is, there's no supported way to pass a const struct by reference. The response from Walter in the past has been "just use a pointer instead" ... well, yes I could do that. But it appears to be indicative of a problem with the language design?
> 
> 
> This case is on the list. You will definitely have a sane and simple way to pass const structs by reference, while having a guarantee that they can't be changed by the callee.

Praise the lord !


>> Why do I want to use const? Well, the data held therein is for reference only, and (via some future D vendor) I want that reference data placed into a ROM segment. I do a lot of work with MCUs, and this sort of thing is a common requirement.
> 
> 
> I agree.
> 
> 
> Andrei


Cheers; this kind of detailed reply (above) is very much appreciated

- Kris
January 27, 2007
Kevin Bealer wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
[about deducing storage types]
>> The syntax that I am proposing does away with storageof and is very much in spirit with the current D:
>>
>> S int foo(S, T)(S T t) { }
>>
>> It does exactly what it says it does, in a clear and terse manner, and is 100% within the spirit of existing D:
>>
>> * It's customized on symbols S and T
>>
>> * It's matching (by sheer position of S and T) the storage (i.e., all of the gooey fuzzy nice information about the argument passed) and the type of the incoming argument
>>
>> * In this example it uses the storage in the result type (e.g. const goes to const)
>>
>> * Inside the function can easily use either T separately or S T as a group that transports the storage around. You have total flexibility (and don't forget that juxtaposition is always easier than extraction).
>>
>> So far we're only thinking of storage-like attributes, but a good deal of information can be encoded under the loose declaration of "storage".
>>
>>
>> Andrei
> 
> I like this; does this mean that when declaring an instance, you would do something like this?  Or do the 'const' parts get deduced?
> 
> alias Foo!(const, int) TheFoo;

The intent is to deduce the storage, but specifying it explicitly works just as well.

There is strong resistance against my notation because manipulating the storage class separately is something entirely new, which means a lot of work in the compiler implementation. Also, S is not a type but a (new kind of) alias, which might confuse people who read:

template Foo(S, T)
{
  ...
}

and expect S and T to be types, just to discover in the body of the template that S is actually a qualifier.

The strawman we're discussing now looks like this:

void foo(auto T)(T t) { }

meaning, T will match whatever you throw at foo, including storage:

int a = 5;
foo(a);     // T == inout int
foo(a + 1); // T == int

Either way, one thing is clear - defining storage classes properly is a focal area.


Andrei
January 27, 2007
James Dennett wrote:
> Could you maybe explain what you mean by "C++-style
> overriding" as opposed to Java-style overriding?  In
> what way is D more like C++ here than like Java?

In C++, overriding one function in a base class prevents use of any of the base class function's overloads. In Java, you can override one overload and inherit the rest.
January 27, 2007
Walter Bright wrote:
> James Dennett wrote:
>> Could you maybe explain what you mean by "C++-style
>> overriding" as opposed to Java-style overriding?  In
>> what way is D more like C++ here than like Java?
> 
> In C++, overriding one function in a base class prevents use of any of the base class function's overloads. In Java, you can override one overload and inherit the rest.

OK; I'd say this is a name lookup difference, not a difference in how overriding is done.

In C++ it doesn't prevent the use of the base class
function's overloads: they can still be used if you
name their scope, or if you bring them into the scope
of the derived class with a using declaration.  (It
certainly does prevent certain idiomatic uses of those
base class functions unless you add a using declaration.
It turns out to generally be a bad idea to overload
virtual functions for design reasons unrelated to the
specifics of a particular language though, so this
isn't much of an issue in practice.)

In Java, lookup is done by signature, not just name,
and there's no such thing as "hiding" of a name in
a base class by a name in a derived class.

Both systems have their pros and cons, and D could
reasonably resemble either, so long as the choice
"feels" consistent with other decisions in the design
of D.

-- James
January 27, 2007
kris wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>> kris wrote:
>> [about implicit conversion rules]
>>
>>> extern (C) int printf (char*, ...);
>>>
>>> class Foo
>>> {
>>>         void write (int x) {printf("int\n");}
>>>         void write (uint x) {printf("uint\n");}
>>>         void write (long x) {printf("long\n");}
>>>         void write (char x) {printf("char\n");}
>>>         void write (wchar x) {printf("wchar\n");}
>>>         void write (double x) {printf("double\n");}
>>>
>>>         void write (char[] x) {printf("char[]\n");}
>>>         void write (wchar[] x) {printf("wchar[]\n");}
>>> }
>>>
>>> void main()
>>> {
>>>         auto foo = new Foo;
>>>
>>>         foo.write ('c');
>>>         foo.write (1);
>>>         foo.write (1u);
>>>         foo.write (3.14);
>>>         //foo.write ("asa");
>>> }
>>>
>>> prints:
>>>
>>> char
>>> int
>>> uint
>>> double
[snip]
>>> Now for the broken part. When you uncomment the string constant, the compiler gets all confused about whether it's a char[] or wchar[]. There is no defaulting to one type, as there is for other constants (such as char). It /is/ possible to decorate the string constant in a similar manner to decorating integer constants:
>>>
>>> foo.write ("qwe"c);
>>>
>>> And this, of course, compiles. It's a PITA though, and differs from the rules for other constants.
>>
>> I talked to Walter about this and it's not a bug, it's a feature :o). Basically it's hard to decide what to do with an unadorned string when both wchar[] and char[] would want to "attract" it. I understand you're leaning towards defaulting to char[]? Then probably others will be unhappy.
> 
> You'll have noticed that the constant 'c' defaults to /char/, and that there's no compile-time conflict between the write(char) & write(wchar)?  Are people unhappy about that too? Perhaps defaulting of char constants and int constants should be abolished also?

It's a bit more complicated with character literals than just defaulting to 'char':
-----
import std.stdio;
void main() {
    writefln(typeid(typeof('a')));          // c <= \u007f
    writefln(typeid(typeof('\uabcd')));     // c <= \uffff
    writefln(typeid(typeof('\U000abcde'))); // c <= \U0010ffff

}
-----
outputs:
"""
char
wchar
dchar
"""
So it defaults to the *smallest* character type that can hold it in one element.
Pretty cool, actually.

This also applies to other types, by the way. If you type an integer literal that won't fit into an 'int', it'll be a 'long' constant (assuming it fits), not an 'int' constant.

Perhaps we should do something similar with string literals, defaulting it to use an array of the smallest character type that can hold all of the characters in the string (i.e. the maximum "character type" of the component characters) ?
That seems like a reasonable rule. And it has the added benefit that "some string constant".length will always be the number of characters in the string.
January 27, 2007
Bill Baxter wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>> You will be able to cherry-pick with "is" tests. Walter is opposed to manipulating the raw storage attributes.
> 
> If you have to cherry pick with "is" tests to separate the bits out of the S above, then why not just use "is" tests to pick the S info off the type to begin with?

Because just 'S' is much shorter when you don't want to cherry-pick?
January 27, 2007
Andrei Alexandrescu (See Website For Email) wrote:
> kris wrote:
>>
>> Now we call it:
>>
>> locate ("abc", "ab", 1);
>>
>> Well, the int/uint error goes away (since function matching operates differently than IFTI matching), but we've now got our old friend back again -- the constant char[], wchar[], dchar[] mismatch problem.
> 
> I think a sound solution to this should be found. It's kind of hard, because char[] is the worst match but also probably the most used one. The most generous match is dchar[] but wastes much time and space for the minority of cases in which it's useful.

I suppose this is where the difference of opinion comes in, but why do you think char[] is the worst match?  From what I understand, UTF-8 seems preferable to UTF-16 in most cases.  The algorithm for dealing with such strings must be essentially the same in both cases, and UTF-8 tends to be more compact on average.


Sean