July 19, 2006
Sounds like a great idea to me.  Easy to implement, improves correctness and performance.  What are we waiting for?

-Craig


July 19, 2006
Craig Black wrote:
> Sounds like a great idea to me.  Easy to implement, improves correctness and performance.  What are we waiting for?

Personally, I'm waiting/hoping for Walter to see the proposal and say what he thinks :)

I'm also wondering whether the "overwhelming" response to the proposal is because
- I didn't write "proposal" in the subject
- it's from me (I used to argue in a bad way too much, I'm sure I'm being filtered at least by some people :)
- it's so bad it's not even worth a comment
- it's so good everybody is already waiting for Walter to say yes ;)


xs0
July 19, 2006
"Reiner Pope" <reiner.pope@gmail.com> wrote in message news:e9kunq$qli$1@digitaldaemon.com...
> You get the speed gains from avoiding all unnecessary duplications, a feat which simple (a la C++) static const-checking can't achieve. Imagine that we had a static const-checking system in D:
>
> const char[] tolower(const char[] input)
> // the input must be const, because we agree with CoW, so we won't change
> it
> // Because of below, we also declare the output of the function const
> {
>   // do some stuff
>   if ( a write is necessary )
>   { // copy it into another variable, since we can't change input (it's
> const)
>   }
>   return something;
> // This something could possibly be input, so it also needs to be declared
> const. So we go back and make the return value of the function also a
> const.
> }
>
> // Now, since the return value is const, we *must* dup it whenever we call it. This is *very* inefficient if we own the string, because we get two unnecessary dups. This is a big price to pay just to keep static const-checking.
>
>
>> c) can be implemented now by defining:
>> struct vector
>> {
>>     bool readonly;
>>     T*  data;
>>     uint length;
>> }
> Yes and no. It can be implemented like that because that would effectively copy exactly what an array does already, but a) it takes up more memory than what xs0 proposed, and b) it isn't supported natively by the language's arrays, so it is less likely to be used.
>

propsed readonly solves one particular pretty narrow case of COW (only for arrays and only in functions aware about this flag)

C++ has better and more universal mechanism for this.

inline string &
    string::operator= ( const string &s )
  {
    release_data();
    set_data ( s.data );
    return *this;
  }

inline string & string::operator += ( const string &s )
  {
    mutate(*this);
    resize( length() + s.length() );
    .....
    return *this;
  }

I beleive that COW arrays (strings in particular) if they needed cannot
be made without operator= in structures in D.
Reference counting cannot be made in D with the same elegancy as in C++.

But in pure GC world COW strings are not used.
Strings in Java, C#, JavaScript, etc. are immutable character ranges -
string as a type simply has no such things as str[i] = 'o';
There are strong reasons for that.

extended typedef and alias will allow D to have strings as value types without any additional runtime costs.

Andrew Fedoniouk.
http://terrainformatica.com



July 19, 2006
Andrew Fedoniouk wrote:
>>> typedef  string char[]
>>> {
>>>     disable opAssign;
>>>     ....
>>>     char[] tolower() { ..... }
>>> }

Is there any particular difference from

struct string
{
    char[] data;

    char[] tolower() { .... }
}

?

> I think that such extended typedef makes sense for other basic types:
> 
> typedef color uint
> {
>     uint red() {  .... }
>     uint blue() {  .... }
>     uint green() {  .... }
> }

Is there any particular difference from

struct color {
    uint value;
    uint red() { ... }
    ...
}

?


> Also such typedef makes sense for classes too.

I don't get that.. Since you seem to want a new type, what's wrong with deriving?

> To avoid vtbl  pollution. Especially actual for templated classes.

Make the methods or class final, then they don't go into vtbl (or at least shouldn't).


xs0
July 19, 2006
"xs0" <xs0@xs0.com> wrote in message news:e9lu7n$2jv0$1@digitaldaemon.com...
> Andrew Fedoniouk wrote:
>>>> typedef  string char[]
>>>> {
>>>>     disable opAssign;
>>>>     ....
>>>>     char[] tolower() { ..... }
>>>> }
>
> Is there any particular difference from
>
> struct string
> {
>     char[] data;
>
>     char[] tolower() { .... }
> }

difference is in disabled opAssign**, so if you will define let's say following:

typedef  string char[]
{
    disable opSliceAssign;
    ....
    char[] tolower() { ..... }
}

then you will not be able to compile following:

string s = "something read-only";
s[0..s.length] = '\0'; // compile time error.

>
> ?
>
>> I think that such extended typedef makes sense for other basic types:
>>
>> typedef color uint
>> {
>>     uint red() {  .... }
>>     uint blue() {  .... }
>>     uint green() {  .... }
>> }
>
> Is there any particular difference from
>
> struct color {
>     uint value;
>     uint red() { ... }
>     ...
> }
>
> ?

The difference is that such color is inherently uint so
you can do following:

color c = 0xFF00FF;
c <<= 8;
uint r = c.red();

>
>
>> Also such typedef makes sense for classes too.
>
> I don't get that.. Since you seem to want a new type, what's wrong with deriving?
>

See:

alias NewClass OldClass
{
    void foo() { .... }
}

will not create new VTBL for NewClass.
It is just a syntactic sugar:

Instead of defining and using:

void foo_x( OldClass c ) { ..... }

You can use

NewClass nc = ....;
nc.foo();


>> To avoid vtbl  pollution. Especially actual for templated classes.
>
> Make the methods or class final, then they don't go into vtbl (or at least shouldn't).
>
>

All classes in D has VTBL by definition as far as I remember.

Andrew Fedoniouk.
http://terrainformatica.com




July 19, 2006
Andrew Fedoniouk wrote:
> "xs0" <xs0@xs0.com> wrote in message news:e9lu7n$2jv0$1@digitaldaemon.com...
>> Andrew Fedoniouk wrote:
>>>>> typedef  string char[]
>>>>> {
>>>>>     disable opAssign;
>>>>>     ....
>>>>>     char[] tolower() { ..... }
>>>>> }
>> Is there any particular difference from
>>
>> struct string
>> {
>>     char[] data;
>>
>>     char[] tolower() { .... }
>> }
> 
> difference is in disabled opAssign**, so if you will define
> let's say following:
> 
> typedef  string char[]
> {
>     disable opSliceAssign;
>     ....
>     char[] tolower() { ..... }
> }
> 
> then you will not be able to compile following:
> 
> string s = "something read-only";
> s[0..s.length] = '\0'; // compile time error.

But you can also not define opSliceAssign in struct string, and get the compile time error?


>>> I think that such extended typedef makes sense for other basic types:
>>>
>>> typedef color uint
>>> {
>>>     uint red() {  .... }
>>>     uint blue() {  .... }
>>>     uint green() {  .... }
>>> }
>> Is there any particular difference from
>>
>> struct color {
>>     uint value;
>>     uint red() { ... }
>>     ...
>> }
>>
>> ?
> 
> The difference is that such color is inherently uint so
> you can do following:
> 
> color c = 0xFF00FF;
> c <<= 8;
> uint r = c.red();

Besides the first line, you can do the same with a struct. And I'd say it's good that color and uint are not fully interchangeable, considering how they have nothing in common; one is a 32-bit integer, the other is more a byte[3] or byte[4], and even then you can't really say that a 'generic 8-bit integer' and a 'level of red intensity' have much in common..


>>> Also such typedef makes sense for classes too.
>> I don't get that.. Since you seem to want a new type, what's wrong with deriving?
> 
> See:
> 
> alias NewClass OldClass
> {
>     void foo() { .... }
> }
> 
> will not create new VTBL for NewClass.
> It is just a syntactic sugar:
> 
> Instead of defining and using:
> 
> void foo_x( OldClass c ) { ..... }
> 
> You can use
> 
> NewClass nc = ....;
> nc.foo();

Well, if you override a function, it should be virtual. If it didn't exist and is final, the compiler should be able to determine it can call it directly. If it is not final (meaning you plan to override it in a further derived class), it should again be virtual.. So I don't see the problem..

Also, why would you want a non-member function to look like it is a member function? Just causes confusion..

Finally, bar.foo() isn't really a shorthand for foo(bar), being one character longer.. Seems more like syntactic saccharin :)


>>> To avoid vtbl  pollution. Especially actual for templated classes.
>> Make the methods or class final, then they don't go into vtbl (or at least shouldn't).
> 
> All classes in D has VTBL by definition as far as I remember.

Yup.

xs0
July 19, 2006
xs0 wrote:
> But you can also not define opSliceAssign in struct string, and get the compile time error?

I think you are missing the point of this proposal (which I like a lot by the way).

(Andrew Fedoniouk if I have missinterepted your proposal, pleas excuse me)

The purpose is not to extend a type as with a class inheritance or to create a new type as with a new class or struct, but to alter the behavior of an existing type to allow for things like read only and color types that can bee passed as is to common graphics api's that expects uints etc.

This would add missing power too the language that are not available at the moment. An alternative weaker (but in my opinion worse) syntax to do this would be this.

struct string : char[]
{
	//posibly
	override opSliceAssign(){throw new Exception("");}

	//or
	disable opSliceAssign();
}

Here we introduce struct inheritance and the use of built in types as base classes but due too the extending nature of inheritance (the child beeing a superset of the parent) the disable syntax is bad and the use of inheritance forces the use of the virtual table (which structs and inbuilt types don't have).

The proposed syntax allows for this on the other hand

typedef char[] string
{
	disable opSliceAssign;
	string toLower(){..}
}

If we used inheritance this would not bee possible because we remove opSliceAssign from the interface of the type.

Here we use the new syntax to describe a "starting from" relation ship while inheritance creates a is a relationship.

We could create an entirely new type using structs and so on but the we would have to specify all the methods and fields of the new type rather than our changes. This is very much against the principles of code reuse.

/Johan Granberg

ps. Walter this could bee a nice feature to have as it would allow the creation of subsets of types (and intersecting types) as well as supersets.

July 19, 2006
I think I need to explain the idea using different words.

In terms of C++

"char[]" and
"const char[]"
are two distinct types.

"const char[]" is a reduced version of "char[]"

Reduced means that "const char[]" as a type has no
mutating methods like length(uint newLength),
opIndexAssign, etc.

extended typedef allows you to define
explicitly such const types by reducing
set of operations (what C++ does implicitly)
and also allows you to extend such types by new
methods.

Main value of the approach is for array and
pointer types I guess.

Andrew Fedoniouk.
http://terrainformatica.com












July 20, 2006
"Johan Granberg" <lijat.meREM@OVEgmail.com> wrote in message news:e9m79k$2sf$1@digitaldaemon.com...
> xs0 wrote:
>> But you can also not define opSliceAssign in struct string, and get the compile time error?
>
> I think you are missing the point of this proposal (which I like a lot by the way).
>
> (Andrew Fedoniouk if I have missinterepted your proposal, pleas excuse me)

Johan, you've got it right.

>
> The purpose is not to extend a type as with a class inheritance or to create a new type as with a new class or struct, but to alter the behavior of an existing type to allow for things like read only and color types that can bee passed as is to common graphics api's that expects uints etc.

Exactly. The main purpose is not for extending classes but to
give opportunity to extend intrinsic and value types.
It makes real sense for arrays, integers, enums, etc.

Again, external methods for arrays are here anyway - this is good chance to legalize them.

>
> This would add missing power too the language that are not available at the moment. An alternative weaker (but in my opinion worse) syntax to do this would be this.
>
> struct string : char[]
> {
> //posibly
> override opSliceAssign(){throw new Exception("");}
>
> //or
> disable opSliceAssign();
> }
>
> Here we introduce struct inheritance and the use of built in types as base classes but due too the extending nature of inheritance (the child beeing a superset of the parent) the disable syntax is bad and the use of inheritance forces the use of the virtual table (which structs and inbuilt types don't have).
>
> The proposed syntax allows for this on the other hand
>
> typedef char[] string
> {
> disable opSliceAssign;
> string toLower(){..}
> }
>
> If we used inheritance this would not bee possible because we remove opSliceAssign from the interface of the type.
>
> Here we use the new syntax to describe a "starting from" relation ship while inheritance creates a is a relationship.
>
> We could create an entirely new type using structs and so on but the we would have to specify all the methods and fields of the new type rather than our changes. This is very much against the principles of code reuse.

Exactly!

Consider this
typedef uint color { ubyte red() {....} }
I want to keep color all attrributes and operations of uint but to
give it couple of specific methods. The thing is that declaration
of such type will force all methods to be declared in one place.
Intellisense engine will like such declarations....

>
> /Johan Granberg
>
> ps. Walter this could bee a nice feature to have as it would allow the creation of subsets of types (and intersecting types) as well as supersets.
>

Yep. You've got an idea. sub- and super-sets are right words.

Andrew Fedoniouk.


July 20, 2006
Andrew Fedoniouk wrote:
> "Reiner Pope" <reiner.pope@gmail.com> wrote in message news:e9kunq$qli$1@digitaldaemon.com...
>> You get the speed gains from avoiding all unnecessary duplications, a feat which simple (a la C++) static const-checking can't achieve. Imagine that we had a static const-checking system in D:
>>
>> const char[] tolower(const char[] input)
>> // the input must be const, because we agree with CoW, so we won't change it
>> // Because of below, we also declare the output of the function const
>> {
>>   // do some stuff
>>   if ( a write is necessary )
>>   { // copy it into another variable, since we can't change input (it's const)
>>   }
>>   return something;
>> // This something could possibly be input, so it also needs to be declared const. So we go back and make the return value of the function also a const.
>> }
>>
>> // Now, since the return value is const, we *must* dup it whenever we call it. This is *very* inefficient if we own the string, because we get two unnecessary dups. This is a big price to pay just to keep static const-checking.
>>
>>
>>> c) can be implemented now by defining:
>>> struct vector
>>> {
>>>     bool readonly;
>>>     T*  data;
>>>     uint length;
>>> }
>> Yes and no. It can be implemented like that because that would effectively copy exactly what an array does already, but a) it takes up more memory than what xs0 proposed, and b) it isn't supported natively by the language's arrays, so it is less likely to be used.
>>
> 
> propsed readonly solves one particular pretty narrow case of COW
> (only for arrays and only in functions aware about this flag)
You have to be aware of CoW if you are writing a CoW function. It's like saying that the opIndexAssign property of arrays is limited because it can only be used by the functions that know about it. I see this proposal as an alternative to C++-style const, and with regards to functions being aware of the features, xs0's solution is better because it avoids const propogation throughout the code

> 
> C++ has better and more universal mechanism for this.
> 
> inline string &
>     string::operator= ( const string &s )
>   {
>     release_data();
>     set_data ( s.data );
>     return *this;
>   }
> 
This appears to be copying the contents of s into this. In D terms, this is a duplication, which is the runtime costs we are trying to avoid.

> inline string & string::operator += ( const string &s )
>   {
>     mutate(*this);
>     resize( length() + s.length() );
>     .....
>     return *this;
>   }
> 

The other point to make is that this seems not to be a C++ feature, but rather a library feature. I'm probably not understanding your examples, but can you, say, provide C++ code to match the following D code's functionality while avoiding unnecessary duplicates _and having const safety_:

char[] foo = "foo";
foo = tolower(toupper(foo));

I don't see how you can manage that with static const-checking. Please explain, and maybe then I can understand how the C++ solution is 'better'.

> I beleive that COW arrays (strings in particular) if they needed cannot
> be made without operator= in structures in D.
This seems to be a tangential issue. xs0's solution appears to work, and you haven't outlined a technical reason for it not working. If Walter integrates it into D, then that isn't going to cause any problems.

> Reference counting cannot be made in D with the same elegancy as in C++.
I don't see why it can't, but ignoring that, I also don't see why we need ref-counting for CoW strings. Doesn't mark-and-sweep manage it better?

> But in pure GC world COW strings are not used.
> Strings in Java, C#, JavaScript, etc. are immutable character ranges -
But Java, C# and JavaScript are not fast languages, so the importance of  fast string processing is largely diminished. However, D's string processing capabilities are good, and since it is possible to keep them, why shouldn't we?
> string as a type simply has no such things as str[i] = 'o';
> There are strong reasons for that.
I'm not aware of them. From my experience in Java and C# it is extremely cumbersome to process strings, with all the calls to foo.substring(0, 2); and so on. The other downside is that the processing is *slow* *as*.

Cheers,

Reiner