November 19, 2009
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s article
> Steven Schveighoffer wrote:
> > On Thu, 19 Nov 2009 12:01:25 -0500, dsimcha <dsimcha@yahoo.com> wrote:
> >
> >> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s
> >> article
> >>> dsimcha wrote:
> >>> > == Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s
> >>> article
> >>> >> Yes, it will be because the book has a few failing unittests. In
> >>> fact, I
> >>> >> was hoping I could talk you or David into doing it :o). Andrei
> >>> >
> >>> > Unfortunately, I've come to hate the MRU idea because it would fail
> >>> miserably for
> >>> > large arrays.  I've explained this before, but not particularly
> >>> thoroughly, so
> >>> > I'll try to explain it more thoroughly here.  Let's say you have an
> >>> array that
> >>> > takes up more than half of the total memory you are using.  You try
> >>> to append to
> >>> > it and:
> >>> >
> >>> > 1.  The GC runs.  The MRU cache is therefore cleared.
> >>> >
> >>> > 2.  Your append succeeds, but the array is reallocated.
> >>> >
> >>> > 3.  You try to append again.  Now, because you have a huge piece of
> >>> garbage that
> >>> > you just created by reallocating on the last append, the GC needs
> >>> to run again.
> >>> > The MRU cache is cleared again.
> >>> >
> >>> > 4.  Goto 2.
> >>> This is not a matter of principles, but one of implementation. When you GC, you can adjust the cache instead of clearing it.
> >>
> >> Technically true, but what is a matter of principles is whether the
> >> implementation
> >> of arrays should be very tightly coupled to the implementation of the
> >> GC.  Fixing
> >> this issue would have massive ripple effects throughout the already
> >> spaghetti
> >> code-like GC, and might affect GC performance.  For every single
> >> object the GC
> >> freed, it would have to look through the MRU cache and remove it from
> >> there if
> >> present, too.
> >
> > You perform the lookup via MRU cache (after mark, before sweep).  I see it as a single function call at the right place in the GC.
> >
> >> The point is that this **can** be done, but we probably don't **want** to
> >> introduce this kind of coupling, especially if we want our GC model to
> >> be sane
> >> enough that people might actually come along and write us a better GC
> >> one day.
> >
> > What about implementing it as a hook "do this between mark and sweep"? Then it becomes decoupled from the GC.
> >
> > -Steve
> I think these are great ideas, but you'd need to transport certain
> information to the cache so it can adjust its pointers. Anyhow, I
> believe this is worth exploring because it can help with a great many
> other things such as weak pointers and similar checks and adjustments
> (there was a paper on GC assertions that I don't have time to dig right
> now. Aw what the heck, found it:
> http://www.eecs.tufts.edu/~eaftan/gcassertions-mspc-2008.pdf
> Andrei

The hook doesn't sound like a bad idea, but it raises a lot of issues with the
implementation details.  These are things I could figure out given plenty of time.
 I'd like weak refs, too.  However, I don't think this makes the short list for D2
because:

1.  Doing it at all properly requires a lot of thought about what a good design for such an API should be and how to implement it efficiently.

2.  I think we still need an ArrayBuilder or something because, while the MRU would be reasonably efficient, it still wouldn't be as efficient as an ArrayBuilder, and would do nothing to solve the uniqueness problem.  Therefore, I think fleshing out ArrayBuilder is a higher priority.  I was thinking of a design something like this:

abstract class Array {
    // A bunch of final methods for .length, opIndex, etc.
    // No .ptr or opSlice.
}

class UniqueArray : Array {
   // Still no .ptr or opSlice.  Has .toImmutable, which allows
   // for conversion to immutability iff the elements are either
   // pure value types or themselves immutable.
   //
   // Also, can deterministically delete old arrays on reallocation,
   // since it owns a unique reference, leading to more GC-efficient
   // appending.
}

class ArrayBuilder : Array {
   // Add opSlice and .ptr.  Appending doesn't deterministically
   // delete old arrays, even if the GC supports this.  No guarantees
   // about uniqueness.
}
November 19, 2009
aarti_pl pisze:
> Andrei Alexandrescu pisze:
>> 2. User-defined operators must be revamped. Fortunately Don already put in an important piece of functionality (opDollar). What we're looking at is a two-pronged attack motivated by Don's proposal:
>>
>> http://prowiki.org/wiki4d/wiki.cgi?LanguageDevel/DIPs/DIP7
>>
>> The two prongs are:
>>
>> * Encode operators by compile-time strings. For example, instead of the plethora of opAdd, opMul, ..., we'd have this:
>>
>> T opBinary(string op)(T rhs) { ... }
>>
>> The string is "+", "*", etc. We need to design what happens with read-modify-write operators like "+=" (should they be dispatch to a different function? etc.) and also what happens with index-and-modify operators like "[]=", "[]+=" etc. Should we go with proxies? Absorb them in opBinary? Define another dedicated method? etc.
>>
>> * Loop fusion that generalizes array-wise operations. This idea of Walter is, I think, very good because it generalizes and democratizes "magic". The idea is that, if you do
>>
>> a = b + c;
>>
>> and b + c does not make sense but b and c are ranges for which a.front = b.front + c.front does make sense, to automatically add the iteration paraphernalia.
>>
(..)
>> Andrei
> 
> I kinda like this proposal. But I would rather call template like below:
> 
> T opInfix(string op)(T rhs) { ... }
> T opPrefix(string op)(T rhs) { ... }
> T opPostfix(string op)(T rhs) { ... }
> 
> and allow user to define her own operators (though it doesn't have to be done now).
> 
> I know that quite a few people here doesn't like to allow users to define their own operators, because it might obfuscate code. But it doesn't have to be like this. Someone here already mentioned here that it is not real problem for programs in C++. Good libraries don't abuse this functionality.
> 
> User defined operators would allow easy definition of Domain Specific Languages in D. I was already writing about it some time ago:
> 
> http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=81026 
> 
> http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=81352 
> 
> 
> BR
> Marcin Kuszczak
> (aarti_pl)

Of course for opPrefix/opPostfix signatures will be different:
T opPrefix(string op)() { ... }
T opPostfix(string op)() { ... }

Sorry for mistake.

BR
Marcin Kuszczak
(aarti_pl)
November 19, 2009
aarti_pl wrote:
> aarti_pl pisze:
>> Andrei Alexandrescu pisze:
>>> 2. User-defined operators must be revamped. Fortunately Don already put in an important piece of functionality (opDollar). What we're looking at is a two-pronged attack motivated by Don's proposal:
>>>
>>> http://prowiki.org/wiki4d/wiki.cgi?LanguageDevel/DIPs/DIP7
>>>
>>> The two prongs are:
>>>
>>> * Encode operators by compile-time strings. For example, instead of the plethora of opAdd, opMul, ..., we'd have this:
>>>
>>> T opBinary(string op)(T rhs) { ... }
>>>
>>> The string is "+", "*", etc. We need to design what happens with read-modify-write operators like "+=" (should they be dispatch to a different function? etc.) and also what happens with index-and-modify operators like "[]=", "[]+=" etc. Should we go with proxies? Absorb them in opBinary? Define another dedicated method? etc.
>>>
>>> * Loop fusion that generalizes array-wise operations. This idea of Walter is, I think, very good because it generalizes and democratizes "magic". The idea is that, if you do
>>>
>>> a = b + c;
>>>
>>> and b + c does not make sense but b and c are ranges for which a.front = b.front + c.front does make sense, to automatically add the iteration paraphernalia.
>>>
> (..)
>>> Andrei
>>
>> I kinda like this proposal. But I would rather call template like below:
>>
>> T opInfix(string op)(T rhs) { ... }
>> T opPrefix(string op)(T rhs) { ... }
>> T opPostfix(string op)(T rhs) { ... }
>>
>> and allow user to define her own operators (though it doesn't have to be done now).
>>
>> I know that quite a few people here doesn't like to allow users to define their own operators, because it might obfuscate code. But it doesn't have to be like this. Someone here already mentioned here that it is not real problem for programs in C++. Good libraries don't abuse this functionality.
>>
>> User defined operators would allow easy definition of Domain Specific Languages in D. I was already writing about it some time ago:
>>
>> http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=81026 
>>
>> http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=81352 
>>
>>
>> BR
>> Marcin Kuszczak
>> (aarti_pl)
> 
> Of course for opPrefix/opPostfix signatures will be different:
> T opPrefix(string op)() { ... }
> T opPostfix(string op)() { ... }
> 
> Sorry for mistake.
> 
> BR
> Marcin Kuszczak
> (aarti_pl)

I think we'll solve postfix "++" without requiring the user to define it. Do you envision user-defined postfix operators?

Andrei
November 19, 2009
dsimcha wrote:
> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s article
>> Steven Schveighoffer wrote:
>>> On Thu, 19 Nov 2009 12:01:25 -0500, dsimcha <dsimcha@yahoo.com> wrote:
>>>
>>>> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s
>>>> article
>>>>> dsimcha wrote:
>>>>>> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s
>>>>> article
>>>>>>> Yes, it will be because the book has a few failing unittests. In
>>>>> fact, I
>>>>>>> was hoping I could talk you or David into doing it :o).
>>>>>>> Andrei
>>>>>> Unfortunately, I've come to hate the MRU idea because it would fail
>>>>> miserably for
>>>>>> large arrays.  I've explained this before, but not particularly
>>>>> thoroughly, so
>>>>>> I'll try to explain it more thoroughly here.  Let's say you have an
>>>>> array that
>>>>>> takes up more than half of the total memory you are using.  You try
>>>>> to append to
>>>>>> it and:
>>>>>>
>>>>>> 1.  The GC runs.  The MRU cache is therefore cleared.
>>>>>>
>>>>>> 2.  Your append succeeds, but the array is reallocated.
>>>>>>
>>>>>> 3.  You try to append again.  Now, because you have a huge piece of
>>>>> garbage that
>>>>>> you just created by reallocating on the last append, the GC needs
>>>>> to run again.
>>>>>> The MRU cache is cleared again.
>>>>>>
>>>>>> 4.  Goto 2.
>>>>> This is not a matter of principles, but one of implementation. When you
>>>>> GC, you can adjust the cache instead of clearing it.
>>>> Technically true, but what is a matter of principles is whether the
>>>> implementation
>>>> of arrays should be very tightly coupled to the implementation of the
>>>> GC.  Fixing
>>>> this issue would have massive ripple effects throughout the already
>>>> spaghetti
>>>> code-like GC, and might affect GC performance.  For every single
>>>> object the GC
>>>> freed, it would have to look through the MRU cache and remove it from
>>>> there if
>>>> present, too.
>>> You perform the lookup via MRU cache (after mark, before sweep).  I see
>>> it as a single function call at the right place in the GC.
>>>
>>>> The point is that this **can** be done, but we probably don't **want** to
>>>> introduce this kind of coupling, especially if we want our GC model to
>>>> be sane
>>>> enough that people might actually come along and write us a better GC
>>>> one day.
>>> What about implementing it as a hook "do this between mark and sweep"?
>>> Then it becomes decoupled from the GC.
>>>
>>> -Steve
>> I think these are great ideas, but you'd need to transport certain
>> information to the cache so it can adjust its pointers. Anyhow, I
>> believe this is worth exploring because it can help with a great many
>> other things such as weak pointers and similar checks and adjustments
>> (there was a paper on GC assertions that I don't have time to dig right
>> now. Aw what the heck, found it:
>> http://www.eecs.tufts.edu/~eaftan/gcassertions-mspc-2008.pdf
>> Andrei
> 
> The hook doesn't sound like a bad idea, but it raises a lot of issues with the
> implementation details.  These are things I could figure out given plenty of time.
>  I'd like weak refs, too.  However, I don't think this makes the short list for D2
> because:
> 
> 1.  Doing it at all properly requires a lot of thought about what a good design
> for such an API should be and how to implement it efficiently.
> 
> 2.  I think we still need an ArrayBuilder or something because, while the MRU
> would be reasonably efficient, it still wouldn't be as efficient as an
> ArrayBuilder, and would do nothing to solve the uniqueness problem.  Therefore, I
> think fleshing out ArrayBuilder is a higher priority.  I was thinking of a design
> something like this:
> 
> abstract class Array {
>     // A bunch of final methods for .length, opIndex, etc.
>     // No .ptr or opSlice.
> }
> 
> class UniqueArray : Array {
>    // Still no .ptr or opSlice.  Has .toImmutable, which allows
>    // for conversion to immutability iff the elements are either
>    // pure value types or themselves immutable.
>    //
>    // Also, can deterministically delete old arrays on reallocation,
>    // since it owns a unique reference, leading to more GC-efficient
>    // appending.
> }
> 
> class ArrayBuilder : Array {
>    // Add opSlice and .ptr.  Appending doesn't deterministically
>    // delete old arrays, even if the GC supports this.  No guarantees
>    // about uniqueness.
> }

What does .toImmutable return? As far as I can tell making UniqueArray a class can't work because by definition you give up controlling how many references to the array there could be in a program.

Andrei
November 19, 2009
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s article
> dsimcha wrote:
> > == Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s article
> >> Steven Schveighoffer wrote:
> >>> On Thu, 19 Nov 2009 12:01:25 -0500, dsimcha <dsimcha@yahoo.com> wrote:
> >>>
> >>>> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s
> >>>> article
> >>>>> dsimcha wrote:
> >>>>>> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s
> >>>>> article
> >>>>>>> Yes, it will be because the book has a few failing unittests. In
> >>>>> fact, I
> >>>>>>> was hoping I could talk you or David into doing it :o). Andrei
> >>>>>> Unfortunately, I've come to hate the MRU idea because it would fail
> >>>>> miserably for
> >>>>>> large arrays.  I've explained this before, but not particularly
> >>>>> thoroughly, so
> >>>>>> I'll try to explain it more thoroughly here.  Let's say you have an
> >>>>> array that
> >>>>>> takes up more than half of the total memory you are using.  You try
> >>>>> to append to
> >>>>>> it and:
> >>>>>>
> >>>>>> 1.  The GC runs.  The MRU cache is therefore cleared.
> >>>>>>
> >>>>>> 2.  Your append succeeds, but the array is reallocated.
> >>>>>>
> >>>>>> 3.  You try to append again.  Now, because you have a huge piece of
> >>>>> garbage that
> >>>>>> you just created by reallocating on the last append, the GC needs
> >>>>> to run again.
> >>>>>> The MRU cache is cleared again.
> >>>>>>
> >>>>>> 4.  Goto 2.
> >>>>> This is not a matter of principles, but one of implementation. When you GC, you can adjust the cache instead of clearing it.
> >>>> Technically true, but what is a matter of principles is whether the
> >>>> implementation
> >>>> of arrays should be very tightly coupled to the implementation of the
> >>>> GC.  Fixing
> >>>> this issue would have massive ripple effects throughout the already
> >>>> spaghetti
> >>>> code-like GC, and might affect GC performance.  For every single
> >>>> object the GC
> >>>> freed, it would have to look through the MRU cache and remove it from
> >>>> there if
> >>>> present, too.
> >>> You perform the lookup via MRU cache (after mark, before sweep).  I see it as a single function call at the right place in the GC.
> >>>
> >>>> The point is that this **can** be done, but we probably don't **want** to
> >>>> introduce this kind of coupling, especially if we want our GC model to
> >>>> be sane
> >>>> enough that people might actually come along and write us a better GC
> >>>> one day.
> >>> What about implementing it as a hook "do this between mark and sweep"? Then it becomes decoupled from the GC.
> >>>
> >>> -Steve
> >> I think these are great ideas, but you'd need to transport certain
> >> information to the cache so it can adjust its pointers. Anyhow, I
> >> believe this is worth exploring because it can help with a great many
> >> other things such as weak pointers and similar checks and adjustments
> >> (there was a paper on GC assertions that I don't have time to dig right
> >> now. Aw what the heck, found it:
> >> http://www.eecs.tufts.edu/~eaftan/gcassertions-mspc-2008.pdf
> >> Andrei
> >
> > The hook doesn't sound like a bad idea, but it raises a lot of issues with the
> > implementation details.  These are things I could figure out given plenty of time.
> >  I'd like weak refs, too.  However, I don't think this makes the short list for D2
> > because:
> >
> > 1.  Doing it at all properly requires a lot of thought about what a good design for such an API should be and how to implement it efficiently.
> >
> > 2.  I think we still need an ArrayBuilder or something because, while the MRU would be reasonably efficient, it still wouldn't be as efficient as an ArrayBuilder, and would do nothing to solve the uniqueness problem.  Therefore, I think fleshing out ArrayBuilder is a higher priority.  I was thinking of a design something like this:
> >
> > abstract class Array {
> >     // A bunch of final methods for .length, opIndex, etc.
> >     // No .ptr or opSlice.
> > }
> >
> > class UniqueArray : Array {
> >    // Still no .ptr or opSlice.  Has .toImmutable, which allows
> >    // for conversion to immutability iff the elements are either
> >    // pure value types or themselves immutable.
> >    //
> >    // Also, can deterministically delete old arrays on reallocation,
> >    // since it owns a unique reference, leading to more GC-efficient
> >    // appending.
> > }
> >
> > class ArrayBuilder : Array {
> >    // Add opSlice and .ptr.  Appending doesn't deterministically
> >    // delete old arrays, even if the GC supports this.  No guarantees
> >    // about uniqueness.
> > }
> What does .toImmutable return? As far as I can tell making UniqueArray a
> class can't work because by definition you give up controlling how many
> references to the array there could be in a program.
> Andrei

Sorry, forgot to flesh out a few details.

1.  .toImmutable() returns an immutable slice, but also sets UniqueArray's pointer
member to null, so no instance of UniqueArray has a mutable reference any longer.
 After this is called, the UniqueArray object will be invalid unless reinitialized.

2.  After thinking about this some more, the big issue I see is ref opIndex.  We
can either:
    a.  Disallow it for both UniqueArray and ArrayBuilder.
    b.  Allow it for both UniqueArray and ArrayBuilder and accept
        that a sufficiently dumb programmer can invalidate the
        guarantees of UniqueArray by taking the address of one of the
        elements and saving it somewhere.  Probably a bad idea, since
        assumeUnique() already works for the careful programmer, and
        UniqueArray is supposed to provide ironclad guarantees.
    c.  Don't define opIndex in the abstract base class at all, thus
        making Array almost useless as an abstract base class.
November 19, 2009
dsimcha wrote:
> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s article
>> dsimcha wrote:
>>> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s article
>>>> Steven Schveighoffer wrote:
>>>>> On Thu, 19 Nov 2009 12:01:25 -0500, dsimcha <dsimcha@yahoo.com> wrote:
>>>>>
>>>>>> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s
>>>>>> article
>>>>>>> dsimcha wrote:
>>>>>>>> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s
>>>>>>> article
>>>>>>>>> Yes, it will be because the book has a few failing unittests. In
>>>>>>> fact, I
>>>>>>>>> was hoping I could talk you or David into doing it :o).
>>>>>>>>> Andrei
>>>>>>>> Unfortunately, I've come to hate the MRU idea because it would fail
>>>>>>> miserably for
>>>>>>>> large arrays.  I've explained this before, but not particularly
>>>>>>> thoroughly, so
>>>>>>>> I'll try to explain it more thoroughly here.  Let's say you have an
>>>>>>> array that
>>>>>>>> takes up more than half of the total memory you are using.  You try
>>>>>>> to append to
>>>>>>>> it and:
>>>>>>>>
>>>>>>>> 1.  The GC runs.  The MRU cache is therefore cleared.
>>>>>>>>
>>>>>>>> 2.  Your append succeeds, but the array is reallocated.
>>>>>>>>
>>>>>>>> 3.  You try to append again.  Now, because you have a huge piece of
>>>>>>> garbage that
>>>>>>>> you just created by reallocating on the last append, the GC needs
>>>>>>> to run again.
>>>>>>>> The MRU cache is cleared again.
>>>>>>>>
>>>>>>>> 4.  Goto 2.
>>>>>>> This is not a matter of principles, but one of implementation. When you
>>>>>>> GC, you can adjust the cache instead of clearing it.
>>>>>> Technically true, but what is a matter of principles is whether the
>>>>>> implementation
>>>>>> of arrays should be very tightly coupled to the implementation of the
>>>>>> GC.  Fixing
>>>>>> this issue would have massive ripple effects throughout the already
>>>>>> spaghetti
>>>>>> code-like GC, and might affect GC performance.  For every single
>>>>>> object the GC
>>>>>> freed, it would have to look through the MRU cache and remove it from
>>>>>> there if
>>>>>> present, too.
>>>>> You perform the lookup via MRU cache (after mark, before sweep).  I see
>>>>> it as a single function call at the right place in the GC.
>>>>>
>>>>>> The point is that this **can** be done, but we probably don't **want** to
>>>>>> introduce this kind of coupling, especially if we want our GC model to
>>>>>> be sane
>>>>>> enough that people might actually come along and write us a better GC
>>>>>> one day.
>>>>> What about implementing it as a hook "do this between mark and sweep"?
>>>>> Then it becomes decoupled from the GC.
>>>>>
>>>>> -Steve
>>>> I think these are great ideas, but you'd need to transport certain
>>>> information to the cache so it can adjust its pointers. Anyhow, I
>>>> believe this is worth exploring because it can help with a great many
>>>> other things such as weak pointers and similar checks and adjustments
>>>> (there was a paper on GC assertions that I don't have time to dig right
>>>> now. Aw what the heck, found it:
>>>> http://www.eecs.tufts.edu/~eaftan/gcassertions-mspc-2008.pdf
>>>> Andrei
>>> The hook doesn't sound like a bad idea, but it raises a lot of issues with the
>>> implementation details.  These are things I could figure out given plenty of time.
>>>  I'd like weak refs, too.  However, I don't think this makes the short list for D2
>>> because:
>>>
>>> 1.  Doing it at all properly requires a lot of thought about what a good design
>>> for such an API should be and how to implement it efficiently.
>>>
>>> 2.  I think we still need an ArrayBuilder or something because, while the MRU
>>> would be reasonably efficient, it still wouldn't be as efficient as an
>>> ArrayBuilder, and would do nothing to solve the uniqueness problem.  Therefore, I
>>> think fleshing out ArrayBuilder is a higher priority.  I was thinking of a design
>>> something like this:
>>>
>>> abstract class Array {
>>>     // A bunch of final methods for .length, opIndex, etc.
>>>     // No .ptr or opSlice.
>>> }
>>>
>>> class UniqueArray : Array {
>>>    // Still no .ptr or opSlice.  Has .toImmutable, which allows
>>>    // for conversion to immutability iff the elements are either
>>>    // pure value types or themselves immutable.
>>>    //
>>>    // Also, can deterministically delete old arrays on reallocation,
>>>    // since it owns a unique reference, leading to more GC-efficient
>>>    // appending.
>>> }
>>>
>>> class ArrayBuilder : Array {
>>>    // Add opSlice and .ptr.  Appending doesn't deterministically
>>>    // delete old arrays, even if the GC supports this.  No guarantees
>>>    // about uniqueness.
>>> }
>> What does .toImmutable return? As far as I can tell making UniqueArray a
>> class can't work because by definition you give up controlling how many
>> references to the array there could be in a program.
>> Andrei
> 
> Sorry, forgot to flesh out a few details.
> 
> 1.  .toImmutable() returns an immutable slice, but also sets UniqueArray's pointer
> member to null, so no instance of UniqueArray has a mutable reference any longer.
>  After this is called, the UniqueArray object will be invalid unless reinitialized.

Ok, so destructive extraction. Perfect.

> 2.  After thinking about this some more, the big issue I see is ref opIndex.  We
> can either:
>     a.  Disallow it for both UniqueArray and ArrayBuilder.
>     b.  Allow it for both UniqueArray and ArrayBuilder and accept
>         that a sufficiently dumb programmer can invalidate the
>         guarantees of UniqueArray by taking the address of one of the
>         elements and saving it somewhere.  Probably a bad idea, since
>         assumeUnique() already works for the careful programmer, and
>         UniqueArray is supposed to provide ironclad guarantees.
>     c.  Don't define opIndex in the abstract base class at all, thus
>         making Array almost useless as an abstract base class.

Welcome to my demons :o).

One possibility that I thought of for a long time would be to disallow taking the address of a ref. That reduces the scope of the problem but doesn't eliminate it:

void main() {
    auto a = new UniqueArray(10);
    fun(a[0], a);
}

void fun(ref int a, UniqueArray b)
{
   auto imm = b.toImmutable();
   // here a is a mutable alias into an immutable array!!!
}

So that doesn't work, but I thought I'd mention it :o).

Another possibility is to expose opIndex to return by value and also opIndexAssign that sets the value. That would be a no-no in C++ because copying is arbitrarily expensive, but I have a feeling that in D it is sensible to consider and foster that all objects should be defined to be cheap to copy (e.g. refcounting, COW etc.) If we go by the notion that in D we can always assume copy costs are reasonable, this last possibility would work. With the newfangled operators, it would even work beautifully because you can do all sorts of things like a[1] += 4 without ever exposing a ref to the user.


Andrei
November 19, 2009
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s article
> > 2.  After thinking about this some more, the big issue I see is ref opIndex.  We
> > can either:
> >     a.  Disallow it for both UniqueArray and ArrayBuilder.
> >     b.  Allow it for both UniqueArray and ArrayBuilder and accept
> >         that a sufficiently dumb programmer can invalidate the
> >         guarantees of UniqueArray by taking the address of one of the
> >         elements and saving it somewhere.  Probably a bad idea, since
> >         assumeUnique() already works for the careful programmer, and
> >         UniqueArray is supposed to provide ironclad guarantees.
> >     c.  Don't define opIndex in the abstract base class at all, thus
> >         making Array almost useless as an abstract base class.
> Welcome to my demons :o).
> One possibility that I thought of for a long time would be to disallow
> taking the address of a ref. That reduces the scope of the problem but
> doesn't eliminate it:
> void main() {
>      auto a = new UniqueArray(10);
>      fun(a[0], a);
> }
> void fun(ref int a, UniqueArray b)
> {
>     auto imm = b.toImmutable();
>     // here a is a mutable alias into an immutable array!!!
> }
> So that doesn't work, but I thought I'd mention it :o).
> Another possibility is to expose opIndex to return by value and also
> opIndexAssign that sets the value. That would be a no-no in C++ because
> copying is arbitrarily expensive, but I have a feeling that in D it is
> sensible to consider and foster that all objects should be defined to be
> cheap to copy (e.g. refcounting, COW etc.) If we go by the notion that
> in D we can always assume copy costs are reasonable, this last
> possibility would work. With the newfangled operators, it would even
> work beautifully because you can do all sorts of things like a[1] += 4
> without ever exposing a ref to the user.
> Andrei

I wonder if it would be feasible to allow overloading on ref vs. non-ref return. Technically this would be overloading on return type, but without many of the practical problems.  If the return value is used as an lvalue, the ref return function gets called.  If the return value is only used as an rvalue, the non-ref function gets called.

This would allow return by value to be defined in the base class and return by reference to only be defined in ArrayBuilder.
November 19, 2009
aarti_pl:

> T opInfix(string op)(T rhs) { ... }
> T opPrefix(string op)(T rhs) { ... }
> T opPostfix(string op)(T rhs) { ... }

So you can use opInfix to define operators like a ~~ b :-)
Now you just need a way to define operator precedence level, with an int number in 0-14 and you're done (I don't think you need to specify operator associativity if left-to-right/right-to-left) :-)
T opInfix(string op, int prec)(T rhs) { ... }
There programming languages that allow to specify operator precedence level too, but I think this is overkill in D.

Regarding operators, in D they are named according to their purpose and not according to their look, and I think this is a good idea. But opDollar doesn't follow that, so isn't a name like opEnd better?

Bye,
bearophile
November 19, 2009
Andrei Alexandrescu pisze:
> aarti_pl wrote:
>> aarti_pl pisze:
>>> Andrei Alexandrescu pisze:
>>>> 2. User-defined operators must be revamped. Fortunately Don already put in an important piece of functionality (opDollar). What we're looking at is a two-pronged attack motivated by Don's proposal:
>>>>
>>>> http://prowiki.org/wiki4d/wiki.cgi?LanguageDevel/DIPs/DIP7
>>>>
>>>> The two prongs are:
>>>>
>>>> * Encode operators by compile-time strings. For example, instead of the plethora of opAdd, opMul, ..., we'd have this:
>>>>
>>>> T opBinary(string op)(T rhs) { ... }
>>>>
>>>> The string is "+", "*", etc. We need to design what happens with read-modify-write operators like "+=" (should they be dispatch to a different function? etc.) and also what happens with index-and-modify operators like "[]=", "[]+=" etc. Should we go with proxies? Absorb them in opBinary? Define another dedicated method? etc.
>>>>
>>>> * Loop fusion that generalizes array-wise operations. This idea of Walter is, I think, very good because it generalizes and democratizes "magic". The idea is that, if you do
>>>>
>>>> a = b + c;
>>>>
>>>> and b + c does not make sense but b and c are ranges for which a.front = b.front + c.front does make sense, to automatically add the iteration paraphernalia.
>>>>
>> (..)
>>>> Andrei
>>>
>>> I kinda like this proposal. But I would rather call template like below:
>>>
>>> T opInfix(string op)(T rhs) { ... }
>>> T opPrefix(string op)(T rhs) { ... }
>>> T opPostfix(string op)(T rhs) { ... }
>>>
>>> and allow user to define her own operators (though it doesn't have to be done now).
>>>
>>> I know that quite a few people here doesn't like to allow users to define their own operators, because it might obfuscate code. But it doesn't have to be like this. Someone here already mentioned here that it is not real problem for programs in C++. Good libraries don't abuse this functionality.
>>>
>>> User defined operators would allow easy definition of Domain Specific Languages in D. I was already writing about it some time ago:
>>>
>>> http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=81026 
>>>
>>> http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=81352 
>>>
>>>
>>> BR
>>> Marcin Kuszczak
>>> (aarti_pl)
>>
>> Of course for opPrefix/opPostfix signatures will be different:
>> T opPrefix(string op)() { ... }
>> T opPostfix(string op)() { ... }
>>
>> Sorry for mistake.
>>
>> BR
>> Marcin Kuszczak
>> (aarti_pl)
> 
> I think we'll solve postfix "++" without requiring the user to define it. Do you envision user-defined postfix operators?
> 
> Andrei

Well, maybe something like below:

auto a = 2²;  //(quadratic power of 2)
auto a = 5!;  //factorial of 5
auto a = 2Ƴ + 3ɛ; //solving equations
auto weight = 5kg; //units of measurement

The point is that this covers whole scope of operators. In fact even built-in operators could be defined using it.

Postfix operator ++ can be defined using prefix operator++ just by delegation and this can be default.

Best Regards
Marcin Kuszczak
(aarti_pl)
November 19, 2009
bearophile pisze:
> aarti_pl:
> 
>> T opInfix(string op)(T rhs) { ... }
>> T opPrefix(string op)(T rhs) { ... }
>> T opPostfix(string op)(T rhs) { ... }
> 
> So you can use opInfix to define operators like a ~~ b :-)

Exactly. But the question is if you *REALLY* need it :-) But IMHO the answer is up to designer.

> Now you just need a way to define operator precedence level, with an int number in 0-14 and you're done (I don't think you need to specify operator associativity if left-to-right/right-to-left) :-)
> T opInfix(string op, int prec)(T rhs) { ... }
> There programming languages that allow to specify operator precedence level too, but I think this is overkill in D.

I agree. That's too much.

> Regarding operators, in D they are named according to their purpose and not according to their look, and I think this is a good idea. But opDollar doesn't follow that, so isn't a name like opEnd better?

I think that proposed names exactly reflect the meaning. So I would say it is perfectly consistent with D convention.

> 
> Bye,
> bearophile