October 03, 2008
Andrei Alexandrescu wrote:
> Bruno Medeiros wrote:
>> In fact, I'm also not a fan of those complex property mechanisms, à lá C#. I think a fair candidate would be Bill Baxter's proposal, the 'property' keyword:
>>
>>   property int foo() { return _foo; };
>>   property void foo(int foo) { _foo = foo; };
>>
>> The property keyword would make a function callable *only* as a property syntax (either as reading, 'bar = foo;', or as writing, 'foo = 42;'). A function signature which was not adequate for property access would be compile-time error.
> 
> Then what would obj.fun mean when fun is not a property?
> 
> Andrei

If I understand the proposal correctly, it would be a compile-time error.

--benji
October 03, 2008
On Fri, Oct 3, 2008 at 11:43 PM, Sergey Gromov <snake.scaly@gmail.com> wrote:
> Fri, 03 Oct 2008 20:59:54 +0800,
> KennyTM~ wrote:
>> Sergey Gromov wrote:
>> > What's wrong with a.opIndexAssign(b, a.opIndex(b) + c)?
>>
>> Probably performance.
>>
>> Consider seeking to the end of a 100M-node single-linked list, and increase its content by 1.
>>
>> But I agree that if something like .opIndexAddAssign() is not defined,
>> the compiler should fall back to use a.opIndexAssign(b, a.opIndex(b)+c).
>>
>> (The same idea can be extended to properties and .opSlice() )
>
> No, if you want performance in this particular case you define
>
> ref int opIndex()
>
> because I think whenever compiler encounters a[x]++ it should first test whether a[x] is an lvalue and if yes use it accordingly.  And only if it is not it should fall back to .opIndexAddButProbablyNotAssign() special overloads.

Yeh, but then you lose encapsulation.  That's fine for some cases, like a vector, but in other cases it can be very handy to be able to get a notification any time one of your values changes.  You lose that ability once you start handing out pointers to whoever asks for one.

--bb
October 03, 2008
On Fri, Oct 3, 2008 at 11:02 PM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> Sergey Gromov wrote:
>>
>> Thu, 02 Oct 2008 15:03:42 -0500,
>> Andrei Alexandrescu wrote:
>>>
>>> Yah, overloaded ops are due for an overhaul. I'm almost afraid to ask... any ideas? :o)
>>>
>>> One goal is to fix opIndexAssign and make it work similar to the way it works in arrays, e.g. a[b] += c. Indexing into hash tables is a good test bed.
>>
>> What's wrong with a.opIndexAssign(b, a.opIndex(b) + c)?
>
> One problem is that for a hashtable that does not have b yet, opIndex will throw an exception.

I don't see why you expect   a[b] += c to work on a key for which a[b] is undefined.  If it's undefined how can you increment it?

Unless you've defined your accessor for undefined keys to return some other value.  And if it returns some other value then Sergey's rule is fine.

Sparse matrices are a good example of a hash-like data-structure that
should return a default value (namely zero) for unset keys.  For such
a sparse matrix      a.opIndexAssign(b, a.opIndex(b) + c)   will work
fine.


> Another problem (assuming the above is fixed) is that b will be looked up
> twice in the hash.

That is a problem.  But like Sergey said, if performance is your #1 concern then return references.

Though, that solution is a little problematic for the sparse matrices.
 In order to return a reference to an element that didn't previously
exist, you must create it.  But if you're just scanning through your
sparse matrix printing out the values with a[b], you don't expect to
end up with a dense matrix full of zeros!  In a C++ lib you'd probably
provide an lvalue-returning operator[] and another function that
returns just an rvalue.

Ooh, how about passing in the manipulator @= function somehow?  Either delegate or template alias param.

Example for sparse matrix case:
// Called for  a[b] @= c type operations
void opIndexUpdate(void delegate(ref ValueType val) updater, uint idx) {
      float *ptr = getPtrToElement(idx);  // creates new zero element if needed
      updater(*ptr);
      // here you can veto the change
      // or throw an overflow exception
      // or clamp *ptr
      // or round it
      //  or whatever you want...
}

Compiler would generate the different updaters needed for the various @= operations.

Maybe with a template alias param there'd maybe be better hope of the compiler being able to inline it all, but then you have the no inheritance problem.

void opIndexUpdate(alias updater)(uint idx) {
      float *ptr = getPtrToElement(idx);  // creates new zero element if needed
      updater(*ptr);
}

--bb
October 03, 2008
Bill Baxter wrote:
> On Fri, Oct 3, 2008 at 11:02 PM, Andrei Alexandrescu
> <SeeWebsiteForEmail@erdani.org> wrote:
>> Sergey Gromov wrote:
>>> Thu, 02 Oct 2008 15:03:42 -0500,
>>> Andrei Alexandrescu wrote:
>>>> Yah, overloaded ops are due for an overhaul. I'm almost afraid to ask...
>>>> any ideas? :o)
>>>>
>>>> One goal is to fix opIndexAssign and make it work similar to the way it
>>>> works in arrays, e.g. a[b] += c. Indexing into hash tables is a good test
>>>> bed.
>>> What's wrong with a.opIndexAssign(b, a.opIndex(b) + c)?
>> One problem is that for a hashtable that does not have b yet, opIndex will
>> throw an exception.
> 
> I don't see why you expect   a[b] += c to work on a key for which a[b]
> is undefined.  If it's undefined how can you increment it?

Because accessing it as an rvalue is different than accessing it as an lvalue. You just wrote a very sensible post in the same vein!

> Unless you've defined your accessor for undefined keys to return some
> other value.  And if it returns some other value then Sergey's rule is
> fine.

Yah, in that case things would work. I wouldn't dislike it, but I think Walter won't want to make that change.

> Sparse matrices are a good example of a hash-like data-structure that
> should return a default value (namely zero) for unset keys.  For such
> a sparse matrix      a.opIndexAssign(b, a.opIndex(b) + c)   will work
> fine.

Glad you brought sparse arrays up. Yah, it would work if a[b] would return a default value instead of throwing.

>> Another problem (assuming the above is fixed) is that b will be looked up
>> twice in the hash.
> 
> That is a problem.  But like Sergey said, if performance is your #1
> concern then return references.

It's one of the concerns. And there are several solutions that make everything work, including performance.

> Though, that solution is a little problematic for the sparse matrices.
>  In order to return a reference to an element that didn't previously
> exist, you must create it.  But if you're just scanning through your
> sparse matrix printing out the values with a[b], you don't expect to
> end up with a dense matrix full of zeros!  In a C++ lib you'd probably
> provide an lvalue-returning operator[] and another function that
> returns just an rvalue.

Exactly.

> Ooh, how about passing in the manipulator @= function somehow?  Either
> delegate or template alias param.
> 
> Example for sparse matrix case:
> // Called for  a[b] @= c type operations
> void opIndexUpdate(void delegate(ref ValueType val) updater, uint idx) {
>       float *ptr = getPtrToElement(idx);  // creates new zero element if needed
>       updater(*ptr);
>       // here you can veto the change
>       // or throw an overflow exception
>       // or clamp *ptr
>       // or round it
>       //  or whatever you want...
> }
> 
> Compiler would generate the different updaters needed for the various
> @= operations.
> 
> Maybe with a template alias param there'd maybe be better hope of the
> compiler being able to inline it all, but then you have the no
> inheritance problem.
> 
> void opIndexUpdate(alias updater)(uint idx) {
>       float *ptr = getPtrToElement(idx);  // creates new zero element if needed
>       updater(*ptr);
> }

I am very glad you brought up this idea. I think it can work. I brought it up to Walter and Bartosz (in the alias form), and the status was to think about it some more. For whatever reason suggestions are viewed with increased negativity when coming from me around here, so I often wish someone else comes up with it, gets support, Walter implements it, and we all get over with. (That's what I hoped for the property thing, but that didn't go through.) In general, it's much easier to attack an imperfect proposal than to help make it better, and it's all the more enticing when it comes from a perceived authority. (Of course the next logical step is the question "then why didn't you help my proposal? :o))


Andrei
October 03, 2008
"Andrei Alexandrescu" wrote
> Sergey Gromov wrote:
>> Fri, 3 Oct 2008 09:37:46 +0900,
>> Bill Baxter wrote:
>>> On Fri, Oct 3, 2008 at 9:32 AM, Bill Baxter <wbaxter@gmail.com> wrote:
>>>> On Fri, Oct 3, 2008 at 8:04 AM, Sergey Gromov <snake.scaly@gmail.com> wrote:
>>>>> Thu, 02 Oct 2008 15:03:42 -0500,
>>>>> Andrei Alexandrescu wrote:
>>>>>> Yah, overloaded ops are due for an overhaul. I'm almost afraid to
>>>>>> ask...
>>>>>> any ideas? :o)
>>>>>>
>>>>>> One goal is to fix opIndexAssign and make it work similar to the way
>>>>>> it
>>>>>> works in arrays, e.g. a[b] += c. Indexing into hash tables is a good
>>>>>> test bed.
>>>>> What's wrong with a.opIndexAssign(b, a.opIndex(b) + c)?
>>>>>
>>>> Indeed.  I thought there wasn't a lot of debate needed on this, just action.
>>> .
>>> ... except these extras do have the same issue that plain property
>>> assignment does.  They would open up a new class of things that are
>>> valid code but don't behave as expected.  writefln += 5.
>>>
>>> And also, a[b] += c should probably be rewritten as  "a.opIndex(b) += c"  *if* a returns a reference of some sort.  Ok, so maybe there is a little to talk about.  :-)
>>
>> I think that any expression "a @= b" should first check whether the expression on the left is an lvalue.  If yes then use it as any other lvalue, otherwise try to re-write an expression using op@Assign.  This works in many scenarios, like opIndex returning a reference.
>
> That's a good rule, but a[b] @= c on a hash is not helped by it.

I think Sergey didn't write his rules correctly.  If a is an lvalue, use it as any other lvalue, which *includes* trying op@Assign.

If a is not an lvalue, then rewrite the expression as a = a @b, without expanding operators, then use normal operator expansion to compile the statement.

If neither works, then it is a syntax error.

This should be valid for all cases.  For example a hash that returns a reference for opIndex:

a[b] += c;

a.opIndex(b) is an lvalue, so try: a.opIndex(b).opAddAssign(c).  If that doesn't work, then syntax error.

case 2, a[b] is an rvalue:

rewrite as a[b] = a[b] + c;

expand operators:

a.opIndexAssign(a[b] + c, b);
a.opIndexAssign(a.opIndex(b) + c, b);

An example of a failure:

int foo(int x) {...}

foo(5) += 6;

left side is an rvalue, so try rewriting:

foo(5) = foo(5) + 6;

syntax error, foo(5) is still an rvalue

Any cases that break with these rules?

-Steve


October 03, 2008
Sat, 4 Oct 2008 04:34:46 +0900,
Bill Baxter wrote:
> On Fri, Oct 3, 2008 at 11:43 PM, Sergey Gromov <snake.scaly@gmail.com> wrote:
> > Fri, 03 Oct 2008 20:59:54 +0800,
> > KennyTM~ wrote:
> >> Sergey Gromov wrote:
> >> > What's wrong with a.opIndexAssign(b, a.opIndex(b) + c)?
> >>
> >> Probably performance.
> >>
> >> Consider seeking to the end of a 100M-node single-linked list, and increase its content by 1.
> >>
> >> But I agree that if something like .opIndexAddAssign() is not defined,
> >> the compiler should fall back to use a.opIndexAssign(b, a.opIndex(b)+c).
> >>
> >> (The same idea can be extended to properties and .opSlice() )
> >
> > No, if you want performance in this particular case you define
> >
> > ref int opIndex()
> >
> > because I think whenever compiler encounters a[x]++ it should first test whether a[x] is an lvalue and if yes use it accordingly.  And only if it is not it should fall back to .opIndexAddButProbablyNotAssign() special overloads.
> 
> Yeh, but then you lose encapsulation.  That's fine for some cases, like a vector, but in other cases it can be very handy to be able to get a notification any time one of your values changes.  You lose that ability once you start handing out pointers to whoever asks for one.

Your two last posts are very close to what I have on my mind myself, and I agree with your concerns.  The 'notification' thing is exactly what's missing.

My proposal is quite flexible overall in that opIndex or opIndexLvalue can return a struct implementing all and every op for the underlying type.  But it's tedious, bug-prone and not forward-compatible.

What is needed is an ability to easily wrap one type with another, exposing the underlying type's functionality as much as possible, but still receiving notifications whenever the underlying object is updated.

Now that I write this it doesn't look as smooth as I thought.  But anyway, here's how it works:

struct TypeWrapper(T)
{
  private T contents;
  ref T opRef() { return contents; }
  void postModify() {...}
}

Here opIndex of your sparse matrix returns this wrapper.  The wrapper knows whether it wraps a real element or a zero.  It can detect that a real element became zero and remove it from matrix.

The opRef works exactly like opCast right now in that it exposes exactly one underlying type.  But I hope we'll get a polymorphic opCast at some point so I'm using a different name for this.  The postModify is a weird callback which is called after the value received via opRef is modified.

Your delegate idea is great in that all the op@Assign can be replaced with opModify(T delegate(T) modify), giving you the ultimate notification.  Even more than that, the minimal type wrapper could look like this:

struct TypeWrapper(T)
{
  private T value;
  T opCast()
  {
    return value;
  }
  T opAssign(T v)
  {
    value = v;
    return value;
  }
  T opModify(T delegate(T v) mod)
  {
    value = mod(value);
    return value;
  }
  T opProduct(T delegate(T v) prod)
  {
    return prod(value);
  }
}

where opCast is for rvalue usage, opAssign is for pure '=', opModify is for '@=' and opProduct is for any regular unary or binary operators.
October 03, 2008
"Steven Schveighoffer" wrote
> "Andrei Alexandrescu" wrote
>> Sergey Gromov wrote:
>>> Fri, 3 Oct 2008 09:37:46 +0900,
>>> Bill Baxter wrote:
>>>> On Fri, Oct 3, 2008 at 9:32 AM, Bill Baxter <wbaxter@gmail.com> wrote:
>>>>> On Fri, Oct 3, 2008 at 8:04 AM, Sergey Gromov <snake.scaly@gmail.com> wrote:
>>>>>> Thu, 02 Oct 2008 15:03:42 -0500,
>>>>>> Andrei Alexandrescu wrote:
>>>>>>> Yah, overloaded ops are due for an overhaul. I'm almost afraid to
>>>>>>> ask...
>>>>>>> any ideas? :o)
>>>>>>>
>>>>>>> One goal is to fix opIndexAssign and make it work similar to the way
>>>>>>> it
>>>>>>> works in arrays, e.g. a[b] += c. Indexing into hash tables is a good
>>>>>>> test bed.
>>>>>> What's wrong with a.opIndexAssign(b, a.opIndex(b) + c)?
>>>>>>
>>>>> Indeed.  I thought there wasn't a lot of debate needed on this, just action.
>>>> .
>>>> ... except these extras do have the same issue that plain property
>>>> assignment does.  They would open up a new class of things that are
>>>> valid code but don't behave as expected.  writefln += 5.
>>>>
>>>> And also, a[b] += c should probably be rewritten as  "a.opIndex(b) += c"  *if* a returns a reference of some sort.  Ok, so maybe there is a little to talk about.  :-)
>>>
>>> I think that any expression "a @= b" should first check whether the expression on the left is an lvalue.  If yes then use it as any other lvalue, otherwise try to re-write an expression using op@Assign.  This works in many scenarios, like opIndex returning a reference.
>>
>> That's a good rule, but a[b] @= c on a hash is not helped by it.
>
> I think Sergey didn't write his rules correctly.  If a is an lvalue, use it as any other lvalue, which *includes* trying op@Assign.
>
> If a is not an lvalue, then rewrite the expression as a = a @b, without expanding operators, then use normal operator expansion to compile the statement.
>
> If neither works, then it is a syntax error.
>
> This should be valid for all cases.  For example a hash that returns a reference for opIndex:
>
> a[b] += c;
>
> a.opIndex(b) is an lvalue, so try: a.opIndex(b).opAddAssign(c).  If that doesn't work, then syntax error.
>
> case 2, a[b] is an rvalue:
>
> rewrite as a[b] = a[b] + c;
>
> expand operators:
>
> a.opIndexAssign(a[b] + c, b);
> a.opIndexAssign(a.opIndex(b) + c, b);
>
> An example of a failure:
>
> int foo(int x) {...}
>
> foo(5) += 6;
>
> left side is an rvalue, so try rewriting:
>
> foo(5) = foo(5) + 6;
>
> syntax error, foo(5) is still an rvalue
>
> Any cases that break with these rules?

After thinking about this some more, I don't even thing the rvalue/lvalue check is necessary.  Just try:

a.op@Assign(b);

If that doesn't compile under the current rules, try rewriting as:

a = a @ b;

If that doesn't compile under the current rules, then it's a syntax error.

-Steve


October 03, 2008
KennyTM~ wrote:
> Sergey Gromov wrote:
>> Thu, 02 Oct 2008 15:03:42 -0500,
>> Andrei Alexandrescu wrote:
>>> Yah, overloaded ops are due for an overhaul. I'm almost afraid to ask... any ideas? :o)
>>>
>>> One goal is to fix opIndexAssign and make it work similar to the way it works in arrays, e.g. a[b] += c. Indexing into hash tables is a good test bed.
>>
>> What's wrong with a.opIndexAssign(b, a.opIndex(b) + c)?
> 
> Probably performance.
> 
> Consider seeking to the end of a 100M-node single-linked list, and increase its content by 1.
> 
> But I agree that if something like .opIndexAddAssign() is not defined, the compiler should fall back to use a.opIndexAssign(b, a.opIndex(b)+c).
> 
> (The same idea can be extended to properties and .opSlice() )
But if the requirement is that it should have the same *meaning* as a.opIndexAssign(b, a.opIndex(b) + c) , then actual implementation for performance can be considered a compiler optimization feature.

Just because one piece of code means the same thing as another doesn't mean it has to be implemented the same way, or can't be optimized.
October 04, 2008
Charles Hixson wrote:
> KennyTM~ wrote:
>> Sergey Gromov wrote:
>>> Thu, 02 Oct 2008 15:03:42 -0500,
>>> Andrei Alexandrescu wrote:
>>>> Yah, overloaded ops are due for an overhaul. I'm almost afraid to ask... any ideas? :o)
>>>>
>>>> One goal is to fix opIndexAssign and make it work similar to the way it works in arrays, e.g. a[b] += c. Indexing into hash tables is a good test bed.
>>>
>>> What's wrong with a.opIndexAssign(b, a.opIndex(b) + c)?
>>
>> Probably performance.
>>
>> Consider seeking to the end of a 100M-node single-linked list, and increase its content by 1.
>>
>> But I agree that if something like .opIndexAddAssign() is not defined, the compiler should fall back to use a.opIndexAssign(b, a.opIndex(b)+c).
>>
>> (The same idea can be extended to properties and .opSlice() )
> But if the requirement is that it should have the same *meaning* as a.opIndexAssign(b, a.opIndex(b) + c) , then actual implementation for performance can be considered a compiler optimization feature.
> 
> Just because one piece of code means the same thing as another doesn't mean it has to be implemented the same way, or can't be optimized.

That's a good point. Compiler magic can make sure stuff like that gets optimized, but user-defined object cannot benefit of such equivalence.

Andrei
October 04, 2008
Bill Baxter wrote:
> On Fri, Oct 3, 2008 at 11:02 PM, Andrei Alexandrescu
> <SeeWebsiteForEmail@erdani.org> wrote:
>> Sergey Gromov wrote:
>>> Thu, 02 Oct 2008 15:03:42 -0500,
>>> Andrei Alexandrescu wrote:
>>>> Yah, overloaded ops are due for an overhaul. I'm almost afraid to ask...
>>>> any ideas? :o)
>>>>
>>>> One goal is to fix opIndexAssign and make it work similar to the way it
>>>> works in arrays, e.g. a[b] += c. Indexing into hash tables is a good test
>>>> bed.
>>> What's wrong with a.opIndexAssign(b, a.opIndex(b) + c)?
>> One problem is that for a hashtable that does not have b yet, opIndex will
>> throw an exception.
> 
> I don't see why you expect   a[b] += c to work on a key for which a[b]
> is undefined.  If it's undefined how can you increment it?

Actually I did use it once to count things, so I could just use a[b]++ instead of the clumsy if(b in a){a[b]++;}else{a[b]=1;}.