January 10, 2009
Walter Bright wrote:
> Jason House wrote:
>> Walter Bright Wrote:
>>
>>> Jason House wrote:
>>>> When considering the standard library changes, I realized that object.d could change. I believe a pure opCmp or toString could
>>>> break user code with impure versions of those functions. Would
>>>> that kind of a change to object.d cause any real problems for D2
>>>> users?
>>> As Andrei pointed out, the trouble with making the Object functions
>>> pure is if you want to do an override that caches its value.
>>
>> One could easily generalize that to mean non-final virtual functions
>> should never be pure. Memoizaton is a pretty basic need. I thought
>> that was in D's future. Is that not true?
> 
> Memoization and purity don't mix. For each function, you'll have to pick which way you want to go. I don't see what that has to do with D's future, though, the language allows either.

Memoization and purity can and should go together. We need to sit down and discuss that someday.

Andrei
January 10, 2009
Walter Bright Wrote:

> Michel Fortin wrote:
> > Hum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure?
> 
> If the compiler does general memoization on pure functions, all it has
> to do is use the bits of the arguments passed on the stack to the
> function as an index into an associative array of the return values.
> 
> The problem is identifying if this would be faster than recomputing the
> return value.

What about a compiler hint similar to inline or register?
January 10, 2009
== Quote from Walter Bright (newshound1@digitalmars.com)'s article
> Michel Fortin wrote:
> > Hum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure?
> If the compiler does general memoization on pure functions, all it has to do is use the bits of the arguments passed on the stack to the function as an index into an associative array of the return values. The problem is identifying if this would be faster than recomputing the return value.

Wouldn't this also cause threading issues?  The obvious solution would be to use TLS, but his requires duplicating the cache across threads.  Also, using AAs internally like this would lead to very deeply hidden memory allocations, and therefore possibly more frequent GC, etc.
January 10, 2009
On Sat, 10 Jan 2009 07:14:07 +0300, dsimcha <dsimcha@yahoo.com> wrote:

> == Quote from Walter Bright (newshound1@digitalmars.com)'s article
>> Michel Fortin wrote:
>> > Hum, could the compiler be trusted to add the memoization code to pure
>> > functions so they can stay pure?
>> If the compiler does general memoization on pure functions, all it has
>> to do is use the bits of the arguments passed on the stack to the
>> function as an index into an associative array of the return values.
>> The problem is identifying if this would be faster than recomputing the
>> return value.
>
> Wouldn't this also cause threading issues?  The obvious solution would be to use
> TLS, but his requires duplicating the cache across threads.  Also, using AAs
> internally like this would lead to very deeply hidden memory allocations, and
> therefore possibly more frequent GC, etc.

Thread-safe (lock-free?) container would do the trick just fine.

January 10, 2009
Michel Fortin wrote:
> On 2009-01-09 20:45:57 -0500, Jason House <jason.james.house@gmail.com> said:
> 
>>> For each function, you'll have to pick
>>> which way you want to go. I don't see what that has to do with D's
>>> future, though, the language allows either.
>>
>> Somehow, I had the impression that D might use memoization as some kind of optimization once pure functions are embraced.  That was probably a misunderstanding on my part.
> 
> Hum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure? Something like this:
> 
>     pure memoize int costlyCalculus(int i, int j)
>     {
>         return i + j;
>     }
> 
> being transformed into this by the compiler?
> 
>     int[TypeTuple!(i, j)] _costlyCalculus_cache;
>     pure int costlyCalculus(int i, int j)
>     {
>         {
>             int* _found_result = Tuple!(i, j) in _costlyCalculus_cache;
>             if (_found_result)
>                 return _found_result;
>         }
>         {
>             int _calculated_result = i + j;
>             _costlyCalculus_cache[Tuple!(i, j)] = _calculated_result;
>             return _calculated_result;
>         }
>     }
> 
> Surely this way the compiler would be able to bypass purity checks for accessing the cache while still ensuring that the function has no side effect and always return the same result for the same arguments. The downside being that a hash table may not always be the best memoization technique.
> 
> Perhaps then it belongs more in the standard library.
> 
>     pure int costlyCalculus(int i, int j);
>     Memoizer!(costlyCalculus) memoizedCostlyCalculus;
> 
> where Memoizer is a struct template defining a pure opCall and containing the cache it can access by casting away purity in a few places. But it makes it more difficult to add memoization while overriding a function.
> 

Yes!

Memoization is One Great Litmus Test of compile-time reflection. I implemented a couple of simple memoizers along the lines you mention, and I plan to make memoization examples part of TDPL.

The Even Greater Litmus Test would be to expose the memoizer itself as a pure function (which it essentially is).


Andrei
January 10, 2009
Walter Bright wrote:
> Michel Fortin wrote:
>> Hum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure?
> 
> If the compiler does general memoization on pure functions, all it has to do is use the bits of the arguments passed on the stack to the function as an index into an associative array of the return values.

Not the bits, the full objects if they have indirection. (Pure functions should accept const parameters!)

And it's not only simple associative array. It could also be a dense array, a little interpolation engine etc.

> The problem is identifying if this would be faster than recomputing the return value.

I used memoizers for exp and log (the most called functions in some code I wrote) and it made the original version feel like it was driving in reverse.


Andrei
January 10, 2009
Jason House wrote:
> Walter Bright Wrote:
> 
>> Michel Fortin wrote:
>>> Hum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure?
>> If the compiler does general memoization on pure functions, all it has to do is use the bits of the arguments passed on the stack to the function as an index into an associative array of the return values.
>>  The problem is identifying if this would be faster than recomputing the return value.
> 
> What about a compiler hint similar to inline or register?

Maybe making a conscious effort to leave the language feature as a last resort action would be better. Creating a memoization feature is a fish, creating a language that allows implementing memoization is fishing. :o/

Andrei
January 10, 2009
Andrei Alexandrescu Wrote:

> Jason House wrote:
> > Walter Bright Wrote:
> > 
> >> Michel Fortin wrote:
> >>> Hum, could the compiler be trusted to add the memoization code to pure functions so they can stay pure?
> >> If the compiler does general memoization on pure functions, all it has
> >> to do is use the bits of the arguments passed on the stack to the
> >> function as an index into an associative array of the return values.
> >> 
> >> The problem is identifying if this would be faster than recomputing the
> >> return value.
> > 
> > What about a compiler hint similar to inline or register?
> 
> Maybe making a conscious effort to leave the language feature as a last resort action would be better. Creating a memoization feature is a fish, creating a language that allows implementing memoization is fishing. :o/
> 
> Andrei

I chose nearly dead keywords for a reason ;) If I got Walter to think about how to implement "pure" memoizers for an extra half hour, I will consider myself highly successful. I must leave the fishing discussion up to the two of you in a coffee shop! (I'd love to sit in and participate with that discussion, but online forums make the conversation far too difficult)
January 10, 2009
On 2009-01-10 00:10:11 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> said:

>> The problem is identifying if this would be faster than recomputing the return value.
> 
> I used memoizers for exp and log (the most called functions in some code I wrote) and it made the original version feel like it was driving in reverse.

That's only true because, in your specific context, those functions were called often with the same input. In a program that rarely use the same inputs, your memoizing functions will just waste time and memory.

So to determine if a function is worth memoizing, you have to say yes to those two things:

1.	is it faster than recomputing the return value?
2.	is this function called often with the same inputs?

While the compiler could take an educated guess at answering 1 given the code of a function, question 2 can only be answered by knowing the usage pattern of various functions.

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

January 10, 2009
On 2009-01-09 22:41:01 -0500, Walter Bright <newshound1@digitalmars.com> said:

> If the compiler does general memoization on pure functions, all it has to do is use the bits of the arguments passed on the stack to the function as an index into an associative array of the return values.

True, but only in the absence of pointers to mutable data (pointers to immutable data should be fine). (Hum, and beware of moving GCs, and ignore any memoization data which may be present inside the arguments... isn't determining equality a job for opEquals?)


> The problem is identifying if this would be faster than recomputing the return value.

That's why I was using a "memoized" attribute in the function declaration, so the programmer decides if this function needs memoization or not. I know it's often a tough decision even for the designer of a function, but if adding memoization becomes easy, as a library designer you could just not worry about it and let users create a memoization wrapper when necessary.


-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/