Memoize function in D (page 4)

July 29, 2005

Re: Memoize function in D

Posted by Burton Radons
in reply to Tony

Permalink

Burton Radons

Posted in reply to Tony

Permalink

Tony wrote:
> Hi.
> 
> I was wondering if someone could demonstrate how to write a memoize function
> in D?
> 
> A memoize routine should accept a function as an argument, and return a new
> function.  This new function should return the same values as the original
> function, but should also maintain a cache of results from previous calls
> and return the value(s) from the cache rather than recalculating them,
> whenever possible.
> 
> This can be done quite simply in Lisp, and an example is available in Paul
> Grahams "On Lisp" (freely available as a pdf) if anyone is interested.
> 
> If there is an elegant solution to this in D, I will be very impressed.

Taking any number of arguments isn't possible not because of a constraint of static typing, but because D doesn't have variadic templates; no language I know of does.

Here's a way in which it could be extended to include it (this would support keyword arguments, if the language did):

    // Specify that the template can take any number of argument types
    // that are stashed in the virtual tuple ArgumentTypes.
    class Memoize (ReturnType, ArgumentTypes...)
    {
        // Expand ArgumentTypes into the type so that it takes all the
        // specified parameters.

        alias ReturnType function (ArgumentTypes...) FunctionType;

        FunctionType call;

        // In this context it's treated as its tuple, which is a kind of
        // struct with overloaded opCmp, opEquals, and opHash, so that
        // it can be used in associative arrays like this.

        ReturnType [ArgumentTypes] cache;

        this (FunctionType vcall)
        {
            this.call = vcall;
        }

        // This function takes the argument list and stashes it in args.

        synchronized ReturnType opCall (ArgumentTypes args...)
        {
            ReturnType *pointer = args in cache;

            // The final context: when used in a function call, expand
            // a tuple into the arguments.

            if (pointer is null)
                return cache [args] = call (args...);
            return *pointer;
        }
    }

    Memoize! (int, int) memo = new Memoize! (int, int) (&fib);

This is still all statically-typed, but that's not a word with much meaning.  A statically-typed language is the same as a dynamically-typed language - it merely has constraints on what values can be assigned to a variable.  D limits constraints to a given type and its inheritors, so to make D dynamically-typed, you only need to specify a type which all other types inherit from or can be considered as, such as box or a void pointer or Object in some scenarios.  A statically-typed language like Cecil allows constraints based on arbitrary predicates, and doesn't have obstructing constructs like value types (which in D's case are simply reference types which initialise implicitly, duplicate on assignment, and are not considered inheritors of Object).

If you try to treat D - but not Cecil - as a dynamically-typed language, you'll get into troubles because you'll have to cast an object before you can send a given message to it.  But that is purely a difference in message dispatching; it doesn't have anything to do with static typing.

A purely dynamically-typed language would be functionally useless, so dynamically-typed languages actually aren't.  For example, this Python code is not dynamically typed:

    class foo:
        def bar (self):
            pass

    class baz:
        def bar (self):
            pass

    foo ().bar ()

It is actually functionally identical to this code using an extension:

    class foo:
        pass

    class baz:
        pass

    def bar (foo self):
        pass

    def bar (baz self):
        pass

    # Call the first version of bar based on the
    # type of its first argument.
    bar (foo ())

A purely dynamically-typed language doesn't have fields or methods; it's all functions at the global scope and the only way to tell two with the same name apart is by the number of arguments they have.  Cecil actually supports this style of programming, although at some depth the code must turn into pure strong static typing for it to be evaluatable (unless if it's empty).

If I sound like I'm talking pure gibberish, sorry.  I'm sure these issues are given high-falutin' names in computer science or maybe these same names but with different meanings, but I'm just a hobbyist so I wouldn't know.

In article <dcc8t7$hce$1@digitaldaemon.com>, Burton Radons says... > >Stefan wrote: >> >> "float opCmp(Box other)" >> >> Why float as return type, that's quite unusal, no? > >If the compiler sees "a <= b" and "a" is a struct that has an opCmp or a class, it converts that into "a.opCmp (b) <= 0"; it only uses opEquals for == and !=. So if you return negative for below, positive for above, zero for equivalent, and float.nan for unordered (nan is unordered with everything including itself), then all of the comparison operators will work correctly. > >This behaviour is undocumented, of course. :) And classes are resigned to using int opCmp because that's what's in Object. That should really be fixed. Ah, so it's float to be able to say "it's unordered" by returning float.nan Thanks, that cleared it up for me (though, I always lived under the impression that opCmp could/should only be implemented for types that have at least _some_ natural ordering :). Best regards, Stefan

"Tony" <talktotony@email.com> wrote: > I have not been seeking a totally general memoize function in D. > In fact, there are many instances where it does not make sense > to apply a memoize function. Namely when: > > 1. The function to be memoized involves less processing than the > cache lookup in a memoized version, or > 2. When a function is not referentially transparent. Then the set of memoizable functions in the sense above is nearly empty and in addition I suggest that this set is undecidable. Informal argument resulting out of condition 1: Every cache lookup needs at least Omega( 1) space and Omega( log (n)) time, where n is the number of arguments stored in the cache. Because the growth of log(n) is unbound memoizing is not suitable for functions whose execution time can be bound by an unknown but fix constant. On the other hand if the execution time of the function to be memoized cannot be bound by a constant the time complexity of the function to be memoized has to grow at least at the magnitude of Omega( log(n)) where n is the number of previous calls(!), regardless of the values of the parameters of the calls. If wihite-box-memoizing would be possible, then this would exclude for example the fibonacci-function from being memoizable, because if n-1 values are cached the n-th value can be computed in constant time. Further restriction resulting out of condition 2 for D: because the evaluation order for functions with more than one parameter is undefined in D, only functions with only one in- parameter are referentially transparent. > What I am saying is that for the subset of functions that can benefit from memoizing, it is desirable to be able to write a single memoize function rather than needing to write multiple versions for each different function signature. > > So far the consensus seems to be that the strong static typing in D precludes this. If this is a consensus it is wrong. Bens first approach uses an implementation with a class, which can be easily templated and extended to handle function-types as arguments. The fact that this template has to be instantiated for each function that has to be memoized and therefore consumes time and space is neglectable because this requirements can be bound by Big-O( l), where l is the length of the description of the type of the memoizee, i.e. no more time and space than needed to describe the memoizee itself. Even if you prefer absolute values for time and space the reqirements are still neglectable compared to the requirements imposed by caching. -manfred

Burton Radons wrote: > Here's a way in which it could be extended to include it (this would support keyword arguments, if the language did): > > // Specify that the template can take any number of argument types > // that are stashed in the virtual tuple ArgumentTypes. > class Memoize (ReturnType, ArgumentTypes...) > { I like it, but would prefer something like: # class Memoize (R function(A...)) { /*blah*/ } With A of type TypeInfo[] like the _arguments local given to variadic functions. The quirk then becomes how to handle out/inout arguments. -- Chris Sauls

Forums