Thread overview | |||||||||
---|---|---|---|---|---|---|---|---|---|
|
March 09, 2018 Lazy caching map()? | ||||
---|---|---|---|---|
| ||||
Today I found myself needing a lazy, caching version of map() on an array. More precisely, given an array `T[] src` of source data and a function func(T) that's pretty expensive to compute, return an object `result` such that:
- result[i] == func(src[i]), for 0 ≤ i < src.length.
- If result[j] is never actually used, func(src[j]) is never invoked
(lazy).
- If result[j] is referenced multiple times, a cached value is returned
after the first time, i.e., the result of func(src[j]) is cached, and
func(src[j]) is never invoked more than once.
I couldn't figure out how to build this using Phobos primitives, so I wrote my own implementation of it. Pretty simple, really, it's just a wrapper struct that lazily initializes an array of Nullable!T and lazily populates it with func(j) when opIndex(j) is invoked, and just returns the cached value if opIndex(j) has been invoked before.
Can this be done using current Phobos primitives?
T
--
Don't throw out the baby with the bathwater. Use your hands...
|
March 10, 2018 Re: Lazy caching map()? | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | On Friday, 9 March 2018 at 19:41:46 UTC, H. S. Teoh wrote:
> Today I found myself needing a lazy, caching version of map() on an array. More precisely, given an array `T[] src` of source data and a function func(T) that's pretty expensive to compute, return an object `result` such that:
>
> - result[i] == func(src[i]), for 0 ≤ i < src.length.
>
> - If result[j] is never actually used, func(src[j]) is never invoked
> (lazy).
>
> - If result[j] is referenced multiple times, a cached value is returned
> after the first time, i.e., the result of func(src[j]) is cached, and
> func(src[j]) is never invoked more than once.
>
> I couldn't figure out how to build this using Phobos primitives, so I wrote my own implementation of it. Pretty simple, really, it's just a wrapper struct that lazily initializes an array of Nullable!T and lazily populates it with func(j) when opIndex(j) is invoked, and just returns the cached value if opIndex(j) has been invoked before.
>
> Can this be done using current Phobos primitives?
>
>
> T
Unless I'm misunderstanding your requirements, are you looking for std.functionl.memoize?
auto fun(int a) {
writeln("here");
return a + 1;
}
void main() {
auto a = [1, 2, 3];
auto b = a.map!(memoize!fun);
writeln(b[1]);
writeln(b[1]);
writeln(b[1]);
}
will print "here" just once.
|
March 19, 2018 Re: Lazy caching map | Mir version | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | On Friday, 9 March 2018 at 19:41:46 UTC, H. S. Teoh wrote: > Today I found myself needing a lazy, caching version of map() on an array. More precisely, given an array `T[] src` of source data and a function func(T) that's pretty expensive to compute, return an object `result` such that: > > - result[i] == func(src[i]), for 0 ≤ i < src.length. > > - If result[j] is never actually used, func(src[j]) is never invoked > (lazy). > > - If result[j] is referenced multiple times, a cached value is returned > after the first time, i.e., the result of func(src[j]) is cached, and > func(src[j]) is never invoked more than once. > > Can this be done using current Phobos primitives? > > > T Phobos implementation based on memoize would use hash map internally, which is slow for this use case. mir-algorithm v0.9.5 (DUB would be updated soon) got `cached` and `cachedGC` topologies. Example: // Mir's map is required import mir.ndslice: cachedGC, map, sliced; auto ar = new int[5]; // init ar ... auto cachedLazyMap = ar.map!fun.cachedGC; `cachedGC` [1] internally allocates a vector of caches and a packed bitarray. `cached` [2] allows to reuse already allocated buffers. The result support all random access range primitive, slicing and etc. ndslice architecture allows to add new random access topologies very easily. [1] http://docs.algorithm.dlang.io/latest/mir_ndslice_topology.html#.cachedGC [2] http://docs.algorithm.dlang.io/latest/mir_ndslice_topology.html#.cached Best regards, Ilya Yaroshenko |
March 19, 2018 Re: Lazy caching map()? | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | On Friday, 9 March 2018 at 19:41:46 UTC, H. S. Teoh wrote:
> Today I found myself needing a lazy, caching version of map() on an array. More precisely, given an array `T[] src` of source data and a function func(T) that's pretty expensive to compute, return an object `result` such that:
>
> [...]
Map the sources to std.parallelism.Task instances, and yieldForce() the instances when you need the results?
Graham
|
March 19, 2018 Re: Lazy caching map()? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Graham Fawcett | On Monday, 19 March 2018 at 16:47:16 UTC, Graham Fawcett wrote:
> On Friday, 9 March 2018 at 19:41:46 UTC, H. S. Teoh wrote:
>> Today I found myself needing a lazy, caching version of map() on an array. More precisely, given an array `T[] src` of source data and a function func(T) that's pretty expensive to compute, return an object `result` such that:
>>
>> [...]
>
> Map the sources to std.parallelism.Task instances, and yieldForce() the instances when you need the results?
On second thought, that won't work. AFIAK there's no way to take an unscheduled task and force it to execute on the current thread immediately and yield the result. But with such an operation, you could have a thread-local future/promise type that would meet your needs.
Graham
|
Copyright © 1999-2021 by the D Language Foundation