Ranges to deal with corner cases and "random access"

Oct 05, 2019

Brett

Oct 06, 2019

9il

Oct 06, 2019

Brett

Oct 08, 2019

Paul Backus

Oct 08, 2019

Jonathan M Davis

October 05, 2019

Ranges to deal with corner cases and "random access"

Posted by Brett

Permalink

Brett

Permalink

Typically a lot of algorithms have corner cases such as referencing elements that end up out of bounds at the start or end (k-c or k+c).

Is there any way to handle such things easily without having to worry about the corners?

E.g., have it return a default value such as 0, or repeat the values or periodically extend it.


The idea is not to have to code for the corners like

if (k - 1 < 0) { return 0 }
return data[k-1]/(k+1);



Many algorithms need to reference other parts of the data by index and which that index may be out of bounds for some corner cases. The idea I'm wanting is to be able to unify the corner cases and a very simple way so they can be ignored and ranges can be used simply and properly to avoid these corners. This simplifies coding and having extra corner case algorithms that only deal with the corners yet are functionally identical.



The idea with history:

https://forum.dlang.org/thread/yfppxywqonzsdjpsjbta@forum.dlang.org

But the issue is that it retains the history which is wasteful.


I see no easy way to access other data elements in a unified way.

With ranges one has to recompute the values but for certain inputs such as already computed arrays this is not necessary. Having an optimal way to memorize exactly what is needed(no excess) and to 0, constant, or continuously extend data for corner cases would make using ranges, in many mathematical algorithms, very useful...

else in many of these cases they offer nothing over the long winded coding. Ideally the the range semantics can make it very efficient such as:

1. If no non-local access is used(other values) then no memorization takes place.  Else it memorizes only up to what is used with the option for recomputation or storage. If the input is an array then the values are already computed so it is not necessary to do anything.

2. The ability to extend data outside the "accessible" range in a common format. Generally there are 4 methods:
   Any values accessed outside the "range" are set
   a) Zero.
   b) The end point value(constant extend) = continuously extended
   c) By extrapolating from the "inside" values(linear, quadratic, etc) = differentially extended
   d) Periodic/wrapping. Uses the other side of the boundary.

   but a 5th is common which is specifically setting some values.

   For optimization usually an algorithm is split in two multiple parts that deal with the boundaries separately since it conditionals which are not needed inside the data set.

3. Ability to access already computed values with the same solutions above!

These generally will cover 99% of algorithms and unify them.


I imagine D has the ability to do this but I'm not sure how to make it all work with ranges.

If one can carry over compile time information then maybe the "history" can be references in the common range algorithms by extending, e.g:

someRange.zeroExtend.map!((a,k,r){ return a + a[k-1]*r*r[k-1]; })

Here a is a special element, it is wrapped to handle out of bounds access given an opIndex to return them(could even be used to assign). r is previously computed values. If [j] is out of bounds or not computed yet then it returns 0(rather than crashing).

Only 2 to 3 value needs to be "memorized", r and the previous(and they are overwritten each iteration). The a's do not need to be memorized if the source is an array, else we would need to memorize 1 value.


For example..

iota(-10,10).zeroExtend.setValues(0=>1,1=>1).map!((a,k,r)=>r + r[k-1]);

Would produce the Fibonacci sequence.

Now recurrence basically does this but is less general.

One of the things that have limited my use of ranges is precisely these issues. I find that if I can't deal with certain common issues easily with ranges then I just do things the old fashion way.

Most algorithms I have corner cases and I either have to deal with it either with ranges or loops... each one requires separation... with loops though sometimes I can make things efficient by some tricks(e.g., using goto).

If I can get ranges to do these things simply and naturally then I'm more likely to use them.

On Saturday, 5 October 2019 at 00:38:06 UTC, Brett wrote: > Typically a lot of algorithms have corner cases such as referencing elements that end up out of bounds at the start or end (k-c or k+c). > > [...] mir-algorithm package provides lazy padding and concatenation routines http://mir-algorithm.libmir.org/mir_ndslice_concatenation.html It may be slightly more complex then you expect as the library was created for a multidimensional random-access ranges (ndslices). Lazy `Concatenation` structure can be effectively eagerly evaluated to a memory allocated ndslice or be lazily iterated using `opIndex` for random access or input range primitives for sequential access. It doesn't provide a backward range primitive but you are welcome to open PR to add them if required. Best, Ilya

Here is a sort of proof of concept: struct CtsExtend { static auto opCall(T)(T a) { struct _CtsExtend { T _a; auto opIndex(int i) { if (i < 0) return a[0]; if (i >= a.length) return a[$-1]; return a[i]; } } _CtsExtend x; x._a = a; return x; } } Not sure if it can be simplified(I had to create a sub struct to get things to work, hopefully it would be optimized out or can be simplified. I tried originally to do it all using meta programming but I couldn't figure it out). For any indexable it will override the index and modify it's behavior to constantly extend the ends. CtsExtend(arr)[-4] In general it would be nice to get this type of thing full featured(the various extensions, for it to be optimized, to work with ranges and other types of indexables that might allow negative indices, override the extension values, keep a history, etc... If it can be done and make to work well with ranges it would allow many algorithms to be very easily expressed and make ranges more powerful.

On Sunday, 6 October 2019 at 20:34:55 UTC, Brett wrote: > If it can be done and make to work well with ranges it would allow many algorithms to be very easily expressed and make ranges more powerful. You can make it a range by adding an "alias this" to the original array: struct ExtendedArray(T) { T[] a; alias a this; T opIndex(int i) { if (i < 0) return a[0]; else if (i >= a.length) return a[$-1]; else return a[i]; } } Full example: https://run.dlang.io/is/2x6LKD

On Tuesday, October 8, 2019 9:40:33 AM MDT Paul Backus via Digitalmars-d- learn wrote: > On Sunday, 6 October 2019 at 20:34:55 UTC, Brett wrote: > > If it can be done and make to work well with ranges it would allow many algorithms to be very easily expressed and make ranges more powerful. > > You can make it a range by adding an "alias this" to the original array: > > struct ExtendedArray(T) > { > T[] a; > alias a this; > > T opIndex(int i) > { > if (i < 0) return a[0]; > else if (i >= a.length) return a[$-1]; > else return a[i]; > } > } > > Full example: https://run.dlang.io/is/2x6LKD It would be less error-prone to just implement the appropriate functions on the struct. alias this and generic code are a terrible combination. It's way too easy for a type to pass a template constraint thanks to alias this and then have trouble because it passed based on the implicit conversion, but the conversion wasn't forced in the code using the type. You can get some really subtle problems if the code converts to the alias in some cases but not in others. - Jonathan M Davis

Forums