Disjoint slices of an array as reference

Aug 20, 2020

data pulverizer

Aug 20, 2020

data pulverizer

Aug 20, 2020

Aug 20, 2020

Aug 20, 2020

Aug 20, 2020

Aug 20, 2020

Aug 20, 2020

I have been trying to create a new array from an existing array that is effectively a view on the original array. This can be done with slices in D: ``` auto x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]; auto y = [0..5]; ``` y a subset of the x array. If I do: ``` y[0] = 11; ``` x[0] will be modified to 11; However I would like to have disjoint slices, something like this: ``` auto y = x[0..5, 9..14]; ``` Which selects different disjoint parts of the array x, but is not allowed on T[]. I have tried something like: ``` double[] y; y ~= x[0..5]; y ~= x[9..14]; ``` But the act of appending results in array copying - breaks the reference with the original array. The only other thing I have considered is creating an array of references to each of the elements of x I would like but that just seems like overkill. It would be good to know if there is a more straightforward way of doing this type of disjoint selection. Thanks in advance.

On Thursday, 20 August 2020 at 02:21:15 UTC, data pulverizer wrote: > However I would like to have disjoint slices, something like this: > > ``` > auto y = x[0..5, 9..14]; > ``` I've been thinking about this some more and I don't think it is possible. An array in D is effectively two pointers either side of a memory block. When you create a slice you are creating another array two pointers somewhere in the same memory block. A disjoint slice array would need more than two pointers - which is not possible since an array only has 2.

On Thursday, 20 August 2020 at 02:38:33 UTC, data pulverizer wrote: > > I've been thinking about this some more and I don't think it is possible. An array in D is effectively two pointers either side of a memory block. When you create a slice you are creating another array two pointers somewhere in the same memory block. A disjoint slice array would need more than two pointers - which is not possible since an array only has 2. p.s. An array in D is either two pointers or one pointer and a length (I don't know which) - but the point still stands.

On Thursday, 20 August 2020 at 02:21:15 UTC, data pulverizer wrote: > ``` > double[] y; > y ~= x[0..5]; > y ~= x[9..14]; > ``` > > But the act of appending results in array copying - breaks the reference with the original array. The only other thing I have considered is creating an array of references to each of the elements of x I would like but that just seems like overkill. > > It would be good to know if there is a more straightforward way of doing this type of disjoint selection. > > Thanks in advance. double[][] y; y ~= x[0..5]; y ~= x[9..14];

On Thursday, 20 August 2020 at 03:47:15 UTC, Paul Backus wrote: > double[][] y; > y ~= x[0..5]; Thanks. I might go for a design like this: ``` struct View(T){ T* data; long[2][] ranges; } ``` The ranges are were the slices are stored and T* (maybe even immutable(T*)) is a pointer is to the start of the original array. I'll use an opIndex that calculates the correct index in the original array to obtain the right data.

On 8/19/20 7:40 PM, data pulverizer wrote: > An array in D is either two pointers or one pointer and a length (I > don't know which) It is the length, followed by the pointer, equivalent of the following struct: struct A { size_t length_; void * ptr; size_t length() { return length_; } size_t length(size_t newLength) { // Modify length_ and ptr as necessary } } Ali

On 8/19/20 9:11 PM, data pulverizer wrote: > On Thursday, 20 August 2020 at 03:47:15 UTC, Paul Backus wrote: >> double[][] y; >> y ~= x[0..5]; > > Thanks. I might go for a design like this: > > ``` > struct View(T){ > T* data; > long[2][] ranges; > } > ``` > The ranges are were the slices are stored and T* (maybe even immutable(T*)) is a pointer is to the start of the original array. I'll use an opIndex that calculates the correct index in the original array to obtain the right data. > I implemented the same idea recently; it's a fun exercise. :) I didn't bother with opIndex because my use case was happy with just the InputRange primitives (and .length I think). And I had to implement it because std.range.chain works only with statically known number of sub-ranges. :/ If the number of ranges are known, then this works: import std.stdio; import std.range; void main() { auto x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]; auto y = chain(x[0..5], x[9..14]); writeln(y); } [1, 2, 3, 4, 5, 10, 11, 12, 13, 14] Ali

August 20, 2020

Re: Disjoint slices of an array as reference

Posted by data pulverizer
in reply to Ali Çehreli

Permalink

data pulverizer

Posted in reply to Ali Çehreli

Permalink

On Thursday, 20 August 2020 at 08:26:59 UTC, Ali Çehreli wrote:
> On 8/19/20 9:11 PM, data pulverizer wrote:
>> Thanks. I might go for a design like this:
>> 
>> ```
>> struct View(T){
>>    T* data;
>>    long[2][] ranges;
>> }
>> ```
>> [...]
>
> I implemented the same idea recently; it's a fun exercise. :) I didn't bother with opIndex because my use case was happy with just the InputRange primitives (and .length I think).
>
> And I had to implement it because std.range.chain works only with statically known number of sub-ranges. :/ If the number of ranges are known, then this works:
>
> import std.stdio;
> import std.range;
>
> void main() {
>   auto x = [1,  2,  3, 4,  5,
>             6,  7,  8, 9,  10,
>             11, 12, 13, 14, 15];
>   auto y = chain(x[0..5], x[9..14]);
>   writeln(y);
> }
>
> [1, 2, 3, 4, 5, 10, 11, 12, 13, 14]
>
> Ali

Many thanks for confirming the internal structure of D's arrays and for the tip on std.range's chain function. It's exactly what I need. In fact the number of sub-ranges are statically known ...

As an aside, the reason for this query is that I have written a small module for multidimensional arrays to be included in a GLM (generalized linear models) package I am writing in D (in fact this will be the second major version of the package in D) - I know about Mir but my array doesn't have to be feature rich and only requires very few methods, I'm also probably going to write an article about it's internal structure and I've learned a lot creating it, and it means the GLM library won't have dependencies outside D's standard library and BLAS/LAPACK.

The multidimensional array is structured like this:

```
Array(long N, T)
if(isFloatingPoint!T && (N >= 1))
{
  T[] data;
  long[N] dim;
  ...
}
```

Where N is the number of dimensions. The indexing and stuff works fine but I wasn't happy that subsetting the array with slices returns a copy (for example an 2D array `A[0..2, 1..$]`) and since the subsetting is not necessarily contiguous, directly slicing from the data array was not feasible. But now N is known at compile time (`A[r[0][0]..r[0][1], r[1][0]..r[1][1], ... , r[N][0]..r[N][1]]`) so doing a subset using std.range's `chain` on `data` is easy. It also means I don't have to implement a separate `View` struct. This chain function will also simplify my indexing code, at the moment I am using string mixins to generate code for for loops over all the different dimensions where the function creating string the is recursive ... which was fun to write [:laugh:]!

Forums