October 09, 2020
On 10/9/20 9:38 AM, Simen Kjærås wrote:
> On Friday, 9 October 2020 at 13:31:18 UTC, Steven Schveighoffer wrote:
>> We are talking about a range that is also contiguous in memory. The only logical reason to require this is to get a ptr and a length and use the CPU specialized instructions for these things. We have that type pattern, it's a slice.
> 
> There are ranges with contiguous memory that aren't slices - stride and occasionally map over a contiguous range comes to mind. Converting these to slices may not be possible, and certainly not efficient.

Map over a contiguous range is going to return from the stack, not from the data.

A stride is not contiguous, it skips data.

I'm struggling to understand what the benefit of defining these as contiguous is. What can you do when defining something that isn't an array as contiguous memory? What are the primitives that you use, for what purpose?

-Steve
October 09, 2020
On Friday, 9 October 2020 at 15:44:27 UTC, Steven Schveighoffer wrote:
> I'm struggling to understand what the benefit of defining these as contiguous is. What can you do when defining something that isn't an array as contiguous memory? What are the primitives that you use, for what purpose?

All kinds of numerics. Like, getting the real parts from a complex set and use an algorithm that has been defined on reals only.



October 09, 2020
On 10/9/20 11:47 AM, Ola Fosheim Grøstad wrote:
> On Friday, 9 October 2020 at 15:44:27 UTC, Steven Schveighoffer wrote:
>> I'm struggling to understand what the benefit of defining these as contiguous is. What can you do when defining something that isn't an array as contiguous memory? What are the primitives that you use, for what purpose?
> 
> All kinds of numerics. Like, getting the real parts from a complex set and use an algorithm that has been defined on reals only.

So basically, a contiguous primitive might be ptr, distance between elements, and total elements? And this is something you can plug into a low-level API? I can see that being a possibility, but also the underlying type has to be an array anyway.

I don't know that we would have to define a range primitive for this. Just a set of operations on an array of data, with parameters for the other pieces.

-Steve
October 09, 2020
On Friday, 9 October 2020 at 15:44:27 UTC, Steven Schveighoffer wrote:
> What can you do when defining something that isn't an array as contiguous memory? What are the primitives that you use, for what purpose?

Ok, so here is another example. Priority queue. Traversing it in order would jump over the place and be slow. A sequential scan has major cache-prefetch benefits.

All you need is a new filter "unordered", when that is applied to a container that provides an unordered contiguous memory interface it can do a sequential scan.

So you would get:

"container.somerangefunction" is in order, but slow.

"container.unordered.somerangefunction" may not be in order, but is fast.


October 09, 2020
On Friday, 9 October 2020 at 15:14:19 UTC, Ola Fosheim Grøstad wrote:
>
> I guess it in some cases would be technically possible to emulate fixed stride with a recast to a slice of a struct with padding before and after the element you want.
>
> But not very clean.
>
> Microsoft's experimental gsl::span implementation did support strides, but I don't think that is possible with the std::span that ended up in the standard.  (C++ span is similar to D slices)

This thread makes me happy that I don't need to meddle with modern C++. In order to use modern C++ you need to become some kind of botanist in order to keep track of all the primitives in C++. Just like in the vegetable kingdom, you might discover new primitives that you didn't know and new primitives seem to cross breed all the time creating new ones.
October 09, 2020
On Friday, 9 October 2020 at 16:05:09 UTC, Steven Schveighoffer wrote:
> So basically, a contiguous primitive might be ptr, distance between elements, and total elements? And this is something you can plug into a low-level API? I can see that being a possibility, but also the underlying type has to be an array anyway.

I guess that is possible. I think a cacheline aligned bitmask version is needed to do better than what C++ numerics TS will end up with, but hard to design.

> I don't know that we would have to define a range primitive for this. Just a set of operations on an array of data, with parameters for the other pieces.

It depends on how common it is to process data without changing it?

I don't know. Maybe look at existing code and see what people do. Not only D code, but look at what people who use dataflow frameworks do.

I guess github would be a good resource.

October 09, 2020
On Friday, 9 October 2020 at 16:09:08 UTC, IGotD- wrote:
> This thread makes me happy that I don't need to meddle with modern C++. In order to use modern C++ you need to become some kind of botanist in order to keep track of all the primitives in C++. Just like in the vegetable kingdom, you might discover new primitives that you didn't know and new primitives seem to cross breed all the time creating new ones.

LOL IIRC Bjarne Stroustrup said that he does not know what will be in C++23 because he will need a couple of years to play with what is in C++20. (Or something along those lines).


October 09, 2020
On 10/9/20 12:05 PM, Ola Fosheim Grøstad wrote:
> On Friday, 9 October 2020 at 15:44:27 UTC, Steven Schveighoffer wrote:
>> What can you do when defining something that isn't an array as contiguous memory? What are the primitives that you use, for what purpose?
> 
> Ok, so here is another example. Priority queue. Traversing it in order would jump over the place and be slow. A sequential scan has major cache-prefetch benefits.
> 
> All you need is a new filter "unordered", when that is applied to a container that provides an unordered contiguous memory interface it can do a sequential scan.
> 
> So you would get:
> 
> "container.somerangefunction" is in order, but slow.
> 
> "container.unordered.somerangefunction" may not be in order, but is fast.
> 
> 

It's fast because the unordered accessor gets a slice, and somerangefunction sees its a slice and does something magic because it's contiguous memory.

The question is, would there be any purpose to returning something OTHER than a slice for unordered? And I don't mean, you happen to get benefits because it's sequential, it has to be something that the algorithm will do differently because it sees it's a special type.

-Steve
October 09, 2020
On Friday, 9 October 2020 at 16:57:52 UTC, Steven Schveighoffer wrote:
> The question is, would there be any purpose to returning something OTHER than a slice for unordered? And I don't mean, you happen to get benefits because it's sequential, it has to be something that the algorithm will do differently because it sees it's a special type.

Yes, at least the following:

1. alignment guarantees (so you don't have to check that at runtime)
2. min-size guarantees (e.g. all slices are at least 2 elements)
3. annotation of elements being valid or not (scanning hashtables etc)
4. other guarantees (not-null, not-Nan etc)

You also have the bitsequences/masks I've mentioned that could be more extensive, giving various qualities of the slice, so that you can scan faster without even loading the data. Both point 3.) and 4.) could be bitmasks.

There are also some compression techniques that provide skip-pointers. Kinda like skip-lists:

https://en.wikipedia.org/wiki/Skip_list

There are certainly many interesting possibilities, but I think bitmasks would go a long way (with the option to skip all-zero masks).


October 09, 2020
On Friday, 9 October 2020 at 17:09:44 UTC, Ola Fosheim Grøstad wrote:
> There are also some compression techniques that provide skip-pointers. Kinda like skip-lists:

Not necessarily compression, just as likely performance optimization of scanning. Although compression techniques use such skip-pointers (or rather offsets) to jump from one header to the next, bypassing a variable sized data chunk.