March 14, 2018
On Tuesday, 13 March 2018 at 17:10:03 UTC, jmh530 wrote:
> "Note that using row-major ordering may require more memory and time than column-major ordering, because the routine must transpose the row-major order to the column-major order required by the underlying LAPACK routine."

Maybe we should use only column major order. --Ilya
March 13, 2018
On 12 March 2018 at 20:37, 9il via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> Hi All,
>
> The Dlang multidimensional range type, ndslice, is a struct composed a an iterator, lengths and possibly strides. It does not own memory and does not know anything about its content. ndslice is a faster and extended version of numpy.ndarray.
>
> After some work on commercial projects based on Lubeck[1] and ndslice I figure out what API and memory management is required to make Dlang super fast and  math friendly in the same time.
>
> The concept is the following:
> 1. All memory is managed by a global BetterC thread safe ARC allocator.
> Optionally the allocator can be overloaded.
> 2. User can take an internal ndslice to use mir.ndslice API internally in
> functions.
> 2. auto matrixB = matrixA; // increase ARC
> 3. auto matrixB = matrixA.dup; // allocates new matrix
> 4. matrix[i] returns a Vec and increase ARC, matrix[i, j] returns a content
> of the cell.
> 5. Clever `=` expression based syntax. For example:
>
>    // performs CBLAS call of GEMM and does zero memory allocations
>    C = alpha * A * B + beta * C;
>
> `Mat` and other types will support any numeric types, PODlike structs, plus special overload for `bool` based on `bitwise` [2].
>
> I have a lot of work for next months, but looking for a good opportunity to make Mat happen.
>
> For contributing or co-financing:
> Ilya Yaroshenko at
> gmail com
>
> Best Regards,
> Ilya

I'd like to understand why implement a distinct vector type, rather
than just a Nx1/1xN matrix?
That kinds sounds like a hassle to me... although there is already
precedent for it, in that a scalar is distinct from a 1x1 matrix (or a
1-length vector).

I want to talk to you about how we interact with colours better... since in that world, a matrix can't just be a grid of independent storage cells. That will throw some spanners into the works, and I'd like to think that designs will support a classy way of expressing images as matrices of pixel data.
March 14, 2018
On Wednesday, 14 March 2018 at 05:40:42 UTC, Manu wrote:
> I'd like to understand why implement a distinct vector type, rather
> than just a Nx1/1xN matrix?

This is just and API quesiton of how elements of Nx1/1xN matrix should be accessed.
E.g. should one specify one or two arguments to access an element

> That kinds sounds like a hassle to me... although there is already
> precedent for it, in that a scalar is distinct from a 1x1 matrix (or a
> 1-length vector).

Yes, I have the same thoughts. The very high level of API abstraction may looks no good for end users.

> I want to talk to you about how we interact with colours better... since in that world, a matrix can't just be a grid of independent storage cells. That will throw some spanners into the works, and I'd like to think that designs will support a classy way of expressing images as matrices of pixel data.

Just have replied to your letter.

Thanks,
Ilya
March 14, 2018
On Wednesday, 14 March 2018 at 05:01:38 UTC, 9il wrote:
>
> Maybe we should use only column major order. --Ilya

In my head I had been thinking that the Mat type you want to introduce would be just an alias to a 2-dimensional Slice with a particular SliceKind and iterator. Am I right on that?

If that's the case, why not introduce a Tensor type that corresponds to Slice but in column-major storage and then have Mat (or Matrix) be an alias to 2-dimensional Tensor with a particular SliceKind and iterator.


March 14, 2018
On 03/14/2018 01:01 AM, 9il wrote:
> On Tuesday, 13 March 2018 at 17:10:03 UTC, jmh530 wrote:
>> "Note that using row-major ordering may require more memory and time than column-major ordering, because the routine must transpose the row-major order to the column-major order required by the underlying LAPACK routine."
> 
> Maybe we should use only column major order. --Ilya

Has row-major fallen into disuse?

Generally: it would be great to have a standard collection of the typical data formats used in linear algebra and scientific coding. This would allow interoperation without having each library define its own types with identical layout but different names. I'm thinking of:

* multidimensional hyperrectangular
* multidimensional jagged
* multidimensional hypertriangular if some libraries use it
* sparse vector (whatever formats are most common, I assume array of pairs integral/floating point number, with the integral either before or after the floating point number)

No need for a heavy interface on top of these. These structures would be low-maintenance and facilitate a common data language for libraries.


Andrei
March 14, 2018
On Wednesday, 14 March 2018 at 16:16:55 UTC, Andrei Alexandrescu wrote:
>
> Has row-major fallen into disuse?
> [snip]

C has always been row major and is not in disuse (the GSL library has gsl_matrix and that is row-major). However, Fortran and many linear algebra languages/frameworks have also always been column major. Some libraries allow both. I use Stan for Bayesian statistics. It has an array type and a matrix type. Arrays are row-major and matrices are column major. (Now that I think on it, an array of matrices would probably resolve most of my needs for an N-dimensional column major type)
March 14, 2018
On Wednesday, 14 March 2018 at 16:16:55 UTC, Andrei Alexandrescu wrote:
>> Maybe we should use only column major order. --Ilya
>
> Has row-major fallen into disuse?

Leaving aside interop between libraries and domains where row major is often used (e.g. computer graphics), the issue is semantic, I think. AFAIK, basically all "modern" domains which involve linear algebra ("data science", machine learning, scientific computing, statistics, etc.) at least formulate their models using the standard linear algebra convention, which is to treat vectors as column vectors (column major). It seems, at least tacitly, that this question is partly about whether to treat this use case as a first-class citizen---column vectors are the norm in these areas.

It's great to have flexibility at the low level to deal with whatever contingency may arise, but it seems that it might be worth taking a page out of MATLAB/numpy/Julia's book and to try to make the common case as easy as possible (what you were alluding to with a "common data language", I think). This seems to match the D notion of having "high torque". Start with svd(X) and ramp up as necessary.

Along these lines, including a zoo of different types of hypermatrices (triangular, banded, sparse, whatever) is cool, but the common cases for plain ol' matrices are dense, sparse, diagonal, triangular, Toeplitz, etc. Ideally data structures and algorithms covering this would be in the standard library?

Also, I think Armadillo is an extremely nice library, but even it can be a little frustrating and clunky at times. Another case study: the Python scientific toolchain (numpy/scipy/matplotlib) seems to be going in the direction of deprecating "ipython --pylab" (basically start ipython in a MATLAB compatibility mode so that typing typical MATLAB commands "just works"). This seems to me to be a huge mistake---the high-level "scripting mode" provided by "ipython --pylab" is extremely valuable.

This comment is coming from someone who has been sitting by the edge of the pool, waiting to hop in and check out D as a replacement for a combination of C++/Python/Julia/MATLAB for research in scientific computing. Take all this with a grain of salt since I haven't contributed anything to the D community. :^)
March 14, 2018
On Wednesday, 14 March 2018 at 17:22:16 UTC, Sam Potter wrote:
> Ideally data structures and algorithms covering this would be in the standard library?

I sure hope not. At least not for a long time anyway. It would be hard to make any progress if it were in the standard library. At this stage functionality is more important than having a tiny amount of code that is written perfectly and has satisfied the 824 rules necessary to get into Phobos.

> This comment is coming from someone who has been sitting by the edge of the pool, waiting to hop in and check out D as a replacement for a combination of C++/Python/Julia/MATLAB for research in scientific computing. Take all this with a grain of salt since I haven't contributed anything to the D community. :^)

What has been holding you back from using D? I've experienced no difficulties, but maybe I've been lucky.
March 14, 2018
On Wednesday, 14 March 2018 at 17:36:18 UTC, bachmeier wrote:
> On Wednesday, 14 March 2018 at 17:22:16 UTC, Sam Potter wrote:
>> Ideally data structures and algorithms covering this would be in the standard library?
>
> I sure hope not. At least not for a long time anyway. It would be hard to make any progress if it were in the standard library. At this stage functionality is more important than having a tiny amount of code that is written perfectly and has satisfied the 824 rules necessary to get into Phobos.

Sure. The key word in my statement was "ideally". :-)

For what it's worth, there is already an "informal spec" in the form of the high-level interface for numerical linear algebra and sci. comp. that has been developed (over three decades?) in MATLAB. This spec has been replicated (more or less) in Julia, Python, Octave, Armadillo/Eigen, and others. I'm not aware of all the subtleties involved in incorporating it into any standard library, let alone D's, but maybe this is an interesting place where D could get an edge over other competing languages. Considering that people in Python land have picked up D as a "faster Python", there might be more traction here than is readily apparent. Just some thoughts---I'm very biased. The reality is that these algorithms are equally (if not more) important in their domain than the usual "CS undergrad algorithms". It's easy enough to use a library, but again, "high torque" would seem to point to making it easier.

>
>> This comment is coming from someone who has been sitting by the edge of the pool, waiting to hop in and check out D as a replacement for a combination of C++/Python/Julia/MATLAB for research in scientific computing. Take all this with a grain of salt since I haven't contributed anything to the D community. :^)
>
> What has been holding you back from using D? I've experienced no difficulties, but maybe I've been lucky.

Hard to say. Right now I write numeric C libraries using C++ "as a backend". I've used D for some very small projects, and my feeling is that I'm using C++ as a worse D. I lean heavily on C++'s templates and metaprogramming abilities, and it looks like I could do what I do now in C++ more easily in D, and could do many more things besides. I think what's holding me back now is a combination of my own inertia (I have to get things done!) and possibly a degree of perceived "instability" (it's totally unclear to me how valid this perception is). I haven't tried "BetterC" yet, but once I have some downtime, I'm going to give it a whirl and see how easily it hooks up with my target languages. If it's a nice experience, I would very much like to try a larger project.
March 14, 2018
On Wednesday, 14 March 2018 at 20:21:15 UTC, Sam Potter wrote:
>
> Sure. The key word in my statement was "ideally". :-)
>
> For what it's worth, there is already an "informal spec" in the form of the high-level interface for numerical linear algebra and sci. comp. that has been developed (over three decades?) in MATLAB. This spec has been replicated (more or less) in Julia, Python, Octave, Armadillo/Eigen, and others. I'm not aware of all the subtleties involved in incorporating it into any standard library, let alone D's, but maybe this is an interesting place where D could get an edge over other competing languages. Considering that people in Python land have picked up D as a "faster Python", there might be more traction here than is readily apparent
> [snip]

libmir [1] originally started as std.experiemental.ndslice (that component is now mir-algorithm). They had removed it from the std.experimental because it wasn't stable enough yet and needed to make breaking changes. I think it's doing just fine as a standalone library, rather than part of the standard library. As this thread makes clear, there's certainly more work to be done on it, but I'm sure Ilya would appreciate any feedback or assistance.

I'm sympathetic to your point about D getting an edge by having a better linear algebra experience. I came to D for faster Python/R/Matlab (and not C++), though if I need to do something quickly, I still defer to Python/R. However, if you look at the TIOBE index, R and Matlab are at 18/20. Python is quite a bit higher, but it's growth in popularity was not largely due to the Numpy/Scipy ecosystem. So while I think that D could get more traction if libmir turns itself into a premiere linear algebra library, we should be realistic that linear algebra is a relatively small segment of how people use programming languages. Maybe these firms might be willing to pay up for more support though...(if a user could replace pandas with libmir, I would imagine some financial firms might be interested).

[1] https://github.com/libmir