DIP80: phobos additions (page 9) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » DIP80: phobos additions (page 9)

June 11, 2015

Re: DIP80: phobos additions

Posted by jmh530
in reply to Ilya Yaroshenko

jmh530

Posted in reply to Ilya Yaroshenko

On Tuesday, 9 June 2015 at 03:26:25 UTC, Ilya Yaroshenko wrote:

> There are
> https://github.com/9il/simple_matrix and
> https://github.com/9il/cblas .
> I will try to rework them for Phobos.
>
> Any ideas and suggestions?
>

A well-supported matrix math library would definitely lead to me using D more. I would definitely applaud any work being done on this subject, but I still feel there are some enhancements (most seemingly minor) that would really make a matrix math library easy/fun to use.

Most of what I discuss below is just syntactical sugar for some stuff that could be accomplished with loops or std.algorithm, but having it built-in would make practical use of a matrix math library much easier. I think Armadillo implements some of these as member functions, whereas other languages like R and Matlab have them more built-in.

Disclaimer: I don't consider myself a D expert, so I could be horribly wrong on some of this stuff.

1) There is no support for assignment to arrays based on the values of another array.
int[] A = [-1, 1, 5];
int[] B = [1, 2];
int[] C = A[B];

You would have to use int[] C = A[1..2];. In this simple example, it’s not really a big deal, but if I have a function that returns B, then I can’t just throw B in there. I would have to loop through B and assign it to C. So the type of assignment is possible, but if you’re frequently doing this type of array manipulation, then the number of loops you need starts increasing.

2) Along the same lines, there is no support for replacing the B above with an array of bools like
bool[] B = [false, true, true];
or
auto B = A.map!(a => a < 0);

Again, it is doable with a loop, but this form of logical indexing is a pretty common idiom for people who use Matlab or R quite a bit.

3) In addition to being able to index by a range of values or bools, you would want to be able to make assignments based on this. So something like
A[B] = c;

This is a very common operation in R or Matlab.

4) Along the lines of #2, as an alternative to map, there is no support for array comparison operators. Something like
int[3] B;
B[] = A[] + 5;

works, but

bool[3] B;
B[] = A[] > 0;

doesn’t (I’m also not sure why I can’t just write auto B[] = A[] + 5;, but that’s neither here nor there). Moreover, it seems like only the mathematical operators work in this way. Mathematical functions from std.math, like exp, don’t seem to work. You have to use map (or a loop) with exp to get the result. I don’t have an issue with map, per se, but it seems inconsistent when some things work but not others.

5) You can only assign scalars to slices of arrays. There doesn’t seem to be an ability to assign an array to a slice. For instance, in #1, I couldn’t write A[0..1] = B; or A[0, 1] = B; instead of what I had written for C.

6) std.range and std.algorithm seem to have much better support for one dimensional containers than if you want to treat a container as two-dimensional. If you have a two-dimensional array and want to use map on every element, then there’s no issue. However, if you want to apply a function to each column or row, then you’d have to use a for loop (not even foreach).

This seems to be a more difficult problem to solve than the others. I’m not sure what the best approach is, but it makes sense to look at other languages/libraries. In R, you have apply, which can operate on any dimensional array.  Matlab has arrayfun. Numpy has apply_along_axis. Armadillo has .each_col and .each_row (one other thing about Armadillo is that you can switch between what underlying matrix math library is being used, like OpenBlas vs. Intel MKL).

June 11, 2015

Re: DIP80: phobos additions

Posted by Wyatt
in reply to jmh530

Wyatt

Posted in reply to jmh530

On Thursday, 11 June 2015 at 21:30:22 UTC, jmh530 wrote:
>
> Most of what I discuss below is just syntactical sugar for some stuff that could be accomplished with loops or std.algorithm,

Your post reminds me of two things I've considered attempting in the past:
1) a set of operators that have no meaning unless an overload is specifically provided (for dot product, dyadic transpose, etc.) and
2) a library implementing features of array-oriented languages to the extent it's possible (APL functions, rank awareness, trivial reshaping, aggregate lifting, et al).

Syntax sugar can be important.

-Wyatt

June 12, 2015

Re: DIP80: phobos additions

Posted by jmh530
in reply to Wyatt

jmh530

Posted in reply to Wyatt

On Thursday, 11 June 2015 at 22:36:28 UTC, Wyatt wrote:

> 1) a set of operators that have no meaning unless an overload is specifically provided (for dot product, dyadic transpose, etc.) and


I see your point, but I think it might be a bit risky if you allow too much freedom for overloading operators. For instance, what if two people implement separate packages for matrix multiplication, one adopts the syntax of R (%*%) and one adopts the new Python syntax (@). It may lead to some confusion.

June 12, 2015

Re: DIP80: phobos additions

Posted by Manu
in reply to John Colvin

Manu

Posted in reply to John Colvin

On 10 June 2015 at 03:04, John Colvin via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> On Tuesday, 9 June 2015 at 16:45:33 UTC, Manu wrote:
>>
>> On 10 June 2015 at 02:32, John Colvin via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>>>
>>> On Tuesday, 9 June 2015 at 16:18:06 UTC, Manu wrote:
>>>>
>>>>
>>>> On 10 June 2015 at 01:26, Ilya Yaroshenko via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>>>>>
>>>>> [...]
>>>>
>>>>
>>>>
>>>> A complication for linear algebra (or other mathsy things in general)
>>>> is the inability to detect and implement compound operations.
>>>> We don't declare mathematical operators to be algebraic operations,
>>>> which I think is a lost opportunity.
>>>> If we defined the properties along with their properties
>>>> (commutativity, transitivity, invertibility, etc), then the compiler
>>>> could potentially do an algebraic simplification on expressions before
>>>> performing codegen and optimisation.
>>>> There are a lot of situations where the optimiser can't simplify
>>>> expressions because it runs into an arbitrary function call, and I've
>>>> never seen an optimiser that understands exp/log/roots, etc, to the
>>>> point where it can reduce those expressions properly. To compete with
>>>> maths benchmarks, we need some means to simplify expressions properly.
>>>
>>>
>>>
>>> Optimising floating point is a massive pain because of precision concerns and IEEE-754 conformance. Just because something is analytically the same doesn't mean you want the optimiser to go ahead and make the switch for you.
>>
>>
>> We have flags to control this sort of thing (fast-math, strict ieee, etc).
>> I will worry about my precision, I just want the optimiser to do its
>> job and do the very best it possibly can. In the case of linear
>> algebra, the optimiser generally fails and I must manually simplify
>> expressions as much as possible.
>
>
> If the compiler is free to rewrite by analytical rules then "I will worry about my precision" is equivalent to either "I don't care about my precision" or "I have checked the codegen". A simple rearrangement of an expression can easily turn a perfectly good result in to complete garbage. It would be great if compilers were even better at fast-math mode, but an awful lot of applications can't use it.

This is fine, those applications would continue not to use it.
Personally, I've never written code in 20 years where I didn't want fast-math.

June 12, 2015

Re: DIP80: phobos additions

Posted by Manu
in reply to Ilya Yaroshenko

Manu

Posted in reply to Ilya Yaroshenko

On 10 June 2015 at 02:40, Ilya Yaroshenko via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> On Tuesday, 9 June 2015 at 16:18:06 UTC, Manu wrote:
>>
>> On 10 June 2015 at 01:26, Ilya Yaroshenko via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>>>
>>>
>>>> I believe that Phobos must support some common methods of linear algebra
>>>> and general mathematics. I have no desire to join D with Fortran
>>>> libraries
>>>> :)
>>>
>>>
>>>
>>> D definitely needs BLAS API support for matrix multiplication. Best BLAS
>>> libraries are written in assembler like openBLAS. Otherwise D will have
>>> last
>>> position in corresponding math benchmarks.
>>
>>
>> A complication for linear algebra (or other mathsy things in general)
>> is the inability to detect and implement compound operations.
>> We don't declare mathematical operators to be algebraic operations,
>> which I think is a lost opportunity.
>> If we defined the properties along with their properties
>> (commutativity, transitivity, invertibility, etc), then the compiler
>> could potentially do an algebraic simplification on expressions before
>> performing codegen and optimisation.
>> There are a lot of situations where the optimiser can't simplify
>> expressions because it runs into an arbitrary function call, and I've
>> never seen an optimiser that understands exp/log/roots, etc, to the
>> point where it can reduce those expressions properly. To compete with
>> maths benchmarks, we need some means to simplify expressions properly.
>
>
> Simplified expressions would [NOT] help because
> 1. On matrix (hight) level optimisation can be done very well by programer
> (algorithms with matrixes in terms of count of matrix multiplications are
> small).

Perhaps you've never worked with incompetent programmers (in my
experience, >50% of the professional workforce).
Programmers, on average, don't know maths. They literally have no idea
how to simplify an algebraic expression.
I think there are about 3-4 (being generous!) people in my office (of
30-40) that could do it properly, and without spending heaps of time
on it.

> 2. Low level optimisation requires specific CPU/Cache optimisation. Modern implementations are optimised for all cache levels. See work by KAZUSHIGE GOTO http://www.cs.utexas.edu/users/pingali/CS378/2008sp/papers/gotoPaper.pdf

Low-level optimisation is a sliding scale, not a binary position. Reaching 'optimal' state definitely requires careful consideration of all the details you refer to, but there are a lot of improvements that can be gained from quickly written code without full low-level optimisation. A lot of basic low-level optimisations (like just using appropriate opcodes, or eliding redundant operations; ie, squares followed by sqrt) can't be applied without first simplifying expressions.

June 12, 2015

Re: DIP80: phobos additions

Posted by Dennis Ritchie
in reply to Manu

Dennis Ritchie

Posted in reply to Manu

On Friday, 12 June 2015 at 00:51:04 UTC, Manu wrote:
> Perhaps you've never worked with incompetent programmers (in my
> experience, >50% of the professional workforce).
> Programmers, on average, don't know maths. They literally have no idea
> how to simplify an algebraic expression.
> I think there are about 3-4 (being generous!) people in my office (of
> 30-40) that could do it properly, and without spending heaps of time
> on it.

But you don't think you need to look up to programmers who are not able to quickly simplify an algebraic expression? :)

For example, I'm a little addicted to sports programming. And I could really use matrix and other math in the standard library.

June 12, 2015

Re: DIP80: phobos additions

Posted by Wyatt
in reply to jmh530

Wyatt

Posted in reply to jmh530

On Friday, 12 June 2015 at 00:11:16 UTC, jmh530 wrote:
> On Thursday, 11 June 2015 at 22:36:28 UTC, Wyatt wrote:
>
>> 1) a set of operators that have no meaning unless an overload is specifically provided (for dot product, dyadic transpose, etc.) and
>
>
> I see your point, but I think it might be a bit risky if you allow too much freedom for overloading operators. For instance, what if two people implement separate packages for matrix multiplication, one adopts the syntax of R (%*%) and one adopts the new Python syntax (@). It may lead to some confusion.

From the outset, my thought was to strictly define the set of (eight or so?) symbols for this.  If memory serves, it was right around the time Walter's rejected wholesale user-defined operators because of exactly the problem you mention. (Compounded by Unicode-- what the hell is "2 🐵 8" supposed to be!?)  I strongly suspect you don't need many simultaneous extra operators on a type to cover most cases.

-Wyatt

June 12, 2015

Re: DIP80: phobos additions

Posted by Tofu Ninja
in reply to Wyatt

Tofu Ninja

Posted in reply to Wyatt

On Friday, 12 June 2015 at 01:55:15 UTC, Wyatt wrote:
> From the outset, my thought was to strictly define the set of (eight or so?) symbols for this.  If memory serves, it was right around the time Walter's rejected wholesale user-defined operators because of exactly the problem you mention. (Compounded by Unicode-- what the hell is "2 🐵 8" supposed to be!?)  I strongly suspect you don't need many simultaneous extra operators on a type to cover most cases.
>
> -Wyatt

What would the new order of operations be for these new operators?

June 12, 2015

Re: DIP80: phobos additions

Posted by Rikki Cattermole
in reply to jmh530

Rikki Cattermole

Posted in reply to jmh530

On 12/06/2015 9:30 a.m., jmh530 wrote:
> On Tuesday, 9 June 2015 at 03:26:25 UTC, Ilya Yaroshenko wrote:
>
>> There are
>> https://github.com/9il/simple_matrix and
>> https://github.com/9il/cblas .
>> I will try to rework them for Phobos.
>>
>> Any ideas and suggestions?
>>
>
> A well-supported matrix math library would definitely lead to me using D
> more. I would definitely applaud any work being done on this subject,
> but I still feel there are some enhancements (most seemingly minor) that
> would really make a matrix math library easy/fun to use.
>
> Most of what I discuss below is just syntactical sugar for some stuff
> that could be accomplished with loops or std.algorithm, but having it
> built-in would make practical use of a matrix math library much easier.
> I think Armadillo implements some of these as member functions, whereas
> other languages like R and Matlab have them more built-in.
>
> Disclaimer: I don't consider myself a D expert, so I could be horribly
> wrong on some of this stuff.
>
> 1) There is no support for assignment to arrays based on the values of
> another array.
> int[] A = [-1, 1, 5];
> int[] B = [1, 2];
> int[] C = A[B];
>
> You would have to use int[] C = A[1..2];. In this simple example, it’s
> not really a big deal, but if I have a function that returns B, then I
> can’t just throw B in there. I would have to loop through B and assign
> it to C. So the type of assignment is possible, but if you’re frequently
> doing this type of array manipulation, then the number of loops you need
> starts increasing.
>
> 2) Along the same lines, there is no support for replacing the B above
> with an array of bools like
> bool[] B = [false, true, true];
> or
> auto B = A.map!(a => a < 0);
>
> Again, it is doable with a loop, but this form of logical indexing is a
> pretty common idiom for people who use Matlab or R quite a bit.
>
> 3) In addition to being able to index by a range of values or bools, you
> would want to be able to make assignments based on this. So something like
> A[B] = c;
>
> This is a very common operation in R or Matlab.
>
> 4) Along the lines of #2, as an alternative to map, there is no support
> for array comparison operators. Something like
> int[3] B;
> B[] = A[] + 5;
>
> works, but
>
> bool[3] B;
> B[] = A[] > 0;
>
> doesn’t (I’m also not sure why I can’t just write auto B[] = A[] + 5;,
> but that’s neither here nor there). Moreover, it seems like only the
> mathematical operators work in this way. Mathematical functions from
> std.math, like exp, don’t seem to work. You have to use map (or a loop)
> with exp to get the result. I don’t have an issue with map, per se, but
> it seems inconsistent when some things work but not others.
>
> 5) You can only assign scalars to slices of arrays. There doesn’t seem
> to be an ability to assign an array to a slice. For instance, in #1, I
> couldn’t write A[0..1] = B; or A[0, 1] = B; instead of what I had
> written for C.
>
> 6) std.range and std.algorithm seem to have much better support for one
> dimensional containers than if you want to treat a container as
> two-dimensional. If you have a two-dimensional array and want to use map
> on every element, then there’s no issue. However, if you want to apply a
> function to each column or row, then you’d have to use a for loop (not
> even foreach).
>
> This seems to be a more difficult problem to solve than the others. I’m
> not sure what the best approach is, but it makes sense to look at other
> languages/libraries. In R, you have apply, which can operate on any
> dimensional array.  Matlab has arrayfun. Numpy has apply_along_axis.
> Armadillo has .each_col and .each_row (one other thing about Armadillo
> is that you can switch between what underlying matrix math library is
> being used, like OpenBlas vs. Intel MKL).


Humm, work on getting gl3n into phobos or work on my ODBC driver manager. Tough choice.

June 12, 2015

Re: DIP80: phobos additions

Posted by Ilya Yaroshenko
in reply to Manu

Ilya Yaroshenko

Posted in reply to Manu

On Friday, 12 June 2015 at 00:51:04 UTC, Manu wrote:
> On 10 June 2015 at 02:40, Ilya Yaroshenko via Digitalmars-d
> <digitalmars-d@puremagic.com> wrote:
>> On Tuesday, 9 June 2015 at 16:18:06 UTC, Manu wrote:
>>>
>>> On 10 June 2015 at 01:26, Ilya Yaroshenko via Digitalmars-d
>>> <digitalmars-d@puremagic.com> wrote:
>>>>
>>>>
>>>>> I believe that Phobos must support some common methods of linear algebra
>>>>> and general mathematics. I have no desire to join D with Fortran
>>>>> libraries
>>>>> :)
>>>>
>>>>
>>>>
>>>> D definitely needs BLAS API support for matrix multiplication. Best BLAS
>>>> libraries are written in assembler like openBLAS. Otherwise D will have
>>>> last
>>>> position in corresponding math benchmarks.
>>>
>>>
>>> A complication for linear algebra (or other mathsy things in general)
>>> is the inability to detect and implement compound operations.
>>> We don't declare mathematical operators to be algebraic operations,
>>> which I think is a lost opportunity.
>>> If we defined the properties along with their properties
>>> (commutativity, transitivity, invertibility, etc), then the compiler
>>> could potentially do an algebraic simplification on expressions before
>>> performing codegen and optimisation.
>>> There are a lot of situations where the optimiser can't simplify
>>> expressions because it runs into an arbitrary function call, and I've
>>> never seen an optimiser that understands exp/log/roots, etc, to the
>>> point where it can reduce those expressions properly. To compete with
>>> maths benchmarks, we need some means to simplify expressions properly.
>>
>>
>> Simplified expressions would [NOT] help because
>> 1. On matrix (hight) level optimisation can be done very well by programer
>> (algorithms with matrixes in terms of count of matrix multiplications are
>> small).
>
> Perhaps you've never worked with incompetent programmers (in my
> experience, >50% of the professional workforce).
> Programmers, on average, don't know maths. They literally have no idea
> how to simplify an algebraic expression.
> I think there are about 3-4 (being generous!) people in my office (of
> 30-40) that could do it properly, and without spending heaps of time
> on it.
>
>> 2. Low level optimisation requires specific CPU/Cache optimisation. Modern
>> implementations are optimised for all cache levels. See work by KAZUSHIGE
>> GOTO
>> http://www.cs.utexas.edu/users/pingali/CS378/2008sp/papers/gotoPaper.pdf
>
> Low-level optimisation is a sliding scale, not a binary position.
> Reaching 'optimal' state definitely requires careful consideration of
> all the details you refer to, but there are a lot of improvements that
> can be gained from quickly written code without full low-level
> optimisation. A lot of basic low-level optimisations (like just using
> appropriate opcodes, or eliding redundant operations; ie, squares
> followed by sqrt) can't be applied without first simplifying
> expressions.

OK, generally you are talking about something we can name MathD. I understand the reasons. However I am strictly against algebraic operations (or eliding redundant operations for floating points) for basic routines in system programming language. Even float/double internal conversion to real in math expressions is a huge headache when math algorithms are implemented (see first two comments at https://github.com/D-Programming-Language/phobos/pull/2991 ). In system PL sqrt(x)^2  should compiles as is.

Such optimisations can be implemented over the basic routines (pow, sqrt, gemv, gemm, etc). We can use approach similar to D compile time regexp.

Best,
Ilya

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation