June 09, 2015
On Tuesday, 9 June 2015 at 16:40:56 UTC, Ilya Yaroshenko wrote:
> On Tuesday, 9 June 2015 at 16:18:06 UTC, Manu wrote:
>> On 10 June 2015 at 01:26, Ilya Yaroshenko via Digitalmars-d
>> <digitalmars-d@puremagic.com> wrote:
>>>
>>>> I believe that Phobos must support some common methods of linear algebra
>>>> and general mathematics. I have no desire to join D with Fortran libraries
>>>> :)
>>>
>>>
>>> D definitely needs BLAS API support for matrix multiplication. Best BLAS
>>> libraries are written in assembler like openBLAS. Otherwise D will have last
>>> position in corresponding math benchmarks.
>>
>> A complication for linear algebra (or other mathsy things in general)
>> is the inability to detect and implement compound operations.
>> We don't declare mathematical operators to be algebraic operations,
>> which I think is a lost opportunity.
>> If we defined the properties along with their properties
>> (commutativity, transitivity, invertibility, etc), then the compiler
>> could potentially do an algebraic simplification on expressions before
>> performing codegen and optimisation.
>> There are a lot of situations where the optimiser can't simplify
>> expressions because it runs into an arbitrary function call, and I've
>> never seen an optimiser that understands exp/log/roots, etc, to the
>> point where it can reduce those expressions properly. To compete with
>> maths benchmarks, we need some means to simplify expressions properly.
>
> Simplified expressions would help because
> 1. On matrix (hight) level optimisation can be done very well by programer (algorithms with matrixes in terms of count of matrix multiplications are small).
> 2. Low level optimisation requires specific CPU/Cache optimisation. Modern implementations are optimised for all cache levels. See work by KAZUSHIGE GOTO http://www.cs.utexas.edu/users/pingali/CS378/2008sp/papers/gotoPaper.pdf
EDIT: would NOT help
June 09, 2015
On Tuesday, 9 June 2015 at 16:08:40 UTC, Andrei Alexandrescu wrote:
> On 6/9/15 1:50 AM, John Colvin wrote:
>> On Tuesday, 9 June 2015 at 06:59:07 UTC, Andrei Alexandrescu wrote:
>>> (a) Provide standard data layouts in std.array for the typical shapes
>>> supported by linear algebra libs: row major, column major, alongside
>>> with striding primitives.
>>
>> I don't think this is quite the right approach. Multidimensional arrays
>> and matrices are about accessing and iteration over data, not data
>> structures themselves. The standard layouts are common special cases.
>
> I see. So what would be the primitives necessary? Strides (in the form of e.g. special ranges)? What are the things that would make a library vendor or user go, "OK, now I know what steps to take to use my code with D"?
>

N-dimensional slices can be expressed as N slices and N shifts.
Where shift equals count of elements in source range  between front elements of neighboring sub-slices on corresponding slice-level.

private struct Slice(size_t N, Range)
{
    size_t[2][N] slices;
    size_t[N] shifts;
    Range range;
}
June 09, 2015
On Tuesday, 9 June 2015 at 16:45:33 UTC, Manu wrote:
> On 10 June 2015 at 02:32, John Colvin via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>> On Tuesday, 9 June 2015 at 16:18:06 UTC, Manu wrote:
>>>
>>> On 10 June 2015 at 01:26, Ilya Yaroshenko via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>>>> [...]
>>>
>>>
>>> A complication for linear algebra (or other mathsy things in general)
>>> is the inability to detect and implement compound operations.
>>> We don't declare mathematical operators to be algebraic operations,
>>> which I think is a lost opportunity.
>>> If we defined the properties along with their properties
>>> (commutativity, transitivity, invertibility, etc), then the compiler
>>> could potentially do an algebraic simplification on expressions before
>>> performing codegen and optimisation.
>>> There are a lot of situations where the optimiser can't simplify
>>> expressions because it runs into an arbitrary function call, and I've
>>> never seen an optimiser that understands exp/log/roots, etc, to the
>>> point where it can reduce those expressions properly. To compete with
>>> maths benchmarks, we need some means to simplify expressions properly.
>>
>>
>> Optimising floating point is a massive pain because of precision concerns and IEEE-754 conformance. Just because something is analytically the same doesn't mean you want the optimiser to go ahead and make the switch for you.
>
> We have flags to control this sort of thing (fast-math, strict ieee, etc).
> I will worry about my precision, I just want the optimiser to do its
> job and do the very best it possibly can. In the case of linear
> algebra, the optimiser generally fails and I must manually simplify
> expressions as much as possible.

If the compiler is free to rewrite by analytical rules then "I will worry about my precision" is equivalent to either "I don't care about my precision" or "I have checked the codegen". A simple rearrangement of an expression can easily turn a perfectly good result in to complete garbage. It would be great if compilers were even better at fast-math mode, but an awful lot of applications can't use it.
June 09, 2015
On 6/9/15 9:16 AM, Dennis Ritchie wrote:
> On Tuesday, 9 June 2015 at 16:14:24 UTC, Dennis Ritchie wrote:
>> To solve these problems you need something like Blas. Perhaps BLAS -
>> it's more practical way to enrich D techniques for working with matrices.
>
> Actually, that's what you need to realize in D:
> http://www.boost.org/doc/libs/1_58_0/libs/numeric/ublas/doc/index.html

"And finally uBLAS offers good (but not outstanding) performance." -- Andrei
June 09, 2015
On Tuesday, 9 June 2015 at 17:19:28 UTC, Andrei Alexandrescu wrote:
> On 6/9/15 9:16 AM, Dennis Ritchie wrote:
>> On Tuesday, 9 June 2015 at 16:14:24 UTC, Dennis Ritchie wrote:
>>> To solve these problems you need something like Blas. Perhaps BLAS -
>>> it's more practical way to enrich D techniques for working with matrices.
>>
>> Actually, that's what you need to realize in D:
>> http://www.boost.org/doc/libs/1_58_0/libs/numeric/ublas/doc/index.html
>
> "And finally uBLAS offers good (but not outstanding) performance." -- Andrei

OK, but...
Same thing I can say about BigInt in Phobos.
"And finally `std.bigint` offers good (but not outstanding) performance."

I decided 17 math problems and for most of them I needed a `BigInt`:
http://i.imgur.com/CmOSm7V.png
https://projecteuler.net/language=D

If in D would not be `BigInt`, I probably would have used to Boost.Multiprekison on C++:
http://www.boost.org/doc/libs/1_58_0/libs/multiprecision/doc/html/index.html

Or do some slow Python.

Maybe all this and does not give a huge performance, but for a wide range of mathematical problems it all helps.

Thus, it is better to have something than nothing :) And BLAS is more than something...
June 09, 2015
On 6/9/15 11:42 AM, Dennis Ritchie wrote:
> "And finally `std.bigint` offers good (but not outstanding) performance."

BigInt should use reference counting. Its current approach to allocating new memory for everything is a liability. Could someone file a report for this please. -- Andrei
June 09, 2015
On Tuesday, 9 June 2015 at 18:58:56 UTC, Andrei Alexandrescu wrote:
> On 6/9/15 11:42 AM, Dennis Ritchie wrote:
>> "And finally `std.bigint` offers good (but not outstanding) performance."
>
> BigInt should use reference counting. Its current approach to allocating new memory for everything is a liability. Could someone file a report for this please. -- Andrei

Done:
https://issues.dlang.org/show_bug.cgi?id=14673
June 09, 2015
On 6/9/15 2:59 PM, Andrei Alexandrescu wrote:
> On 6/9/15 11:42 AM, Dennis Ritchie wrote:
>> "And finally `std.bigint` offers good (but not outstanding) performance."
>
> BigInt should use reference counting. Its current approach to allocating
> new memory for everything is a liability. Could someone file a report
> for this please. -- Andrei

Slightly OT, but this reminds me.

RefCounted is not viable when using the GC, because any references on the heap may race against stack-based references.

Can we make RefCounted use atomicInc and atomicDec? It will hurt performance a bit, but the current state is not good.

I spoke with Erik about this, as he was planning on using RefCounted, but didn't know about the hairy issues with the GC.

If we get to a point where we can have a thread-local GC, we can remove the implementation detail of using atomic operations when possible.

-Steve
June 09, 2015
On 6/9/15 12:21 PM, Dennis Ritchie wrote:
> On Tuesday, 9 June 2015 at 18:58:56 UTC, Andrei Alexandrescu wrote:
>> On 6/9/15 11:42 AM, Dennis Ritchie wrote:
>>> "And finally `std.bigint` offers good (but not outstanding)
>>> performance."
>>
>> BigInt should use reference counting. Its current approach to
>> allocating new memory for everything is a liability. Could someone
>> file a report for this please. -- Andrei
>
> Done:
> https://issues.dlang.org/show_bug.cgi?id=14673

Thanks! -- Andrei
June 09, 2015
On 6/9/15 1:53 PM, Steven Schveighoffer wrote:
> On 6/9/15 2:59 PM, Andrei Alexandrescu wrote:
>> On 6/9/15 11:42 AM, Dennis Ritchie wrote:
>>> "And finally `std.bigint` offers good (but not outstanding)
>>> performance."
>>
>> BigInt should use reference counting. Its current approach to allocating
>> new memory for everything is a liability. Could someone file a report
>> for this please. -- Andrei
>
> Slightly OT, but this reminds me.
>
> RefCounted is not viable when using the GC, because any references on
> the heap may race against stack-based references.

How do you mean that?

> Can we make RefCounted use atomicInc and atomicDec? It will hurt
> performance a bit, but the current state is not good.
>
> I spoke with Erik about this, as he was planning on using RefCounted,
> but didn't know about the hairy issues with the GC.
>
> If we get to a point where we can have a thread-local GC, we can remove
> the implementation detail of using atomic operations when possible.

The obvious solution that comes to mind is adding a Flag!"interlocked". -- Andrei