Thread overview
Mir Algorithm v0.5.8: Interpolation, Timeseries and 17 new functions
May 08, 2017
9il
May 08, 2017
jmh530
May 08, 2017
jmh530
May 09, 2017
9il
May 09, 2017
9il
May 09, 2017
jmh530
May 10, 2017
9il
May 10, 2017
jmh530
May 08, 2017
## New modules

 - mir.interpolation
 - mir.interpolation.linear
 - mir.interpolation.pchip
 - mir.timeseries
 - mir.ndslice.mutation: transposeInPlace

## New functions for existing modules

 - mir.ndslice.topology: diff
 - mir.ndslice.topology: slide
 - mir.ndslice.algorithm: findIndex
 - mir.ndslice.algorithm: minPos, maxPos
 - mir.ndslice.algorithm: minIndex, maxIndex
 - mir.ndslice.algorithm: minmaxPos, maxmaxPos
 - mir.ndslice.algorithm: minmaxIndex, maxmaxIndex

## New features

 - Syntax sugar for `indexed` and `cartesian` [v0.5.1]
 - Syntax sugar for map + RefTuple combination [v0.5.0]
 - Specialisation for `map!"a"`.


## Bug fixes

 - front!1 and back!1 were wrong for Canonical and Contiguous ndslices.

Docs: http://docs.algorithm.dlang.io/latest/index.html
[v0.5.8] https://github.com/libmir/mir-algorithm/releases/tag/v0.5.8
[v0.5.1] https://github.com/libmir/mir-algorithm/releases/tag/v0.5.1
[v0.5.0] https://github.com/libmir/mir-algorithm/releases/tag/v0.5.0
May 08, 2017
On Monday, 8 May 2017 at 08:51:32 UTC, 9il wrote:
> ## New modules
> ...

Great work.

Some comments:

mir.timeseries is a welcome addition. Calling (time, data) pairs moments will confuse because moment has another meaning in statistics. Perhaps observation? Head and tail are also pretty common timeseries functions (probably would need to go through pandas to get a reminder on other common stuff). Also, Series might also include data labels for columns. And access by data label.

The second part of the example for
mir.ndslice.topology: slide
is not that intuitive. It seems like what you're basically doing is the same as
assert(sw == [8, 12, 16, 20, 24, 28, 32, 36]);
(or something) but it's just less obvious to do it by a formula.

I don't know how strongly I feel about this, but I find the naming between minIndex/maxIndex and minPos/maxPos and minmaxIndex/minmaxPos strange. All three produce indices, it's just that the Pos do it backwards and minmax give both min and max. It seems like a lot of separate functions for things that could be done with one multi-purpose template. Regardless, if you keep it the way it is, then maybe given the plethora of finding functions, split it off to a separate module?

Would it make sense to bump that thread you posted earlier in case people didn't see it due to dconf?
May 08, 2017
On Monday, 8 May 2017 at 14:26:35 UTC, jmh530 wrote:
>
> mir.timeseries is a welcome addition. Calling (time, data) pairs moments will confuse because moment has another meaning in statistics. Perhaps observation? Head and tail are also pretty common timeseries functions (probably would need to go through pandas to get a reminder on other common stuff). Also, Series might also include data labels for columns. And access by data label.
>
You might also be interested in Python's xarray
http://xarray.pydata.org/en/stable/why-xarray.html
May 09, 2017
On Monday, 8 May 2017 at 14:26:35 UTC, jmh530 wrote:
> On Monday, 8 May 2017 at 08:51:32 UTC, 9il wrote:
>> ## New modules
>> ...
>
> Great work.
>
> Some comments:
>
> mir.timeseries is a welcome addition. Calling (time, data) pairs moments will confuse because moment has another meaning in statistics. Perhaps observation?

Thanks. Fixed.
> Also, Series might also include data labels for columns. And access by data label.

I do not see good @nogc solution for now. PRs are welcome!

> The second part of the example for
> mir.ndslice.topology: slide
> is not that intuitive. It seems like what you're basically doing is the same as
> assert(sw == [8, 12, 16, 20, 24, 28, 32, 36]);
> (or something) but it's just less obvious to do it by a formula.

Fixed

> I don't know how strongly I feel about this, but I find the naming between minIndex/maxIndex and minPos/maxPos and minmaxIndex/minmaxPos strange. All three produce indices, it's just that the Pos do it backwards and minmax give both min and max. It seems like a lot of separate functions for things that could be done with one multi-purpose template. Regardless, if you keep it the way it is, then maybe given the plethora of finding functions, split it off to a separate module?

OK, I changed return type for *Pos functions.
No they return positions :-)

Thank you for the comments!

Best,
Ilya
May 09, 2017
On Tuesday, 9 May 2017 at 17:35:18 UTC, 9il wrote:
> OK, I changed return type for *Pos functions.
> No they return positions :-)

Now*

May 09, 2017
On Tuesday, 9 May 2017 at 17:35:18 UTC, 9il wrote:
>> Also, Series might also include data labels for columns. And access by data label.
>
> I do not see good @nogc solution for now. PRs are welcome!
>

Ah, I see the issue. I may just focus on generalizing the time dimension to coordinates for now.


Some other comments after spending some more time with it:
opIndexopApply documentation says

Special [] op= index-op-assign operator for time-series. Op-assigns data from r for time intersection. This and r series are assumed to be sorted.

The "for time intersection" is not entirely obvious what it means. Looking at the code and examples, it looks more like a left join. So I thought it could probably be explained a little more clearly, such as (with similar adjustments for opIndexAssign)

Special [] op= index-op-assign operator for time-series. Op-assigns data from r with time intersection. If a time index in r is not in the time index for this series, then no op-assign will take place. This and r series are assumed to be sorted.


Regardless, the behavior still seems risky to me. It may also be prudent to include a warning in the documentation that it may be useful to check that they have the same time index first. Functions to align time series would be useful.

More generally, I think the documentation on mir.ndslice.slice could use some work. Some suggestions:
1) Universal/Canonical/Contiguous link to short one line explanations in SliceKind, but the Slice struct has much more detail. Perhaps you can mention to refer to the sections of the Internal Binary Representation of Slice?

2) I have had trouble understanding this Internal Binary Representation section. a) I would recommend that the first paragraph be split in two at the "For ranges...". The first paragraph could then provide more details on lengths/strides/pointer and how that relates to the template arguments of the Struct. In particular, I find packs/strides to be among the most difficult to understand aspects based on the current documentation.
b) The sections for Canonical and Contiguous do not have examples or representations like the Universal does.
c) I think many would be interested in use cases for the different SliceKinds and performance implications.

3) After this section, you might use a bigger header for Examples.

4) The Definitions heading begins a discussion on operator overloading. You might instead change the heading to Operator Overloading. For instance, the table and discussion do not have definitions of other concepts, such as strides or packs, that I said above I don't think are well described (but I don't necessarily think the definitions should go here either).

5) The first line at the top really has little detail before going into the Definitions section. You might expand this a bit and provide a simple example and then have links to the different sections and what information is in each.
May 10, 2017
On Tuesday, 9 May 2017 at 20:35:18 UTC, jmh530 wrote:
> On Tuesday, 9 May 2017 at 17:35:18 UTC, 9il wrote:
>>[...]
>
> Ah, I see the issue. I may just focus on generalizing the time dimension to coordinates for now.
>
> [...]

I have fixed small parts. I have invited you to the Mir Github team. Would be awesome to see your documentation PRs) You can always ask me about implementation details in the Gitter

Thanks,
Ilya
May 10, 2017
On Wednesday, 10 May 2017 at 11:16:37 UTC, 9il wrote:
>
> I have fixed small parts. I have invited you to the Mir Github team. Would be awesome to see your documentation PRs) You can always ask me about implementation details in the Gitter
>
> Thanks,
> Ilya

Thanks. Your documentation seems like it assumes that the person reading it is as great a programmer as you are. And I am most assuredly not.

More generally, I sometimes get frustrated with the documentation on dlang, but I'm not as pro-active as I should be in trying to improve it. It might be less frustrating to help with yours than dlang.