Thread overview
problem with multiwayMerge and chunkBy
Nov 04, 2017
Matthew Gamble
Nov 05, 2017
Nicholas Wilson
Nov 05, 2017
Matthew Gamble
Nov 05, 2017
Nicholas Wilson
Nov 05, 2017
Nicholas Wilson
November 04, 2017
Dear most helpful and appreciated D community,

I'm a non-pro academic biologist trying to code a modeler of transcription in D. I've run into a small roadblock. Any help would be greatly appreciated. I'm hoping someone can tell me why I get the following run-time error from this code. I've reduced it to something simple:

        import std.algorithm;
        import std.range;

	auto d =[2,4,6,8];
	auto e =[1,2,3,5,7];
	auto f =[d,e];

	writeln(f.multiwayMerge.chunkBy!"a == b");//error happens
        writeln(f.multiwayMerge.array.chunkBy!"a == b");//no error, but there must be a better way!

My understanding is that chunkBy should be able to take an input range. Is that not true? I'm trying to get a merged sorted view of two sorted ranges followed by merging records based on a predicate without allocating memory or swapping the underlying values. Speed will be very important at the end of the day and sticking the ".array" in the middle kills me, given the size of the actual ranges.

Thank you so much for your help. The full tale-of-the-tape is below.

Thanks,
Matt

error:
[[1], [2, 2], [3], [4], [5], [6], [7], [8
core.exception.AssertError@C:\D\dmd2\windows\bin\..\..\src\phobos\std\range\primitives.d(2055): Attempting to popFront() past the end of an array of int
----------------
0x00007FF7A30775D3 in d_assert_msg
0x00007FF7A303E497 in std.range.primitives.popFront!int.popFront at C:\D\dmd2\windows\bin\..\..\src\phobos\std\range\primitives.d(2056)
0x00007FF7A304450C in std.algorithm.setops.MultiwayMerge!("a < b", int[][]).MultiwayMerge.popFront at C:\D\dmd2\windows\bin\..\..\src\phobos\std\algorithm\setops.d(877)
0x00007FF7A304A700 in std.algorithm.iteration.ChunkByChunkImpl!(binaryFun, std.algorithm.setops.MultiwayMerge!("a < b", int[][])).ChunkByChunkImpl.popFront at C:\D\dmd2\windows\bin\..\..\src\phobos\std\algorithm\iteration.d(1624)
0x00007FF7A3054877 in std.format.formatRange!(std.stdio.File.LockingTextWriter, std.algorithm.iteration.ChunkByChunkImpl!(binaryFun, MultiwayMerge!("a < b", int[][])), char).formatRange at C:\D\dmd2\windows\bin\..\..\src\phobos\std\format.d(2960)
0x00007FF7A3054796 in std.format.formatValue!(std.stdio.File.LockingTextWriter, std.algorithm.iteration.ChunkByChunkImpl!(binaryFun, MultiwayMerge!("a < b", int[][])), char).formatValue at C:\D\dmd2\windows\bin\..\..\src\phobos\std\format.d(3676)
0x00007FF7A3054704 in std.format.formatElement!(std.stdio.File.LockingTextWriter, std.algorithm.iteration.ChunkByChunkImpl!(binaryFun, MultiwayMerge!("a < b", int[][])), char).formatElement at C:\D\dmd2\windows\bin\..\..\src\phobos\std\format.d(3180)
0x00007FF7A305410E in std.format.formatRange!(std.stdio.File.LockingTextWriter, std.algorithm.iteration.ChunkByImpl!("a == b", MultiwayMerge!("a < b", int[][])), char).formatRange at C:\D\dmd2\windows\bin\..\..\src\phobos\std\format.d(2964)
0x00007FF7A3053F66 in std.format.formatValue!(std.stdio.File.LockingTextWriter, std.algorithm.iteration.ChunkByImpl!("a == b", MultiwayMerge!("a < b", int[][])), char).formatValue at C:\D\dmd2\windows\bin\..\..\src\phobos\std\format.d(3676)
0x00007FF7A304B053 in std.format.formattedWrite!(std.stdio.File.LockingTextWriter, char, std.algorithm.iteration.ChunkByImpl!("a == b", MultiwayMerge!("a < b", int[][]))).formattedWrite at C:\D\dmd2\windows\bin\..\..\src\phobos\std\format.d(568)
0x00007FF7A3069BA2 in std.stdio.File.write!(std.algorithm.iteration.ChunkByImpl!("a == b", MultiwayMerge!("a < b", int[][])), char).write at C:\D\dmd2\windows\bin\..\..\src\phobos\std\stdio.d(1407)
0x00007FF7A30699DB in std.stdio.writeln!(std.algorithm.iteration.ChunkByImpl!("a == b", MultiwayMerge!("a < b", int[][]))).writeln at C:\D\dmd2\windows\bin\..\..\src\phobos\std\stdio.d(3604)
0x00007FF7A303E258 in gappedIntervals.__unittestL70_4 at C:\Users\matth\Documents\GambleLabCodeBaseLocal\intervals\gappedIntervals.d(78)
0x00007FF7A3061917 in void gappedIntervals.__modtest()
0x00007FF7A307C08B in int core.runtime.runModuleUnitTests().__foreachbody1(object.ModuleInfo*)
0x00007FF7A3082663 in int object.ModuleInfo.opApply(scope int delegate(object.ModuleInfo*)).__lambda2(immutable(object.ModuleInfo*))
0x00007FF7A308594F in int rt.minfo.moduleinfos_apply(scope int delegate(immutable(object.ModuleInfo*))).__foreachbody2(ref rt.sections_win64.SectionGroup)
0x00007FF7A30858BF in int rt.minfo.moduleinfos_apply(scope int delegate(immutable(object.ModuleInfo*)))
0x00007FF7A3082637 in int object.ModuleInfo.opApply(scope int delegate(object.ModuleInfo*))
0x00007FF7A307C005 in runModuleUnitTests
0x00007FF7A3070B7D in void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).runAll()
0x00007FF7A3070ADF in void rt.dmain2._d_run_main(int, char**, extern (C) int function(char[][])*).tryExec(scope void delegate())
0x00007FF7A30708DF in d_run_main
0x00007FF7A306F9F2 in __entrypoint.main
0x00007FF7A30BA0C5 in __scrt_common_main_seh at f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl(283)
0x00007FF8876A2774 in BaseThreadInitThunk
0x00007FF887DA0D51 in RtlUserThreadStart
November 05, 2017
On Saturday, 4 November 2017 at 18:57:17 UTC, Matthew Gamble wrote:
> Dear most helpful and appreciated D community,
>
> I'm a non-pro academic biologist trying to code a modeler of transcription in D. I've run into a small roadblock. Any help would be greatly appreciated. I'm hoping someone can tell me why I get the following run-time error from this code. I've reduced it to something simple:
>
>         import std.algorithm;
>         import std.range;
>
> 	auto d =[2,4,6,8];
> 	auto e =[1,2,3,5,7];
> 	auto f =[d,e];
>
> 	writeln(f.multiwayMerge.chunkBy!"a == b");//error happens
>         writeln(f.multiwayMerge.array.chunkBy!"a == b");//no error, but there must be a better way!
>
> My understanding is that chunkBy should be able to take an input range. Is that not true? I'm trying to get a merged sorted view of two sorted ranges followed by merging records based on a predicate without allocating memory or swapping the underlying values. Speed will be very important at the end of the day and sticking the ".array" in the middle kills me, given the size of the actual ranges.

It should, this looks like a bug somewhere, please file one at issues.dlang.org/ .

in the mean time

struct Replicate(T)
{
    Tuple!(T, uint) e;
    @property bool empty() { return e[1] == 0 ; }
    @property auto front() {return e[0]; }
    void popFront() { --e[1]; }
}

Replicate!T replicate(T)(Tuple!(T, uint) e)
{
    return typeof(return)(e);
}

f.multiwayMerge.group!"a == b".map!(replicate).writeln;

Does the same thing provided your predicate is "a == b".
November 05, 2017
On Sunday, 5 November 2017 at 03:21:06 UTC, Nicholas Wilson wrote:
> On Saturday, 4 November 2017 at 18:57:17 UTC, Matthew Gamble wrote:
>> [...]
>
> It should, this looks like a bug somewhere, please file one at issues.dlang.org/ .
>
> in the mean time
>
> struct Replicate(T)
> {
>     Tuple!(T, uint) e;
>     @property bool empty() { return e[1] == 0 ; }
>     @property auto front() {return e[0]; }
>     void popFront() { --e[1]; }
> }
>
> Replicate!T replicate(T)(Tuple!(T, uint) e)
> {
>     return typeof(return)(e);
> }
>
> f.multiwayMerge.group!"a == b".map!(replicate).writeln;
>
> Does the same thing provided your predicate is "a == b".


Thanks Nicholas.
I posted the bug as you suggested. My predicate is not quite a == b, otherwise I would never have needed chunkBy in the first place. But thanks, I'm pursuing a workaround.

Matt
November 05, 2017
On Sunday, 5 November 2017 at 13:32:57 UTC, Matthew Gamble wrote:
> On Sunday, 5 November 2017 at 03:21:06 UTC, Nicholas Wilson wrote:
>> On Saturday, 4 November 2017 at 18:57:17 UTC, Matthew Gamble wrote:
>>> [...]
>>
>> It should, this looks like a bug somewhere, please file one at issues.dlang.org/ .
>>
>> in the mean time
>>
>> struct Replicate(T)
>> {
>>     Tuple!(T, uint) e;
>>     @property bool empty() { return e[1] == 0 ; }
>>     @property auto front() {return e[0]; }
>>     void popFront() { --e[1]; }
>> }
>>
>> Replicate!T replicate(T)(Tuple!(T, uint) e)
>> {
>>     return typeof(return)(e);
>> }
>>
>> f.multiwayMerge.group!"a == b".map!(replicate).writeln;
>>
>> Does the same thing provided your predicate is "a == b".
>
>
> Thanks Nicholas.
> I posted the bug as you suggested. My predicate is not quite a == b, otherwise I would never have needed chunkBy in the first place. But thanks, I'm pursuing a workaround.
>
> Matt

One thing you might try is instead of using .array to eagerly evaluate the whole range, eagerly evaluate only a part (say 128 elements) and .joiner them.

import std.range : chunks;

f.multiwayMerge.chunks(128).joiner.chunkBy!(pred).writeln;

since it seems to be the iteration that stuff things up and this changes it.
November 05, 2017
On Sunday, 5 November 2017 at 22:47:10 UTC, Nicholas Wilson wrote:
> f.multiwayMerge.chunks(128).joiner.chunkBy!(pred).writeln;
>
> since it seems to be the iteration that stuff things up and this changes it.

If that doesn't work you could try rolling your own version of chunk with `take` and a static array.