Thread overview | |||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
June 11, 2020 Looking for a Code Review of a Bioinformatics POC | ||||
---|---|---|---|---|
| ||||
Hi! I'm new to dlang but loving it so far! One of my favorite first things to implement in a new language is an interval library. In this case I want to submit to a benchmark repo: https://github.com/lh3/biofast If anyone is willing to take a look and give some feedback I'd be very appreciative! Specifically if you have an performance improvement ideas: https://github.com/sstadick/dgranges/pull/1 Currently my D version is a few seconds slower than the Crystal version. putting it very solid in third place overall. I'm not really sure where it's falling behind crystal since `-release` removes bounds checking. I have not looked at the assembly between the two, but I suspect that Crystal inlines the callback and D does not. I also think there is room for improvement in the IO, as I'm just using the defaults. |
June 11, 2020 Re: Looking for a Code Review of a Bioinformatics POC | ||||
---|---|---|---|---|
| ||||
Posted in reply to duck_tape | On Thursday, 11 June 2020 at 16:13:34 UTC, duck_tape wrote:
> Hi! I'm new to dlang but loving it so far! One of my favorite first things to implement in a new language is an interval library. In this case I want to submit to a benchmark repo: https://github.com/lh3/biofast
>
> I also think there is room for improvement in the IO, as I'm just using the defaults.
Are you building with DMD or with LDC/GDC?
|
June 11, 2020 Re: Looking for a Code Review of a Bioinformatics POC | ||||
---|---|---|---|---|
| ||||
Posted in reply to CraigDillabaugh | On Thursday, 11 June 2020 at 17:25:13 UTC, CraigDillabaugh wrote:
> Are you building with DMD or with LDC/GDC?
I'm building with LDC. I haven't pulled up a linux box to test drive gdc yet.
`ldc2 -O -release`
|
June 11, 2020 Re: Looking for a Code Review of a Bioinformatics POC | ||||
---|---|---|---|---|
| ||||
Posted in reply to duck_tape | On Thursday, 11 June 2020 at 16:13:34 UTC, duck_tape wrote:
> Hi! I'm new to dlang but loving it so far! One of my favorite first things to implement in a new language is an interval library. In this case I want to submit to a benchmark repo: https://github.com/lh3/biofast
>
> If anyone is willing to take a look and give some feedback I'd be very appreciative! Specifically if you have an performance improvement ideas: https://github.com/sstadick/dgranges/pull/1
>
> Currently my D version is a few seconds slower than the Crystal version. putting it very solid in third place overall. I'm not really sure where it's falling behind crystal since `-release` removes bounds checking. I have not looked at the assembly between the two, but I suspect that Crystal inlines the callback and D does not.
>
> I also think there is room for improvement in the IO, as I'm just using the defaults.
Move as much as possible code to compile time.
Do not allocate inside the loops.
Keep GC collection away from performance critical parts with GC.disable switch;
Also dflags-ldc "-mcpu=native" in dub.json might give you some edge.
|
June 11, 2020 Re: Looking for a Code Review of a Bioinformatics POC | ||||
---|---|---|---|---|
| ||||
Posted in reply to duck_tape | On Thursday, 11 June 2020 at 16:13:34 UTC, duck_tape wrote:
> Hi! I'm new to dlang but loving it so far! One of my favorite first things to implement in a new language is an interval library. In this case I want to submit to a benchmark repo: https://github.com/lh3/biofast
>
> If anyone is willing to take a look and give some feedback I'd be very appreciative! Specifically if you have an performance improvement ideas: https://github.com/sstadick/dgranges/pull/1
>
> Currently my D version is a few seconds slower than the Crystal version. putting it very solid in third place overall. I'm not really sure where it's falling behind crystal since `-release` removes bounds checking. I have not looked at the assembly between the two, but I suspect that Crystal inlines the callback and D does not.
>
> I also think there is room for improvement in the IO, as I'm just using the defaults.
Add to your dub.json the following:
"""
"buildTypes": {
"release": {
"buildOptions": [
"releaseMode",
"inline",
"optimize"
],
"dflags": [
"-boundscheck=off"
]
},
}
"""
dub build --compiler=ldc2 --build=release
Mir Slices instead of standard D arrays are faster. Athough looking at your code I don't see where you can plug them in. Just keep in mind.
|
June 11, 2020 Re: Looking for a Code Review of a Bioinformatics POC | ||||
---|---|---|---|---|
| ||||
Posted in reply to tastyminerals | On Thursday, 11 June 2020 at 20:24:37 UTC, tastyminerals wrote:
> Mir Slices instead of standard D arrays are faster. Athough looking at your code I don't see where you can plug them in. Just keep in mind.
Thanks for taking a look! What is it about Mir Slices that makes them faster? I hadn't seen the Mir package before but it looks very useful and intriguing.
|
June 11, 2020 Re: Looking for a Code Review of a Bioinformatics POC | ||||
---|---|---|---|---|
| ||||
Posted in reply to tastyminerals | On Thursday, 11 June 2020 at 20:24:37 UTC, tastyminerals wrote:
> Mir Slices instead of standard D arrays are faster. Athough looking at your code I don't see where you can plug them in. Just keep in mind.
I just started following links, sweet blog! Your reason for getting into D is exactly the same as mine. Awesome blog!
|
June 11, 2020 Re: Looking for a Code Review of a Bioinformatics POC | ||||
---|---|---|---|---|
| ||||
Posted in reply to duck_tape | On Thu, Jun 11, 2020 at 04:13:34PM +0000, duck_tape via Digitalmars-d-learn wrote: [...] > Currently my D version is a few seconds slower than the Crystal version. putting it very solid in third place overall. I'm not really sure where it's falling behind crystal since `-release` removes bounds checking. I have not looked at the assembly between the two, but I suspect that Crystal inlines the callback and D does not. To encourage inlining, you could make it an alias parameter instead of a delegate, something like this: void overlap(alias cb)(SType start, SType stop) { ... } ... bed[chr].overlap!callback(st0, en0); This doesn't guarantee inlining, though. And no guarantee it will actually improve performance. > I also think there is room for improvement in the IO, as I'm just using the defaults. I wouldn't spend too much time optimizing I/O without profiling it first, to check that it's actually a bottleneck. If I/O turns out to be a real bottleneck, you could try using std.mmfile.MmFile to mmap the input directly into the program's address space, which should give you a speed boost. T -- An imaginary friend squared is a real enemy. |
June 11, 2020 Re: Looking for a Code Review of a Bioinformatics POC | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | On Thursday, 11 June 2020 at 22:19:27 UTC, H. S. Teoh wrote: > To encourage inlining, you could make it an alias parameter instead of a delegate, something like this: > > void overlap(alias cb)(SType start, SType stop) { ... } > ... > bed[chr].overlap!callback(st0, en0); > I don't think ldc can handl that yet. I get an error saying ``` source/app.d(72,7): Error: function app.main.overlap!(callback).overlap requires a dual-context, which is not yet supported by LDC ``` And I see an open ticket for it on the ldc project. |
June 11, 2020 Re: Looking for a Code Review of a Bioinformatics POC | ||||
---|---|---|---|---|
| ||||
Posted in reply to duck_tape | On Thursday, 11 June 2020 at 21:54:31 UTC, duck_tape wrote:
> On Thursday, 11 June 2020 at 20:24:37 UTC, tastyminerals wrote:
>> Mir Slices instead of standard D arrays are faster. Athough looking at your code I don't see where you can plug them in. Just keep in mind.
>
> Thanks for taking a look! What is it about Mir Slices that makes them faster? I hadn't seen the Mir package before but it looks very useful and intriguing.
Mir is fine-tuned for LLVM, pointer magic and SIMD optimizations.
|
Copyright © 1999-2021 by the D Language Foundation