May 22, 2017
On Friday, 19 May 2017 at 22:53:39 UTC, crimaniak wrote:
> On Friday, 19 May 2017 at 12:55:05 UTC, Biotronic wrote:
>> revComp6 seems to be the fastest, but it's probably also the least readable (a common trade-off).
> Try revComp7 with -release :)
>
> string revComp7(string bps)
> {
>     char[] result = new char[bps.length];
>     auto p1 = result.ptr;
>     auto p2 = &bps[$ - 1];
>     enum AT = 'A'^'T';
>     enum CG = 'C'^'G';
>
>     while (p2 > bps.ptr)
>     {
>        *p1 = *p2 ^ ((*p2 == 'A' || *p2 == 'T') ? AT : CG);
>         p1++;
>         p2--;
>     }
>     return result.assumeUnique;
> }
>
> In fact, when the size of the sequence is growing time difference between procedures is shrinking, so it's much more important to use memory-efficient presentation than to optimize logic.

revComp7 is pretty fast, but I followed ag0aep6g's advice:

On Friday, 19 May 2017 at 13:38:20 UTC, ag0aep6g wrote:
> Use `static immutable` instead. It still forces compile-time calculation, but it doesn't have copy/paste behavior. Speeds up revComp3 a lot.

Also, with DMD (2.073.0) -release -O instead of -debug from this point. I'd blame someone else, but this is my fault. :p

Anyways, full collection of the various versions I've written, plus crimaniak's revComp7 (rebranded as revComp8, because I already had 7 at the time):

https://gist.github.com/Biotronic/20daaf0ed1262d313830bc8cd4199896

Timings:
revComp0:        158 ms, 926 us
revComp1: 1 sec, 157 ms,  15 us
revComp2:        604 ms,  37 us
revComp3:         18 ms, 545 us
revComp4:         92 ms, 293 us
revComp5:         86 ms, 731 us
revComp6:         94 ms,  56 us
revComp7:        917 ms, 576 us
revComp8:         62 ms, 917 us

This actually matches my expectations - the table lookup version should be crazy fast, and it is. It beats even your revComp7 (revComp8 in the table).

LDC (-release -O3) timings:

revComp0: 166 ms, 190 us
revComp1: 352 ms, 917 us
revComp2: 300 ms, 493 us
revComp3:  10 ms, 950 us
revComp4: 148 ms, 106 us
revComp5: 144 ms, 152 us
revComp6: 142 ms, 307 us
revComp7: 604 ms, 274 us
revComp8:  26 ms, 612 us

Interesting how revComp4-6 got slower. What I really wanted to see with this though, was the effect on revComp1, which uses ranges all the way.
May 22, 2017
On Monday, 22 May 2017 at 06:50:45 UTC, Biotronic wrote:
> On Friday, 19 May 2017 at 22:53:39 UTC, crimaniak wrote:
>> On Friday, 19 May 2017 at 12:55:05 UTC, Biotronic wrote:
>>> revComp6 seems to be the fastest, but it's probably also the least readable (a common trade-off).
>> Try revComp7 with -release :)
>>
>> string revComp7(string bps)
>> {
>>     char[] result = new char[bps.length];
>>     auto p1 = result.ptr;
>>     auto p2 = &bps[$ - 1];
>>     enum AT = 'A'^'T';
>>     enum CG = 'C'^'G';
>>
>>     while (p2 > bps.ptr)
>>     {
>>        *p1 = *p2 ^ ((*p2 == 'A' || *p2 == 'T') ? AT : CG);
>>         p1++;
>>         p2--;
>>     }
>>     return result.assumeUnique;
>> }
>>
>> In fact, when the size of the sequence is growing time difference between procedures is shrinking, so it's much more important to use memory-efficient presentation than to optimize logic.
>
> revComp7 is pretty fast, but I followed ag0aep6g's advice:
>
> On Friday, 19 May 2017 at 13:38:20 UTC, ag0aep6g wrote:
>> Use `static immutable` instead. It still forces compile-time calculation, but it doesn't have copy/paste behavior. Speeds up revComp3 a lot.
>
> Also, with DMD (2.073.0) -release -O instead of -debug from this point. I'd blame someone else, but this is my fault. :p
>
> Anyways, full collection of the various versions I've written, plus crimaniak's revComp7 (rebranded as revComp8, because I already had 7 at the time):
>
> https://gist.github.com/Biotronic/20daaf0ed1262d313830bc8cd4199896
>
> Timings:
> revComp0:        158 ms, 926 us
> revComp1: 1 sec, 157 ms,  15 us
> revComp2:        604 ms,  37 us
> revComp3:         18 ms, 545 us
> revComp4:         92 ms, 293 us
> revComp5:         86 ms, 731 us
> revComp6:         94 ms,  56 us
> revComp7:        917 ms, 576 us
> revComp8:         62 ms, 917 us
>
> This actually matches my expectations - the table lookup version should be crazy fast, and it is. It beats even your revComp7 (revComp8 in the table).
>
> LDC (-release -O3) timings:
>
> revComp0: 166 ms, 190 us
> revComp1: 352 ms, 917 us
> revComp2: 300 ms, 493 us
> revComp3:  10 ms, 950 us
> revComp4: 148 ms, 106 us
> revComp5: 144 ms, 152 us
> revComp6: 142 ms, 307 us
> revComp7: 604 ms, 274 us
> revComp8:  26 ms, 612 us
>
> Interesting how revComp4-6 got slower. What I really wanted to see with this though, was the effect on revComp1, which uses ranges all the way.

Wow!!! Someone grab me a chair, I need to sit down. I can't tell enough how grateful I am to all you guys. This is so much fun to learn. Some specific comments and replies:

@Nicolas Wilson: Your explanation of the enum is clear and very helpful. I can recall to the same technique used in kh_hash in samtools and the associated. With that said, the chars enum is only to 'T' (85) elements.

 Regarding BioD, I have plan to work on it to add some more functionality. But first I need to sharpen my D skills a bit more.

@Laeeth Isharc: I do like ldc as well. I've came across several projects that use ldc, and learnt that it is a good choice for speed in general.


@ag0aep6g
>You fell into a trap there. The value is calculated at compile time, but it has >copy/paste-like behavior. That is, whenever you use `chars`, the code behaves as if you >typed out the array literal. That means, the whole array is re-created on every iteration.

>Use `static immutable` instead. It still forces compile-time calculation, but it doesn't > have copy/paste behavior. Speeds up revComp3 a lot.

With  'iteration' here you mean running lifetime of the function, or in other words, each one of the 10_000 cycles in the benchmark?

Could you provide some more reading for what you are telling here? I can only guess it is intrinsic behavior of an 'enum'.

@crimaniak, Nicolas Wilson and Biotronic: I've realized before the reversible/negate property of XOR: 'A'^'T'^'T' = 'A' and 'A'^'T'^'A' = 'T'; To help myself and see it in bit patterns, I wrote this snippet:


void main(){

  enum AT = 'A'^'T';
  enum CG = 'C'^'G';
  enum chars = [Repeat!('A'-'\0', '\0'), 'T',
                Repeat!('C'-'A'-1, '\0'), 'G',
                Repeat!('G'-'C'-1, '\0'), 'C',
                Repeat!('T'-'G'-1, '\0'), 'A'];


  writef("BIN %0 8b DEC %d\n", 'A', 'A');
  writef("BIN %0 8b DEC %d\n", 'T', 'T');
  writef("XOR %0 8b DEC %d\n", AT, AT);
  writef("TOR %0 8b DEC %d\n", AT^'T', AT^'T', AT^'T');
  writef("AOR %0 8b DEC %d\n", AT^'A', AT^'A', AT^'A');

  foreach (i, c; chars){
    if (i >= 60) writef("%02d: %0 8b, %d\n",i, c, c); // elements before 60 are all \0
  }
}


// Output
BIN 01000001 DEC 65
BIN 01010100 DEC 84
XOR 00010101 DEC 21
TOR 01000001 DEC 65
AOR 01010100 DEC 84
60: 00000000, 0
61: 00000000, 0
62: 00000000, 0
63: 00000000, 0
64: 00000000, 0
65: 01010100, 84
66: 00000000, 0
67: 01000111, 71
68: 00000000, 0
69: 00000000, 0
70: 00000000, 0
71: 01000011, 67
72: 00000000, 0
73: 00000000, 0
74: 00000000, 0
75: 00000000, 0
76: 00000000, 0
77: 00000000, 0
78: 00000000, 0
79: 00000000, 0
80: 00000000, 0
81: 00000000, 0
82: 00000000, 0
83: 00000000, 0
84: 01000001, 65

May 22, 2017
On Monday, 22 May 2017 at 08:58:24 UTC, biocyberman wrote:
> @Nicolas Wilson: Your explanation of the enum is clear and very helpful. I can recall to the same technique used in kh_hash in samtools and the associated. With that said, the chars enum is only to 'T' (85) elements.

The reason for having only 85 elements in the array was pure laziness - the problem description seems to forbid non-ACGT letters, so I saw no reason to write more code to handle them. :p My code will crash if the input string contains lower-case letters, Zs, or any other letter beyond 'T' (or read random bits of memory if you're lucky).

> @ag0aep6g
>>You fell into a trap there. The value is calculated at compile time, but it has >copy/paste-like behavior. That is, whenever you use `chars`, the code behaves as if you >typed out the array literal. That means, the whole array is re-created on every iteration.
>
>>Use `static immutable` instead. It still forces compile-time calculation, but it doesn't > have copy/paste behavior. Speeds up revComp3 a lot.
>
> With  'iteration' here you mean running lifetime of the function, or in other words, each one of the 10_000 cycles in the benchmark?
>
> Could you provide some more reading for what you are telling here? I can only guess it is intrinsic behavior of an 'enum'.

ag0aep6g is absolutely correct in his observation, and the resulting code is basically this:

string revComp3(string bps) {
    const N = bps.length;
    static immutable chars_saved = [Repeat!('A'-'\0', '\0'), 'T',
                Repeat!('C'-'A'-1, '\0'), 'G',
                Repeat!('G'-'C'-1, '\0'), 'C',
                Repeat!('T'-'G'-1, '\0'), 'A'];
    char[] result = new char[N];
    for (int i = 0; i < N; ++i) {
        auto chars = chars_saved.dup; // Bad stuff happens here
        result[i] = chars[bps[N-i-1]];
    }
    return result.assumeUnique;
}

As we can see, it copies the entire array for every character in the input string. That's basically an allocation and a memcpy in the innermost, hottest loop. Roughly as bad as it gets. (yup, that's 20 million allocations)

As for why this happens, enum can be thought of as the analog of C's #define - the compiler precalculates the data to fill the array, and then copies that into the source code every time it's used.
May 22, 2017
On 05/22/2017 10:58 AM, biocyberman wrote:
> @ag0aep6g
>> You fell into a trap there. The value is calculated at compile time, but it has >copy/paste-like behavior. That is, whenever you use `chars`, the code behaves as if you >typed out the array literal. That means, the whole array is re-created on every iteration.
> 
>> Use `static immutable` instead. It still forces compile-time calculation, but it doesn't > have copy/paste behavior. Speeds up revComp3 a lot.
> 
> With  'iteration' here you mean running lifetime of the function, or in other words, each one of the 10_000 cycles in the benchmark?

For reference, here is the version of revComp3 I commented on:

----
string revComp3(string bps) {
    const N = bps.length;
    enum chars = [Repeat!('A'-'\0', '\0'), 'T',
                Repeat!('C'-'A'-1, '\0'), 'G',
                Repeat!('G'-'C'-1, '\0'), 'C',
                Repeat!('T'-'G'-1, '\0'), 'A'];

    char[] result = new char[N];
    for (int i = 0; i < N; ++i) {
        result[i] = chars[bps[N-i-1]];
    }
    return result.assumeUnique;
}
----

By "iteration" I mean every execution of the body of the `for` loop. For every new `i`, a new array is created.

The loop above is equivalent to this:

----
    for (int i = 0; i < N; ++i) {
        result[i] = [Repeat!('A'-'\0', '\0'), 'T',
                Repeat!('C'-'A'-1, '\0'), 'G',
                Repeat!('G'-'C'-1, '\0'), 'C',
                Repeat!('T'-'G'-1, '\0'), 'A'][bps[N-i-1]];
    }
----

Used like that, the array literal

    [Repeat!('A'-'\0', '\0'), 'T',
    Repeat!('C'-'A'-1, '\0'), 'G',
    Repeat!('G'-'C'-1, '\0'), 'C',
    Repeat!('T'-'G'-1, '\0'), 'A']

allocates a new array on every execution of `result[i] = ...;`.

> Could you provide some more reading for what you are telling here? I can only guess it is intrinsic behavior of an 'enum'.

Unfortunately, the spec page (<https://dlang.org/spec/enum.html>) doesn't seem to mention this.

But Ali Çehreli covers it in his book on the "immutability" page (I would have expected to find it on the "enum" page):

http://ddili.org/ders/d.en/const_and_immutable.html#ix_const_and_immutable.enum

The details can be confusing here. There is an element of copy/paste-like behavior, but it's not as simple as taking the right-hand side of the enum declaration and substituting it for the left-hand name.

The right-hand side is evaluated at compile time. The result of that can be thought of as an array literal. It's that array literal that gets substituted for the name.

An example with comments:

----
import std.stdio: writeln;

/* f prints a message when called at run time. Then it returns its
argument times ten. */
int f(int x)
{
    if (!__ctfe) writeln("f(", x, ")");
    return x * 10;
}

void main()
{
     /* The following line prints f's messages. The f calls are normal
    run-time calls. Then the writeln prints "false" because each array
    literal creates a new, distinct array.
    */
    writeln([f(1), f(2)] is [f(1), f(2)]); /* false */

    /* The next `enum` line does not print f's messages. The calls go
    through CTFE.
    The `writeln` line afterwards prints "false". ea gets pre-computed
    via CTFE, but the result acts like an array literal. So it's the
    same as writing `writeln([10, 20] is [10, 20]);`.
    */
    enum int[] ea = [f(1), f(2)];
    writeln(ea is ea); /* false */

    /* No messages either with `static immutable`. Like ea, the
    right-hand side goes through CTFE.
    But unlike ea, ia does not act like an array literal. `writeln`
    prints "true".
    */
    static immutable int[] ia = [f(1), f(2)];
    writeln(ia is ia); /* true */
}
----
May 22, 2017
On Monday, 22 May 2017 at 10:35:36 UTC, ag0aep6g wrote:
> On 05/22/2017 10:58 AM, biocyberman wrote:
>> [...]
>
> For reference, here is the version of revComp3 I commented on:
>
> ----
> string revComp3(string bps) {
>     const N = bps.length;
>     enum chars = [Repeat!('A'-'\0', '\0'), 'T',
>                 Repeat!('C'-'A'-1, '\0'), 'G',
>                 Repeat!('G'-'C'-1, '\0'), 'C',
>                 Repeat!('T'-'G'-1, '\0'), 'A'];
>
> [...]

Very illustrative. I could easily miss and I did miss this subtle but important aspect. I wonder how D should widen the 'pit of success' that Scott Meyers mentioned about more than once. A take home message for myself, if one ever use an array as a lookup table, make it 'static immutable. And enum array does not make sense'. And in Ali's book:

> Consider the hidden cost of enum arrays and enum associative arrays. Define them as immutable variables if the arrays are large and they are used more than once in the program.

One thing also became clear: 'is' is not '=='. Therefore

writeln([10,20] is [10,20]); /* false */
writeln([10,20] == [10,20]); /* true */

I did not notice that because I haven't come across 'is' so often.

May 25, 2017
On 05/22/2017 03:35 AM, ag0aep6g wrote:

> But Ali Çehreli covers it in his book on the "immutability" page (I
> would have expected to find it on the "enum" page):
>
> http://ddili.org/ders/d.en/const_and_immutable.html#ix_const_and_immutable.enum

Thank you for your feedback. I'm inserting a forward reference from the enum chapter.

I would like to acknowledge you in the book hopefully with your real name but if you don't want or care, I will use ag0aep6g. :)

Ali

May 25, 2017
On 05/25/2017 10:41 PM, Ali Çehreli wrote:
> I would like to acknowledge you in the book hopefully with your real name but if you don't want or care, I will use ag0aep6g. :)

Thanks, but I don't really care for the recognition. If anything, please use ag0aep6g.
May 25, 2017
On Friday, 19 May 2017 at 07:46:13 UTC, Stefan Koch wrote:
> On Friday, 19 May 2017 at 07:29:44 UTC, biocyberman wrote:
>> I am solving this problem http://rosalind.info/problems/revc/ as an exercise to learn D. This is my solution:
>>
>> https://dpaste.dzfl.pl/8aa667f962b7
>>
>> Is there some D tricks I can use to make the `reverseComplement` function more concise and speedy? Any other comments for improvement of the whole solution are also much appreciated.
>
> I think doing a switch or even a if-else chain would be faster then using an AA.

As part of your work on newCTFE, you are in a position to write a book titled "Fast D", about code that compiles, execute fast and efficiently.
1 2
Next ›   Last »