September 07

Could I ask some questions on the pipeline, the following code always stops at the second cmds, cmds[1]. I tested the command under shell and found it can quickly end and get output.bam

import std.process;
import std.stdio;
import std.conv;
void executeCommandPipe(string[][] cmds) {

    // pipe init
    auto temp_pipe = pipe(); // save cmd1, cmd2, cmd3 ...

    // process first
    auto pid_first = spawnProcess(cmds[0], stdin, temp_pipe.writeEnd);
    scope(exit) wait(pid_first);

    // process cmd2 ~ cmdN-1
    for (int i = 1; i < cmds.length - 1; i++) {
        auto new_pipe = pipe(); // create next pipe

        auto pid = spawnProcess(cmds[i], temp_pipe.readEnd, new_pipe.writeEnd);
        scope(exit) wait(pid);

        temp_pipe = new_pipe; // update the pipe
    }

    // process final, output to stdout
    auto pid_final = spawnProcess(cmds[$-1], temp_pipe.readEnd, stdout);
    scope(exit) wait(pid_final);
}
void main() {
    string[][]cmds=[["samtools", "fixmate", "-@", "8", "-m", "-u", "input.bam", "-"], ["samtools", "sort", "-@", "8", "-u", "-",], ["samtools", "markdup", "-@", "8", "-u", "-", "output.bam"]];
    executeCommandPipe(cmds);
}

corresponding Linux command is

samtools fixmate -@ 8 -m -u input.bam - | samtools sort -@ 8 -u - | samtools markdup -u - output.bam
September 07

Thanks go to Paul Backus, nside the foreach loop, this will cause your code to wait at the end of the current loop iteration, not at the end of the function.