March 23, 2014
I decided to redirect stdout to nul and print the stopwatch messages to stderr.
So, basically like this.

import std.stdio;
import std.datetime;
import std.cstream;

StopWatch sw;
sw.start();

measured code

sw.stop();
derr.writefln("time: ", sw.peek().msecs, "[ms]");

Then, windows results comparing two versions, this for n=2001, shows one form is about 3x faster when you redirect stdout to nul.

D:\diamond\diamond\diamond\Release>diamond 1>nul
time: 15[ms]
time: 42[ms]






March 23, 2014
On 03/22/2014 06:03 PM, Jay Norwood wrote:

> derr.writefln("time: ", sw.peek().msecs, "[ms]");

Cool. stderr should work too:

    stderr.writefln(/* ... */);

Ali

March 23, 2014
On Saturday, 22 March 2014 at 14:41:48 UTC, Jay Norwood wrote:
> The computation times of different methods can differ a lot.   How do you suggest to measure this effectively without the overhead of the write and writeln output?   Would a count of 100001 and stubs like below be reasonable, or would there be something else that would  prevent the optimizer from getting too aggressive?

I used this to benchmark H. S. Teoh's calendar formatter:

version(benchmark)
{
    int main(string[] args)
    {
        enum MonthsPerRow = 3;
        auto t = benchmark!(function() {
                foreach(formattedYear; iota(1800, 2000).map!(year => formatYear(year, MonthsPerRow)))
                {
                    foreach(_; formattedYear){};
                }
            })(30);
        writeln(t[0].msecs * 0.001);
        return 0;
    }
}

While the optimizer could probably remove all of that, it doesn't. I also tested it against other options like walkLength, this ended up begin the better choice.

(BTW, using joiner instead of join I was able to more than double the performance: https://github.com/luismarques/dcal/tree/benchmark . Once the pipeline is made lazy end to end that will probably have even more impact.)
March 23, 2014
Hmmm, looks like stderr.writefln requires format specs, else it omits the additional parameters. (not so on derr.writefln)

stderr.writefln("time: %s%s",sw.peek().msecs, "[ms]");

D:\diamond\diamond\diamond\Release>diamond 1>nul
time: 16[ms]
time: 44[ms]



March 23, 2014
I converted the solution examples to functions, wrote a test to measure each 100 times with a diamond of size 1001.  These are release build times.  timon's crashed so I took it out.  Maybe I made a mistake copying ... have to go back and look.


D:\diamond\diamond\diamond\Release>diamond 1>nul
brad: time: 78128[ms]
printDiamond1: time: 1166[ms]
printDiamond2: time: 1659[ms]
printDiamond3: time: 631[ms]
jay1: time: 466[ms]
sergei: time: 11944[ms]
jay2: time: 414[ms]


These are the the measurement functions


void measure( void function(in int a) func, int times, int diamondsz, string name ){
  StopWatch sw;
  sw.start();
  for (int i=0; i<times; i++){
    func(diamondsz);
  }
  sw.stop;
  stderr.writeln(name, ": time: ", sw.peek().msecs, "[ms]");
}

void measureu( void function(in uint a) func, int times, uint diamondsz, string name ){
  StopWatch sw;
  sw.start();
  for (int i=0; i<times; i++){
    func(diamondsz);
  }
  sw.stop;
  stderr.writeln(name, ": time: ", sw.peek().msecs, "[ms]");
}

int main(string[] argv)
{
	int times = 100;
	int dsz = 1001;
	uint dszu = 1001;
	measure (&brad,times,dsz,"brad");
	//measure (&timon,times,dsz,"timon");
	measureu (&printDiamond1,times,dszu,"printDiamond1");
	measure (&printDiamond2,times,dsz,"printDiamond2");
	measure (&printDiamond3,times,dsz,"printDiamond3");
	measure (&jay1,times,dsz,"jay1");
	measure (&sergei,times,dsz,"sergei");
	measure (&jay2,times,dsz,"jay2");

	return 0;

}

All the functions are like this:
void brad(in int length){
  import std.algorithm, std.range, std.stdio, std.conv;

  auto rng =
    chain(iota(length), iota(length, -1, -1))
    .map!((a => " ".repeat(length-a)),
    (a => "#".repeat(a*2+1)))
    .map!(a => chain(a[0].joiner, a[1].joiner, "\n"))
    .joiner;

  writeln(rng);
}

void timon(in int s){
  import std.stdio, std.range, std.algorithm, std.math;

  writef("%(%s\n%)", (i=>i.map!(a=>i.map!(b=>"* "[a+b>s/2])))
    (iota(-s/2,s/2+1).map!abs));
}
March 23, 2014
A problem with the previous brad measurement is that his solution creates a diamond of size 2n+1 for an input of n.  Correcting the size input for brad's function call, and re-running, I get this.  So the various solutions can have overhead computation time of 40x difference, depending on the implementation.

D:\diamond\diamond\diamond\Release>diamond 1>nul
brad: time: 19554[ms]
printDiamond1: time: 1154[ms]
printDiamond2: time: 1637[ms]
printDiamond3: time: 622[ms]
jay1: time: 475[ms]
sergei: time: 11939[ms]
jay2: time: 413[ms]



March 23, 2014
Jay Norwood:

> A problem with the previous brad measurement is that his solution creates a diamond of size 2n+1 for an input of n.  Correcting the size input for brad's function call, and re-running, I get this.  So the various solutions can have overhead computation time of 40x difference, depending on the implementation.

The task didn't ask for a computationally efficient solution :-) So you are measuring something that was not optimized for. So there's lot of variance.

Bye,
bearophile
March 23, 2014
On Sunday, 23 March 2014 at 17:30:20 UTC, bearophile wrote:

>
> The task didn't ask for a computationally efficient solution :-) So you are measuring something that was not optimized for. So there's lot of variance.
>
> Bye,
> bearophile

Yes, this is just for my own education.   My builds are using the dmd compiler on windows, and some  posts indicate I should expect better optimization currently with the ldc compiler... so maybe I'll get on a linux box and retest with ldc.



March 24, 2014
These were the times on ubuntu 64 bit dmd.  I added diamondShape, which is slightly modified to be consistent with the others .. just removing the second parameter and doing the writeln calls within the function, as the others have been done.  This is still with dmd.  I've downloaded ldc.

Also,  I posted the test code on dpaste.com/hold/1753517


brad: time: 20837[ms]
printDiamond1: time: 482[ms]
printDiamond2: time: 944[ms]
printDiamond3: time: 490[ms]
jay1: time: 62[ms]
sergei: time: 4154[ms]
jay2: time: 30[ms]
diamondShape: time: 3384[ms]

void diamondShape(in int N)
{
    import std.range : chain, iota, repeat;
    import std.algorithm : map;
    import std.conv : text;
    import std.string : center, format;
    import std.exception : enforce;
    dchar fillChar = '*';
    enforce(N % 2, format("Size must be an odd number. (%s)", N));

    foreach(ln;
			chain(iota(1, N, 2),
				  iota(N, 0, -2))
			.map!(i => fillChar.repeat(i))
			.map!(s => s.text)
			.map!(s => s.center(N))) writeln(ln);
}

March 24, 2014
On Sunday, 23 March 2014 at 18:28:18 UTC, Jay Norwood wrote:
> On Sunday, 23 March 2014 at 17:30:20 UTC, bearophile wrote:
>
>>
>> The task didn't ask for a computationally efficient solution :-) So you are measuring something that was not optimized for. So there's lot of variance.
>>
>> Bye,
>> bearophile
>
> Yes, this is just for my own education.   My builds are using the dmd compiler on windows, and some  posts indicate I should expect better optimization currently with the ldc compiler... so maybe I'll get on a linux box and retest with ldc.

So it's about speed now? Then I submit this:

//----
void printDiamond(size_t N)
{
    char[32] rawSpace = void;
    char[64] rawStars = void;
    char* pSpace = rawSpace.ptr;
    char* pStars = rawStars.ptr;
    if (N > 64)
    {
        pSpace = new char[](N/2).ptr;
        pStars = new char[](N).ptr;
    }
    pSpace[0 .. N/2] = ' ';
    pStars[0 ..   N] = '*';

    N/=2;
    foreach         (n ; 0 .. N + 1)
        writeln(pSpace[0 .. N - n], pStars[0 .. 2*n+1]);
    foreach_reverse (n ; 0 .. N)
        writeln(pSpace[0 .. N - n], pStars[0 .. 2*n+1]);
}
//----