Thread overview
reporting positive real-world experience
May 12, 2004
Helmut Leitner
May 12, 2004
Ben Hinkle
May 12, 2004
May 12, 2004
I'm reporting first positive results of translating the commandline-tool HFIND from C to D (=DFIND).

HFIND is a mix of CTAGS and GREP, a multi-file, multi-project
search utility that knows about C-family programming-language syntax.
You can ask it to search certain types of tokens (e. g. "only strings"
or more typical "only commands without strings or comments"),
it knows about words within identifiers, you can control your
project vocabulary and style conventions with it ... and more.

HFIND relies heavily on some C-libraries, so to simplify the translation
we gathered everything necessary into a single 150 KB / 6000 LOC file.

After translation DFIND is 69 KB / 2600 LOC. Most of this code reduction comes from  the replacement of dynamic array and hash modules by the simpler D built-ins. A lot of the string handling also became simpler.

The executables are 80 KB for HFIND, 99 KB for DFIND. The difference
is remarkable small, given the larger initial D footprint.

The runtime results seem to fall into two groups:

 (1) Benchmarks that mainly do searching and tokenizing.
     In this group D is on par with C, typically a few percent faster.

 (2) Benchmarks that use hash functions heavily.
     In this group D is about 4x faster. The C hash-modules
     which use a standard OO function call interface are
     slow compared to the D built-ins.

More testing and optimizing will be done. D results can only further improve that way.

Borland C 5.0x was used as the C compiler of choice
for the comparisons. Only the Win32 D-compiler was tested.

- C pointer juggling translates to array slices smoothly.
- Initial bad performance was traced most of the time to the
  avoidable use of toString (unnecessary object generation).

The results suggest that D is competitive for real-world
commandline-tool development and a good replacement for C
in this respect.

Helmut Leitner
Graz, Austria
May 12, 2004
> - Initial bad performance was traced most of the time to the
>   avoidable use of toString (unnecessary object generation).

I wonder if it is worth having overloaded toStrings with an optional buffer.
So for example
 char[] toString(int x, char[] buffer);
1) return one of the digits "0" ... "9" for numbers 0 to 9 (as does the
existing toString)
2) fill and return buffer if the result would fit (this would be new
3) otherwise allocate a new string, fill and return it

So the buffer would be a scratch pad to scribble on if needed but there would be no guarantee to use it. The less garbage generated the better!


May 12, 2004
This is great news! Thanks for putting this together.