Commercial video processing app in D (experience report)

April 27, 2016

Posted by thedeemon

Permalink

thedeemon

Permalink

Hi,
I just wanted to share some experience of using D in industry.
Recently my little company released version 2.0 of our flagship product Video Enhancer, a video processing application for Windows, and this time it's written in D.
http://www.infognition.com/VideoEnhancer/

Couple of screenshots:
http://data.infognition.com/VideoEnhancer/ve2d-filters.jpg
http://data.infognition.com/VideoEnhancer/ve2e-save.jpg

Version 1 was born like 10 years ago and was of course written in C++. It consisted of main GUI executable and 5 dynamically loaded DirectShow filters. For GUI version 1 used MFC and a third-party skinning engine. This skinning engine had its own problems but since we didn't have its sources we couldn't fix them, meanwhile its author disappeared in sands of time. So when time has come to create version 2 I chose the best available language and an open source GUI library with 100% control and customizability - DLangUI. Overall, I'm pretty happy with this choice.

Version 2 is quite different from v1 in feature set and internal structure, it's not a direct translation. It consists of two executables running in tandem (one does GUI, the other deals with video) and 2 dynamically loaded DirectShow filters. Both main executables are purely in D, while the DirectShow filters are still in C++. Heavy number crunching, including our main feature - motion-based video upscaler - is still in C++, because of heavy SIMD usage and Intel compiler.

Main executable of version 1 was ~34K lines of C++ (half of which were libraries like pugixml) and full build took ~90 seconds.
Main executables of version 2 are in total ~7.5K lines of D (of which 2K are auto generated by IDL2D tool) and full build of GUI app takes 7 seconds (and the worker app builds in 3-4 seconds), so that's a really nice improvement.
Thanks to Phobos we don't need many libraries: things like XML parsing, ZIP unpacking and many others are all covered by the standard library. Only two additional libraries were used: Cerealed for serialization of messages the two processes exchange, and DLangUI.

Some things to reflect on:

Compiler
Compiler used is DMD 2.070, 32-bit target. Video Enhancer supports 200+ plugins from VirtualDub and they happen to be 32-bit, so our app has to be 32-bit too. Speed of code generated by DMD is more than enough, even debug builds were fast enough. The default linker is used (not the MS one), and I was worried there might be some troubles with antivirus false positives (that happened before when using optlink) but no, everything went smooth and no problems with optlink arose whatsoever.

IDE
Visual Studio 2010 with VisualD. I've used this combo for many years, generally quite successively. Last year its D parser had some problems that made it crash on code that used DLangUI, and that was painful. I even made a patch that made the crash silent, so VisualD would silently reload the parser and continue working. It worked, but luckily authors of D parser and VisualD quickly found the crash cause and fixed it, since then everything works smoothly out of the box. I like how well DML works there, with syntax highlighting and autocompletion:
http://data.infognition.com/VideoEnhancer/dml.png

Builds
We're not using Dub to build the app, it tends to be slow and rebuild dependencies too often (or maybe I just haven't learnt to use it properly). Instead we use Dub to build the libraries and produce .lib files, then reference libraries sources and lib files in VisualD project of the main apps and then use VisualD's simple building process that just invokes DMD.

Cerealed
This compile-time-introspection-based serializaition lib is really great: powerful and easy to use. We're probably using an old version, haven't updated for some time, and the version we use sometimes had problems serializing certain types (like bool[], IIRC), so sometimes we had to tweak our message types to make it compile, but most of the time it just works.

DLangUI
Very nice library. Documentation is very sparse though, so learning to use DLangUI often means reading source code of examples and the lib itself, and sometimes even that's not enough and you need to learn some Android basics, since it originates from Android world. But once you learn how to use it, how to encode what you need in DML (a QML counterpart) or add required functionality by overriding some method of its class, it's really great and pleasant to use. Many times I was so happy the source code is available, first for learning, then for tweaking and fixing bugs. I've found a few minor bugs and sent a few trivial fixes that were merged quickly. DLangUI is cross-platform and has several backends for drawing and font rendering. We're using its minimal build targeted to use Win32 API (had to tweak dub.json a bit). We don't use OpenGL, as it's not really guaranteed to work well on any Windows box. Using just WinAPI makes our app smaller, more stable and avoids dependencies.

Garbage Collection
Totally fine if you know what you're doing. Some people say you can't make responsive GUI apps in GC-ed languages. That's a myth. In this application we rely on GC almost as freely as one would in C#, allocate strings, messages, widgets and other small litter without having to worry about deallocation, and it works well, I've never seen any performance problems related to GC in this app, and no GC pauses were visible at all. However this is a 32-bit app and some care must be taken to avoid excessive memory use. Firstly and mainly, where bitmaps for GUI elements and video frames are allocated. DLangUI tends to allocate everything in GC heap without second thought and has been allocating some arrays even just to draw some pictures rescaled. If you're resizing a window and it has some big bitmap that gets resized too, by default DLangUI tends to just allocate a new buffer for each new size, this does not look good in memory section of Task Manager. However if instead of allocating a new ColorDrawBuf, for example, you use its resize() method and don't forget to call assumeSafeAppend() for its buffer, then it stops eating memory and behaves well, even though it's still "managed", no manual management required. In couple of places I did change ordinary arrays to std.container.array to reduce allocations, the code did not change much but there was much less work for GC. One such change where it was important (allocation during bitmap drawing) was merged back to DLangUI, so don't worry about that one.

COM
To deal with video (reading, parsing, decoding, encoding..) we use DirectShow which is COM-based. I've found a lot of good stuff for working with COM in VisualD sources. There is IDL2D utility that converts IDL files from Windows SDK to D source files. We used it actively (~2K lines generated). There is also TLB to IDL converter if you need it (I haven't tried to use it myself). The way it converts COM interfaces, it adds their IIDs (GUIDs) as a static member of the interface, so when you need to QueryInterface() you don't need to provide IID for the interface, it's already there, and this is used by the ComPtr smart pointer that deals with most COM stuff like reference counting too. I borrowed initial ComPtr implementation from VisualD and then changed it a lot, most importantly, my version also checks for errors automatically, now I have Error Monad for free:
http://www.infognition.com/blog/2016/error_checking_smart_pointer.html

D features used.
Apart from the smart pointer above, we used compile-time introspection to pass and automatically dispatch messages between two processes, there is a blog post about that too. We used User Defined Attributes to describe what to do with errors if they arise in message handlers
http://stuff.thedeemon.com/lj/onerror.png
and then proper error handling was performed automatically, no need to repeat it in each method, just add one word of annotation and it works.
We used std.concurrency message passing to talk between GUI and other threads.
We enjoyed the ability to call C++ interfaces directly from D and D methods from C++, no FFI required, that was really nice and helped a lot.
Ability to include binary data as simple as import("file.dat") was also very nice. This way we include some images and also bytecode for a little VM, generated by a compiler written in OCaml, but that is another story.
That's what comes to mind now, the rest of the code is quite boring I guess.

TL;DR: at least in some areas D is a fine successor to C++ and can be used to make real world video processing apps that people around the world use and pay for.

On Wednesday, 27 April 2016 at 12:42:05 UTC, thedeemon wrote:
>
> Compiler
> Compiler used is DMD 2.070, 32-bit target. Video Enhancer supports 200+ plugins from VirtualDub and they happen to be 32-bit, so our app has to be 32-bit too. Speed of code generated by DMD is more than enough, even debug builds were fast enough. The default linker is used (not the MS one), and I was worried there might be some troubles with antivirus false positives (that happened before when using optlink) but no, everything went smooth and no problems with optlink arose whatsoever.
>

Note that starting with the newest LDC releases you can have Win32 builds.

The parameters to pass to the linker to avoid a VS runtime dependency are:

    link.exe [...stuff...] libcmt.lib /nodefaultlib:msvcrt.lib /nodefaultlib:vcruntime.lib

Users report such builds working on XP, Vista and later of course.
The advantages are faster binaries (typically 2x faster) and importantly lack of backend regressions.

Forums