Thread overview
commonmark-d: A fast CommonMark and Github Flavoured Markdown parser, translation of MD4C
Sep 30, 2019
Guillaume Piolat
Oct 01, 2019
Mike Parker
Oct 01, 2019
Dennis
Oct 02, 2019
Guillaume Piolat
Oct 01, 2019
bachmeier
Oct 02, 2019
Guillaume Piolat
Oct 02, 2019
zoujiaqing
Oct 02, 2019
Guillaume Piolat
Oct 03, 2019
LocoDelPueblo
Oct 05, 2019
zoujiaqing
September 30, 2019
Hello,

commonmark-d is a D translation of MD4C, a fast SAX-like Markdown parser.
MD4C achieves remarkable parsing speed through the lack of AST and careful memory usage.

The route of translation was choosen because parsing Markdown is much more involved that first thought. The D translation largely preserve the speed benefits of M4DC.


Usage:

    // Parse CommonMark, generate HTML
    import commonmarkd;
    string html = convertMarkdownToHTML(markdown);

Key Performance Numbers:
    - commonmark-d compiles 3x faster than dmarkdown and 40x faster than hunt-markdown.
    - commonmark-d parses Markdown 2x faster than dmarkdown and 15x faster than hunt-markdown (see GitHub for benchmark details)

I haven't measured memory usage of either compile time or run time, but I feel like it's also better.

Available now on DUB: http://code.dlang.org/packages/commonmark-d
GitHub page: https://github.com/p0nce/commonmark-d






October 01, 2019
On Monday, 30 September 2019 at 23:06:42 UTC, Guillaume Piolat wrote:
> Hello,
>
> commonmark-d is a D translation of MD4C, a fast SAX-like Markdown parser.
>

Thumbs up!


October 01, 2019
Cool!

On Monday, 30 September 2019 at 23:06:42 UTC, Guillaume Piolat wrote:
> Key Performance Numbers:

Have you compared it with the original C code from MD4C?
October 01, 2019
On Monday, 30 September 2019 at 23:06:42 UTC, Guillaume Piolat wrote:
> Hello,
>
> commonmark-d is a D translation of MD4C, a fast SAX-like Markdown parser.
> MD4C achieves remarkable parsing speed through the lack of AST and careful memory usage.
>
> The route of translation was choosen because parsing Markdown is much more involved that first thought. The D translation largely preserve the speed benefits of M4DC.
>
>
> Usage:
>
>     // Parse CommonMark, generate HTML
>     import commonmarkd;
>     string html = convertMarkdownToHTML(markdown);
>
> Key Performance Numbers:
>     - commonmark-d compiles 3x faster than dmarkdown and 40x faster than hunt-markdown.
>     - commonmark-d parses Markdown 2x faster than dmarkdown and 15x faster than hunt-markdown (see GitHub for benchmark details)
>
> I haven't measured memory usage of either compile time or run time, but I feel like it's also better.
>
> Available now on DUB: http://code.dlang.org/packages/commonmark-d
> GitHub page: https://github.com/p0nce/commonmark-d

This is really nice. The examples show only conversion to html. Is there an easy way to get the intermediate output and convert to PDF through latex, to org-mode, etc., or to change the html conversion? One use case that is easy with Pandoc is to copy just the code from markdown into its own source file as a simple form of literate programming.
October 02, 2019
On Tuesday, 1 October 2019 at 11:37:00 UTC, Dennis wrote:
> Cool!
>
> On Monday, 30 September 2019 at 23:06:42 UTC, Guillaume Piolat wrote:
>> Key Performance Numbers:
>
> Have you compared it with the original C code from MD4C?

No. It's completely possible that there is a small difference, however most of the code is under nothrow @nogc and only use GC to allocate the output buffer (the grow strategy might matter there). I don't expect much difference, but yeah, haven't tested :)
October 02, 2019
On Tuesday, 1 October 2019 at 16:02:47 UTC, bachmeier wrote:
> On Monday, 30 September 2019 at 23:06:42 UTC, Guillaume Piolat wrote:
>> Hello,
>>
>> commonmark-d is a D translation of MD4C, a fast SAX-like Markdown parser.
>> MD4C achieves remarkable parsing speed through the lack of AST and careful memory usage.
>>
>> The route of translation was choosen because parsing Markdown is much more involved that first thought. The D translation largely preserve the speed benefits of M4DC.
>>
>>
>> Usage:
>>
>>     // Parse CommonMark, generate HTML
>>     import commonmarkd;
>>     string html = convertMarkdownToHTML(markdown);
>>
>> Key Performance Numbers:
>>     - commonmark-d compiles 3x faster than dmarkdown and 40x faster than hunt-markdown.
>>     - commonmark-d parses Markdown 2x faster than dmarkdown and 15x faster than hunt-markdown (see GitHub for benchmark details)
>>
>> I haven't measured memory usage of either compile time or run time, but I feel like it's also better.
>>
>> Available now on DUB: http://code.dlang.org/packages/commonmark-d
>> GitHub page: https://github.com/p0nce/commonmark-d
>
> This is really nice. The examples show only conversion to html. Is there an easy way to get the intermediate output and convert to PDF through latex, to org-mode, etc., or to change the html conversion? One use case that is easy with Pandoc is to copy just the code from markdown into its own source file as a simple form of literate programming.

MD4C is a push parser without AST so you have to give it callbacks to generate any koind of intermediate output. You'd have to make md_parse public in commonmark-d, this is a C-style API

My long term goal is indeed super fast conversion of markdown to PDF, now we have the commonmark parser and the PDF generation, I just need the time to manage layout. Possibly making a minimal browser is a better route, dunno.

October 02, 2019
On Monday, 30 September 2019 at 23:06:42 UTC, Guillaume Piolat wrote:
> Hello,
>
> I haven't measured memory usage of either compile time or run time, but I feel like it's also better.
>

Thanks, I like this project.

Because hunt-markdown is strictly abstract in design, the performance is not particularly good:)
October 02, 2019
On Wednesday, 2 October 2019 at 09:33:03 UTC, zoujiaqing wrote:
> On Monday, 30 September 2019 at 23:06:42 UTC, Guillaume Piolat wrote:
>> Hello,
>>
>> I haven't measured memory usage of either compile time or run time, but I feel like it's also better.
>>
>
> Thanks, I like this project.
>
> Because hunt-markdown is strictly abstract in design, the performance is not particularly good:)

I wanted to use hunt-markdown but was thinking it could use a bit less RAM :) Translations look like the originals. I'd be very happy if you can consider commonmark-d for your use case. Having no AST is less flexible but have nice properties.
October 03, 2019
On Monday, 30 September 2019 at 23:06:42 UTC, Guillaume Piolat wrote:
> Hello,
>
> commonmark-d is a D translation of MD4C, a fast SAX-like Markdown parser.
> MD4C achieves remarkable parsing speed through the lack of AST and careful memory usage.
>
> The route of translation was choosen because parsing Markdown is much more involved that first thought. The D translation largely preserve the speed benefits of M4DC.
>
>
> Usage:
>
>     // Parse CommonMark, generate HTML
>     import commonmarkd;
>     string html = convertMarkdownToHTML(markdown);
>
> Key Performance Numbers:
>     - commonmark-d compiles 3x faster than dmarkdown and 40x faster than hunt-markdown.
>     - commonmark-d parses Markdown 2x faster than dmarkdown and 15x faster than hunt-markdown (see GitHub for benchmark details)
>
> I haven't measured memory usage of either compile time or run time, but I feel like it's also better.
>
> Available now on DUB: http://code.dlang.org/packages/commonmark-d
> GitHub page: https://github.com/p0nce/commonmark-d

d-markdown was actually extracted from vibe-d a a few years ago, mostly for a software called "harbored-mod", to add support for markdown in DDOC comments, so vibe-d MD module should still be in the same magnitude of "sub-optimal-ity".

For conversions from MD to HTML, in a static context (i.e not a server), I'd just use Pandoc. markdown-d had some bugs. Maybe fixed in the newest vibe-d since the fork you compare to was basically dead-born.

October 05, 2019
On Thursday, 3 October 2019 at 08:19:12 UTC, LocoDelPueblo wrote:
>
> d-markdown was actually extracted from vibe-d a a few years
> 

But it is not compatible with commonmark syntax.