November 01, 2010
bearophile:
> Now the total running time is about 0.3 seconds instead of 1.1 seconds.

The program allocates 169_000 TextNode  and 245_001 TagNode. Just allocating two dynamic arrays of them, even with disabled GC, takes about 0.16 seconds of the about 0.30 of running time.

The children arrays inside TagNode receive a total of 414_000 appends, they cause reallocations.

I'll try to study the code some more.

Bye,
bearophile
November 01, 2010
On 10/31/2010 16:57, Simen kjaeraas wrote:
> For very short functions, strings are better, because of the length of the 'return' keyword. Had D instead always returned the result of the last line of a function (unless specified to be of void return type), map/filter/reduce would likely not take strings.

In other words: function literals in D are too verbose to be convenient.
 This language-level shortcoming is plastered over at the library level.


-- 
Rainer Deyke - rainerd@eldwood.com
November 01, 2010
Converting the nodes to tagged structs, and allocating the nodes with a memory pool data structure (that allocates large chunks from the C heap), seems to reduce the total running time to about 0.15 seconds. I don't know a good language to write this kind of code.

Bye,
bearophile
November 01, 2010
On Sun, 31 Oct 2010 20:24:59 -0600
Rainer Deyke <rainerd@eldwood.com> wrote:

> On 10/31/2010 16:57, Simen kjaeraas wrote:
> > For very short functions, strings are better, because of the length of the 'return' keyword. Had D instead always returned the result of the last line of a function (unless specified to be of void return type), map/filter/reduce would likely not take strings.
> 
> In other words: function literals in D are too verbose to be convenient.
>  This language-level shortcoming is plastered over at the library level.

True. But it is very hard to find a notation that is at the same time:
* compact
* general
* clear
In many language communities, there are endless discussions of this topic. One common issue is the contradiction between compact and general. Oftentimes people come up with a cooool format without realising they only consider their little egotic use case (typically one single arg, one single expr, etc...).
Also, the need for clarity is difficult to achieve for historical reasons in languages of the C & Pascal lines: namely that func defs (and type defs) were basically special cased (because then they were not first-class elements):
	string Text (...) {...}
instead of
	Text = func string (...) {...}
This makes theoretically superior formats difficult to integrate nicely with the general syntax of the language. The solution is indeed to get rid of the legacy syntax and use only literals (1), but...


Denis

(1) Once the bug that prevents using literals in func defs, using the form
	auto f = function type (params) {block}
is solved, we do not need the legacy format at all anymore.
-- -- -- -- -- -- --
vit esse estrany ☣

spir.wikidot.com

November 01, 2010
On Sun, 31 Oct 2010 22:02:10 -0400
bearophile <bearophileHUGS@lycos.com> wrote:

> The children arrays inside TagNode receive a total of 414_000 appends, they cause reallocations.

Then an optimization would be to somehow grossly predict array sizes and preallocate?


Denis
-- -- -- -- -- -- --
vit esse estrany ☣

spir.wikidot.com

November 01, 2010
spir:

> Then an optimization would be to somehow grossly predict array sizes and preallocate?

In some situations that's not handy to do, because you don't know what's a good size to preallocate. So I suggest a more general solution, to create a struct that inside keeps a dynamic array of pointers to fixed-sized chunks of structs. The size of the chunk size is chosen at compile time to be "good" (close to 64 KB, for example). A member function of this struct may return a pointer to a new struct on request, and allocate a new chunk when the last allocated chunk is full. Another member function may clear the data structure with no deallocation (just resetting the pointer to the first free struct), and another member function may really free all the chunks. This data structure may allocate its chunks from the GC heap or the C heap. This is the data structure I have used for this parser.

Bye,
bearophile
November 01, 2010
spir:

> Then an optimization would be to somehow grossly predict array sizes and preallocate?

I have answered the wrong question sorry :-) My answer was about the tree node allocations.

I have not studied enough the allocation patterns of those arrays, so I can't answer. You may collect better statistics yourself.

Bye,
bearophile
November 02, 2010
On Sun, 31 Oct 2010 19:20:04 -0400, bearophile <bearophileHUGS@lycos.com> wrote:

> Simen kjaeraas:
>
>> They take both, in fact:
>>
>> auto cubes = map!((a){ return a*a*a; })(arr);
>
> But that needs to be compile-time constant, so if you have several functions, you need to put them inside a typetuple, or duplicate the code. And some idioms are just not possible.

I don't think it needs to be a compile-time constant.

I think one of the most horrible problems with documenting D templates is it's impossible to tell what's possible without trying it.

-Steve
November 03, 2010
I'll reply to several posts at once.

The code comments you quoted weren't meant as a criticism. They are mostly notes to myself in case I want to document my learning experience with D on some personal website no one ever visits. I'll post criticisms separately if/when I have any. :)

When I was speaking about several constructors, I referred to stuff like this:

    public TagNode(String name){
        this.name = name;
        this.children = new ArrayList<Node>(1);
    }

    public TagNode(String name, Node... children){
        this.children = Arrays.asList(children);
        for (Node child : children) {
            child.parent = this;
        }
    }

It's allowed in Java, but I can see why this can be prohibited.

You're right about unit tests. One thing that surprised me is that unit tests are run when the program is run. For some reason I thought they were run immediately after the compilation with the appropriate flag.

== Quote from bearophile (bearophileHUGS@lycos.com)'s article
> If you import:
> import core.memory: GC;
> And then you disable the GC just before parsing:
> GC.disable();
> The parsing runtime on my PC becomes about one third.

Interesting. I didn't think about GC, because there aren't any objects that get unreferenced. But, of course, garbage collector can't know about that until it runs. (Well, unless there is a mechanism for marking objects as non-collectible manually.)

Tangent question. Is there a way to disable GC per application thread?

> Disabling the GC when you have to build a large data structure and re-enabling
it after the creation is a normal optimization both in CPython and D.
> You may also import:
> import std.c.stdlib: exit;
> And add this last line to the main:
> exit(0);
> to kill garbage collection at the end to remove the final collection time you
weren't timing (but was there).
> Now the total running time is about 0.3 seconds instead of 1.1 seconds.
> All this means that your code is GC-bound :-) D GC is far worse than the Java one.
> I will try to optimize other things.
> Later,
> bearophile

Thanks for such informative replies. I can't quite keep up on the weekdays, but I will experiment with the suggestions this weekend and probably post an update of some sort.
November 03, 2010
T.D.Spense:

>     public TagNode(String name){
>         this.name = name;
>         this.children = new ArrayList<Node>(1);
>     }
> 
>     public TagNode(String name, Node... children){
>         this.children = Arrays.asList(children);
>         for (Node child : children) {
>             child.parent = this;
>         }
>     }
> 
> It's allowed in Java, but I can see why this can be prohibited.


Java code, prints "12":


class Node {}

class Main {
  public void foo(String name) {
    System.out.print("1");
  }

  public void foo(String name, Node... nodes) {
    System.out.print("2");
  }

   public static void main(String[] args) {
    Main m = new Main();
    m.foo("hello");
    m.foo("hello", new Node());
  }
}


In D the T[]... syntax means zero or more T, so if you supply zero T, the compiler can't know what of the two methods you are calling, so it's statically forbidden.

This prints 22 in Java:


class Node {}

class Main {
  public void foo(String name, Node... nodes) {
    System.out.print("2");
  }

   public static void main(String[] args) {
    Main m = new Main();
    m.foo("hello");
    m.foo("hello", new Node());
  }
}

Here I prefer how D is designed, it looks tidier (often it's the opposite, the C# looks designed in a more clear way compared to D).


> You're right about unit tests. One thing that surprised me is that unit tests are run when the program is run. For some reason I thought they were run immediately after the compilation with the appropriate flag.

If you compile your code with:

dmd foo.d
And then you run the program:
foo.exe
or:
./foo

Your unit tests aren't run.

If you instead compile with this:
dmd -unittest foo.d

And then you run the program, the unittests are run and then the program is run. In D2 with a version(unittest) you may disable the running as you like.


> Interesting. I didn't think about GC, because there aren't any objects that get unreferenced. But, of course, garbage collector can't know about that until it runs.

Right, the program builds the data structure and then now and then the GC transversed it all and marks all objects as reachable. This takes a lot of time.


> (Well, unless there is a mechanism for marking objects as non-collectible manually.)

A simple way to make objects not collectable is to allocate them from the C heap (there is a kind of placement new in D).


> Tangent question. Is there a way to disable GC per application thread?

I am not expert on this, I think there is no way yet. Maybe it will be added.

Bye,
bearophile
1 2
Next ›   Last »