Thread overview
Local static class fields
Aug 12, 2019
Bert
Aug 13, 2019
DanielG
Aug 14, 2019
Bert
Aug 13, 2019
Paul Backus
Aug 13, 2019
Simen Kjærås
Aug 13, 2019
Bert
Aug 13, 2019
Simen Kjærås
Aug 14, 2019
Bert
Aug 13, 2019
Simen Kjærås
August 12, 2019
Making a field static is effectively a global variable to the class.

I have a recursive class structure(think of a graph or tree) and I need to keep a global state for it, but this state actually needs to be different for each tree object. The reason for this is that structurally it will not change per tree and if it it is not static then it wastes space unnecessarily.

class X
{
    X[] x;
    static State s;
}

making s non-static has the problem that every X will allocate storage for s wasting space since for any instance chain of X's they will all have the same s. Making it static though then makes it the same for all instance chain's of X, of which it should be local to each.

One way around this is to use an dictionary that associates each state to each instance chain.

That is there is a initial X that is the parent of the chain and we can then associate a state to it's address, any other chain will have a different initial X and so we can get a new state.

class X
{
    X[] x;
    static State[X] s;
}


This is not a huge difficulty to do but does require finding the initial X to use, which is not difficult but wastes cycles for large chains. There will, of course, be a constructor that constructs a chain and creates a new state for it and updates s.


Is there any way to do this more naturally in D?

August 13, 2019
On Monday, 12 August 2019 at 22:48:43 UTC, Bert wrote:
> I have a recursive class structure(think of a graph or tree) and I need to keep a global state for it,

If I'm understanding the problem correctly, it seems like you have a choice to make: either "bloat" the child nodes by the size of a pointer/reference to the shared tree state (4 or 8 bytes), or forego that by traversing the tree (from child to root node) every time you want to access the state.

But if you don't want traversal every time, you're going to have to cache some kind of pointer (or Tree ID in the case of a central dictionary). I suppose if you got down to the nitty-gritty maybe you could use an 8- or 16-bit value for the key ID? (Disclosure: I know absolutely zero about how D lays out classes and structs in memory, this might not save any memory at all)

It would help if you could provide some rough metrics:

- How deep you expect each recursive structure to be (from root to deepest leaf node)

- How many total nodes you expect to have across all trees (such that adding a single pointer to each instance would be memory-prohibitive)

- How often each node will have to access the shared state (such that traversal-every-time would be too slow)

That might help in figuring out the best approach.
August 13, 2019
On Monday, 12 August 2019 at 22:48:43 UTC, Bert wrote:
> Making a field static is effectively a global variable to the class.
>
> I have a recursive class structure(think of a graph or tree) and I need to keep a global state for it, but this state actually needs to be different for each tree object. The reason for this is that structurally it will not change per tree and if it it is not static then it wastes space unnecessarily.
>
> [...]
>
> Is there any way to do this more naturally in D?

It seems to me like the obvious solution is to use two different classes, one to store the global state, and one to store the individual objects in your structure. For example:

class Tree {
    State state;
    Node root;
}

class Node {
    Node[] children;
}
August 13, 2019
On Monday, 12 August 2019 at 22:48:43 UTC, Bert wrote:
> Making a field static is effectively a global variable to the class.
>
> I have a recursive class structure(think of a graph or tree) and I need to keep a global state for it, but this state actually needs to be different for each tree object. The reason for this is that structurally it will not change per tree and if it it is not static then it wastes space unnecessarily.
>
> class X
> {
>     X[] x;
>     static State s;
> }
>
> making s non-static has the problem that every X will allocate storage for s wasting space since for any instance chain of X's they will all have the same s. Making it static though then makes it the same for all instance chain's of X, of which it should be local to each.
>
> One way around this is to use an dictionary that associates each state to each instance chain.
>
> That is there is a initial X that is the parent of the chain and we can then associate a state to it's address, any other chain will have a different initial X and so we can get a new state.
>
> class X
> {
>     X[] x;
>     static State[X] s;
> }
>
>
> This is not a huge difficulty to do but does require finding the initial X to use, which is not difficult but wastes cycles for large chains. There will, of course, be a constructor that constructs a chain and creates a new state for it and updates s.
>
>
> Is there any way to do this more naturally in D?

First: Are you low on memory? Is traversing the tree actually slow? Since you don't know which of these issues to prioritize, in all likelihood the answer to both questions is no, and you should use the solution that's easier to read.

Of course, that's not the answer you want to hear. Let's consider the alternatives:

1) Embed it in each node. Costs 1 pointer's worth of memory per node. Very fast. Very easy to understand.

2) Use a dictionary with entries for each node. Costs more than 1 pointer's worth of memory per node. Fast, but not quite as fast as 1). 1) is almost definitely a better option.

3) Use a dictionary with entries for each root node. Costs more than 1 pointer's worth of memory, but per tree instead of per node. Requires traversal of the tree to find the root, so slower than both 1) and 2), especially for deep hierarchies. Complicated code.

4) Use a derived class for the root node (class RootX : X { State s }). Still requires traversing the tree to find the root, likely via a virtual method. Also requires a bit more work setting up the tree. Might be faster than 3), definitely smallest memory footprint. Relatively easy to follow.

There may be other options, but from the above I'd go with either 1) or 4).

Now, I've noticed that X doesn't have a pointer to its parent, so traversing the tree to find the root is currently impossible. If this is true, and you have no other use for traversing up the tree, then 1) is the way to go, as 2) is almost never a better option, and 3) and 4) requires adding a parent pointer so you'll pay the memory price anyway.

--
  Simen
August 13, 2019
On Tuesday, 13 August 2019 at 04:43:29 UTC, Paul Backus wrote:
> On Monday, 12 August 2019 at 22:48:43 UTC, Bert wrote:
>> Making a field static is effectively a global variable to the class.
>>
>> I have a recursive class structure(think of a graph or tree) and I need to keep a global state for it, but this state actually needs to be different for each tree object. The reason for this is that structurally it will not change per tree and if it it is not static then it wastes space unnecessarily.
>>
>> [...]
>>
>> Is there any way to do this more naturally in D?
>
> It seems to me like the obvious solution is to use two different classes, one to store the global state, and one to store the individual objects in your structure. For example:
>
> class Tree {
>     State state;
>     Node root;
> }
>
> class Node {
>     Node[] children;
> }

So I have a Node. How do I find the Tree it belongs to?

--
  Simen
August 13, 2019
On Tuesday, 13 August 2019 at 04:43:29 UTC, Paul Backus wrote:
> On Monday, 12 August 2019 at 22:48:43 UTC, Bert wrote:
>> Making a field static is effectively a global variable to the class.
>>
>> I have a recursive class structure(think of a graph or tree) and I need to keep a global state for it, but this state actually needs to be different for each tree object. The reason for this is that structurally it will not change per tree and if it it is not static then it wastes space unnecessarily.
>>
>> [...]
>>
>> Is there any way to do this more naturally in D?
>
> It seems to me like the obvious solution is to use two different classes, one to store the global state, and one to store the individual objects in your structure. For example:
>
> class Tree {
>     State state;
>     Node root;
> }
>
> class Node {
>     Node[] children;
> }

Yes, I forgot to mention this as a solution.

I was going to go in to more detail. This is not a great solution because it requires a base class simply to hold the "global" state and it requires nodes to have an accessor(they won't know which tree they come from).

What I was thinking though, with D's capabilities, if a constructor could somehow return a "Voldemort" type that makes it all work out.

e.g.,

class Tree {
    State state;
    Node root;
    alias root this;
}

then the constructor just returns tree which acts as an node.

It doesn't have to be a voldemort but it at least hides some of the details...

Ultimately it is not too different from using a dictionary though, effectively does the same thing.

The main issue with your solution is that Nodes themselves do not have access to the state directly and that is a similar issue with the dictionary. One has to traverse up to the root and then get the state, in your case, Tree would have to inherit from Node so it could be cast and be the root(else the Node would have to reference the tree).

There may be no way around such a problem in that these might be "optimal". Your's, which added parent in node, might be better since a dictionary doesn't have to be directly maintained.

I'd like to get away from actually having a different "root" type though. Mainly because it reduces uniformity and complicates the design I already have. If I could hide all these things and it is not too complicated then it would work better for me.



August 13, 2019
On Tuesday, 13 August 2019 at 08:41:02 UTC, Bert wrote:
> On Tuesday, 13 August 2019 at 04:43:29 UTC, Paul Backus wrote:
>> It seems to me like the obvious solution is to use two different classes, one to store the global state, and one to store the individual objects in your structure. For example:
>>
>> class Tree {
>>     State state;
>>     Node root;
>> }
>>
>> class Node {
>>     Node[] children;
>> }
>
> Yes, I forgot to mention this as a solution.
>
> I was going to go in to more detail. This is not a great solution because it requires a base class simply to hold the "global" state and it requires nodes to have an accessor(they won't know which tree they come from).

What you're essentially looking at here is my solution 4):

class Node {
    Node parent;
    Node[] children;
    @property State state() {
        return parent.state;
    }
}

class Root : Node {
    State treeState;
    override @property State state() {
        return treeState;
    }
}

class State {}



> What I was thinking though, with D's capabilities, if a constructor could somehow return a "Voldemort" type that makes it all work out.
>
> e.g.,
>
> class Tree {
>     State state;
>     Node root;
>     alias root this;
> }
>
> then the constructor just returns tree which acts as an node.
>
> It doesn't have to be a voldemort but it at least hides some of the details...
>
> Ultimately it is not too different from using a dictionary though, effectively does the same thing.

Something like this may be possible, but it will still need to keep around a pointer to the context (likely the root), costing you the exact same as having a pointer to either the root or the state in each node:

class Root {
    State state;
    this() { state = new State(); }
    auto newNode() {
        class Node {
            State getState() { return state; }
        }
        return new Node();
    }
}

class State { }

unittest {
    auto root1 = new Root();
    auto node = root1.newNode();

    assert(node.getState() == root1.state);
    // Note that Node keeps track of its context, thus taking up more space:
    assert(__traits(classInstanceSize, typeof(node)) > __traits(classInstanceSize, State));
}


> There may be no way around such a problem in that these might be "optimal". Your's, which added parent in node, might be better since a dictionary doesn't have to be directly maintained.

Also, if space is a concern, the dictionary is actually worse than adding a pointer to the state in each node, as the internal book-keeping of the dictionary is more than one pointer's worth per entry (at the very least it needs to keep both the hash of the Node and the pointer to the state).


> I'd like to get away from actually having a different "root" type though. Mainly because it reduces uniformity and complicates the design I already have. If I could hide all these things and it is not too complicated then it would work better for me.

The solution of having a Root class that derives from Node causes complications in three situations: When you create a new tree, when you move a subtree to become a new tree, and when you move a tree to become a subtree. The last two may or may not actually occur in your code, but without knowing more, I have no idea if that's the case. I'm fairly certain you will at some point be creating new trees, though. If the creation of new trees is confined to one or two functions while the creation of new nodes is widespread, this special-casing of roots should be easily manageable.

However, since you specifically mention in this post that Paul 'added parent in node', you should definitely go with a pointer to the state in the Node, as per my earlier post. It's faster, you pay the memory cost anyway, and it's easy for maintainers (that's you in 6 months cursing your young-and-careless self) to understand what's going on.

Actually, there's one complication with having a pointer to the state in each node that we haven't mentioned: is the state often replaced wholesale? That is, not just its contents changed, but can the state of a tree suddenly point to a wholly different instance of the same class? If each node holds a pointer to it, then this operation is costly. However, this is generally easy to fix/avoid by a) favor mutation over reassignment, or b) another layer of indirection.

--
  Simen
August 14, 2019
On Tuesday, 13 August 2019 at 12:22:45 UTC, Simen Kjærås wrote:
> On Tuesday, 13 August 2019 at 08:41:02 UTC, Bert wrote:
>> On Tuesday, 13 August 2019 at 04:43:29 UTC, Paul Backus wrote:
>>> It seems to me like the obvious solution is to use two different classes, one to store the global state, and one to store the individual objects in your structure. For example:
>>>
>>> class Tree {
>>>     State state;
>>>     Node root;
>>> }
>>>
>>> class Node {
>>>     Node[] children;
>>> }
>>
>> Yes, I forgot to mention this as a solution.
>>
>> I was going to go in to more detail. This is not a great solution because it requires a base class simply to hold the "global" state and it requires nodes to have an accessor(they won't know which tree they come from).
>
> What you're essentially looking at here is my solution 4):
>
> class Node {
>     Node parent;
>     Node[] children;
>     @property State state() {
>         return parent.state;
>     }
> }
>
> class Root : Node {
>     State treeState;
>     override @property State state() {
>         return treeState;
>     }
> }
>
> class State {}
>
>
>
>> What I was thinking though, with D's capabilities, if a constructor could somehow return a "Voldemort" type that makes it all work out.
>>
>> e.g.,
>>
>> class Tree {
>>     State state;
>>     Node root;
>>     alias root this;
>> }
>>
>> then the constructor just returns tree which acts as an node.
>>
>> It doesn't have to be a voldemort but it at least hides some of the details...
>>
>> Ultimately it is not too different from using a dictionary though, effectively does the same thing.
>
> Something like this may be possible, but it will still need to keep around a pointer to the context (likely the root), costing you the exact same as having a pointer to either the root or the state in each node:
>
> class Root {
>     State state;
>     this() { state = new State(); }
>     auto newNode() {
>         class Node {
>             State getState() { return state; }
>         }
>         return new Node();
>     }
> }
>
> class State { }
>
> unittest {
>     auto root1 = new Root();
>     auto node = root1.newNode();
>
>     assert(node.getState() == root1.state);
>     // Note that Node keeps track of its context, thus taking up more space:
>     assert(__traits(classInstanceSize, typeof(node)) > __traits(classInstanceSize, State));
> }
>
>
>> There may be no way around such a problem in that these might be "optimal". Your's, which added parent in node, might be better since a dictionary doesn't have to be directly maintained.
>
> Also, if space is a concern, the dictionary is actually worse than adding a pointer to the state in each node, as the internal book-keeping of the dictionary is more than one pointer's worth per entry (at the very least it needs to keep both the hash of the Node and the pointer to the state).
>
>
>> I'd like to get away from actually having a different "root" type though. Mainly because it reduces uniformity and complicates the design I already have. If I could hide all these things and it is not too complicated then it would work better for me.
>
> The solution of having a Root class that derives from Node causes complications in three situations: When you create a new tree, when you move a subtree to become a new tree, and when you move a tree to become a subtree. The last two may or may not actually occur in your code, but without knowing more, I have no idea if that's the case. I'm fairly certain you will at some point be creating new trees, though. If the creation of new trees is confined to one or two functions while the creation of new nodes is widespread, this special-casing of roots should be easily manageable.
>
> However, since you specifically mention in this post that Paul 'added parent in node', you should definitely go with a pointer to the state in the Node, as per my earlier post. It's faster, you pay the memory cost anyway, and it's easy for maintainers (that's you in 6 months cursing your young-and-careless self) to understand what's going on.
>
> Actually, there's one complication with having a pointer to the state in each node that we haven't mentioned: is the state often replaced wholesale? That is, not just its contents changed, but can the state of a tree suddenly point to a wholly different instance of the same class? If each node holds a pointer to it, then this operation is costly. However, this is generally easy to fix/avoid by a) favor mutation over reassignment, or b) another layer of indirection.
>

Generally no, I think that your solution is not great. There is no reason to contain a pointer in each node. I mentioned in the other reply that there might be better alternative, a hybrid.

Basically


class StateNode : Node

and then use a limited traversal. Replace every kth node with a StateNode and simply traverse up to it. One is guaranteed to find one within kN steps and on average half that. It also then reduces memory requires by kN.

This seems to be the best all around approach. It still requires some type of construction but this might be done behind the scenes by silently creating and removing StateNodes as necessary.

With only the root as a StateNode, i.e., your solution, the path length is the total depth. With each node a StateMode(other extreme) then the memory waste is the entire tree size * ptr.sizeof.

In this hybrid approach, one can limit both to reasonable levels. Each tree could be optimized to reduce overall cost. It should significantly reduce memory in a typical tree. One might not need to traverse to the root, but can traverse to a leaf or a neighbor, but I'm not sure if this would be worth it. This could, on average reduce the traversal cost even further.

In my use case the tree is fixed, but in others it may not be. I'm trying to find a good solution. This seems to be the best all around. I haven't thought about it too much in the general case(for tree operations) but with proper balancing it should be nearly ideal.

I'll probably have to do some testing to figure out what would be the proper way to distribute these "StateNodes" in the tree would be and what are the ramifications on tree operations. I imagine if they just point to the root and then have the root contain the state, it might remove some of the problems, but when merging tree's and such these notes would still have to be updated and all that.

What do you think? Seems good?






August 14, 2019
On Tuesday, 13 August 2019 at 00:17:13 UTC, DanielG wrote:
> On Monday, 12 August 2019 at 22:48:43 UTC, Bert wrote:
>> I have a recursive class structure(think of a graph or tree) and I need to keep a global state for it,
>
> If I'm understanding the problem correctly, it seems like you have a choice to make: either "bloat" the child nodes by the size of a pointer/reference to the shared tree state (4 or 8 bytes), or forego that by traversing the tree (from child to root node) every time you want to access the state.
>

Yes. It's the typical size/speed issue, if, for example, one could somehow use pointer manipulation.

If, say, all the nodes could be allocated using special pointer where the first node could be calculated then it would solve both problems optimally(e.g., ptr & 0xFFFF0000/class_size). Of course this is someone rediculous. In my use case the tree is fixed but it might not always be.

I was just curious if there was some trick or if's impossible to really make it happen.


> But if you don't want traversal every time, you're going to have to cache some kind of pointer (or Tree ID in the case of a central dictionary). I suppose if you got down to the nitty-gritty maybe you could use an 8- or 16-bit value for the key ID? (Disclosure: I know absolutely zero about how D lays out classes and structs in memory, this might not save any memory at all)
>

I guess I could wrap some nodes that contain a pointer to the data equally spaced throughout the hierarchy and that will reduce overhead.

E.g., The kNth node has a pointer to the state.

This reduces the wasted memory by kN and limits the traversal depth to a max of kN.
This might be the easiest method that provides the best of both words.

Just inherit from Node and cast to check if it contains the info.


> It would help if you could provide some rough metrics:
>
> - How deep you expect each recursive structure to be (from root to deepest leaf node)
>
> - How many total nodes you expect to have across all trees (such that adding a single pointer to each instance would be memory-prohibitive)
>
> - How often each node will have to access the shared state (such that traversal-every-time would be too slow)
>
> That might help in figuring out the best approach.

These are all unknown, it is a general solution. Obviously since these are tree's the depth isn't going to be astronomical but they could be quite deep since arrays are technically tree's.