September 16, 2019
Anyone have a better way to accomplish the following task:

An algorithm computes various values for a graph structure. The graph contains many loops and nested loops. Each node has a chain associated with it, which is how it is connected to the rest of the graph.

The problem is to save memory because many nodes will have overlapping chains, there is no reason to store them all separately as it will waste several orders of magnitude of space.

The idea is to create arrays of arrays and to reuse parts of the arrays that will be common and to slice parts of the arrays and append them when necessary. Each Chain acts like a normal array in that it can be indexed as if it were a single array, but, in fact, it is a sequence of arrays.

Two chains may have parts that refer to the same data and so "connect" but there will never be more than 1 part in memory, so no wasted space.

It's starting to get a little more complicated than just wasting memory(which will limit the size of the input space and usually the input space will be quite large).

normally one would have something like

A - [ArrayA]
B - [ArrayB]
C - [ArrayC]

where the arrays may have sub-arrays in common. With a chain we have

A - [ChainA] = [Array1, ArrayQ, ArrayN]
B - [ChainB] = [Array2, ArrayQ, Array3]
C - [ChainC] = [Array1, ArrayN, Array2]

And the chain's can refer to common data to reduce the overlap from N to 1. (any duplicate arrays in the chain will be singletons. E.g., ArrayQ used in ChainA and ChainB are references to the same array. In the first case they will be duplicated. In my algorithms I always know when a array will be duplicated but the order of creation is chaotic which makes it more difficult than I'd like(and I'm not sure how correct the code is)



The problem is not so much the structure as it is dealing with corner cases in overlap detection.

Ideally all this would be handled under the hood and the user would not know any difference and would just see a chain as an array. The problem with that approach, as best I can tell, is that it would be a big performance hit to have to find duplicates because it would require searching the arrays. Currently I deal with the overlapping of the chains outside the structure tailored to the algorithm and it seems to work ok but is not very general.

One can think of this as sort of linearized a graph per node. Each node in the graph has a potential path to any other node. One simply lists all paths(it can even deal with loops by referencing the chain itself(which requires mimicking an array or to have some type of recursive structure).. but in doing this there are lots of paths with lots of redundancy that needs to be dealt with.


struct sChain(T)
{
	T[][] Parts;

	// Indexes parts like one large array
	auto opIndex(int i)
	{
		int c = 0;
		for(int k = 0; k < Parts.length; k++)
		{
			if (c <= i && i < c + Parts[k].length)
			{
				return Parts[k][i - c];
			
			}
			c += Parts[k].length;
		}
		assert(0, "Element not found");
	}

	auto FindIndex(alias D)(T v)
	{
		for(int i = 0; i < Length; i++)
		{
			if (D(v, this[i])) return i;
		}
		return -1;
	}

	auto FindPartOnIndex(int i)
	{
		int c = 0;
		for(int k = 0; k < Parts.length; k++)
		{
			if (c <= i && i < c + Parts[k].length)
			{
				return k;
			}
		}
		return -1;
	}

	auto Length(int m = -1)
	{
		if (m == -1) m = Parts.length;
		int c = 0;
		for(int k = 0; k < m; k++)
		{
			c += Parts[k].length;
		}
		return c;
	}

	auto NumParts()
	{
		return Parts.length;
	}
	auto AddPart(T[] p = null)
	{
		if (p)
			Parts ~= p;
		else
			Parts ~= new T[0];
	}

	auto opCast(T : bool)()
	{
		return !!Parts;
	}

	// Appends element to last part
	auto opOpAssign(string op : "~")(T r)
	{
		if (!Parts) Parts ~= new T[0];
		Parts[$-1] ~= r;	
	}

	// Appends part to parts
	auto opOpAssign(string op : "~")(T[] r)
	{
		//if (!Parts) Parts ~= new T[0];
		Parts ~= r;
	}

	// Appends array parts to parts
	auto opOpAssign(string op : "~")(typeof(this) r)
	{
		for(int k = 0; k < r.Parts.length; k++)
		{
			Parts ~= r.Parts[k];
		}
		
	}

}