Thread overview
Can Metaprogramming Help Here?
Feb 23, 2021
Mike Brown
Feb 23, 2021
H. S. Teoh
Feb 24, 2021
Mike Brown
Feb 26, 2021
H. S. Teoh
Feb 26, 2021
H. S. Teoh
Feb 26, 2021
Mike Brown
February 23, 2021
Hi all,

Im porting some C++ code, which has a mess of a section that implements prime number type id's. I've had to smother it to death with test cases to get it reliable, I think metaprogramming that D provides is the better solution - Id rather not reimplement that C++ mess ideally.

A simplified example,

enum token_type {
  endOfFile = 2,

  unknown = 3,
  newline = 5,
  identifier = 7,
    userDefined = 13 * identifier,
    // Keyword
    var = 17 * identifier,
    uses = 19 * identifier,
    constructor = 23 * identifier,
    do_ = 29 * identifier,
    end_ = 31 * identifier,

  operator = 11,
	copyAssignment = 13 * operator

  // LAST ID = 13
}

Its effectly a tree, with the starting child node being one more than the last child number of the parent.

Is it possible to produce this via metaprogramming? I'm thinking mixin? Do I need to make a function that returns D code that will produce the enum structure above? I'm unsure as to how I would be able to get the LAST ID of the siblings at each level, and I would describe this with some kind of tree/function call?

e.g. Pseudo-code

enum token_type = prime_ids(
  value("endOfFile"),
  value("unknown"),
  value("newline"),
  branch("identifier",
    value("userDefined"),
    value("var"),
    value("uses"),
    value("constructor"),
    value("do_"),
    value("end_")
  ),
  ...
;

What would be the direction I need to go in to achieve this?

Kind regards,
Mike
February 23, 2021
On Tue, Feb 23, 2021 at 10:24:50PM +0000, Mike Brown via Digitalmars-d-learn wrote:
> Hi all,
> 
> Im porting some C++ code, which has a mess of a section that implements prime number type id's. I've had to smother it to death with test cases to get it reliable, I think metaprogramming that D provides is the better solution - Id rather not reimplement that C++ mess ideally.

Try something like this:

-----------------------------snip-----------------------------
import std;

int[] firstNPrimes(int n) {
	// FIXME: replace this with actual primes computation
	return [ 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31 ];
}

string genEnum(string enumName, idents...)() {
	string code = "enum " ~ enumName ~ " {";
	auto primes = firstNPrimes(idents.length);
	foreach (i, ident; idents) {
		code ~= ident ~ " = " ~ primes[i].to!string ~ ", ";
	}
	code ~= "}";
	return code;
}

template PrimeEnum(idents...) {
	mixin(genEnum!("PrimeEnum", idents));
}

alias MyEnum = PrimeEnum!(
	"unknown", "newline", "identifier", "var", "user_defined",
);

void main() {
	writefln("%(%d\n%)", [
		MyEnum.unknown,
		MyEnum.newline,
		MyEnum.identifier,
		MyEnum.var,
		MyEnum.user_defined
	]);
}
-----------------------------snip-----------------------------


You can substitute the body of firstNPrimes with any standard prime-generation algorithm. As long as it's not too heavyweight, you should be able to get it to compile without the compiler soaking up unreasonable amounts of memory. :-D  If you find the compiler using up too much memory, try precomputing the list of primes beforehand and pasting it into firstNPrimes (so that the CTFE engine doesn't have to recompute it every time you compile).

Note that PrimeEnum can be used to generate any number of enums you wish to have prime values.  Or if you replace the call to firstNPrimes with something else, you can generate enums whose identifiers map to any integer sequence of your choosing.


T

-- 
Bare foot: (n.) A device for locating thumb tacks on the floor.
February 24, 2021
On Tuesday, 23 February 2021 at 22:55:53 UTC, H. S. Teoh wrote:
> On Tue, Feb 23, 2021 at 10:24:50PM +0000, Mike Brown via Digitalmars-d-learn wrote:
>> Hi all,
>> 
>> Im porting some C++ code, which has a mess of a section that implements prime number type id's. I've had to smother it to death with test cases to get it reliable, I think metaprogramming that D provides is the better solution - Id rather not reimplement that C++ mess ideally.
>
> Try something like this:
>
> -----------------------------snip-----------------------------
> import std;
>
> int[] firstNPrimes(int n) {
> 	// FIXME: replace this with actual primes computation
> 	return [ 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31 ];
> }
>
> string genEnum(string enumName, idents...)() {
> 	string code = "enum " ~ enumName ~ " {";
> 	auto primes = firstNPrimes(idents.length);
> 	foreach (i, ident; idents) {
> 		code ~= ident ~ " = " ~ primes[i].to!string ~ ", ";
> 	}
> 	code ~= "}";
> 	return code;
> }
>
> template PrimeEnum(idents...) {
> 	mixin(genEnum!("PrimeEnum", idents));
> }
>
> alias MyEnum = PrimeEnum!(
> 	"unknown", "newline", "identifier", "var", "user_defined",
> );
>
> void main() {
> 	writefln("%(%d\n%)", [
> 		MyEnum.unknown,
> 		MyEnum.newline,
> 		MyEnum.identifier,
> 		MyEnum.var,
> 		MyEnum.user_defined
> 	]);
> }
> -----------------------------snip-----------------------------
>
>
> You can substitute the body of firstNPrimes with any standard prime-generation algorithm. As long as it's not too heavyweight, you should be able to get it to compile without the compiler soaking up unreasonable amounts of memory. :-D  If you find the compiler using up too much memory, try precomputing the list of primes beforehand and pasting it into firstNPrimes (so that the CTFE engine doesn't have to recompute it every time you compile).
>
> Note that PrimeEnum can be used to generate any number of enums you wish to have prime values.  Or if you replace the call to firstNPrimes with something else, you can generate enums whose identifiers map to any integer sequence of your choosing.
>
>
> T

Hi T,

Thank you for the reply. Im struggling extending this to get the nesting working.

I'm trying something like:

string entry(string i, string[] inherit = []) {
	return i;
}

alias token_type2 = PrimeEnum!(
	entry("unknown"),
	entry("newline"),
	entry("identifier"),
	entry("var", ["identifier"]),
	entry("userDefined", ["identifier"])
);

Its worth noting that multiple inherited bases are needed too.

But I can't get those functions contexts linking, can I pass a function pointer as lazy into the PrimeEnum!() template?

Would it be easier to just parse the text at once into a single templating function?

Kind regards,
Mike Brown
February 26, 2021
On Wed, Feb 24, 2021 at 08:10:30PM +0000, Mike Brown via Digitalmars-d-learn wrote: [...]
> Thank you for the reply. Im struggling extending this to get the nesting working.
> 
> I'm trying something like:
> 
> string entry(string i, string[] inherit = []) {
> 	return i;
> }
> 
> alias token_type2 = PrimeEnum!(
> 	entry("unknown"),
> 	entry("newline"),
> 	entry("identifier"),
> 	entry("var", ["identifier"]),
> 	entry("userDefined", ["identifier"])
> );
> 
> Its worth noting that multiple inherited bases are needed too.
> 
> But I can't get those functions contexts linking, can I pass a function pointer as lazy into the PrimeEnum!() template?
> 
> Would it be easier to just parse the text at once into a single templating function?
[...]

Ah, so sorry, I completely overlooked the nesting part.  PrimeEnum as I defined it in my first reply does not handle this at all, so it will need to be extended.

Since we're dealing with a tree structure here, I think the best way is to express the tree structure explicitly in a compile-time data structure. Thanks to CTFE, this works pretty much exactly the same as normal runtime data structures; the only difference is that they will be processed at compile-time.

Here's a rough sketch of how I'd do it:

	class Entry {
		string ident;
		Entry[] subentries;
		this(string _id, Entry[] _subs = []) {
			ident = _id;
			subentries = _subs;
		}
	}

	Entry[] makeIdentTrees() {
		return [
			new Entry("endOfFile"),
			new Entry("unknown"),
			new Entry("newline"),
			new Entry("identifier", [
				new Entry("userDefined"),
				new Entry("var"),
				new Entry("uses"),
				... // you get the idea
			],
			new Entry("operator", [
				new Entry("copyAssignment"),
				... // etc.
			]));
		]
	}

You'd then write a recursive function that traverses this tree, using a compile-time array of prime numbers, and compute the enum values that way.  Format that into D code as a string, and use mixin to actually create the enum.  The function can be written just like any runtime D code, as long as it does not use any CTFE-incompatible language features.

	string genEnum(Entry[] entries) {
		string code;
		... // traverse tree and generate D code here
		return code;
	}

	// Create the enum
	mixin(genEnum(makeIdentTrees()));


T

-- 
A mathematician learns more and more about less and less, until he knows everything about nothing; whereas a philospher learns less and less about more and more, until he knows nothing about everything.
February 26, 2021
On Fri, Feb 26, 2021 at 11:37:18AM -0800, H. S. Teoh via Digitalmars-d-learn wrote:
> On Wed, Feb 24, 2021 at 08:10:30PM +0000, Mike Brown via Digitalmars-d-learn wrote: [...]
> > Thank you for the reply. Im struggling extending this to get the nesting working.
[...]

Alright, here's an actual working example. Instead of using classes, I decided to use templates instead, but the underlying concept is the same:

-------------------------------------snip--------------------------------------
template branch(string _ident, _values...) {
	enum ident = _ident;
	alias values = _values;
}

// I used strings for easier concatenation to code, otherwise we have to use
// std.conv to convert it which is slow in CTFE.
static immutable string[] primes = [
	"2", "3", "5", "7", "11", "13", "17", "19", "23", "29", "31", "37",
	"41", // fill in more if you need to
];

string genPrimeId(size_t[] indices)
	in (indices.length > 0)
{
	string result = primes[indices[0]];
	foreach (i; indices[1 .. $]) {
		result ~= "*" ~ primes[i];
	}
	return result;
}

template primeIdsImpl(size_t[] indices, Args...)
	if (indices.length > 0 && Args.length > 0)
{
	static if (Args.length == 1) {
		static if (is(typeof(Args[0]) == string)) {
			enum primeIdsImpl = Args[0] ~ "=" ~ genPrimeId(indices) ~ ",\n";
		} else {
			enum primeIdsImpl = Args[0].ident ~ "=" ~ genPrimeId(indices) ~ ",\n" ~
				primeIdsImpl!(indices ~ [ indices[$-1] + 1 ],
					Args[0].values);
		}
	} else {
		enum primeIdsImpl = primeIdsImpl!(indices, Args[0]) ~
			primeIdsImpl!(indices[0 .. $-1] ~ [ indices[$-1] + 1 ],
					Args[1 .. $]);
	}
}

template primeIds(string enumName, Args...) if (Args.length > 0) {
	enum primeIds = "enum " ~ enumName ~ " {\n" ~
		primeIdsImpl!([0], Args) ~
		"}";
}

mixin(primeIds!("token_type",
	"endOfFile",
	"unknown",
	"newline",
	branch!("identifier",
		"userDefined",
		"var",
		"uses",
		"constructor",
		"do_",
		"end_",
	),
	branch!("operator",
		"copyAssignment",
	),
));

void main() {
	import std;
	writefln("%s", token_type.identifier);
	writefln("%d", token_type.identifier);
}
-------------------------------------snip--------------------------------------


You can change the mixin line to `pragma(msg, ...)` instead to see the
generated code string.

I noticed that the definitions of the first nested identifiers are different from your original post; I don't know if this is a misunderstanding on my side or an oversight on your part?  After identifier=7, the next prime should be 11, not 13, so userDefined should start with 11*identifier rather than 13*identifier.


T

-- 
Shin: (n.) A device for finding furniture in the dark.
February 26, 2021
On Friday, 26 February 2021 at 20:42:50 UTC, H. S. Teoh wrote:
> On Fri, Feb 26, 2021 at 11:37:18AM -0800, H. S. Teoh via Digitalmars-d-learn wrote:
>> > [...]
> [...]
>
> Alright, here's an actual working example. Instead of using classes, I decided to use templates instead, but the underlying concept is the same:
>
> [...]

Hi T,

Thank you so much for that, I appriciate the time and hard work you've put in for this! I'm sure the code examples above will be workable for my needs.

Thanks again
Mike