Thread overview
join() in CTFE very low performance
2 days ago
realhet
2 days ago
monkyyy
2 days ago
realhet
2 days ago
realhet
2 days ago
monkyyy
2 days ago

Hello,

I have an array of array of strings, a 2D table encapsulated in a struct:

The first few rows look like this.

enum TBL_niceExpressionTemplates =
(表([
	[q{/+Note: Name+/},q{/+Note: Example+/},q{/+Note: Pattern+/},q{/+Note: op+/},q{/+Note: Style+/},q{/+Note: Syntax+/},q{/+Note: Class+/},q{/+Note: Scripts @init: @text @node @draw @ui+/}],
	[q{null_},q{},q{/+Code:+/},q{""},q{dim},q{Whitespace},q{NiceExpression},q{}],
	[q{magnitude},q{(magnitude(a))},q{/+Code: (op(expr))+/},q{"magnitude"},q{dim},q{Symbol},q{NiceExpression},q{@text: put(operator); op(0); @node: put('|'); op(0); put('|'); }],
	[q{normalize},q{(normalize(a))},q{/+Code: (op(expr))+/},q{"normalize"},q{dim},q{Symbol},q{NiceExpression},q{@text: put(operator); op(0); @node: put('‖'); op(0); put('‖'); }],
...
]));

I have around 50 rows total, not much for a computer.

Then I use the 'table' and try to generate an actual static immutable array of runtime structs.

I use an own makeNiceExpressionTemplate(string[] args) function to convert those table rows into the runtime used structs.

static immutable niceExpressionTemplates = TBL_niceExpressionTemplates.rows.map!makeNiceExpressionTemplate.array;

This method is perfectly fine, it must be under a millisecond, I can't see it in the ftime-trace.

But when I try to put this table together using a string mixin, it goes extremelyi slow:

mixin(iq{static immutable niceExpressionTemplates = [$(TBL_niceExpressionTemplates.rows.map!((r)=>(iq{makeNiceExpressionTemplate($(r.text))}.text)).join(','))]; }.text)

It took 2.6 seconds!!!

It concatenates 50+ strings like this -> makeNiceExpressionTemplate(["null_", "", "/+Code:+/", """", "dim", "Whitespace", "NiceExpression", ""]),
into a single long string.
Put an array declaration 'container' around it, and then gives it to mixin()

Then the weird thing happens:
Even the makeNiceExpressionTemplate() is called with the sampe parameter: a string array, it generates all the code for it again and again.
Exactly that many times as the join() template is executed on it.

I narrowed down the code as much as possible:

static immutable very_slow_operation = TBL_niceExpressionTemplates.rows.map!text.join(',');

It is a combination of text(), join(), and formatting string arrays to text.

My only question is why?
What is the exact thing I should avoid, why join() recompiles their iterations from zero all the time?

Thank You in advance!

2 days ago

On Saturday, 4 January 2025 at 13:56:47 UTC, realhet wrote:

>

Hello,

I have an array of array of strings, a 2D table encapsulated in a struct:

The first few rows look like this.

enum TBL_niceExpressionTemplates =
(表([
	[q{/+Note: Name+/},q{/+Note: Example+/},q{/+Note: Pattern+/},q{/+Note: op+/},q{/+Note: Style+/},q{/+Note: Syntax+/},q{/+Note: Class+/},q{/+Note: Scripts @init: @text @node @draw @ui+/}],
	[q{null_},q{},q{/+Code:+/},q{""},q{dim},q{Whitespace},q{NiceExpression},q{}],
	[q{magnitude},q{(magnitude(a))},q{/+Code: (op(expr))+/},q{"magnitude"},q{dim},q{Symbol},q{NiceExpression},q{@text: put(operator); op(0); @node: put('|'); op(0); put('|'); }],
	[q{normalize},q{(normalize(a))},q{/+Code: (op(expr))+/},q{"normalize"},q{dim},q{Symbol},q{NiceExpression},q{@text: put(operator); op(0); @node: put('‖'); op(0); put('‖'); }],
...
]));

I have around 50 rows total, not much for a computer.

Then I use the 'table' and try to generate an actual static immutable array of runtime structs.

I use an own makeNiceExpressionTemplate(string[] args) function to convert those table rows into the runtime used structs.

static immutable niceExpressionTemplates = TBL_niceExpressionTemplates.rows.map!makeNiceExpressionTemplate.array;

This method is perfectly fine, it must be under a millisecond, I can't see it in the ftime-trace.

But when I try to put this table together using a string mixin, it goes extremelyi slow:

mixin(iq{static immutable niceExpressionTemplates = [$(TBL_niceExpressionTemplates.rows.map!((r)=>(iq{makeNiceExpressionTemplate($(r.text))}.text)).join(','))]; }.text)

It took 2.6 seconds!!!

It concatenates 50+ strings like this -> makeNiceExpressionTemplate(["null_", "", "/+Code:+/", """", "dim", "Whitespace", "NiceExpression", ""]),
into a single long string.
Put an array declaration 'container' around it, and then gives it to mixin()

Then the weird thing happens:
Even the makeNiceExpressionTemplate() is called with the sampe parameter: a string array, it generates all the code for it again and again.
Exactly that many times as the join() template is executed on it.

I narrowed down the code as much as possible:

static immutable very_slow_operation = TBL_niceExpressionTemplates.rows.map!text.join(',');

It is a combination of text(), join(), and formatting string arrays to text.

My only question is why?
What is the exact thing I should avoid, why join() recompiles their iterations from zero all the time?

Thank You in advance!

Id check that it is a string and not some sort of lazy wrapper type doing worse of both world things; I usually use enum string[] for mixin-y things when possible, idk what style your doing and the q{} may have some extra logic.

2 days ago

On Saturday, 4 January 2025 at 15:34:27 UTC, monkyyy wrote:

>

On Saturday, 4 January 2025 at 13:56:47 UTC, realhet wrote:

>

Id check that it is a string and not some sort of lazy wrapper type doing worse of both world things; I usually use enum string[] for mixin-y things when possible, idk what style your doing and the q{} may have some extra logic.

That style indeed makes no sense on text mode.
Here's what it looks in graphic mode: https://youtu.be/8brvCoMaWyQ

At the end of the video I've tried out 4 versions.

The first one is super fast (--ftime-trace) -> It puts the table in the global scope, then the mixin just inject a simple transformation expression on it so the string[][] table is transformed into a Struct[] in compile time.

But the big difference is that the later 3 versions are putting data onto the string surface of the mixin. They use the universal std text() template function to do that. And that is working extremely slow inside the context of the Compile Time.

When I look at the --ftime-trace, I see text() formatValue() everywhere. It feels like the CT version of text() does everything by the limited but safe tools of CT environment, something like CtRegexpr. It looks like they discover their parameter signatures every time from zero. They can't remember that they already compiled text(string[]) formatValue(string), etc.
Both of those CT things (text, format, regexpr) are awesome but I guess I should avoid them while using string mixins.

Maybe those 'lazy wrappers' you mentioned can be inside text()?
My wrapper struct named with a chinese character is rellly simple: struct S{ string[][] rows; } first member is the table rows with the cells.

q{} always worked perfectly. I have no fear of that.

Also no problems with the new goodies: iq{} and ${} I'm testing them like crazy, I really like them.

Only the very complex stuff works weird in CT -> text, format...

2 days ago

On Saturday, 4 January 2025 at 19:54:19 UTC, realhet wrote:

>

On Saturday, 4 January 2025 at 15:34:27 UTC, monkyyy wrote:

>

On Saturday, 4 January 2025 at 13:56:47 UTC, realhet wrote:
Only the very complex stuff works weird in CT -> text, format...

I think I've found the solution, it was so simple, that's why I wasn't able to see it :D

mixin template INJECTOR_TEMPLATE(表 table, string script)
{ mixin(script); }

The proper way to pass large amount of data through the 'membrane' of compile-time and run-time is a mixin template.

NOT a string mixin combined with the safest and most platform independent version of the text() function (for arrays and structs).

With the mixin template arguments, the data will always stay in binary form, no slow textual form is needed.

It seems like I like to learn the hard way. But it's so difficult to stop thinking with the good old C preprocessor way. The classic leg shooting way of thinking is still strong in me :D

2 days ago

On Saturday, 4 January 2025 at 19:54:19 UTC, realhet wrote:

>

It looks like they discover their parameter signatures every time from zero.

Maybe those 'lazy wrappers' you mentioned can be inside text()?

While it usually would be correct to assume Im being informal, in this case I wasn't; form functional programming "lazy" v "eager" are formal terms in english; idk what they are in chinese.

That is the definition. I suggest adding in memorized: and for any computation you can have the following outcomes:

lazy: may run 0, or infite amount of times, doesnt allocate

eager: will run once, will always allocate

memo: may run 0 or once, may allocate

There are times when ct is worse when lazy but the std is(and should stay) default lazy, so Id suggest a pattern of typing your enums enum string foo=..., to attempt to get an eager result