September 23
Your algorithm is interesting, but I don't think it is fundamentally different from the one I sketched out.

Also, it uses expensive operations like array appends, and the filter function. The bit mask route should be much faster.

September 23
On 9/23/2025 4:52 AM, Dennis wrote:
> If we remove @nogc from a function's type/mangle in a next edition (which I'm all for) it can work.

Is there a discussion on this?
September 23
On 9/23/2025 2:43 AM, Richard (Rikki) Andrew Cattermole wrote:
> Instead of writing ``@nogc``, we write ``@nogc iff(foo, bar, abc)``.
> 
>  From there its a simple dependency resolver either during the compilation of the three functions or on usage.

It's not so simple when there is recursion in the flow graph.

> 
> This is the basis for my solution to a problem Adam has proposed for PhobosV3, which essentially is the desire to whitelist attributes. Use ``@localnogc`` & if the function calls are all ``@nogc``, therefore the function is allowed to be ``@nogc``.

I am reluctant to try to fix this issue with even more attributes.


> Am I right to assume that the reason you haven't presented this solution, is because you want attributes to be finalized when bar&abc complete?

I showed the solution to illustrate the requirement for iteration and convergence to resolve cycles in the flow graph. A DFA that cannot deal with recursion is just going to leave people frustrated, as the current @nogc inference engine does.

September 23

On Tuesday, 23 September 2025 at 17:46:20 UTC, Walter Bright wrote:

>

Your algorithm is interesting, but I don't think it is fundamentally different from the one I sketched out.

I think both algorithms compute the same thing, but your algorithm represents the graph with an adjacency matrix and does an inefficient search. The algorithm be O(V + E), with V = vertices = number of function declarations, and E = edges = number of function calls.

That is, compile time should scale linearly with the amount of source code being compiled.

Your algorithm does a O(V^2) traversal O(V) times, running in O(V^3) time. That's a pathological worst case scenario, but I'm not convinced the constant factor is so large that it's more practical than a linear search.

>

Also, it uses expensive operations like array appends, and the filter function. The bit mask route should be much faster.

I wrote the example as a simple illustration, not as optimized production code. The filter function is trivially replaced with an if statement if the optimizer doesn't already do that.

On a slight tangent, I'm not a fan of how dmd's performance culture seems to be hyper focused on local constant time code optimizations that worsen code quality (like bit-hacking and using global variables for reusable memory buffers) when there's a number of glaring issues related to time complexity, which are way more severe than a redundant allocation here and there could ever be.

Whenever I investigate what slows down compilation time of real code, it's usually because it triggers a bad case of one of the many quadratic or even exponential algorithms in dmd's code. I'm talking:

A fast constant factor is nice, but let's not introduce more 'compilation time landmines' into dmd's code base, and prefer linear algorithms over quadratic ones.

September 23

On Tuesday, 23 September 2025 at 17:47:23 UTC, Walter Bright wrote:

>

Is there a discussion on this?

I've briefly mentioned this at DConf in the past two years, I haven't written a proposal or seen anyone else propose it in detail.

September 24
On 24/09/2025 5:54 AM, Walter Bright wrote:
> On 9/23/2025 2:43 AM, Richard (Rikki) Andrew Cattermole wrote:
>> Instead of writing ``@nogc``, we write ``@nogc iff(foo, bar, abc)``.
>>
>>  From there its a simple dependency resolver either during the compilation of the three functions or on usage.
> 
> It's not so simple when there is recursion in the flow graph.

For @nogc and pure, it is that simple.

Other attributes can't be solved with this solution. They are too complex, or effect the calling convention of the function.

I don't have a generalized solution.

>> This is the basis for my solution to a problem Adam has proposed for PhobosV3, which essentially is the desire to whitelist attributes. Use ``@localnogc`` & if the function calls are all ``@nogc``, therefore the function is allowed to be ``@nogc``.
> 
> I am reluctant to try to fix this issue with even more attributes.

If you don't, you're limited to a single compilation.

Which may be fine for a single module, but might not be cross-module.

>> Am I right to assume that the reason you haven't presented this solution, is because you want attributes to be finalized when bar&abc complete?
> 
> I showed the solution to illustrate the requirement for iteration and convergence to resolve cycles in the flow graph. A DFA that cannot deal with recursion is just going to leave people frustrated, as the current @nogc inference engine does. 

Indeed. I'd call it an all right example, a better example might be:

```d
void* global;

void thing(T)(scope void* u, T* v) {
	global = v;
	thing(null, u);
}
```

You can't solve for this without applying outputs into inputs.
Therefore no shortcuts.

September 23
On 9/23/2025 1:01 PM, Dennis wrote:
> On Tuesday, 23 September 2025 at 17:46:20 UTC, Walter Bright wrote:
>> Your algorithm is interesting, but I don't think it is fundamentally different from the one I sketched out.
> 
> I think both algorithms compute the same thing, but your algorithm represents the graph with an adjacency matrix and does an inefficient search. The algorithm be O(V + E), with V = vertices = number of function declarations, and E = edges = number of function calls.
> 
> That is, compile time should scale linearly with the amount of source code being compiled.
> 
> Your algorithm does a O(V^2) traversal O(V) times, running in O(V^3) time. That's a pathological worst case scenario, but I'm not convinced the constant factor is so large that it's more practical than a linear search.

I don't think it is anywhere near that bad. It does more than O(V) only when there are cycles.


> On a slight tangent, I'm not a fan of how dmd's performance culture seems to be hyper focused on local constant time code optimizations that worsen code quality (like bit-hacking and using global variables for reusable memory buffers)

The reusable memory buffer thing was benchmarked by myself long ago, and was a big win. I don't know what you mean by bit-hacking.

My general approach is to make it work first, not worrying too much about algorithms. I've discovered that D makes it fairly easy to change algorithms, while C/C++ code tends to be a lot more brittle.

Recoding for performance later takes advantage of there being a test suite in place for it.

> when there's a number of glaring issues related to time complexity, which are way more severe than a redundant allocation here and there could ever be.

They both matter.


> Whenever I investigate what slows down compilation time of real code, it's usually because it triggers a bad case of one of the many quadratic or even exponential algorithms in dmd's code. I'm talking:

I've discovered many such algorithms in my C++ compiler, and fixed them. I've been less diligent with D's source, mainly because I have too much on my plate. It's great that you're looking into this!

> - O(n^2) MSCoff section insertion (https://github.com/dlang/dmd/pull/16780)
> - O(n^2) Checking for missing/duplicated switch case statements (TODO)
> - O(2^n) Nested function sibling detection (https://github.com/dlang/dmd/pull/14886)
> - O(2^n) Return type isolated from parameters check (https://github.com/dlang/dmd/pull/12885)
> - O(n*m) CTFE execution because AA literals aren't memoized (https://github.com/dlang/dmd/issues/20179)
> - O(2^n) Function inlining (https://github.com/dlang/dmd/issues/20315)
> - O(n^2) Loop optimizations (https://github.com/dlang/dmd/issues/20535)

This is all valuable stuff!


> A fast constant factor is nice, but let's not introduce more 'compilation time landmines' into dmd's code base, and prefer linear algorithms over quadratic ones.

There's mathematical time, the O(whatever) thing, and there's the detail coding practice time. When I worked on the Warp preprocessor, I spent a lot of time trying different algorithms and data structures, and benchmarking. It paid off to not be too pre-judgmental about what time it would take.

Other things matter a lot, too, like cache-friendly data structures.

I appreciate your efforts here, it is very valuable work you are doing!
September 23
On 9/23/2025 1:05 PM, Dennis wrote:
> On Tuesday, 23 September 2025 at 17:47:23 UTC, Walter Bright wrote:
>> Is there a discussion on this?
> 
> I've briefly mentioned this at DConf in the past two years, I haven't written a proposal or seen anyone else propose it in detail.

Ok. There are other things that affect this, too, like typing of pointers to functions.
September 23

On Tuesday, 23 September 2025 at 22:10:22 UTC, Walter Bright wrote:

>

It does more than O(V) only when there are cycles.

Just step 2, constructing the bit vectors, is already O(V^2). Even without cycles, you can hit the O(V^3) case if you have a deep call stack.

On Tuesday, 23 September 2025 at 22:10:22 UTC, Walter Bright wrote:

>

The reusable memory buffer thing was benchmarked by myself long ago, and was a big win.

The problem is, a trick is found to be useful in one spot at one time, and 15 years later it's still blanketed everywhere without question. Meanwhile, it's 2025, the TypeScript compiler got a 3x speed up from leveraging multi-threading, and dmd is still stuck in __gshared town, population 500+. Mutable global buffers are even being used in dmd's brand new ARM disassembler, which is a) not performance critical and b) if it were, the key to make it run fast on modern hardware is not __gshared char[5 + hw.sizeof * 3 + 1 + 1].

>

I don't know what you mean by bit-hacking.

I meant things like using an int flags parameter with 2 bits or'd in instead of 2 bool parameters. But it's not the best example, I'm talking about the general pattern of micro-optimizations, not specific instances of the problem.

>

My general approach is to make it work first, not worrying too much about algorithms.

(...)

It paid off to not be too pre-judgmental about what time it would take.

Which is why it puzzles me:

There's a simple, classic algorithm that runs in linear time, which is pre-judged to be slow because there's array concatenation in it.

Then there's a complex algorithm with a nested for-loop triple, which is pre-judged to be fast because it's going to use bit vectors, and it has extra logic to prune the search space making it not as bad hopefully.

And you think the latter fits the philosophy "make it work first"? I mean, be my guest and implement it, I just think you're making it harder for yourself if you do. And I am going to be wary and test performance when something like that lands in dmd 😉

>

I appreciate your efforts here, it is very valuable work you are doing!

Thanks!

September 23

On Tuesday, 23 September 2025 at 22:11:19 UTC, Walter Bright wrote:

>

Ok. There are other things that affect this, too, like typing of pointers to functions.

Which is incidentally one of the many long standing problems with function attributes being part of function types, see for example https://github.com/dlang/DIPs/pull/198