On Friday, 23 May 2025 at 22:11:58 UTC, Ali Çehreli wrote:
> On 5/23/25 6:10 AM, Dennis wrote:
> This results in a complex system that's more annoying to deal
with than
the original problem of just maintaining a .d file and shader
file in
parallel, which is what I'm doing now for the time being.
Approved! Sounds like engineering to me. :)
I totally accept your worries and I thought about this critical dilemma a lot in the past too.
This is the very first time, I'm going to use an external 'hack' to improve the capabilities of LDC2.exe to my needs.
Here are my points, why:
Maintaining 2 sets of code is too much for my brain. I notice the following bad pattern from time to time:
- I have an idea,
- I try to implement it quickly,
- then I end up with a bug (caused by own human error),
- fighting with it for hours/days, then my client can't understand why is it took so long. And both of us getting frustrated. The bad outcome is that I rather don't touch these codes. The innovation stops in order to avoid frustration.
So I really need the help of the machine in these tasks that are require 100% focus.
And I as a human can easily experiment with stuff, without the worries I can cause nasty hidden bugs with that.
If you change the internal behavior of the compiler that break my 'hack', I will be notice that because I'm sending a hashOf() as well. But so far so good. :D
Every year I'm spending some weeks, (maybe a month) to catch up with the new features of D and use them in my framework. And I'm happy to see the progress, I don't care if sometimes it breaks my stuff, I can detect it and fix it. My recent favourite feature is $() without a doubt!
An addition the to dificulty of this redundant CPU/GPU code management is this: I'm transitioning from OpenGL to Vulkan. And Vulkan checks nothing, from now, I have to check everything. Alone I sure can't do this, I know my limits. But I'm sure with D's metaprogramming I will. Vulkan can't even care the human readable identifiers inside a 'shader binary', so sampler times, no 'entity framework' at all, just integer indices. (And I think Vulkan is a piece of art. It doesn't try to make the users job easier, it's only aim is to give full control of the underlying hardware.)
This difficult transition also can't avoided by me: I need Vulkan because I want to unlock true multithreading, I'm gonna have a lot of PCIE traffic from cameras, while I want my UI go at 60FPS. Vulkan is ideal for this, I can access all components individually. OpenGL serializes all work, it is easy to clog it with a 10MB texture.
One more thing: I just took a look at the std430 align requirements in the Vulkan documentation, and I was like, no way I do and guarantee all this manually.
Anyways, I can't go back :D
So as Dennis asked earlier, here are a small demo of my experiments with this 'hack':
This is the way of declaring an embedded source code inside a D module:
First I declare fields for the uniform buffer:
enum UBO_fields =
q{
uint param0,
param1;
};
static struct UBO { mixin(UBO_fields); }
Then I declare some constants that are both accessible from the shader and from the CPU program, I also specify the command line args for the external compiler:
enum groupSize = 1024,
bufSize = 64 <<20,
shaderBinary = (碼!(iq{glslc -O}.text,iq{
#version 430
layout(local_size_x = $(groupSize)) in;
layout(binding = 0) uniform UBO {$(UBO_fields)};
layout(std430, binding = 1) buffer BUF { uint values[]; };
void main() {
const uint id = gl_GlobalInvocationID.x;
values[id] = values[id] * param0 + param1 + 1000;
}
}.text));
(Note that I use Chinese identifier chars to mark machine related parts of my code.)
When the D compiler reaches the 碼 template it will do this:
template 碼/+ExternalCode+/(string args, string src, string FILE=__FILE__, int LINE=__LINE__)
{
pragma(msg, "$DIDE_TEXTBLOCK_BEGIN$");
pragma(msg, src /+This is large data to CTFE, so do not touch it here!!!+/);
pragma(msg, "$DIDE_TEXTBLOCK_END$");
enum hash = src.hashOf(args.hashOf).to!string(26) /+hashOf CTFE performance: 1.2ms / 1KB+/;
pragma(msg, FILE, "(", LINE, ",1): $DIDE_EXTERNAL_COMPILATION_REQUEST: ", args.quoted, ",", hash.quoted);
//The TEXTBLOCK will be attached to the end of the compilation request message as a string literal inside DIDE.
enum 碼 = (cast(immutable(ubyte)[])(import(hash)));
static if(碼.startsWith(cast(ubyte[])("ERROR:")))
{
pragma(msg, (cast(string)(碼)).splitter('\n').drop(1).join('\n'));
static assert(false, "$DIDE_EXTERNAL_COMPILATION_"~(cast(string)(碼)).splitter('\n').front);
}
}
It sends out the following things to stderr:
- A begin of data market
- The actual data: The source code as it is.
- An end of data marker
- Finally a command that looks like a standard error message containing the following indo:
- source code location of the template issuation
- the command line arguments az a quoted string. (It's small, CTFE performance is fine here)
- the hash of the large source code
At this point inside another process my buildsystem processes the stderr messages:
- It detects the begin/end markers and collects the large source code inside those.
- It catches the message with the source location, the command line and the hash (in textual form).
So now the buildsystem has all the data to begin the external compilation process.
In my example it will call "glslc.exe" and gives it a file with the source code that was earlier received from stdErr.
After the compilation finishes, it puts the resulting binary (or compilation error messages) into an associative array (into a cache. This compilation is also incremental, just like the D modules).
Now in the D module the compiler reaches the stringImport statement:
enum 碼 = (cast(immutable(ubyte)[])(import(hash)));
Then name of the file is the hash. And the path of the file is served by ProjectedFS (Now that I worked with this on windows for a week, I can say it's a stable, reliable technology: I use it to virtually serve any files inside a directory.)
Do LDC2.exe will open the import(file), meanwhile in the buildsystem "glslc.exe" is compiling the shader.
When "glslc.exe" finishes, the file content is sent to ProjectedFS.
Now inside LDC2.exe the fread() instruction returns and LDC2.exe will have the shader binary.
Other ways to do this:
Second chance, if the 'hack' stops working because of a compiler change:
- Compile the modules as normally with an incremental way.
- run the exe in a special 'mode', so it will export the shader sources generated in DLang CTFE. (Difficult because what if multiple modules.)
- compile the shaders
- zip the shaders and copy to the end of the exe file.
Third chance:
Move the shader compilation task (glslc, Vulkan SDK), to the machine of the client. I don't want this.
So I have more options but the best one form me was the "extending the functionality of the LDC2.exe by a non standard thing". (I had my own macro preprocessor for Delphi around 15-20 years ago. It was awesome but it also had a heavy IDE integration dependency, I burned my hands with that, haha. But this time, I'm optimistic.)
I can only choose the fastest way because I'm lonewolfing, this tool lets me have control over 100KLOC which is kinda my limit. Once there is a team involved in the development, these tools easily transforming from helping goodies to a big burden, noone want to deal with, I know.
(Thanks for __ctfeWrite! I can't use it right now because I'm depending on LDC2, but I saved it in my stash. ;) )