The D Language Foundation's monthly meeting for August 2024 took place on Friday the 9th. It lasted about an hour and forty minutes.
The Attendees
The following people attended:
- Walter Bright
- Iain Buclaw
- Rikki Cattermole
- Jonathan M. Davis
- Timon Gehr
- Martin Kinkelin
- Dennis Korpel
- Mathias Lang
- Razvan Nitu
- Mike Parker
- Robert Schadek
- Quirin Schroll
- Adam Wilson
The Summary
Replacing D's escape analysis
Rikki said he'd spoken with Dennis a month ago about trying to simplify D's escape analysis, but nothing had come of it. At BeerConf, he'd brought it up again and Dennis said he'd been thinking about it. Rikki had also spoken to Walter about it, and Walter had said that DIP 1000 wasn't quite doing what we wanted it to do and was a bit too complex.
As such, Rikki wanted to discuss the possibility of replacing DIP 1000 as D's escape analysis solution. He thought the first step before making any solid decisions was to make sure it was fully under a preview switch. Dennis confirmed that it currently was.
Rikki said the next step was to think about replacing it and what that might look like. He asked for suggestions.
Dennis said that before deciding on how to replace it, we should first state what was wrong with the current design and the goals of a replacement. He said he had some issues with it. One was the lack of transitive scope. Another was that structs only had one lifetime even if they had multiple members. He'd been thinking of allowing struct fields to be annotated as scope
, but he didn't have a concrete proposal yet.
Walter said the difficulty he'd encountered wasn't that DIP 1000 was complicated, but that the language was complicated. You had to fit it into each of the language's various constructs. How did reference types work? Or implicit class types? Or lazy arguments? Constructors? That was where the complexity came from.
He gave the example of implicit this
arguments to member functions. He'd explained over and over again that if anyone wanted to understand how they worked with DIP 1000 in constructors or member functions, the thing to do was to write it out as if this
were an explicit argument. Then you'd be able to see how it was supposed to work. But people found that endlessly confusing.
Any proposal to simplify it would also have to justify how it could possibly be simpler than DIP 1000 was now, as DIP 1000 was complicated because the language was complicated. It had to support every language construct.
Rikki said there were three different levels of escape analysis. The most basic level was "this is an output, and this contributes to the outputs". Then you had what the language was able to infer. Then you had what the programmer could add that could be proven. We didn't really have that scale, so now there was no escaping when things were just too broad.
He said he would also like the this
reference and the nested encapsulation context to be explicit arguments that you could annotate when you needed to, e.g., to declare it couldn't be null
.
Walter noted that Herb Sutter had put out a proposal for his revamped C++ language requiring this
to be explicit. That would resolve confusion regarding implicit arguments. But there was also the case of implicit arguments when you had a nested function. Those were hidden arguments and had the same issue. He didn't see any straightforward solution because it was a complicated problem.
He reiterated that the complexity of DIP 1000 was due to the complexity of the language and not to the concept itself, which was very simple. If you wrote it out using pointers, everything was clear and simple. It was when you added things like auto ref
that it started getting complex. He'd never liked auto ref
and never used it because it was just confusing.
He said if Rikki could think of a better way to do it, he was all for it. DIP 1000 was his best shot at it.
Timon said that DIP 1000 arguably already incurred some of the complexity cost of being able to annotate different levels of indirection, but it didn't allow you to do that in general. There was probably a better trade-off there.
Walter said there were two kinds of indirection: pointers and references. That doubled the complexity of DIP 1000 right there. Timon agreed but said it meant that DIP 1000 was not what you got when you translated everything to pointers. DIP 1000 was a step up from that because it actually had two levels of indirection per pointer, but it was restricted in a way that wasn't particularly orthogonal. How you annotated either level of indirection depended on the construct.
Walter agreed and said that was because references had an implicit indirection and pointers did not. He asked what could be done about that.
Rikki asked everyone to let him know if they had any ideas.
Quirin said that he understood the aim of DIP 1000 to be that you could take the address of a local variable, like a static array or something, and the pointer would be scoped and unable to escape. So it might be the case that in a future version of the language where DIP 1000 was the default, there could be a compiler switch to disable it so that taking the address of a local variable would then be an error.
He said the issue he'd run into was that if you had a system function but didn't actually annotate it as @system
, and then you had a scope
annotation, the compiler would assume that you were doing the scope thing correctly. But if you weren't, you were screwed. It was very easy to do accidentally.
This was an issue with DIP 1000. You could shoot yourself in the foot in system code. Not in safe code if you were doing it correctly, but if you were a beginner and didn't annotate something @safe
and then used, for example, -preview=in
, which was implicitly scope, you could get into trouble.
So he thought having the option to disable that stuff but enable all the checks of scope
and things like that in @safe
code would be good.
Walter said @system
turned off all the checks because sometimes you needed to do nasty things. And beginners shouldn't be writing @system
code. Quirin said if you didn't use @safe
, DIP 1000 made the language more dangerous. He thought this might be why some people had a problem with it.
Walter thought the biggest problem was that people didn't like to write annotations. The only reason they were necessary was for the case where you didn't have a function body. If you just took the address of a local, the compiler would say, "Okay, that's a scope pointer now". It would do that automatically. You didn't have to do anything extra for that. The difficulty was in the two places where you needed to add annotations: when there was no function body and in a virtual function. The compiler couldn't do it automatically in those cases.
Jonathan said part of the problem DIP 1000 was trying to solve was something he didn't care about. He was totally fine that taking the address of a local meant you had to avoid escaping it. It was nice to have extra checks for it, but all those annotations got very complicated very fast, and a lot of it was because of how complicated the language was.
For the most part, he wouldn't want DIP 1000 on at all except in very specific circumstances where he wanted some extra safety. Not having it was actually simpler. If you were only taking the addresses of locals in a small number of places, and therefore those functions were @system
, then most of your code was safe and you were fine. But once you turned on DIP 1000, you ended up with scope
inferred all over the place, and then figuring out what was going on became far, far more complicated.
He said it seemed like a lot of complication to try to make something safe, which most code shouldn't need to worry about anyway. If you were using the GC for everything, then you typically only had to take the address of things in a small number of places. All the complications around scope
didn't really buy you anything. It just made it harder to figure out what was going on.
If it were a problem that needed a solution, he would love it if we could solve it in a simpler way. He had no clue how we might go about that, but if it were up to him he'd rather not have it at all because of the complexity that it brought.
Walter said the language had recently been changed to allow ref
for local variables. That allowed for more safety without needing annotations. He thought it was a good thing. It improved the language by reducing the need for raw pointers. The next step would be to allow ref
on struct fields. The semantics of that would have to be worked out, but the more you could improve the language to reduce the need for raw pointers, the more inherently safe it would become, and there would be fewer problems.
Adam said someone in Discord had suggested that we not build Phobos v3 with DIP 1000 turned on. He kind of agreed with that view. He'd told Walter before that he thought DIP 1000 had been a huge waste of time for minimal gain.
Rikki wanted to point out that without reference counting, there was basically no way we could do 100,000 requests per second. That was gated by Walter's work on owner escape analysis, and that in turn was gated on escape analysis. So he was blocked on this, and that was why he wanted to get escape analysis sorted.
Walter said the reason the ROI was so low was because it was rather rare that people had errant bugs in their programs because of errant pointers into the stack. Mathias asked why we were spending so much time on it in that case. Walter likened it to airplane crashes: they were rare, but they were disastrous when they happened. You couldn't be a memory-safe language and have that problem.
Mathias said that DIP 1000 made him want to use D less, not more, because of all the sea of deprecations he got when he enabled it with vibe.d. It was just terrible. He was hoping it would never be turned on by default.
When it came to the DIP itself, he said that composition just didn't work. Any design that required him to annotate his class or struct with scope
in the type definition was dead on arrival. He said a lot of people compared it to const
, which was the wrong comparison. const
was outside in, but scope
was inside out. So if your outer layer was const
and you composed a type with multiple layers, then all your layers were const
. With scope
it was the other way around. We had no way to represent the depth of scopeness in the language. It wasn't possible grammatically. It was just unworkable and unusable.
I suggested we put a pin in the discussion here and schedule a meeting just to focus on DIP 1000. Everyone agreed.
(UPDATE: We had the meeting later and decided we needed to do two things to move forward: compile a list of failing DIP 1000 cases to see if they are resolvable or not; and consider how to do inference by default. I have no further updates at this time.)
Improve error messages as a SAOC project
Razvan said that Max Haughton had proposed improving compilation error messages a while back as a potential SAOC project. The goal was to implement an error-handling mechanism that was more sophisticated than the current approach of just printing errors as they happened. The details had yet to be hashed out, but the main idea was to implement an error message queue.
One of the problems with the current approach was that errors were sometimes gagged during template instantiation. What we wanted to do was to save them somewhere so that they could be printed when returning to the call site. This would be quite useful also for users of DMD-as-a-library.
With SAOC on the horizon, Razvan wanted to avoid the situation where the judges accepted an application for this project, and we later decided we didn't want to go this route for some reason.
Rikki suggested the queue should be thread-safe, as he needed it for Semantic 4. It had been on his TODO list to write exactly that, so the project had his support.
Dennis asked what wasn't thread-safe about the current mechanism with its global error count. Rikki said he hadn't looked into it, but in a multi-threaded scenario, any thread that threw would need to write the error out on the main thread. He didn't think the functionality was there.
Walter said he'd refactored error handling as an abstract class, so it could be overridden to do whatever we wanted. We could make it multi-threaded or whatever. The transition to using it was incomplete because gags were still in there, but one of the reasons he'd done it was to get rid of gags, and that would eliminate the global state. He told Razvan that anything like the proposed project should be built around instantiations of that class.
Razvan asked if that meant he had Walter's approval for the project. Walter said he didn't know what it was trying to accomplish so he couldn't say just yet.
Razvan gave the real-world example of calling opDispatch
on a struct. Maybe the body had some errors and failed to instantiate. You had no way of knowing that at the call site. It would just look like opDispatch
didn't exist on that struct. Right now, without knowing why it failed, there was no way to output a decent error message. The error was going to say that there was no field or member for that struct.
He said there were other examples. The project aimed to save the error messages instead of just tossing them in the dumpster so that an accurate error message could be output to the user back at the call site. When fixing some bugs in the past, he had needed to resort to all kinds of hacks to decide why something was failing.
Walter said he thought that was worth pursuing. But it would involve getting rid of the gagging entirely and replacing it with another abstract function or another error handler instantiation. Razvan said that wasn't necessarily true. When errors were gagged, you could save the state instead of printing them out.
Mathias thought it was a good idea and should go forward. Regarding instantiation errors, he said he saw them most often when there was an inference issue. For example, he'd do a map
, but somewhere his delegate did an unsafe operation and he ended up with an error saying the overload couldn't be found. He wondered if there was a way it could print the error about the safety problem instead.
Razvan said that this project would save everything that had failed so that a decision could be made at the call site by searching through the queue. He didn't know if this could be solved in other ways.
Walter asked how you would know at the call site which error mattered. Razvan said it depended on the use case. Walter said if you printed them all out, then you'd end up with the C++ problem of hundreds of pages of error messages.
Razvan said the project would give you a tool to put out better error messages than we had now. It wasn't intended to just save all the error messages and print them all out. That wouldn't make sense. Maybe in time--and he suspected Walter wouldn't like this--we might have priority error messages.
Walter said no, normally it was the first error that mattered. If you just logged the first error message, you'd be most of the way to where you were trying to go. Razvan agreed that would be one strategy.
Jonathan said that once we had the list, there were different things we could do with it. There might be a flag that puts out five error messages instead of one, or maybe an algorithm to enable it to go more intelligently. If we decided it wasn't doing anything for us we could always get rid of it later. But just having the list of error messages would enable us to do more than we currently could without it, though it might be hard to figure out how to use it in some circumstances.
Walter said as an initial implementation, he'd suggest just logging the first error message and see how far that got us.
Martin said it was okay as just another straightforward implementation of the abstract error sink. What worried him was if any extra context was needed, like different error categories or warning categories, or instantiation context, that kind of stuff. If we needed to extend the interface to accommodate that sort of thing, it might get hairy. Interface changes might come with a performance cost for compilers that weren't interested in the feature. That was something to be wary of.
He said another thing was that we already had a compiler switch to show gagged errors.
Third, there were circumstances in which some code only worked when a template instantiation was semantically analyzed a second time due to forward references or something. If we just went with a simple approach, an error on the first analysis of an instantiation could be invalidated on the second analysis. But even in that case, it might be nice to have the error to let you know about the forward reference.
Razvan agreed there could be some problems with this approach, but he didn't see any definite blockers. No one objected to moving forward with the project.
(UPDATE: Royal Simpson Pinto was accepted into SAOC 2024 to work on this project.)
Moving std.math to core.math
Martin said he'd been wanting to move std.math
to core.math
for years. It had come up in discussions with Walter quite a while ago in GitHub PRs, and he recalled Walter had agreed with it. It had come up again more recently in attempts to make the compiler test suite independent of Phobos. With DMD and the runtime in the same repository now, it would be nice for all of the make targets to be standalone with no dependency on Phobos just to run the compiler tests.
He'd experimented and found that most of the Phobos imports in the test cases were std.math
. One common reason was the exponentiation operator, ^^
. There were also some tests that tested the math builtins.
Calls to the standard math functions were detected by the compiler at CTFE using the mangled function names. That was already a problem because when we changed an attribute in std.math
, we needed to update the compiler as well due to the new mangled name. So we tested that all of that worked and that the CTFE math results complied with what we expected. So there was an implicit dependency on Phobos.
Martin said he wanted approval before going ahead because it wouldn't be worth it to get going and then be shut down. He wanted to make sure everyone was on board with it and that there weren't any blockers to be aware of. Phobos would import and forward everything to core.math
, which already existed in the runtime. It had something like five functions currently.
LDC already did some forwarding of math functions. std.math
was one of the few Phobos modules in which LDC and GDC had some modifications, and that was just to be able to use intrinsics. Moving it into the runtime would be nicer as it would minimize or eliminate the need for their Phobos forks.
Walter said that std.math
was kind of a grab bag of a lot of things. He suggested just moving things into DRuntime that should be core.math
and forwarding to those, then changing the test suite to use core.math
. He wanted to keep std.math
. There was still a lot of room for math functions that didn't need to be in the compiler test suite, and they could remain there.
Jonathan said that in the past when we decided we really wanted something in DRuntime that had been in Phobos, but we really wanted people importing Phobos, we moved the thing to core.internal
. For example, std.traits
imported core.internal.traits
to avoid duplicating traits used in DRuntime, and users could still get at it through std.traits
.
In the general case, it was just a question of whether we wanted core.internal
or something more public. He'd prefer going with core.internal
where possible, but either way, he saw no problem with the basic idea.
Rikki said if we were talking about primitives that the compilers recognized and that were currently living in Phobos, then yeah, move them. Full stop, no questions asked. He asked if anyone had an objection to that. When no one did, he said that was the answer to Martin's question.
Martin said the thing he didn't like about that was that we were drawing a line. Where should it be drawn? It wasn't just about the builtins. The list of CTFE builtins might not be complete. There might be some functions that should be in there but weren't. But really, most of the functions in std.math
were detected by the compiler.
As far as he knew, std.math
was quite nicely isolated and didn't depend on anything else in Phobos. He would double-check, but he was certain it was good in that respect so that it could just be moved over. He really didn't want to split it up. If it was in the runtime, it was logical to include it directly from there, starting from some specific compiler version and keeping it in the Phobos API for a while for backward compatibility. So the final location would be in core.math
.
He said we did the same thing for the lifetime helpers. move
used to be in Phobos. That was a totally bollocks decision. How could such a primitive function be in the standard library instead of the runtime? But now it was in the runtime, unfortunately with slightly different semantics, and he'd been using it from there for ages.
Walter said the dividing line was simple: if you wanted to put it in the compiler test suite, it needed to go in the runtime. Martin said he would need to check, but he thought it would be most of the functions anyway.
Mathias thought we should get rid of the exponentiation operator, though that wouldn't solve Martin's problem. Martin said moving it to the runtime would get rid of the special case where you got the error trying to use it when you didn't import std.math
. At least we'd have that. Walter agreed with Mathias that it should go. He thought it was an ugly wart in the language.
Adam said that Phobos 3 was a great opportunity for the change. It was a natural dividing line. We could keep Phobos 2 as it was and support it for a long time, but Martin could do whatever he wanted in Phobos 3. Adam had already been looking at std.math
and thinking how much he dreaded porting it over. So if Martin came up with something else and told him how to make it work, he'd make it work.
Primary Type Syntax DIP
Quirin had joined us to discuss the current draft of his Primary Type Syntax DIP (that was the second draft; his most recent as I write is the fourth draft).
He assumed most of us had not read through the entire thing, as it was a long text. He thought most DIPs were really, really short and missed a lot of detail. He felt that anything that touched on it that entered your thoughts should be a part of a DIP.
The basic idea of the proposal was that we modify the grammar without any change to the semantics or anything like that. It aimed to ensure that any type that could be expressed by an error message, for example, could be expressed in code as well and you wouldn't get parsing errors. You might get a visibility error because something was private, but that was a semantic error, not a parsing error.
He said the easiest example was a function pointer that returned by reference. This could not be expressed in the current state of D. The DIP suggested we add a clause to the type grammar allowing ref
in front of some basic types and some type suffixes. What had to follow obviously was a function or delegate type suffix, and this formed a type but not a basic type. The difference was meaningful because, for a declaration, you needed a basic type and not a type.
It also suggested that you could form a basic type from a type by putting parentheses around it. This was essentially the same as a primary expression, where if you had, e.g., an addition expression, you could put parentheses around it and then multiply it with something else. But you had to put the parentheses around it because it would otherwise have a different meaning.
So to declare a variable of a ref
-returning function pointer type, you had to use parentheses:
(ref int function() @safe) fp = null;
Rikki said that based on his knowledge of parsers, this could be difficult to recognize. The best way forward would be to implement it and see what happens. If it could be implemented without failing the test suite, it shouldn't be an issue and could go in.
Quirin said he had started implementing it for that reason. So far, it hadn't been a problem. He'd needed to modify something to do a further look ahead, but that was a niche case, and he had no idea why anyone would write such code. But he hadn't found any issues because the language usually tried to parse stuff as declarations first. When it didn't work, then it parsed as an expression. If it succeeded in parsing as a declaration, it just worked.
Walter said that there was a presentation at CppCon in 2017 titled, 'Curiously Recurring C++ Bugs`. One of the problems they went into was things like this. Was it a function call or a declaration? C++ apparently had all sorts of weird errors around things like this. So when you were talking about adding more parentheses, there was a large risk of creating ambiguities that led to unexpected compiler behavior.
In adding more meaning to parentheses in the type constructor, we'd need to be very sure that it didn't lead to ambiguities in the grammar, where users could write code that looked like one thing, but it was actually another completely unintended thing. He didn't know if the proposal suffered from this problem, but he suggested caution in adding more grammar productions like this.
Quirin said there were two grammar productions. One was the primary type stuff, and the other was just allowing ref
in front of some part so that you could declare a function pointer or delegate that returned by reference. He thought the latter one should be uncontentious. The only problem was that you could just put ref
in front of something because it was a ref
variable, or a parameter that was passed by reference, and it didn't apply to the function pointer type.
Walter said that with the function pointer type, you had two possibilities. One was that the function returned by reference, and the other was that it was a reference to a function.
Quirin said that was exactly like his second example where you had a function that returned a reference to a function pointer that returned its result by reference:
ref (ref int function() @safe) returnsFP() @safe => fp;
You needed the parentheses here to disambiguate.
Walter said D already had a syntax where you could add ref
on the right after the parameter list, and that meant the function returned by reference. But D allowed ref in both places to mean the same thing, which was an ambiguity in the language.
Quirin said the problem was that each time someone asked about this on the forums, the answer was "you can't return a function pointer by reference". People complained about putting ref
after the parameter list because it felt unnatural. His DIP was trying to make it work with ref
in front. And if you needed parentheses to disambiguate, then you needed parentheses.
Walter wasn't saying Quirin was wrong. He just wanted to put up a warning flag that ref
was currently allowed in both places. Changing that could break existing code and result in ambiguity errors in the grammar. That was his concern.
Quirin said he had an implementation for the proposal, and the implementation for ref
worked as intended. He'd played around with it for quite a while and really tried to push some limits. He'd found no issues with it.
He said the same issue applied to linkage. Like a function pointer with extern(C)
linkage. The issue there in his implementation was that it didn't apply the linkage to the type. He could parse it, but he couldn't apply it, and he didn't know why. But the whole of the thing worked perfectly. The example code he was showing wasn't fantasy code. It was compilable with his local compiler.
Walter asked Quirin to watch the video he'd mentioned. He said that maybe Quirin had solved the problem, but asked that he please review it for grammar and parentheses problems and make sure the proposal didn't suffer from them.
There were some questions about the details of the DIP that Quirin addressed, and Rikki suggested an alternative to consider if it didn't work out. He said it appeared that there was no real blocker here.
Walter said it was a laudable goal and he liked it. He just wanted to make sure we didn't get into that C++ problem of an ambiguous grammar that could be an expression or could be a type, then the compiler guessed wrong and caused hidden bugs.
Quirin said he had initially thought this would cause some weird niche problem somewhere and that he'd probably find one if he implemented it. Miraculously, it just worked. The implementation was there and anyone could play around with it. It was so much easier than reading a proposal and trying to work it out in your head.
Walter said it would be a pretty good thing to try it on the compiler test suite. Quirin agreed.
The 'making printf safe' DIP
Dennis was wondering about the DIP to make printf
safe. It was mostly meant for DMD, which wanted to become safe. But DMD had the bootstrap compiler situation. Was the plan to wait five years until the bootstrap compiler was up to date, or could we have some shorter-term solutions to make DMD's error interface @safe
compatible?
Walter asked why we needed such an old version for the bootstrap. In the old days, his bootstrap compiler was always the previous release. Why were we going back so far?
Martin said it was because we had the C++ platforms. If we newly conquered a platform using D, the most practical thing to do currently was to use GDC. The 2.076 version had the CXX front end with those backported packages. That was what he recommended to every LDC package maintainer. They were all concerned about the bootstrapping process. So he always pointed them to GDC for bootstrapping the first version. Then they were free to compile more recent versions.
He said the ideal situation was that we could still use that specific GDC release to compile the latest version. As far as he knew, that was the status quo. So we didn't have to do multiple jumps. Just compile that GCC version, which was still completely C++, and then you could compile all the existing D compilers using that GDC.
So whenever we had a new requirement for new features, then it was going to become a multi-step process. That wasn't a problem for us but would be for the package maintainers. If we were doing good, then we weren't putting too much pressure on them. Most of them did it in their spare time, making sure they had D compilers for their platforms. If we made the bootstrapping process more complicated for them, they wouldn't appreciate it.
Iain said that it was 2024 and people were still inventing new CPUs. He'd had Chinese guys inventing their own MIPS CPU having to drag out the old GDC version and port it to their CPU just to get LDC and DMD working on it. That was another modern chip that was up and coming. It was keeping those guys happy having a modern version of the D compiler rather than the C++ version so that they could jump to the latest. So that older bootstrap version was completely invaluable.
Walter said okay. It wasn't critical that the D compiler source code be made safe. It was just something he would like to do. But if it was going to cause a lot of downstream problems, then of course, what else could we do?
Iain said we'd have to make the documentation very loud and very explicit. GDC did pretty well at this, explaining what you had to do if you were starting from a given version of the compiler because certain versions of GDC were written with a specific C++ standard. To get to the latest, you had to go through these versions from whatever your starting point was. We should agree to do the same for DMD as well.
Rikki noted that Elias had done a new dockerization image of LDC which did the bootstrap from the LTS version of it up to the latest. He said we should be able to dump the compiler code base as C++, and then use that to bootstrap the same compiler version. He'd been thinking about that for a long time. It wasn't a problem today, but it would become a problem down the road.
Dennis asked if he meant exporting the compiler source as C++, and Rikki said yes. Martin said he very much disagreed. It wasn't like clang was transformable to C code so it could be bootstrapped with a C compiler.
Regarding the LTS version of LDC, he had dropped it because he didn't want to backport platform support in the compiler, in the runtime, in Phobos, into a very old version with many, many, many changes in between, just to get a bootstrap. That was stuff that Iain had already taken care of. That was extremely important work.
He said at some point we'd end up in a situation where we wouldn't be able to compile the latest with a very old compiler. There would be some steps needed in between. But any changes we made should be simple stuff. We could add @safe
here or there, or use native bitfields, or whatever. We just had to make a very conscious decision to introduce new steps only when we really needed to.
Iain added that whenever we introduced a new feature to the compiler implementation, it shouldn't be anything fringe. It should be a well-established feature that was stable and that we knew was working, and happily working for at least five years.
Martin suggested using cross-compilation when experimenting on new platforms, and the discussion veered off onto that for a while. Then Dennis brought us back to the original point.
He thought we all agreed that the bootstrap situation made it kind of complex to add new printf
features. He wondered if there could be an alternative to, e.g., error("%s", expr.toChars()
, where we used the printf
format that included the length and had a function that could return a tuple of the length and pointer that was compatible with C varargs, e.g., error("%.*s", expr.toPrintfTuple().expand)
. This would be compatible with the old compiler. The new compiler could do its safety checks, but the old compiler would still work without them. This would allow us to make a printf
-based error interface safe with new compilers while not breaking anything. We'd just have to ditch the magic format string rewriting in the DIP.
Martin said that sounded valuable. All we needed was to make sure it compiled with the older compilers, and because our test suite was using newer compilers as well, then this would ensure we had test coverage for the implementation of the new thing.
Walter added that the goal of fixing printf
here wasn't just to fix printf
, but to get rid of the incentive to use C strings in the front end. Right now, half of the data structures used C strings and the other half used D strings. Fixing the printf
issue would enable us to tilt the source code toward using D strings everywhere.
Dennis noted that toPrintfTuple
could just convert a D string to a printf
tuple. Walter thought it was a good idea. Mathias agreed and asked why we were still using printf
strings in 2024. We had type information. Why were we even passing %s
in std.format
? Tango had a better format for it. C#, Java... they had all solved this problem differently. Why were we using it?
Walter said it was because writeln
sucked. Mathias asked why we couldn't fix it. Walter said that right now that was on Adam. They had discussed it. The problem was that writeln
was absurdly complicated. If you put it or writefln
in a piece of code, you'd get a blizzard of template instantiations. That made it really difficult when you were looking at code dumps to try to isolate a problem. With printf
it was really simple. It was just a function call: push a couple of arguments on the stack, call a function, done.
Another issue was that writeln
itself was a bunch of templates. The error sink was an abstract interface. He thought that was an ideal use case for an abstract interface and it worked great. writeln
was not an abstract interface. It was an overly complicated system.
Dennis asked if a viable alternative to printf
-based errors could be that we created a minimal template version of writeln
for DMD, since DMD mostly only concatenated strings and occasionally formed an integer. Walter said we could write our own printf
, but the one in the C standard library was the most battle-tested, debugged, and optimized. Dennis emphasized that we only needed to concatenate strings. We didn't need things like battle-tested float conversion for that.
Johanthan suggested we just wrap it. Dennis said that was also okay.
Martin said that DMD at the moment didn't depend on Phobos because doing so was a big can of worms. We could write our own stripped down version of writeln
that we needed. But then there were similar things in other parts of the code base, like path manipulations and stuff. All of that was stuff we already had in Phobos, yet had to implement from scratch using some dirty malloc
stuff. That would be one of the first problems.
The second problem was using C varargs for error strings and such. This was one of those ABI issues that were hard to get right. They were a very platform-specific, special-case, complex part of the ABI. This introduced difficulty when conquering a new platform in trying to get the compiler to compile itself. If we could ditch C varargs and use proper D stuff, that would make it all easier.
Adam said that he had talked about simplifying writeln
and the std.conv
stuff, but he'd found that people protested when anyone suggested getting rid of any templates they liked. He was on board with what Walter said about writeln
being problematic because it was a blizzard of templates. But he kept hearing from people that we shouldn't remove these templates.
Jonathan said that we couldn't be removing the templates for range-based stuff. For things like the write
and std.conv
families, the problem was that they were using templates to take your arbitrary type and convert it to a string. The alternative was to hand them a string, which meant you had to do the work upfront yourself. That might work internally in DMD, but not in Phobos.
Regardless, the implementation we had could be improved. It was quite slow from what he'd seen. So even if we opted to keep the blizzard of templates, we needed to redo it.
Walter reiterated that printf
was much maligned, but it was the most debugged, optimized function in history. Maybe a writeln
could be implemented that just forwarded calls safely to printf
. It had its problems, which was why he had put forward the safe printf
proposal.
He said Jonathan was absolutely correct that templates gave a lot of advantages to writeln
. He wasn't arguing with that. But when trying to debug the compiler, dealing with writeln
was a giant pain. That was why he always went back to printf
. And he didn't want the compiler dependent on writeln
, because then we'd be unable to bootstrap the compiler.
Jonathan agreed that we didn't want DMD dependent on Phobos. In that case, maybe just wrapping printf
with something that took a string
and converted it to a C string was the way to go. Walter said that was what the safe printf
proposal did, it just had the compiler rewrite the printf
expression to make it memory-safe. Jonathan said we could avoid calling printf
directly with a wrapper function instead. Either way, the compiler's situation was different from the general case.
He said we definitely needed to rewrite writeln
to make it more efficient. It wasn't appropriate for the compiler, though, since it was doing all kinds of stuff the compiler didn't need.
We left the topic there and moved on to the next one.
void-initializg a ref variable
Dennis asked if everyone agreed that void
initializing a ref
variable should be an error. The DIP didn't specify it, and he didn't think there was any use case for it. Walter said that was an error. No one objected.
Scopes and auto ref
Dennis asked if everyone agreed that the keywords auto ref
on variables must be together and not apply with the keywords in different scopes, e.g., auto { ref int x = 3; }
. Walter said yes, kill that with fire.
Quirin said he'd noticed that when looking at the grammar, auto
and ref
didn't always need to be next to each other. It was possible, for example, to write ref const auto foo
in a parameter list. He suggested we should ban that. Walter said it should be deprecated.
Conclusion
Given that some of us would be traveling on the second Friday in September, just before DConf, we agreed to schedule our next monthly meeting on the first Friday, September 6th, at 15:00 UTC.
If you have something you'd like to discuss with us in one of our monthly meetings, feel free to contact me and let me know.