On Saturday, 23 December 2023 at 22:55:34 UTC, Bruce Carneal wrote:
> Is it really easy, trivial even, to button things up with either proposal or is one easier to use correctly than the other?
1027 makes it possible to do some cases correctly, but difficult to trust in the general case since it makes no attempt at type safety and its string cannot differentiate between user-injected strings and format string literals.
So, when you process a 1027 style format string, and see a %, was that part of the string or was that injected by the compiler to indicate a param placeholder? What if the user forgets to escape something, or passes the wrong syntax as a custom specifier? These are all unforced errors in the design of 1027, that led to its DIP being rejected by community review.
On the other hand, 1036e corrects these flaws, while adding the possibility for CTFE manipulation, aggregation, and verification of all string literals passed.
I encourage everyone to look at the sample repository here:
https://github.com/adamdruppe/interpolation-examples/
Several of the use cases selected for that specifically demonstrate how it gives the users the convenient syntax they expect from string interpolation, yet actually lowers to the correct semantics for each specialized problem domain.
Example #1, basics, shows how, when a string is the right thing to do, it works quite easily for it.
Example #2, formatting, shows how format strings can be attached and processed in library code, including compile-time verification associated with the data types passed.
Example #3, printf, shows how you can adapt the advanced usage D provides to be compatible with legacy functions in a zero-runtime-cost manner.
Example #4, internationalization, builds off the techniques shown in the previous examples to use the industry-standard GNU gettext library, coupled with automatic aggregation of translatable strings at compile time, to provide full context to non-developers to add new language packs at run time.
The next three examples are directly relevant to your question, and address common problems web developers face, where security problems are often introduced where strings are convenient, but no longer appropriate for correctness.
Example #5, urls, shows how you can build off the previously demonstrated techniques, to make a directly-manipulable high-level object out of what looks to be a simple, familiar string. Since it works at a high level, aware of the surrounding context, it ensures each injected component is encoded appropriately for that context.
Example #6, sql, directly avoids the trap of sql injection by separating code and data - delegating the recombination of them to the database engine to do it safely and correctly, yet appearing to the user to be a convenient mixture of the two! Notice how the usage example, at the top level of the repository, looks like string interpolation, yet the implementation, in the lib
folder, actually binds the data to a prepared statement in a structured way, like the guides say you are supposed to!
Finally, example #7, directly avoids the trap of XSS holes by, again, separating HTML structure from added data and ensuring correct encodings and valid data positioning is done in all contexts. With CTFE validation, it prevents common mistakes that can manifest as bugs or exploitable holes in production, and by working on a high level, using object representations instead of raw strings, it ensures all semantic invariants are maintained from creation to consumption. It goes beyond just bringing web best practices to the D programming language - it also enables innovation by allowing coupling of these security guidelines and development best practices with D's unique features for static analysis and compile-time processing.
Similar examples could be written for shell scripting, json, and more, but I thought this was enough to make the point and demonstrate the relevant patterns.
By the end of this year, when this new feature is merged, D will cement its position as an innovating pioneer, learning the lessons from the past and applying their best libraries in a whole new way.