Jump to page: 1 2
Thread overview
Fear of Compiler Magic
Aug 02
IchorDev
Aug 02
Dennis
Aug 03
Dennis
August 02

I hear people complain about compiler magic a lot. Yes, being able to do everything in-language is nice, but compiler magic is inevitable and also can be very useful. assert is my favourite example. Things like Python’s print are more dubious. You can always make a better print function on your own, right? Whereas one assert is never going to transcend another assert, even if the way it prints its assertion failure is slow, who cares? The program is already functionally dead anyway. Do we really want C’s ‘everyone assert for themself’ problem?
And besides that, isn’t ‘compiler magic’ at its logical conclusion generally applicable to any task performed by the compiler? Exception handling, code optimisation, inlining, template expansion, new, adding two numbers? Compiler magic.

August 02
On 8/2/24 05:22, IchorDev wrote:
> I hear people complain about compiler magic a lot.

Probably this is partially inspired from here:
https://dconf.org/2018/talks/alexandrescu.html

> Yes, being able to do everything in-language is nice, but compiler magic is inevitable and also can be very useful. `assert` is my favourite example.

int foo(){ enforce(0); } // error
int foo(){ assert(0); } // ok

"No compiler magic" would e.g. mean: `enforce` can similarly influence definite return analysis. It's not inevitable that this is impossible.

> Things like Python’s `print` are more dubious. You can always make a better print function on your own, right? Whereas one assert is never going to transcend another assert, even if the way it prints its assertion failure is slow, who cares? The program is already functionally dead anyway. Do we really want C’s ‘everyone assert for themself’ problem?

Well, this is what assert does. The question is how it achieves it, and whether the same tools are accessible to user code that perhaps does _something different than assert_.

> And besides that, isn’t ‘compiler magic’ at its logical conclusion generally applicable to *any* task performed by the compiler?

Not if the tasks are properly decomposed into orthogonal components that are also available to the user. Anyway, there is quite a bit of existing magic, and it does cause some issues.

August 02

On Friday, 2 August 2024 at 03:22:28 UTC, IchorDev wrote:

>

And besides that, isn’t ‘compiler magic’ at its logical conclusion generally applicable to any task performed by the compiler? Exception handling, code optimisation, inlining, template expansion, new, adding two numbers? Compiler magic.

A good programming language consists of a small set of orthogonal features that combine into a powerful language. Magic features are situational. They add the maintenance burden of an orthogonal feature, but instead of multiplying the language's expressiveness, they only add a constant to it. So what happens in practice when you pile on magic features?

To keep technical debt of large code bases under control, refactoring needs to happen. Refactoring relies on performing code transformations that result in equivalent semantics. As a programmer, you want to have actions you can always do safely, based on general facts:

  • Comments don't affect code
  • Unreachable code can be removed
  • Function definitions can be moved around

However, proposals for magic features tend to add more and more exceptions. I have seen proposals that would make version(all) assert(0); not always equivalent to assert(0);, or x + 1 different from x + (1). And most recently: Make printf safe.

You would think it's safe to transform this:

printf("x = %s\n", x);
printf("x = %s\n", x);

Into this:

const(char)* fmt = "x = %s\n";
printf(fmt, x);
printf(fmt, x);

But with magic printf format string rewrites, that transformation turns correct code into memory corrupting code when x is an int.

This doesn't mean magic features are always a bad idea. The __FILE__ and __LINE__ tokens are somewhat magical, and they break all the aforementioned refactoring equivalences, since any code movement (including comments!) can alter line numbers, which potentially alters the meaning of the program.

In practice of course, __FILE__ and __LINE__ are not used for control flow, only for logging, so it's not a big problem. But hopefully you see why many people are very wary of magic features, lest the language becomes a minefield of gotchas like the printf example.

p.s. It even used to be the case that __LINE__ + 0 was not the same as __LINE__! (https://issues.dlang.org/show_bug.cgi?id=18919)

August 03
On 8/2/2024 1:34 AM, Timon Gehr wrote:
> Well, this is what assert does. The question is how it achieves it, and whether the same tools are accessible to user code that perhaps does _something different than assert_.

One reason for some of the builtin stuff is to not tempt people to write their own. Having standardized ways to do common tasks is a big win for making code understandable by others, which advantageous in a team environment.

For example, the `debug` conditionals came about from my discussions with a veteran Microsoft programming manager. He complained that every project invented their own scheme for doing debug conditionals, making it unnecessarily difficult to share code.

Unittests and Ddoc are other successful examples.

Lisp is a language that enables building one's one programming language on top of it. It more or less requires it.

The result is every Lisp user invents their own language, incompatible with any other Lisp user, and so successful Lisp programs don't survive their creators. It's why Lisp has never really caught on, much to the bafflement of Lisp advocates.
August 03
On 8/2/2024 2:29 AM, Dennis wrote:
> You would think it's safe to transform this:
> ```D
> printf("x = %s\n", x);
> printf("x = %s\n", x);
> ```
> 
> Into this:
> ```D
> const(char)* fmt = "x = %s\n";
> printf(fmt, x);
> printf(fmt, x);
> ```
> 
> But with magic printf format string rewrites, that transformation turns correct code into memory corrupting code when x is an int.

The transformation won't compile if the call is marked @safe, and won't compile with the various proposals to increase the default safety-ness.

It is in the same box as:

```
int[] array;
x = array[5];
```

and rewriting as:

```
int[] array;
x = *(array.ptr + 5);
```
August 03

On Saturday, 3 August 2024 at 17:02:55 UTC, Walter Bright wrote:

>

The transformation won't compile if the call is marked @safe, and won't compile with the various proposals to increase the default safety-ness.

This is an interesting aspect that I forgot to mention: Whenever I bring up such examples of problematic cases resulting from 'magic', the defense is often akin to: "Sure, it fails in that theoretical / pathological case, but when are you going to find that in REAL code?"

Which can be fair. Like I said, a bit of magic doesn't always have to be problematic. And while I can't tell you when or how these problem cases are going to crop up, do beware that there might just be plenty of opportunity, just from a statistical standpoint.

It's like Intel saying, in response to the Pentium FDIV bug: "Well, when are you going to divide 4195835 by 3145727 needing all the precision? Give me an example of when you would divide those specific numbers in a REAL application."

That's hard to answer upfront, but with enough users doing enough floating point math, eventually you get some great stories: When working on Quake, Michael Abrash spent hours tracking down a graphical glitch, until finding out with the help of a friend from Intel that it was the infamous hardware bug.

Going back to printf, it's possible hardly anyone will ever hit this problem. But when eventually there's thousands of calls to magic printf out there (or snprintf which still can't be called in @safe code!), each one is a contender to be part of that one spectacular failure. That's why I used the word "minefield": There's no guarantee of things blowing up, it depends on the density of mines and the amount of people crossing.

So you could try some sort of risk assessment for magic features, but I prefer to just avoid them as much as possible and look for alternatives.

>

It is in the same box as:

That example is different, because the first program wasn't correct to begin with. Or if it were, then the refactoring would result in the second program also being correct. In my example, only the first program was correct.

August 04
On 8/3/24 18:54, Walter Bright wrote:
> On 8/2/2024 1:34 AM, Timon Gehr wrote:
>> Well, this is what assert does. The question is how it achieves it, and whether the same tools are accessible to user code that perhaps does _something different than assert_.
> 
> One reason for some of the builtin stuff is to not tempt people to write their own. Having standardized ways to do common tasks is a big win for making code understandable by others, which advantageous in a team environment.
> ...

Sure, I am in favor of built-in assert. This thread is however about magic.

> For example, the `debug` conditionals came about from my discussions with a veteran Microsoft programming manager. He complained that every project invented their own scheme for doing debug conditionals, making it unnecessarily difficult to share code.
> 
> Unittests and Ddoc are other successful examples.
> ...

Off-topic, but yes.

> Lisp is a language that enables building one's one programming language on top of it. It more or less requires it.
> 
> The result is every Lisp user invents their own language, incompatible with any other Lisp user, and so successful Lisp programs don't survive their creators. It's why Lisp has never really caught on, much to the bafflement of Lisp advocates.

Sure.
August 04
On 8/3/24 19:02, Walter Bright wrote:
> On 8/2/2024 2:29 AM, Dennis wrote:
>> You would think it's safe to transform this:
>> ```D
>> printf("x = %s\n", x);
>> printf("x = %s\n", x);
>> ```
>>
>> Into this:
>> ```D
>> const(char)* fmt = "x = %s\n";
>> printf(fmt, x);
>> printf(fmt, x);
>> ```
>>
>> But with magic printf format string rewrites, that transformation turns correct code into memory corrupting code when x is an int.
> 
> The transformation won't compile if the call is marked @safe, and won't compile with the various proposals to increase the default safety-ness.
> ...

The simple fact is that is that the magic treatment of the string-literal leads to some trouble. I.e., this is a good illustration about how magic instills fear.

> It is in the same box as:
> 
> ```
> int[] array;
> x = array[5];
> ```
> 
> and rewriting as:
> 
> ```
> int[] array;
> x = *(array.ptr + 5);
> ```

Not at all. You just orthogonally removed the range check. This is a completely unrelated case. Nothing surprising happens here.
August 05

On Friday, 2 August 2024 at 08:34:39 UTC, Timon Gehr wrote:

>

On 8/2/24 05:22, IchorDev wrote:

>

I hear people complain about compiler magic a lot.

Probably this is partially inspired from here:
https://dconf.org/2018/talks/alexandrescu.html

>

Yes, being able to do everything in-language is nice, but compiler magic is inevitable and also can be very useful. assert is my favourite example.

int foo(){ enforce(0); } // error
int foo(){ assert(0); } // ok

"No compiler magic" would e.g. mean: enforce can similarly influence definite return analysis. It's not inevitable that this is impossible.

This specific case could be solved with Enum Parameters: enforce could detect 0 (or false) specifically and return type noreturn.

August 05

On Sunday, 4 August 2024 at 17:41:52 UTC, Timon Gehr wrote:

>

On 8/3/24 19:02, Walter Bright wrote:

>

On 8/2/2024 2:29 AM, Dennis wrote:

>

You would think it's safe to transform this:

printf("x = %s\n", x);
printf("x = %s\n", x);

Into this:

const(char)* fmt = "x = %s\n";
printf(fmt, x);
printf(fmt, x);

But with magic printf format string rewrites, that transformation turns correct code into memory corrupting code when x is an int.

The transformation won't compile if the call is marked @safe, and won't compile with the various proposals to increase the default safety-ness.
...

The simple fact is that is that the magic treatment of the string-literal leads to some trouble. I.e., this is a good illustration about how magic instills fear.

And it’s why I suggested using __printf instead. It can be an intrinsic (a keyword even), and be specified to require a compile-time constant string as its first argument, i.e. a string literal or something synthesized by CTFE, but nothing run-time.

« First   ‹ Prev
1 2