December 06, 2022
On Tuesday, 6 December 2022 at 10:03:24 UTC, Arjan wrote:
> On Monday, 5 December 2022 at 23:58:58 UTC, Timon Gehr wrote:
>> On 12/5/22 20:57, H. S. Teoh wrote:
>> Default initialization does not even fix all initialization issues, it just makes them reproducible. Anyway, I think neither default initialization nor uninitialized variables are the right solution, but you kind of have to do it this way given how scoping works in C++ and in D.
>
> Now I'm curious, what, in you opinion, would be best for initialization?
> How is C++/D scoping limiting in this?

Compiler can do control flow analysis, so they can force you to initialize things before you use them. This is the right solution to that problem.
December 06, 2022
On Monday, 5 December 2022 at 23:58:58 UTC, Timon Gehr wrote:
> On 12/5/22 20:57, H. S. Teoh wrote:
>> Similarly, D's initialized-by-default variables are often touted as a
>> big thing, but overall issues with uninitialized variables only
>> constitute about 1% of the total issues.
>
> Default initialization does not even fix all initialization issues, it just makes them reproducible. Anyway, I think neither default initialization nor uninitialized variables are the right solution, but you kind of have to do it this way given how scoping works in C++ and in D.

I wouldn't see lack of default initialization as a source of bugs, rather an attack vector.  It isn't a concern that there are uninitialized data pointing to garbage causing your program to do something wild/unexpected.  The concern is when it might point to useful information.
December 06, 2022

On Tuesday, 6 December 2022 at 04:35:18 UTC, Siarhei Siamashka wrote:

>

end users. While D compilers don't offer any reasonable protection. Except for GDC, which supports -ftrapv option as an undocumented "Easter egg".

What about these modules?
https://dlang.org/phobos/std_checkedint.html
https://dlang.org/phobos/core_checkedint.html

December 07, 2022
On 12/6/22 19:24, Iain Buclaw wrote:
> On Monday, 5 December 2022 at 23:58:58 UTC, Timon Gehr wrote:
>> On 12/5/22 20:57, H. S. Teoh wrote:
>>> Similarly, D's initialized-by-default variables are often touted as a
>>> big thing, but overall issues with uninitialized variables only
>>> constitute about 1% of the total issues.
>>
>> Default initialization does not even fix all initialization issues, it just makes them reproducible. Anyway, I think neither default initialization nor uninitialized variables are the right solution, but you kind of have to do it this way given how scoping works in C++ and in D.
> 
> I wouldn't see lack of default initialization as a source of bugs, rather an attack vector.  It isn't a concern that there are uninitialized data pointing to garbage causing your program to do something wild/unexpected.  The concern is when it might point to useful information.

True, that's a concern (default initialization does fix _some_ issues around initialization).
December 07, 2022
On 12/6/22 11:03, Arjan wrote:
> On Monday, 5 December 2022 at 23:58:58 UTC, Timon Gehr wrote:
>> On 12/5/22 20:57, H. S. Teoh wrote:
>> Default initialization does not even fix all initialization issues, it just makes them reproducible. Anyway, I think neither default initialization nor uninitialized variables are the right solution, but you kind of have to do it this way given how scoping works in C++ and in D.
> 
> Now I'm curious, what, in you opinion, would be best for initialization?

Ideally you just eliminate those cases where a programmer feels like they have to leave a variable uninitialized. The whole concept of "uninitialized variable" does not make a whole lot of sense from the perspective of a safe high-level programming language.

> How is C++/D scoping limiting in this?
> 

Variables are scoped within the innermost block that they are declared in. Languages like Python that don't have block-local scoping just don't have this particular problem (there's plenty of things to dislike about Python, but this is something it got right I think):

```python
# note there is no x declared here
if cond:
    x = f()
else:
    x = g()
print(x)
```

A particularly egregious case is the do-while loop:

```d
do{
    int x=4;
    if(condition){
        ...
        x++;
    }
    ...
}while(x<10); // error
```

Just... why? x)
December 07, 2022

On Tuesday, 6 December 2022 at 22:26:17 UTC, Sergey wrote:

>

On Tuesday, 6 December 2022 at 04:35:18 UTC, Siarhei Siamashka wrote:

>

end users. While D compilers don't offer any reasonable protection. Except for GDC, which supports -ftrapv option as an undocumented "Easter egg".

What about these modules?
https://dlang.org/phobos/std_checkedint.html
https://dlang.org/phobos/core_checkedint.html

Imagine that you have several millions of D code (a big popular browser) and you want to do something to safeguard against integer overflow bugs and security issues (or at least mitigate them). How would these modules help?

In my opinion, the std.checkedint module is completely useless and there are no practical scenarios in the real world where it can help in any meaningful way. I see no real alternative to -ftrapv or UBSan for integer overflows diagnostics in large software projects. But the core.checkedint module is surely useful after you already know the exact part of the code where the overflow happens and needs to be patched up.

My reply would be incomplete without mentioning that D compilers do have some limited static analysis at compile time, intended to improve integer overflows safety: https://dlang.org/spec/type.html#vrp
But this analysis doesn't catch everything and it also sometimes unnecessarily gets in the way by forcing type casts. So it's not good enough. Want to see it in action? Here's one example:

import std.stdio;
void main() {
  long bigsum = 0;
  int min_a = int.max - 50, max_a = int.max - 1;
  int min_b = 50, max_b = 100;
  foreach (a ; min_a .. max_a + 1) {
    foreach (b ; min_b .. max_b + 1) {
      // Compiles fine, but overflows at runtime
      int a_plus_b = a + b;
      bigsum += a_plus_b;
    }
  }
  byte min_c = 1, max_c = 50;
  byte min_d = 1, max_d = 50;
  foreach (c ; min_c .. max_c + 1) {
    foreach (d ; min_d .. max_d + 1) {
      // Won't compile without this explicit cast
      byte c_plus_d = cast(byte)(c + d);
      bigsum += c_plus_d;
    }
  }
  writeln(bigsum);
}
$ gdc -ftrapv test.d && ./a.out
Aborted

$ gdc test.d && ./a.out
-5471788083929
December 08, 2022
Very good post!

On 12/5/2022 11:57 AM, H. S. Teoh wrote:
> Similarly, D's initialized-by-default variables are often touted as a
> big thing, but overall issues with uninitialized variables only
> constitute about 1% of the total issues.

True, I did not encounter this bug that often. But, and this is a big but, they cost me a *lot* of time to find, sometimes days. This is because when you'd close in on where the bug was, it would dance away. The way it exhibits is totally dependent on everything else in the program.

That's why it's a very serious problem.
December 08, 2022
On 12/5/2022 11:57 AM, H. S. Teoh wrote:
> Most interesting point here is that the largest category of bugs is
> use-after-free bugs, constituting 34% of the reported issues.  (Arguably
> we should include "object lifecycle/lifetime" in this category, but I
> think those refer to bugs in the JS implementation. In any case, it
> doesn't change the conclusion.)  This is strong evidence that memory
> management is a major source of bugs, and a strong argument for GC use
> in application code.

I'm a bit surprised at this, but maybe I shouldn't. C++ doesn't have a good feature set to prevent use-after-free.


> D's bounds checks are often touted as a major feature to prevent issues
> with buffer overflow and out-of-bounds accesses.  Interestingly, "buffer
> overflow" and "out of bounds..." add up only to about 14% of the total
> issues.  Nothing to sneeze at, but nonetheless not as big an issue as
> use-after-free bugs.

The language here is C++, and C++ has touted that if you use the latest C++ features, you'll have fewer bounds problems. I suspect that is the cause of the reduction. With C code, the percent is a lot higher.

December 08, 2022
On 12/5/2022 11:57 AM, H. S. Teoh wrote:
> Most interestingly, "double free" only has 3 counts of the total, less
> than 1%, compared with "use after free", which constitute the largest
> category of issues.  This seems to suggest that it's not memory
> management in general that's necessarily problematic, but it's keeping
> track of the *lifetime* of allocated memory.  One could say that this is
> proof that lifetime is a complex problem. But again it's a strong
> argument that the GC brings a major benefit: it relieves the programmer
> from having to worry about lifetime issues.  You can instantly be freed
> from 34% of security issues, if the above numbers are anything to go by.

This is a good interpretation of the results.

December 08, 2022
On 12/5/2022 3:58 PM, Timon Gehr wrote:
> Default initialization does not even fix all initialization issues, it just makes them reproducible.

As I mentioned in another post, making them reproducible is a huge deal. Leaving variables uninitialized is Heisenbug City. Testing finds the reproducible ones, not the Heisenbugs.