Thread overview
Interesting memory safety topic
Feb 12, 2019
Eduard Staniloiu
Feb 12, 2019
H. S. Teoh
Feb 13, 2019
JN
Feb 13, 2019
Paulo Pinto
Feb 13, 2019
JN
Feb 13, 2019
Paulo Pinto
Feb 13, 2019
Kagamin
February 12, 2019
Something that caught my attention on Reddit’s r/cpp

“Microsoft: 70 percent of all security bugs are memory safety issues”

Reddit r/cpp thread about this

https://www.reddit.com/r/cpp/comments/aprgkf/microsoft_70_percent_of_all_security_bugs_are/?st=JS27GQSQ&sh=3fc6d57c

Cheers,
Edi
February 12, 2019
On Tue, Feb 12, 2019 at 08:25:24PM +0000, Eduard Staniloiu via Digitalmars-d wrote:
> Something that caught my attention on Reddit’s r/cpp
> 
> “Microsoft: 70 percent of all security bugs are memory safety issues”
[...]

Walter is right on the money about memory safety becoming an increasingly important problem.

Based on my experience working with C/C++ codebases, I'd say that one of the key causes of memory safety problems is the decay of arrays into pointers in C/C++, which Walter has rightly said will ultimately be the downfall of C/C++.  Pairing the pointer with a length, as in D arrays, is a major step in avoiding this problem.  The "extra baggage" of an extra length field is well worth the cost -- besides, in most cases in C/C++, you already need to pass the length with the pointer *anyway*, so why not have the language handle it for you correctly rather than rely on fallible humans to do the job manually, and, as the history of security problems proves, very poorly.

The second biggest cause of memory safety problems IMO is not using a GC. I.e., manual memory management.

Memory management is a very complicated task, and humans simply aren't good at doing it.  I've been there and done that -- it *is* possible to write memory-safe code with manual memory management, but it takes a lot of time, a lot of effort, and a lot of experience, and *one* small slip-up (among the millions conscientiously avoided by careful coding) can cost you dearly.

It also constantly distracts the programmer from focusing on the problem domain: everywhere you look in a non-trivial program, you need to address memory management, and this becomes a tax that you pay at every turn. APIs are uglified because you have to address memory management somehow.  Libraries become gratuitously incompatible because they were written with different memory management schemes in mind.  Your code and design suffers because you're forced to direct so much mental effort towards micro-managing your memory, rather than focusing on solving the problem domain.

And the incentives are all wrong: because you have to pay memory management tax at every turn, and because manual memory management is so onerous, you end up preferring solutions that simplify or reduce memory management, rather than solutions that better fit the problem domain. For example, using strlen and copying on append / substring everywhere, rather than a more efficient method like slicing, because keeping track of when to free those slices will complicate your code so much (plus, it would be incompatible with the pervasive char* interfaces of all those libraries you depend on), that it's simply not worth the effort.  So APIs end up being poorly designed in order to simplify memory management, e.g., store an error message in a global (with the associated messiness of subsequent calls overwriting previous error messages, etc.), rather than allocating a message string, because doing the latter would require facing tricky issues of ownership and who's responsible for cleaning up.  Poorer algorithms end up being chosen because they're quick and easy, memory management wise, whereas better algorithms would make the memory management involved so complicated that it would be a monumental effort to pull off.

And in spite of all this effort and these compromises, memory safety problems continue to plague C/C++ codebases on a regular basis.


Pairing length with a pointer to make an array/slice, and having a GC, are big advances in increasing memory safety of software.  They address what I consider to be two of the top causes of memory safety problems. Unfortunately, many folks with C/C++ background seem to be allergic to the GC, and will undoubtedly hate me for saying that not using a GC is one of the leading causes of their memory safety problems. But the historical facts speak for themselves.


T

-- 
Gone Chopin. Bach in a minuet.
February 13, 2019
On Tuesday, 12 February 2019 at 21:31:35 UTC, H. S. Teoh wrote:
> The second biggest cause of memory safety problems IMO is not using a GC. I.e., manual memory management.
>
> Unfortunately, many folks with C/C++ background seem to be allergic to the GC, and will undoubtedly hate me for saying that not using a GC is one of the leading causes of their memory safety problems. But the historical facts speak for themselves.
>
>
> T

Seems like the tide has turned. While folks with C/C++ are still allergic to a GC, they are very open to Rust-like static analysis solutions which seem to be working.
February 13, 2019
On Wednesday, 13 February 2019 at 07:30:35 UTC, JN wrote:
> On Tuesday, 12 February 2019 at 21:31:35 UTC, H. S. Teoh wrote:
>> The second biggest cause of memory safety problems IMO is not using a GC. I.e., manual memory management.
>>
>> Unfortunately, many folks with C/C++ background seem to be allergic to the GC, and will undoubtedly hate me for saying that not using a GC is one of the leading causes of their memory safety problems. But the historical facts speak for themselves.
>>
>>
>> T
>
> Seems like the tide has turned. While folks with C/C++ are still allergic to a GC, they are very open to Rust-like static analysis solutions which seem to be working.

Not all of them sadly. There is a talk from Herb Sutter at CppCon where only 1% of the audience claimed to use any kind of static analysis during their daily workflows.

However Microsoft is the one pushing for lifetime analysis tooling on VC++ and clang.

--
Paulo
February 13, 2019
On Tuesday, 12 February 2019 at 20:25:24 UTC, Eduard Staniloiu wrote:
> Something that caught my attention on Reddit’s r/cpp
>
> “Microsoft: 70 percent of all security bugs are memory safety issues”
>
> Reddit r/cpp thread about this
>
> https://www.reddit.com/r/cpp/comments/aprgkf/microsoft_70_percent_of_all_security_bugs_are/?st=JS27GQSQ&sh=3fc6d57c
>
> Cheers,
> Edi

Session slides available here:

https://github.com/Microsoft/MSRC-Security-Research/blob/master/presentations/2019_02_BlueHatIL/2019_01%20-%20BlueHatIL%20-%20Trends%2C%20challenge%2C%20and%20shifts%20in%20software%20vulnerability%20mitigation.pdf

TL;DR; Key points from Microsoft internal development, focus on C# and Rust, push tooling to enforce usage of C++ Core Guidelines for the areas where C++ is still unavoidable.
February 13, 2019
https://medium.com/@shnatsel/how-rusts-standard-library-was-vulnerable-for-years-and-nobody-noticed-aebf0503c3d6
>When I asked the maintainers to file a CVE, they said that if they filed one for every such bug they fixed they’d never get any actual work done.

Nice quote.
February 13, 2019
On Wednesday, 13 February 2019 at 07:45:15 UTC, Paulo Pinto wrote:
> Not all of them sadly. There is a talk from Herb Sutter at CppCon where only 1% of the audience claimed to use any kind of static analysis during their daily workflows.
> --
> Paulo

That's why you make that stuff opt-out rather than opt-in, and easiest to integrate with IDE as possible. Programmers are lazy by nature. I wouldn't ever use DScanner or Dfmt if I had to invoke them manually. But I configured code-d in VSCode to run Dfmt on each file save and I don't have to think about it. Same for DScanner, it has some defaults and it validates my code every time I save the source file.