H. S. Teoh
| On Fri, Dec 22, 2023 at 09:40:03PM +0000, bomat via Digitalmars-d-learn wrote:
> On Friday, 22 December 2023 at 16:51:11 UTC, bachmeier wrote:
> > Given how fast computers are today, the folks that focus on memory and optimizing for performance might want to apply for jobs as flooring inspectors, because they're often solving problems from the 1990s.
>
> *Generally* speaking, I disagree. Think of the case of GTA V where several *minutes* of loading time were burned just because they botched the implementation of a JSON parser.
IMNSHO, if I had very large data files to load, I wouldn't use JSON. Precompile the data into a more compact binary form that's already ready to use, and just mmap() it at runtime.
> Of course, this was unrelated to memory management. But it goes to show that today's hardware being super fast doesn't absolve you from knowing what you're doing... or at least question your implementation once you notice that it's slow.
My favorite example is this area is the poor selection of algorithms, a
very common mistake being choosing an O(n²) algorithm because it's
easier to implement than the equivalent O(n) algorithm, and not very
noticeable on small inputs. But on large inputs it slows to an unusable
crawl. "But I wrote it in C, why isn't it fast?!" Because O(n²) is
O(n²), and that's independent of language. Given large enough input, an
O(n) Java program will beat the heck out of an O(n²) C program.
> But that is true for any language, obviously.
>
> I think there is a big danger of people programming in C/C++ and thinking that it *must* be performing well just because it's C/C++. The C++ codebase I have to maintain in my day job is a really bad example for that as well.
"Elegant or ugly code as well as fine or rude sentences have something in common: they don't depend on the language." -- Luca De Vitis
:-)
> > I say this as I'm in the midst of porting C code to D. The biggest change by far is deleting line after line of manual memory management. Changing anything in that codebase would be miserable.
>
> I actually hate C with a passion.
Me too. :-D
> I have to be fair though: What you describe doesn't sound like a problem of the codebase being C, but the codebase being crap. :)
Yeah, I've seen my fair share of crap C and C++ codebases. C code that makes you do a double take and stare real hard at the screen to ascertain whether it's actually C and not some jokelang or exolang purposely designed to be unreadable/unmaintainable. (Or maybe it would qualify as an IOCCC entry. :-D) And C++ code that looks like ... I dunno what. When business logic is being executed inside of a dtor, you *know* that your codebase has Problems(tm), real big ones at that.
> If you have to delete "line after line" of manual memory management, I assume you're dealing with micro-allocations on the heap - which are performance poison in any language.
Depends on what you're dealing with. Some micro-allocations are totally avoidable, but if you're manipulating a complex object graph composed of nodes of diverse types, it's hard to avoid. At least, not without uglifying your APIs significantly and introducing long-term maintainability issues. One of my favorite GC "lightbulb" moments is when I realized that having a GC allowed me to simplify my internal APIs significantly, resulting in much cleaner code that's easy to debug and easy to maintain. Whereas the equivalent bit of code in the original C++ codebase would have required disproportionate amounts of effort just to navigate the complex allocation requirements.
These days my motto is: use the GC by default, when it becomes a problem, then use a more manual memory management scheme, but *only where the bottleneck is* (as proven by an actual profiler, not where you "know" (i.e., imagine) it is). A lot of C/C++ folk (and I speak from my own experience as one of them) spend far too much time and energy optimizing things that don't need to be optimized, because they are nowhere near the bottleneck, resulting in lots of sunk cost and added maintenance burden with no meaningful benefit.
[...]
> Of course, this directly leads to the favorite argument of C defenders, which I absolutely hate: "Why, it's not a problem if you're doing it *right*."
>
> By this logic, you have to do all these terrible mistakes while learning your terrible language, and then you'll be a good programmer and can actually be trusted with writing production software - after like, what, 20 years of shooting yourself in the foot and learning everything the hard way? :) And even then, the slightest slipup will give you dramatic vulnerabilities. Such a great concept.
Year after year I see reports of security vulnerabilities, the most common of which are buffer overflows, use-after-free, and double-free. All of which are caused directly by using a language that forces you to manage memory manually. If C were only 10 years old, I might concede that C coders are just inexperienced, give them enough time to learn from field experience and the situation will improve. But after 50 years, the stream of memory-related security vulnerabilities still hasn't ebbed. I think it's beyond dispute that even the best C coders make mistakes -- because memory management is HARD, and using a language that gives you no help whatsoever in this department is just inviting trouble. I've personally seen the best C coders commit blunders, and in C, all it takes is *one* blunder among millions of lines of code that manage memory, and you have a glaring security hole.
It's high time people stepped back to think hard about why this is happening, and why 50 years of industry experience and hard-earned best practices has not improved things.
And also think hard about why eschew the GC when it could single-handedly remove this entire category of bugs from your program in one fell swoop.
(Now, just below memory-related security bugs is data sanitization bugs. Unfortunately the choice of language isn't going to help you very much in there...)
T
--
In theory, software is implemented according to the design that has been carefully worked out beforehand. In practice, design documents are written after the fact to describe the sorry mess that has gone on before.
|