Potential of a compiler that creates the executable at once (page 5)

On Friday, 11 February 2022 at 18:13:34 UTC, H. S. Teoh wrote: > > I'm skeptical of any LoC metric. > > > T This reminds me of what Walter said before! It is actually so simple that I don't understand what's so hard about it! ``` int val = 200; // This is a line of code // This is a comment /* This is a comment This counts as a comment too! /* int function_test() { int v = 10; } ``` The following has: Lines of code: 4 Empty lines: 3 Comments: 2 Don't we all agree that this is how we should count it?

On Friday, 11 February 2022 at 17:36:03 UTC, Patrick Schluter wrote:

If one wants to get really historic it is also what made Turbo Pascal did up to version 3.0. With Turbo Pascal 4.0 they went back to more classic object file/linker

Mmm hard to say on various compilers, i never had the money when i was younger to pay for said compilers/toolsets, and now most of them (current popular ones) are free (Might have a couple Turbo Compiler with a C++ programming book, but never touched it).

No doubt many earlier commercial compilers didn't have separate architectures and probably just did x86. But it's been a long time since the 16-bit MS-DOS age when that was more common.

Though if optimizations are dropped you can probably have a very lean toolset, maybe even to build an entire distro from sources on a CD. Though last time i tried to build Libc it took a very long time, not recommended.

On Fri, Feb 11, 2022 at 08:00:14PM +0000, rempas via Digitalmars-d wrote: [...] > Thank you for the information! It seems pretty impressive to me that DMD only has 175K LoC in it's code base given the fact of how huge D is! Even without the recent commits (which how much could they be?), this seems to little to me. In that case, we can talk about re-writing it but again, that's up to the developers to decide. https://www.joelonsoftware.com/2000/04/06/things-you-should-never-do-part-i/ T -- "I'm running Windows '98." "Yes." "My computer isn't working now." "Yes, you already said that." -- User-Friendly

February 11, 2022

Re: Potential of a compiler that creates the executable at once

Posted by H. S. Teoh
in reply to rempas

Permalink

H. S. Teoh

Posted in reply to rempas

Permalink

On Fri, Feb 11, 2022 at 08:23:10PM +0000, rempas via Digitalmars-d wrote:
> On Friday, 11 February 2022 at 18:13:34 UTC, H. S. Teoh wrote:
> > I'm skeptical of any LoC metric.
[...]
> This reminds me of what Walter said before! It is actually so simple that I don't understand what's so hard about it!
[...]

It's not that it's *hard*.  It's pretty straightforward, and everybody knows what it means.

The problem is the mostly-unfounded *interpretations* that people put on it.

In the bad ole days, LoC used to be a metric used by employers to measure their programmers' productivity. (I *hope* they don't do that anymore, but you never know...)  Which is completely ridiculous because the amount of code you write has very little correlation with the amount of effort you put into it. It's trivial to write 1000 lines of sloppy boilerplate code that accomplishes little; it's a lot harder to write condense that into 50 lines of code that does the same thing 10x faster and with 10% of the memory requirements.

One of the hardest bug fixes I've done at my job involve a 1-line fix for a subtle race condition that took 3+ months to track down and identify.  I guess they should fire me for non-productivity, because by the LoC metric I've done almost zero work in that time. Good luck with the race condition, though; adding another 1000 LoC to the code ain't getting rid of the race, it'd only obscure it even further and make it just about impossible to find and fix.

And some of my best bug fixes involve *deleting* poorly-written redundant code and writing a much shorter replacement. I guess they should *really* fire me for that, because by the LoC metric I've not only been unproductive, but *counter*productive. :-P

By the above, it should be clear that the assumption that LoC is a good measure of complexity is an unfounded one.  If project A has 10000 LoC and project B has 10000 LoC, does it mean they are of equal complexity? Hardly. Project A could be mostly boilerplate, copy-pasta, redundant code, poorly-implemented poorly-chosen O(n^2) algorithms, which has 10000 LoC simply because there's so much useless redundancy. Project B could be a collection of fine-tuned, hand-optimized professional algorithms that could do a LOT under the hood, and it has 10000 LoC because it actually has a large number of algorithms implemented, and was able to fit them all into 10000 LoC because each individual piece was written to be as concise as needed to express the algorithm and no more.  In terms of actual complexity, project A might as well be kindergarten-level compared to project B's PhD sophistication.  What does their respective LoC tell us about their complexity?  Basically nothing.

And don't even get me started on code quality vs. LoC. An IOCCC entry can easily fit an entire flight simulator into a single page of code, for example. Don't expect anybody to be able to read it, though (not even the author :-D).  A more properly-written flight simulator would occupy a lot more than a single page of code, but in terms of complexity, they'd be about the same, give or take.  But by the LoC metric, the two ought to be so far apart they should be completely unrelated to each other.  Again, the value of LoC as a metric here is practically nil.

--T

On 2/11/2022 9:20 AM, max haughton wrote: > If all the libraries rely on hooking something you will silently break all but one, whereas the process of overriding a runtime hook can be made into an atomic operation that can fail in a reasonable manner if wielded incorrectly. Sorry, I don't follow that. I don't know what atomic ops have to do with it. > Doing things based on the order at link-time is simply not good practice in the general case. It's OK if you control all the things in the stack and want to (say) override malloc, but controlling what happens on an assertion is exactly the kind of thing that resolution at link-time can make into a real nightmare to do cleanly (and mutably, you might want to catch assertions differently when acting as a web server than when loading data). All link operations conform to the ordering I described. I can't think of a way that is simpler, cleaner, or easier to understand. Hooking certainly ain't. > Also linking (especially around shared libraries) doesn't work in exactly the same way on all platforms, so basically maximizing the entropy of a given link (minimize possible outcomes, so minimal magic) can be a real win when it comes to making a program that builds and runs reliably on different platforms. At Symmetry we have had real issues with shared libraries, for reasons more complicated than mentioned here granted, so we actually cannot ship anything with dmd even if we wanted to. DLLs (shared libraries) are a different story because they are all-or-nothing. In fact, they aren't actually libraries at all in the programming sense. They aren't linked in, either, there's no linking involved when accessing a DLL.

On 2/11/2022 1:42 AM, Dennis wrote: > On Friday, 11 February 2022 at 06:33:20 UTC, Walter Bright wrote: >> Now suppose X.obj and Y.obj both define foo. Link with: >> >> link X.obj Y.obj A.lib B.lib C.lib >> >> You get a message: >> >> Multiple definition of "foo", found in X.obj and Y.obj > > Unless your compiler places all functions in COMDATs of course. > > https://github.com/dlang/dmd/blob/a176f0359a07fa5a252518b512f3b085a43a77d8/src/dmd/backend/backconfig.d#L303 > > https://issues.dlang.org/show_bug.cgi?id=15342 > Yes, common blocks (of which COMDATs are) are all treated as identical and one is selected, but if and only if they are already added by the linker. If the linker finds a COMDAT that resolves an undefined symbol, it is not going looking further for another one. COMDATs came about because C++ has a proclivity to spew identical functions into multiple object files. D does, too.

On Friday, 11 February 2022 at 22:08:57 UTC, H. S. Teoh wrote: > > [It's not that it's *hard*... practically nil.] > > > --T I hear you loud and clear! It's very funny how "professionals" and their companies work worse that most hobbies programmers. This is why I don't want to become a "professional" and work for a company and why I FUCKING HATE when everyone talks about programming based on what's popular and what you should learn to get a "job". Fuck this shit! I remember someone saying me the same thing when we were discussing about QT and I said how bloated it is and they guy said that this is probably due to this reason (as even that QT offers free licenses, a company is behind it). I haven't wrote almost anything but even the few things that I tried to, I would always see how many things I could do with so few lines of code that I would always wonder how some projects take hundreds of thousands of lines of code or even millions! Like, wtf they are doing? Even for software that are minimal (see suckless), they still do about 80% of what the other "big and complete" software do with about 10% of the codebase so bloatware is a thing no matter how you see it! You can't make these numbers out!

On 2/11/2022 4:34 AM, rempas wrote: > That's nice to hear! However, does DMD generates object files directly It generates object files directly. No "asm" step. The intermediate code is converted directly to machine code. > Do you think that there are any very bad places in DMD's backend? Has anyone in the team thought about re-writing the backend (or parts of it) from the beginning? It has evolved over time, but the basic design has held up very well. The main difficulty is the very complex nature of the x86 CPU, which leads to endless special cases.

Forums