January 24, 2022
On Mon, Jan 24, 2022 at 09:41:28PM +0000, jmh530 via Digitalmars-d wrote:
> On Monday, 24 January 2022 at 20:09:46 UTC, H. S. Teoh wrote:
[...]
> > Makes me curious: how feasible is it to have functions with identical bodies merge together as well? This would help reduce template bloat when you have e.g. a templated type where a bunch of functions are identical because they either don't depend on the type (e.g. a container method that doesn't care about what type the payload is), or else depend on type traits that are common across many types (e.g., int.sizeof which is shared with uint.sizeof, float.sizeof, etc.).
[...]
> This reminds me of generics. In languages with generics, you can get one copy of a function over many different types. My (possibly incorrect) recollection is that it works like having the compiler cast the generic MyClass!T to MyClass. Ideally from D's perspective there would be a way to do this to minimize any overhead that you would get in Java from being forced to use classes.

IMNSHO, Java generics are weaksauce because they are unable to take advantage of compile-time type information. Basically, once you insert an object of type T into the container, all information about T is erased at runtime, it's just a container of Object, so you couldn't, for example, optimize your container based on the size/alignment of its contents, for example, or use a more compact storage method by inspecting the size of T. The container code can only perform operations that don't introduce a runtime dependency on the specifics of T.

D's templates are much more powerful, but that power does come at the price of (sometimes great) template bloat: you get a new copy of the code for every T the template is instantiated with.

The ideal best of both worlds is if the D compiler can somehow selectively type-erase the implementation of a template, so that the parts that can be factored out as generic code that works with all T, of which we only need a single instantiation, vs. the type-dependent (non-type-erased) parts which remain as separate instantiations. Merging functions that are binary-identical despite being, at the language level, distinct template instantiations, would be a good step in this direction.


> Merging functions with identical bodies is basically the problem that
> inout is trying to solve. A way to express inout via the language
> would probably be good, though inout is rather complicated.  For
> instance, you have an Inout(T) with the property that if you have a
> function that takes an Inout(T), then calling it with Inout!int,
> Inout!(const(int)), and Inout!(immutable(int)) would all have only one
> version of a function at runtime. You would still need to be able to
> enforce that Inout(T) has the same unique behaviors as inout, such as
> that you can't modify the variable in the body of the function.

inout is a hack and a crock. I think it's the wrong approach. First of all, inout as currently implemented is incomplete: there are a lot of things you cannot express wrt. inout that you might reasonably want to. For example, inout applied to delegates: it quickly becomes ambiguous what the inout is supposed to refer to: the return type of the delegate, or its parameter, or the outer function's return type, or the outer function's parameter. Inside the function body if you need to hold references to the delegate and/or its parameters, it quickly becomes a total mess (and just plain doesn't compile because the compiler doesn't understand what you're trying to do and the language doesn't let you express what you want to do).

Secondly, inout applies only to const/immutable. In generic code, I frequently find myself wishing for inout(nothrow), inout(pure), etc., but the language currently does not support such things. And it's also questionable whether attribute soup + attribute soup with complete sub-grammars is really the right direction to go. Your function declarations quickly drowns in attribute subgrammar and readability goes out the window.

Third, inout kinda-sorta behaves like a template except that it isn't one, and it kinda-sorta behaves like Java generics, but without half of the expressiveness. It's a special case of templates with the optimization of identical bodies being merged, but artificially restricted to a single function and to alternation between const/mutable/immutable, and arbitrarily *not* a template so it behaves differently from the template part of the language.  A misfit stuck in a very narrow niche that fits in neither with templates nor with generics.

IMO the right approach is to just replace inout with templates, let the compiler merge identical function bodies and eliminate template bloat, and let the compiler infer the attribute soup for you so that you don't have to deal with it directly.


> This would also be useful for reducing template bloat with allocators. For instance, if you have a templated function that takes an allocator as a parameter, then for every different allocator you use with it, you get an extra copy of the function, even if the body of the function may be the same across the different allocators used (since it may be jumping to the allocator to call that code).

Exactly.


T

-- 
Skill without imagination is craftsmanship and gives us many useful objects such as wickerwork picnic baskets.  Imagination without skill gives us modern art. -- Tom Stoppard
January 24, 2022
On Monday, 24 January 2022 at 20:09:46 UTC, H. S. Teoh wrote:
>
> Makes me curious: how feasible is it to have functions with identical bodies merge together as well?

Cursory reseach suggests that the main obstacle to this is function pointers. Specifically, if two different functions with identical bodies both have their addresses taken, and the resulting pointers are compared with ==, the result must be `false`. Unless the compiler (or linker, in the case of LTO) can prove that a comparison like this never happens, the optimization is invalid.

An easier optimization is to "factor out" the bodies of identical functions into a single new function, and have the original functions simply forward to the new one. For example, when presented with

    int f(int x, int y) { return x^^2 - y + 42; }
    int g(int x, int y) { return x^^2 - y + 42; }

...the compiler (or linker) could emit something like

    int f(int x, int y) { return __generated(x, y); }
    int g(int x, int y) { return __generated(x, y); }

    int __generated(int x, int y) { return x^^2 - y + 42; }

With tail-call optimization, the overhead of the additional function call is extremely small.
January 25, 2022
On Monday, 24 January 2022 at 10:06:49 UTC, rempas wrote:
> ..
> What do you guys think?

Well, if const were default, you could ask yourself, is there any real reason to use mutable :-)

If I see a function: void getuser(string pwd){} .. I get kinda concerned.

If I write a function: void getuser(string pwd){} .. I also get kinda concerned.

After hours of debugging, you eventually realise the root of your problem, was that pwd was not const.

Constraints on parameters help to ensure correctness. That is the value of const.

I wish @safe and const were default actually.

January 25, 2022
On 24.01.22 18:47, Ali Çehreli wrote:
> 
> However, const on the function API is also for communication: It tells the caller what parameters are not going to be mutated by the function. But I've become one of the people who advocate 'in' over 'const' especially when compiled with -preview=in:
> 
> https://dlang.org/spec/function.html#in-params
> 
> Sweet! 'in' even enables passing rvalues by reference! :)

Actually I am very disappointed that passing rvalues by ref is now tied to transitive const. Makes no sense.
January 25, 2022

On Monday, 24 January 2022 at 18:48:48 UTC, H. S. Teoh wrote:

>

I used to be a hardcore C programmer. So hardcore that I won an award in the IOCCC once (well OK, that's not something to be proud of :-D). Correctly answered a C question on an interview technical exam that even my interviewer got wrong. Was totally into the philosophy of "the programmer knows better, compiler please step aside and stop trying to restrict me". Believed my code was perfect, and could not possibly have any bugs because I mulled over every line and polished every character. Didn't believe in test suites because I hand-tested every function when I wrote it so there can't have been any bugs left. And besides, test suites are too cumbersome to use. Used to pride myself on my programs never crashing. (And the times they did I blamed on incidental factors, like I was too distracted because some idiot was WRONG on the internet, gosh the injustice!)

Then I discovered D. And in particular, D's unittest blocks. Was very resistant at first (why would I need to test perfect code), but they were just so darned convenient (unlike unittest frameworks in other languages) they just keep staring at me with puppy eyes until I felt so ashamed for not using them. Then the unittests started catching bugs. COPIOUS bugs. All kinds of boundary cases, careless typos, logic flaws, etc., in my "perfect" code. And EVERY SINGLE TIME I modified a function, another unittest started failing on a previously-tested case (that I disregarded as having nothing to do with my change so not worthy of retesting).

Then this awful realization started dawning on me... my code was NOT perfect. In fact, it was anything BUT perfect. My "perfect" logic that flowed from my "perfect" envisioning of the perfect algorithm was actually full of flaws, logic errors, boundary cases I hadn't thought of, typos, and just plain ole stupid mistakes. And worst of all, I was the one making these careless mistakes, practically EVERY SINGLE TIME I wrote any code. What I thought was perfect code was in fact riddled with hidden bugs in almost every line. Usually in lines that I'd written tens of thousands of times throughout my career, that I thought I could write them perfectly even in my dreams, I knew them so well. But it was precisely because of my confidence that these "trivial" lines of code were correct, that I neglected to scrutinize them, and bugs invariably crept in.

Then I observed top C coders in my company make these very same mistakes, OVER AND OVER AGAIN. These were not inexperienced C greenhorns who didn't know what they were doing; these were top C hackers who have been at it for many decades. Yet they were repeating the same age-old mistakes over and over again. I began to realize that these were not merely newbie mistakes that would go away with experience and expertise. These mistakes keep getting made because HUMANS MAKE MISTAKES. And because C's philosophy is to trust the programmer, these mistakes slip into the code unchecked, causing one disaster after another. Buffer overflow here, security exploit there, careless typos that cause the customer's production server to blow up at a critical time. Memory leaks and file descriptor leaks that brought a top-of-the-line server to its knees after months of running "perfectly". And the time and money spent in finding and fixing these bugs were adding up to a huge mountain of technical debt.

Today, my "trust the programmer" philosophy has been shattered. I want the compiler to tell me when I'm doing something that looks suspiciously like a mistake. I want the language to be safe by default, and I have to go out of my way to commit a mistake. I want the compiler to stop me from doing stupid things that I'd done a hundred times before but continue to do it BECAUSE HUMANS ARE FALLIBLE.

Of course, I don't want to write in a straitjacket like Java makes you do -- there has to be an escape hatch for when I do know what I'm doing. But the default should be the compiler stopping me from doing stupid things. If I really meant to cast that pointer, I want to have to write a verbose, ugly-looking "cast(NewType*) ptr" instead of just having a void* implicitly convert to whatever pointer type I happen to have on hand -- writing out this verbose construct this forces me to stop and think twice about what I'm doing, and hopefully catch any wrong assumptions before it slips into the code. I want the compiler to tell me "hey, you said that data was const, and now you're trying to modify it!", which would cause me to remember "oh yeah, I did decide 2 months ago that this data should not be changed, and that other piece of code in this other module is relying on this -- why am I trying to modify it now?!".

As Walter often says, programming by convention doesn't work. Decades of catastrophic failures in C code have more than proven this. Humans are fallible, and cannot be relied on for program correctness. We're good at certain things -- leaps of intuition and clever out-of-the-box solutions for hard problems. But for other things, like keeping bugs out of our code, we need help. We need things to be statically verifiable by the compiler to prove that our assumptions indeed hold (and that somebody -- namely ourselves 3 months after writing that code -- didn't violate this assumption and introduce a bug during a last-minute code change before the last release deadline). Weak sauce like C++'s const that can freely be cast away with no consequences anytime you feel like it, will not do. You need something strong like D's const to keep the human error in check. Something that the compiler can automatically check and provide real guarantees for.

T

I hear you very clearly! Thanks! No seriously... THANKS A LOT!!! I should constantly remind myself that a wise man always learns from mistakes that have made prior to him and always advance! Well I also found about "negation" overflow thanks to LDC2 so yeah, we must have the compiler protect us so "const" will be implemented! Thanks a lot for your time!

January 25, 2022

On Monday, 24 January 2022 at 19:15:42 UTC, Steven Schveighoffer wrote:

>

I know exactly what you were doing:

void mul_num(T)(T num) {
   num *= 2;
}

That will fail if you just do mul_num(number) because templates infer the type from the argument (e.g. T becomes const int)

Unfortunately, D doesn't have a "tail-const" modifier, which would work really well here.

-Steve

Yeah right, now I remember! I did it about 2 months ago in a template function that converts types and It didn't compiled. Now, I know why, thank you!

January 25, 2022

On Monday, 24 January 2022 at 20:02:11 UTC, Walter Bright wrote:

>

You can see this if you run an object file disassembler over the compiler output, the immutable data goes in read-only sections.

That's actually really amazing and IMO, it is how it should be. Real immutability makes sense rather than just the compiler not letting you mutate the value.

January 25, 2022

On Monday, 24 January 2022 at 20:26:03 UTC, Patrick Schluter wrote:

>

No, at API level it is also a documentation help. When a parameter is marked as const, the code reader can be sure that there are no side effects in that function on that parameter. This reduces the mental burden when one tries to read code.

We could also had a library reference documentation that explains that in the worse case. But yeah, "const" works better for this

January 25, 2022

On Monday, 24 January 2022 at 20:31:36 UTC, Patrick Schluter wrote:

>

I'm an extremly minimalistic programmer, I avoid unecessary parenthesis and brackets whenever I can.

Makes sense when your code if finalized and (properly) checked but bugs can be introduced if you haven't finished your code. I find myself doing something like the following:

if (val)             { one_line_statement(); }
else if (other_val)  { one_line_statement(); }

I think it doesn't take much space and at the same time, it looks very readable.

January 25, 2022
On Monday, 24 January 2022 at 10:23:14 UTC, rempas wrote:
> On Monday, 24 January 2022 at 10:13:02 UTC, rikki cattermole wrote:
>> If you are working with raw pointers, you need a way to express read only memory.
>>
>> Because the CPU does have it, and if you tried to write to it, bye bye process.
>
> We would try to avoid working with pointers directly in read-only memory but in any case, I still think that the programmer should know what they are doing.
>

Sure thing but that's not realistic if you work professionally as you might be handed code that 15 other people worked on for a decade before you.

You will not be able to reason about the memory properly, in which case if certain variables are marked with const it will tell future people that this is read-only memory.