H. S. Teoh
Posted in reply to Ali Çehreli
| On Tue, Jun 27, 2023 at 02:53:59PM -0700, Ali Çehreli via Digitalmars-d wrote: [...]
> First, an aside: You may remember my minor complaint about 'private' during a DConf presentation years ago. Today, I feel even stronger that disallowing access to parts of software "just because" of good design is a mistake. I've seen multiple examples of this in professional life where a developer uses 'private' only because it is "of course" better to do so. (The Turkish word "işgüzar" and the German word "verschlimmbessern" describe the situation pretty well for me but the English language lacks such a word.)
I can't resist me a Walter quote here:
I've been around long enough to have seen an endless parade of
magic new techniques du jour, most of which purport to remove
the necessity of thought about your programming problem. In the
end they wind up contributing one or two pieces to the
collective wisdom, and fade away in the rearview mirror. --
Walter Bright
When you start doing something with the code because that's what everybody else does, or because it's what everyone else says is "the Right Thing(tm)", then it's just cargo-culting, which inevitably leads to problems down the road.
> To give an example from D's ecosystem, the D runtime's garbage collector statistics object used to be 'private'. (I think there is an interface for it now.) What an inconvenience it was to copy/paste that type's definition from the runtime to user code, get the compiled symbol of the object from the library, and pointer cast it to be able to access the members! A 'static assert' attempts to protect the project from changes to that type...
Thing is, things like these usually come from temporary hacks in the code that the original coder didn't want to set in stone, but that end up staying put because of inertia and becoming de facto set in stone.
> The idea of 'private' should be to just give the developer freedom to change the implementation in the future. It should not impede use cases that people come up with. That can be achieved practically with an underscore: Make everything 'public' and name your implementation details with an underscore. People who need them will surely know they are implementation details that can change in the future but they will be happy: They will get things done.
IOW, empower the user instead of straitjacketing them. My favorite programming modus operandi. Along the same lines as my philosophy of "everything should be a library, main() is just a convenient (thin) interface to access the library API".
[...]
> The main topic here is about the harm caused by rich types surrounding algorithms. Let's say I am interested in using an open source algorithm that works with a memory area. (Not related to D.) We all know that a memory area can be described by a fat pointer like D's slices. So, that is what the algorithm should take.
>
> Unfortunately, the poor little algorithm is not free to be used: It is written to work with a custom type of that library; let's call it MySlice, which is produced by MyMemoryMappedFile, which is produced by MyFile, which is initialized only by types like MyFilePath. (I may have gotten the relationships wrong there.)
That's a sign of poorly-factored code. The logically-separate parts of the code are not properly separated out, causing them to be dependent on each other where they technically should not be. Doing this right is actually a lot harder than it looks; it often requires significant amounts of refactoring after your initial implementation, because until you write the thing out in code, it isn't always clear which parts are actually dependent and which parts can be separated.
Idioms like pipeline programming with ranges help to identify independent pieces of the logic, and abstractions like the range API help you actually separate out the pieces in a clean way. Without a unifying common API like ranges, it's pretty tough to write code in composable pieces that can be freely mixed-and-matched with each other.
https://wiki.dlang.org/Component_programming_with_ranges
Well, obviously you already know about this article, but one of my motivations for writing that article was precisely what you describe above.
> But my data is already in a memory area that I own! How can I call that algorithm? Should I write it to a file first and then use those rich types to access the algorithm? That should not be necessary...
>
> Of course I understand the benefits of all those types but the core algorithm should be as free as possible. So, this is simply wrong. I think us, software developers, have been on the wrong path. Our task should primarily be about getting things done first.
Over the years, I've been dreaming about the ideal situation where there would be libraries of algorithms that are not tied to a specific implementation (i.e., bound to concrete types and parameter values), but are written in a form that encapsulates only its core logic. You'd then pull in the algorithm by specifying which concrete type(s) to bind its various parts to, and it'd Just Work(tm). That's the way things should have been from the beginning.
But the situation today is far from that ideal: you have libraries that solve some particular programming problem X, but to use the library's solution you need to use also Y, Z, and W that the author of that library happened to choose. For instance, the FreeType library implements rasterization algorithms, but you can't access those algorithms directly. You have to use the library API, which abstracts away file handling, memory management, image type, etc.. In order to cater to different user needs, an entire complicated API is invented to allow the user to specify certain parameters the authors deem tweakable, while an elaborate scheme is designed to hide the rest of the information away. You can't effectively use the rasterization algorithm without also using all of these other peripheral types; and when you need to interface FreeType with another library that uses other, different concrete types, you end up having to write lots of shunt code whose sole purpose is to bridge between incompatible types that actually do equivalent things.
> I could work with those types if they had virtual interfaces. But no. They are un-subtypable C++ 'class'es.
>
> I think it could also work if the algorithm was templatized; but again, no...
[...]
In cases like this, I often get really tempted to copy-n-paste the code and templatize it myself. :-D Of course, in practice that's usually impractical, so the next best thing is to use D's compile-time introspection capabilities to autogenerate boilerplate shunt code to work around API infelicities in the target library, and export a nicer API on the D side. :-D Not always possible, of course, like in your case, where you'd have to either copy-n-paste code and do un-@safe casts, or live with infelicities like writing stuff to a file and opening it via the official API.
(I had to do something similar once in my day job, interfacing with a grossly over-engineered C++ framework that nobody fully understood nor wanted anything to do with if they could help it -- I ended up having to write a hack where a single function call involved 7 layers of abstraction, one of which involved writing a struct to a temporary file on one side of an RPC call and having the other side (a daemon process) read from the file and cast it back to the struct. The result was the stuff of nightmares that, to everyone's great relief, was phased out a couple of releases later. We relished every moment of typing `\rm -rf` on that entire old codebase after its replacement became fully functional.)
T
--
2+2=4. 2*2=4. 2^2=4. Therefore, +, *, and ^ are the same operation.
|