H. S. Teoh
Posted in reply to bachmeier
| On Fri, Jun 30, 2023 at 02:41:00PM +0000, bachmeier via Digitalmars-d wrote:
> On Friday, 30 June 2023 at 11:07:33 UTC, Atila Neves wrote:
>
> > API design is indeed hard. Which makes it all the more imperative to not accidentally design one with implementation details that users downstream start depending on. That is: API design needs to be a conscious opt-in decision and not "I guess I didn't think about the consequences of leaving the door to my flat open all the time and now there are people camping in my living room".
>
> Private is more like locking everyone else's doors for their own safety. In the cases that it keeps an intruder out, it was helpful to them. When grandma had to sleep on the sidewalk, not so much. Many times library authors have prevented me from doing my work because of arbitrarily preventing access to implementation details. I should have the option to override those decisions. If something blows up, or if my code gets broken in the future, it's my fault, because I was the one that made that decision.
The thing is, both of the above are true.
Private does have its uses: to hide implementation details from unrelated parts of the code so that, especially in a large project with many contributors, you don't end up with accidental dependencies between parts of the code that really shouldn't depend on each other. Hairball dependencies among unrelated modules is a major factor of unmaintainability in large projects, and preventing this goes a long way to reduce long-term maintenance costs.
The other side to this, however, is that deciding what should be private and what shouldn't is a hard problem, and most people either can't figure it out, or can't be bothered to put in the effort to get it right, so they slap private on everything, making it hard to reuse their code outside of the narrow confines of how they initially envisioned it. So you end up with an API that covers the most common use cases but not others, which causes a lot of frustration when downstream code wants to do something but can't via the API, so they have to resort to copy-pasta or breaking private. (See: API design is hard.)
Most people design APIs around how they envision the module would be (or ought to be) used, at a relatively high level of abstraction, without regard to the core algorithms that would be used to implement this. What we may call a "use-centric API". Contrary to popular belief, this is actually a mistake. It frequently leads to the situation where a useful algorithm that might benefit other parts of the code gets locked behind the private implementation of the module, because it doesn't directly map to the external API. This in turn promotes code duplication: if my module also needs some variant of the same algorithm, I have to copy-n-paste it or re-implement it from scratch in my own module -- usually also behind `private`, so the next person that comes along will need to do it again. It actually *reduces* code reuse. It also fosters the desire to break private: I realize that the algorithm is already implemented, so I wish I could break private in order to avoid rewriting it myself.
A better approach is an algorithm-centric API design: in the course of implementing a module (or library), identify the core algorithms that solve the main problems that the module/library is trying to solve, and design the API around exposing this algorithm to user code. Then on top of that, add some syntactic sugar that maps this to the high-level usage of the algorithm (the use-centric API). There may still be private parts (internal details of the algorithms that the user really doesn't need to know), but these are confined to things that outside code truly doesn't need to know, not a blanket default that may unintentionally exclude certain unusual, but valid, use cases.
There is an important philosophical difference between these two approaches. The first approach tends towards the philosophy of "you have problem X, no problem, hand it over to us (the library), we'll perform the magic to solve it, and we'll give you back the result Y". The method of solution is opaque and hidden from user code. IOW, the hood is welded shut; your only recourse in case of problems is to take it back to the dealer (the library author). The second approach has the philosophy "you have problem X, we (the library) will give you tools A, B, C, that you can use to solve problem X. In addition, we provide you special combo D (syntactic sugar functions) that will solve X the usual way without you having to figure out how to combine A, B, and C in the right way." The hood is open and you may fiddle with the things inside if you know what you're doing. But most of the time you won't need to -- the syntactic sugar functions handle the most common use cases for you.
The first approach empowers the library writer, the second approach empowers the user. My argument is that the second approach is superior. No abstraction is perfect (otherwise it wouldn't be an abstraction!); there will always be cases where you need to go under the hood and do something the library author didn't envision initially. Give him the tools to do so without breaking encapsulation, instead of forcing him to come back to you for help.
T
--
Claiming that your operating system is the best in the world because more people use it is like saying McDonalds makes the best food in the world. -- Carl B. Constantine
|