Algorithms should be free from rich types

Algorithms should be free from rich types
Jun 27, 2023 Ali Çehreli
Jun 28, 2023 H. S. Teoh
Jun 28, 2023 FeepingCreature
Jun 28, 2023 Ali Çehreli
Jun 28, 2023 Richard (Rikki) Andrew Cattermole
Jun 28, 2023 Adam D Ruppe
Jun 28, 2023 bachmeier
Jun 28, 2023 Cecil Ward
Jun 30, 2023 Steven Schveighoffer
Jun 30, 2023 Steven Schveighoffer
Jun 28, 2023 Max Samukha
Jun 28, 2023 Ali Çehreli
Jun 28, 2023 Hipreme
Jun 28, 2023 bachmeier
Jun 28, 2023 Ali Çehreli
Jun 29, 2023 Atila Neves
Jun 29, 2023 Jonathan M Davis
Jun 30, 2023 Steven Schveighoffer
Jun 30, 2023 H. S. Teoh
Jun 30, 2023 Atila Neves
Jun 30, 2023 bachmeier
Jun 30, 2023 monkyyy
Jul 03, 2023 Timon Gehr
Jun 30, 2023 H. S. Teoh
Jun 30, 2023 bachmeier
Jun 30, 2023 Meta
Jul 01, 2023 Dom DiSc
Jul 02, 2023 Dukc
Jul 03, 2023 Atila Neves
Jul 03, 2023 Steven Schveighoffer
Jul 03, 2023 H. S. Teoh
Jul 03, 2023 Steven Schveighoffer
Jul 03, 2023 H. S. Teoh
Jul 03, 2023 claptrap
Jul 04, 2023 Steven Schveighoffer
Jul 02, 2023 Dukc

June 27, 2023

Posted by Ali Çehreli

Permalink

Ali Çehreli

Permalink

My mind is not fully clear on this topic yet but some related things have been brewing in me for years.

First, an aside: You may remember my minor complaint about 'private' during a DConf presentation years ago. Today, I feel even stronger that disallowing access to parts of software "just because" of good design is a mistake. I've seen multiple examples of this in professional life where a developer uses 'private' only because it is "of course" better to do so. (The Turkish word "işgüzar" and the German word "verschlimmbessern" describe the situation pretty well for me but the English language lacks such a word.)

To give an example from D's ecosystem, the D runtime's garbage collector statistics object used to be 'private'. (I think there is an interface for it now.) What an inconvenience it was to copy/paste that type's definition from the runtime to user code, get the compiled symbol of the object from the library, and pointer cast it to be able to access the members! A 'static assert' attempts to protect the project from changes to that type...

The idea of 'private' should be to just give the developer freedom to change the implementation in the future. It should not impede use cases that people come up with. That can be achieved practically with an underscore: Make everything 'public' and name your implementation details with an underscore. People who need them will surely know they are implementation details that can change in the future but they will be happy: They will get things done.

Ok, that rant is over.

The main topic here is about the harm caused by rich types surrounding algorithms. Let's say I am interested in using an open source algorithm that works with a memory area. (Not related to D.) We all know that a memory area can be described by a fat pointer like D's slices. So, that is what the algorithm should take.

Unfortunately, the poor little algorithm is not free to be used: It is written to work with a custom type of that library; let's call it MySlice, which is produced by MyMemoryMappedFile, which is produced by MyFile, which is initialized only by types like MyFilePath. (I may have gotten the relationships wrong there.)

But my data is already in a memory area that I own! How can I call that algorithm? Should I write it to a file first and then use those rich types to access the algorithm? That should not be necessary...

Of course I understand the benefits of all those types but the core algorithm should be as free as possible. So, this is simply wrong. I think us, software developers, have been on the wrong path. Our task should primarily be about getting things done first.

I could work with those types if they had virtual interfaces. But no. They are un-subtypable C++ 'class'es.

I think it could also work if the algorithm was templatized; but again, no...

Hey! Thank you! I feel better already. :)

Ali

June 27, 2023

Re: Algorithms should be free from rich types

Posted by H. S. Teoh
in reply to Ali Çehreli

Permalink

H. S. Teoh

Posted in reply to Ali Çehreli

Permalink

On Tue, Jun 27, 2023 at 02:53:59PM -0700, Ali Çehreli via Digitalmars-d wrote: [...]
> First, an aside: You may remember my minor complaint about 'private' during a DConf presentation years ago. Today, I feel even stronger that disallowing access to parts of software "just because" of good design is a mistake. I've seen multiple examples of this in professional life where a developer uses 'private' only because it is "of course" better to do so. (The Turkish word "işgüzar" and the German word "verschlimmbessern" describe the situation pretty well for me but the English language lacks such a word.)

I can't resist me a Walter quote here:

	I've been around long enough to have seen an endless parade of
	magic new techniques du jour, most of which purport to remove
	the necessity of thought about your programming problem.  In the
	end they wind up contributing one or two pieces to the
	collective wisdom, and fade away in the rearview mirror. --
	Walter Bright

When you start doing something with the code because that's what everybody else does, or because it's what everyone else says is "the Right Thing(tm)", then it's just cargo-culting, which inevitably leads to problems down the road.

> To give an example from D's ecosystem, the D runtime's garbage collector statistics object used to be 'private'. (I think there is an interface for it now.) What an inconvenience it was to copy/paste that type's definition from the runtime to user code, get the compiled symbol of the object from the library, and pointer cast it to be able to access the members! A 'static assert' attempts to protect the project from changes to that type...

Thing is, things like these usually come from temporary hacks in the code that the original coder didn't want to set in stone, but that end up staying put because of inertia and becoming de facto set in stone.

> The idea of 'private' should be to just give the developer freedom to change the implementation in the future. It should not impede use cases that people come up with. That can be achieved practically with an underscore: Make everything 'public' and name your implementation details with an underscore.  People who need them will surely know they are implementation details that can change in the future but they will be happy: They will get things done.

IOW, empower the user instead of straitjacketing them. My favorite programming modus operandi. Along the same lines as my philosophy of "everything should be a library, main() is just a convenient (thin) interface to access the library API".

[...]
> The main topic here is about the harm caused by rich types surrounding algorithms. Let's say I am interested in using an open source algorithm that works with a memory area. (Not related to D.) We all know that a memory area can be described by a fat pointer like D's slices. So, that is what the algorithm should take.
>
> Unfortunately, the poor little algorithm is not free to be used: It is written to work with a custom type of that library; let's call it MySlice, which is produced by MyMemoryMappedFile, which is produced by MyFile, which is initialized only by types like MyFilePath. (I may have gotten the relationships wrong there.)

That's a sign of poorly-factored code. The logically-separate parts of the code are not properly separated out, causing them to be dependent on each other where they technically should not be.  Doing this right is actually a lot harder than it looks; it often requires significant amounts of refactoring after your initial implementation, because until you write the thing out in code, it isn't always clear which parts are actually dependent and which parts can be separated.

Idioms like pipeline programming with ranges help to identify independent pieces of the logic, and abstractions like the range API help you actually separate out the pieces in a clean way. Without a unifying common API like ranges, it's pretty tough to write code in composable pieces that can be freely mixed-and-matched with each other.

	https://wiki.dlang.org/Component_programming_with_ranges

Well, obviously you already know about this article, but one of my motivations for writing that article was precisely what you describe above.

> But my data is already in a memory area that I own! How can I call that algorithm? Should I write it to a file first and then use those rich types to access the algorithm? That should not be necessary...
> 
> Of course I understand the benefits of all those types but the core algorithm should be as free as possible. So, this is simply wrong. I think us, software developers, have been on the wrong path. Our task should primarily be about getting things done first.

Over the years, I've been dreaming about the ideal situation where there would be libraries of algorithms that are not tied to a specific implementation (i.e., bound to concrete types and parameter values), but are written in a form that encapsulates only its core logic.  You'd then pull in the algorithm by specifying which concrete type(s) to bind its various parts to, and it'd Just Work(tm).  That's the way things should have been from the beginning.

But the situation today is far from that ideal: you have libraries that solve some particular programming problem X, but to use the library's solution you need to use also Y, Z, and W that the author of that library happened to choose. For instance, the FreeType library implements rasterization algorithms, but you can't access those algorithms directly. You have to use the library API, which abstracts away file handling, memory management, image type, etc.. In order to cater to different user needs, an entire complicated API is invented to allow the user to specify certain parameters the authors deem tweakable, while an elaborate scheme is designed to hide the rest of the information away. You can't effectively use the rasterization algorithm without also using all of these other peripheral types; and when you need to interface FreeType with another library that uses other, different concrete types, you end up having to write lots of shunt code whose sole purpose is to bridge between incompatible types that actually do equivalent things.

> I could work with those types if they had virtual interfaces. But no. They are un-subtypable C++ 'class'es.
> 
> I think it could also work if the algorithm was templatized; but again, no...
[...]

In cases like this, I often get really tempted to copy-n-paste the code and templatize it myself. :-D  Of course, in practice that's usually impractical, so the next best thing is to use D's compile-time introspection capabilities to autogenerate boilerplate shunt code to work around API infelicities in the target library, and export a nicer API on the D side. :-D  Not always possible, of course, like in your case, where you'd have to either copy-n-paste code and do un-@safe casts, or live with infelicities like writing stuff to a file and opening it via the official API.

(I had to do something similar once in my day job, interfacing with a grossly over-engineered C++ framework that nobody fully understood nor wanted anything to do with if they could help it -- I ended up having to write a hack where a single function call involved 7 layers of abstraction, one of which involved writing a struct to a temporary file on one side of an RPC call and having the other side (a daemon process) read from the file and cast it back to the struct.  The result was the stuff of nightmares that, to everyone's great relief, was phased out a couple of releases later. We relished every moment of typing `\rm -rf` on that entire old codebase after its replacement became fully functional.)

T

-- 
2+2=4. 2*2=4. 2^2=4. Therefore, +, *, and ^ are the same operation.

June 28, 2023

Re: Algorithms should be free from rich types

Posted by FeepingCreature
in reply to Ali Çehreli

Permalink

FeepingCreature

Posted in reply to Ali Çehreli

Permalink

On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:

My mind is not fully clear on this topic yet but some related things have been brewing in me for years.

I like this approach:

class C {
    private int i;
}
...
void main() @system {
    auto c = new C;
    c.private.i = 5;
}

June 28, 2023

Re: Algorithms should be free from rich types

Posted by Max Samukha
in reply to Ali Çehreli

Permalink

Max Samukha

Posted in reply to Ali Çehreli

Permalink

On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:
> My mind is not fully clear on this topic yet but some related things have been brewing in me for years.
>
> Unfortunately, the poor little algorithm is not free to be used: It is written to work with a custom type of that library; let's call it MySlice, which is produced by MyMemoryMappedFile, which is produced by MyFile, which is initialized only by types like MyFilePath. (I may have gotten the relationships wrong there.)
>
> But my data is already in a memory area that I own! How can I call that algorithm? Should I write it to a file first and then use those rich types to access the algorithm? That should not be necessary...

That's some poorly designed library (Phobos?). A decently designed one would at least allow you to construct a MySlice instance from a (pointer, length) pair.

June 28, 2023

Re: Algorithms should be free from rich types

Posted by Ali Çehreli
in reply to FeepingCreature

Permalink

Ali Çehreli

Posted in reply to FeepingCreature

Permalink

On 6/28/23 01:00, FeepingCreature wrote:

>      auto c = new C;
>      c.private.i = 5;

I love it. And I actually tried but no, D does not have this yet. :D

Ali

June 28, 2023

Re: Algorithms should be free from rich types

Posted by Ali Çehreli
in reply to Max Samukha

Permalink

Ali Çehreli

Posted in reply to Max Samukha

Permalink

On 6/28/23 02:25, Max Samukha wrote:

> That's some poorly designed library (Phobos?).

Not in the D world at all.

Ironically, I think the library's design is actually pretty good. And that's why I was motivated to write in the first place: Everything was done according to industry best practices but in the end all of that reduces the usability of the library.

Ali

June 29, 2023

Re: Algorithms should be free from rich types

Posted by Richard (Rikki) Andrew Cattermole
in reply to Ali Çehreli

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Ali Çehreli

Permalink

Oh how you dare me.

--- app.d
module app;
import foo;
void main()
{
    Foo foo = new Foo;
    foo.privateGet!"i" = 2;
    foo.say();
}

ref privateGet(string name, From)(ref From from) {
    static foreach(I; 0 .. from.tupleof.length) {
        {
            enum Name = __traits(identifier, from.tupleof[I]);

            static if (Name == name) {
                return from.tupleof[I];
            }
        }
    }

    assert(0);
}

--- foo.d
module foo;
class Foo {
    void say() {
        import std.stdio;
     	writeln(i);
    }

private:
    int i;
    bool b;
}

June 28, 2023

Re: Algorithms should be free from rich types

Posted by Hipreme
in reply to Ali Çehreli

Permalink

Hipreme

Posted in reply to Ali Çehreli

Permalink

On Wednesday, 28 June 2023 at 17:00:44 UTC, Ali Çehreli wrote:
> On 6/28/23 02:25, Max Samukha wrote:
>
> > That's some poorly designed library (Phobos?).
>
> Not in the D world at all.
>
> Ironically, I think the library's design is actually pretty good. And that's why I was motivated to write in the first place: Everything was done according to industry best practices but in the end all of that reduces the usability of the library.
>
> Ali

I have had a rant with `private` since the time I used LibGDX Particle System. I wasn't able to extend its particle system to add collision to it, why? Because the particles were `private`. Since that, I never used `private` anymore without a very very good reason to do so, the only place I use it right now is for intermediate processes on a full process. People in industry knows nothing on how to use `protected`. Protected IMO should be the industry standard.

I have worked in a codebase which is being refactored for at least 3 years, there's so many changes on `private` not being used after some time. Why is that? Because programmers should not fear themselves most of the time.

June 28, 2023

Re: Algorithms should be free from rich types

Posted by bachmeier
in reply to Hipreme

Permalink

bachmeier

Posted in reply to Hipreme

Permalink

On Wednesday, 28 June 2023 at 17:12:17 UTC, Hipreme wrote:

I have had a rant with private since the time I used LibGDX Particle System. I wasn't able to extend its particle system to add collision to it, why? Because the particles were private. Since that, I never used private anymore without a very very good reason to do so, the only place I use it right now is for intermediate processes on a full process. People in industry knows nothing on how to use protected. Protected IMO should be the industry standard.

I have worked in a codebase which is being refactored for at least 3 years, there's so many changes on private not being used after some time. Why is that? Because programmers should not fear themselves most of the time.

Rich Hickey:

At some point though, someone is going to need to have access to the data. And if you have a notion of “private”, you need corresponding notions of privilege and trust. And that adds a whole ton of complexity and little value, creates rigidity in a system, and often forces things to live in places they shouldn’t.

If people don’t have the sensibilities to desire to program to abstractions and to be wary of marrying implementation details, then they are never going to be good programmers.

June 28, 2023

Re: Algorithms should be free from rich types

Posted by bachmeier
in reply to FeepingCreature

Permalink

bachmeier

Posted in reply to FeepingCreature

Permalink

On Wednesday, 28 June 2023 at 08:00:23 UTC, FeepingCreature wrote:

I like this approach:

class C {
    private int i;
}
...
void main() @system {
    auto c = new C;
    c.private.i = 5;
}

This would be a good change to the language.

Top | Forum index | About this forum

Forums