Thread overview
How to fake pure
Apr 08
Dom DiSc
Apr 08
Dom DiSc
Apr 08
Dom DiSc
April 08

Hi.

I want to create my own allocator, but using malloc is not pure (it internally has to have some "global state"). But the GC also has this "global state" and still is considered "pure".
So internally the impurity of the allocators has been hidden.

How can I do this?
adding @trusted doesn't do the trick, declaring it extern(C) also doesn't help.
Do I have to compile the file containing the allocator with some special switches turned on or off or something?

April 08
On Tuesday, April 8, 2025 5:28:57 AM MDT Dom DiSc via Digitalmars-d-learn wrote:
> Hi.
>
> I want to create my own allocator, but using malloc is not pure
> (it internally has to have some "global state"). But the GC also
> has this "global state" and still is considered "pure".
> So internally the impurity of the allocators has been hidden.
>
> How can I do this?
> adding @trusted doesn't do the trick, declaring it extern(C) also
> doesn't help.
> Do I have to compile the file containing the allocator with some
> special switches turned on or off or something?

You basically have to lie to the compiler and cast the function pointer to pure or by using an extern(C) shim which lies (since C functions have no name-mangling, it's possible for the attributes of the declaration to not match the attributes of the definition).

That being said, core.memory has pureMalloc and pureFree which do that for you already, including mucking around with errno to ensure that it's not changed even though malloc and pure can change it.

That being said, personally, I think that it's a mistake to try to force pure like this (and I don't think that pureMalloc or pureFree should ever have been added). We already have a variety of compiler bugs related to doing stuff based on pure (e.g. incorrectly implicitly converting to immutable, because the compiler decides that the result has to be unique, but it got the logic wrong, and it's not unique). And it seems to be sufficiently difficult to reason about exactly what is implied by pure and how the compiler will react because of it that determining whether what you're doing is "logically" pure is very error-prone in general. In general, when we try to make the compiler do more because of pure or treat more stuff like pure, we end up with bugs, because it's pretty much a house of cards.

Of course, I'm also increasingly of the opinion that pure was a mistake in general, because it does almost nothing in practice but routinely doesn't work with straightforward code - and it's definitely one of those attributes which gets in way when code needs to be refactored, since it's easy to end up in a situation where you can't have a bunch of code be pure any longer just because of one change that you need to make deep in the call stack somewhere, and it can be very difficult and time-consuming to refactor in such situations. So, by using pure, you're typically getting no actual benefits, but you're often making your life harder down the line.

So, my advice is to only use pure when you actually need the guarantees that it provides, and _maybe_ if your function is really simple, and you can be absolutely sure that you're never going to need to make significant changes to it, you could make it pure to enable its use in pure code, but in general, I think that using pure is just a mistake that's going to cause problems down the line - particularly with larger code bases. And that's without considering the issues of lying to the compiler about what's pure in order to get code to be treated as pure when the compiler doesn't think that it is. If you lie to the compiler and get it wrong, you can get issues that will be pretty hard to catch or debug, whereas if you just don't bother with pure, you avoid all of the associated problems without actually losing anything in almost all cases.

- Jonathan M Davis



April 08

On Tuesday, 8 April 2025 at 14:00:56 UTC, Jonathan M Davis wrote:

>

You basically have to lie to the compiler and cast the function pointer to pure [...].

That being said, core.memory has pureMalloc and pureFree which do that for you already, including mucking around with errno to ensure that it's not changed even though malloc and pure can change it.

Oh, cool. Thank you very much.

>

That being said, personally, I think that it's a mistake to try to force pure like this [...]. We already have a variety of compiler bugs related to doing stuff based on pure.

  • Jonathan M Davis

Yeah, maybe you're right, but I use pure more for myself - it tells me that a function doesn't depend on global state (beside allocation). And indeed, I use it only on small functions where I have the feeling to understand what they're doing...

April 08
On Tuesday, 8 April 2025 at 14:00:56 UTC, Jonathan M Davis wrote:
> Of course, I'm also increasingly of the opinion that pure was a mistake in general, because it does almost nothing in practice but routinely doesn't work with straightforward code - and it's definitely one of those attributes which gets in way when code needs to be refactored


And you really need pureMalloc to be applicable, hovewer it's a lie since the GC and malloc obviously have global state.
April 08
On Tuesday, April 8, 2025 9:07:45 AM MDT Guillaume Piolat via Digitalmars-d-learn wrote:
> On Tuesday, 8 April 2025 at 14:00:56 UTC, Jonathan M Davis wrote:
> > Of course, I'm also increasingly of the opinion that pure was a mistake in general, because it does almost nothing in practice but routinely doesn't work with straightforward code - and it's definitely one of those attributes which gets in way when code needs to be refactored
>
>
> And you really need pureMalloc to be applicable, hovewer it's a lie since the GC and malloc obviously have global state.

It at least makes more sense with the GC, because the freeing portion only occurs when it's guaranteed that the program no longer has access to the data, so the global state isn't really affected other than the bookkeeping that the GC does (which isn't normally accessed by the program and isn't pure if it is) or whether there's going to be memory available to allocate when new is called next (which doesn't really matter for the global state either, since faling to allocate memory is an Error, killing the program). On the other hand, pureFree has no such guarantee, since it's up to the programmer to make sure that it's not used when other references to the data still exist, and they can easily screw that up.

But yeah, having memory allocations (from the GC or otherwise) work with pure code is on some level a lie about global state. And it ultimately adds to the complication of what pure really means and what can be done based on it.

- Jonathan M Davis



April 08
On 4/8/25 17:07, Guillaume Piolat wrote:
> On Tuesday, 8 April 2025 at 14:00:56 UTC, Jonathan M Davis wrote:
>> Of course, I'm also increasingly of the opinion that pure was a mistake in general, because it does almost nothing in practice but routinely doesn't work with straightforward code - and it's definitely one of those attributes which gets in way when code needs to be refactored
> 
> 
> And you really need pureMalloc to be applicable, hovewer it's a lie since the GC and malloc obviously have global state.

It's not really a lie. This is just a confusion of levels of abstraction. Purely functional languages use memory allocators too, and they can also run out of memory.

The underlying problem here is that `pure` in D does not have a formal semantic meaning and there are some highly questionable choices such as dependencies on memory addresses and FPU state being considered `pure`.

In any case, casting a memory allocator to `pure` should be fine. Any reasonable definition of `pure` we can come up with in the future would be compatible with that.
April 08
On Tuesday, 8 April 2025 at 15:54:52 UTC, Timon Gehr wrote:
> In any case, casting a memory allocator to `pure` should be fine. Any reasonable definition of `pure` we can come up with in the future would be compatible with that.

Yes, this is also what I think. Of course an allocator cannot be strongly pure, but it should keep away the mental burden of memory management from the rest of a program. So I always thought of "D pure" as something like "pure except for allocation".

Other exceptions from pure are much harder to defend and should be avoided if possible - e.g. reliance on the memory-address of objects is nothing I would consider "pure" - the allocator will give me always different values, so this is a clear violation of purity.
Garbage like

int foo() @safe pure { int x; return cast(int)&x; }

should not compile. But it does :-(
Maybe one could say, yes this is safe (also questionable), but it is NOT pure. Never ever.

A pure function called with the same parameters should always produce the same output, except of maybe aborting caused by external reasons (which for me includes insufficient amount of available memory and interrupts by the OS... and of course user-interference).