Rebooting the __metada/__mutable discussion

Rebooting the __metada/__mutable discussion
Apr 06, 2022 RazvanN
Apr 06, 2022 IGotD-
Apr 06, 2022 rikki cattermole
Apr 06, 2022 Timon Gehr
Apr 06, 2022 Timon Gehr
Apr 07, 2022 RazvanN
Apr 07, 2022 rikki cattermole
Apr 07, 2022 Timon Gehr
Apr 07, 2022 Timon Gehr
Apr 08, 2022 Zach Tollen
Apr 08, 2022 Paul Backus
Apr 08, 2022 Zach Tollen
Apr 08, 2022 Paul Backus
Apr 08, 2022 Timon Gehr
Apr 08, 2022 Zach Tollen
Apr 08, 2022 Timon Gehr
Apr 09, 2022 Zach Tollen
Apr 09, 2022 Timon Gehr
Apr 10, 2022 Zach Tollen
Apr 10, 2022 Bruce Carneal
Apr 10, 2022 Zach Tollen
Apr 10, 2022 Bruce Carneal
Apr 10, 2022 Zach Tollen
Apr 10, 2022 Zach Tollen
Apr 10, 2022 Bruce Carneal
Apr 10, 2022 Timon Gehr
Apr 10, 2022 Bruce Carneal
Apr 10, 2022 Paul Backus
Apr 10, 2022 Zach Tollen
Apr 10, 2022 Timon Gehr
Apr 13, 2022 Zach Tollen
Apr 13, 2022 Dennis
Apr 13, 2022 Paul Backus
Apr 08, 2022 Zach Tollen
Apr 08, 2022 Alexandru Ermicioi
Apr 08, 2022 Alexandru Ermicioi
Apr 08, 2022 H. S. Teoh
Apr 08, 2022 Paul Backus
Apr 08, 2022 Alexandru Ermicioi
Apr 08, 2022 rikki cattermole
Apr 09, 2022 vit
Apr 10, 2022 Alexandru Ermicioi
Apr 10, 2022 Timon Gehr
Apr 12, 2022 RazvanN
Apr 10, 2022 Alexandru Ermicioi
Apr 10, 2022 Timon Gehr
Apr 10, 2022 rikki cattermole
Apr 11, 2022 Alexandru Ermicioi
Apr 11, 2022 Paul Backus
Apr 11, 2022 Alexandru Ermicioi
Apr 11, 2022 Paul Backus

April 06, 2022

Posted by RazvanN

Permalink

RazvanN

Permalink

Hello everyone,

Lately, I've been looking at the __metadata/__mutable discussions. For reference, here is the material I have so far:

Forum discussion: https://forum.dlang.org/thread/3f1f9263-88d8-fbf7-a8c5-b3a2a5224ce0@erdani.org?page=1
Timon's original DIP: https://github.com/RazvanN7/DIPs/blob/Mutable_Dip/DIPs/timon_dip.md
My updated DIP which is based on Timon's: https://github.com/RazvanN7/DIPs/blob/Mutable_Dip/DIPs/DIP1xxx-rn.md

We need this to be able to implement generic reference counting. So our main
problem is how do we reference count immutable/const objects. Timon's original
proposal tried to implement __metadata in a way that does not affect purity
based optimizations. I think that this could be done:

A strongly pure function (that return types without indirections) will return the same result when applied to the same immutable arguments.

Fixed by rewritting

auto a = foo(arg)   // foo -> strongly pure
aubo b = foo(arg)

auto a = foo(arg)
auto b = a

This is taken from Timon DIP.

The set of references returned from strongly pure functions can be safely converted to immutable or shared.

This is no affected by the introduction of __metadata.

A strongly pure function whose result is not used may be safely elided.

struct S{
    private int __metadata x;
}

void foo(immutable ref S s)pure{
    s.x += 1;
}

void main(){
    immutable S s;
    foo(s); // there is no reason for this call to happen
    assert(s.x==1); // can't rely on this, it might also be 0
}

Essentially, if foo is strongly pure, then the compiler can optimize away
the call to it and your reference count is blown away. If we look at it this
way, the problem seems to be unsolvable. However, the idea of __metadata is to
be used solely by library developers and even they should take extra care.
As such, I would propose that __metadata can only be accessed from inside
the aggregate that defines it (private here means Java private) and methods
that access __metadata directly need to also private. I think that this makes sense since the reference is updated only when you call the copy constructor and the assignment operator. These methods should be public and they can call the incRef, decRef that are mandatory private. This way, it becomes impossible to access a __metadata field without going through the object methods. This makes sense, since the object is the only one that should manage the __metadata.

Now, if we do it like this, then foo will not have any means of accessing x
apart from assigning s, passing s to a function or copy constructing from s.
However, whatever happens to s, once the execution of foo is over, the reference
count at the call site is going to be the same as when foo was called. Why?
Because there is no way you can escape any reference to s outside of a strongly pure function (other than returning it, but that case is taken care of at point 1.).

If we have two subsequent pure function invocations foo(args1...) and bar(args2...) where data transitively reachable from args1 and args2 only overlaps in immutable and const data (this includes data accessed through __mutable fields of const and immutable objects), the two invocations may safely swap their order (of course, this only applies if none of the two functions takes arguments)

This should be un-affected.

A strongly pure function invocation can always exchange order with an adjacent impure function invocation.

This should be un-affected.

What do you think? Am I missing anything? If you think this could fly, I could update the DIP and submit it.

Best regards,
RazvanN

April 06, 2022

Re: Rebooting the __metada/__mutable discussion

Posted by IGotD-
in reply to RazvanN

Permalink

IGotD-

Posted in reply to RazvanN

Permalink

On Wednesday, 6 April 2022 at 09:41:52 UTC, RazvanN wrote:

Immutable and mutable variables don't mix well at all. With immutable there is a high probability that it ends up in read only memory so any poking around there will cause the program to crash. Also reference counting with immutable objects doesn't make any sense.

When it comes changing data on const objects such as reference counting, I think that this is low level code that works under the hood of the actual language. This means that you will cast away whatever you need in order to get the job done.

April 07, 2022

Re: Rebooting the __metada/__mutable discussion

Posted by rikki cattermole
in reply to IGotD-

Permalink

rikki cattermole

Posted in reply to IGotD-

Permalink

On 06/04/2022 11:43 PM, IGotD- wrote:
> On Wednesday, 6 April 2022 at 09:41:52 UTC, RazvanN wrote:
>>
>> We need this to be able to implement generic reference counting. So our main
>> problem is how do we reference count immutable/const objects. Timon's original
>> proposal tried to implement __metadata in a way that does not affect purity
>> based optimizations. I think that this could be done:
> 
> Immutable and mutable variables don't mix well at all. With immutable there is a high probability that it ends up in read only memory so any poking around there will cause the program to crash. Also reference counting with immutable objects doesn't make any sense.
> 
> When it comes changing data on const objects such as reference counting, I think that this is low level code that works under the hood of the actual language. This means that you will cast away whatever you need in order to get the job done.

There are alternatives: https://forum.dlang.org/post/t2eo0t$mg7$1@digitalmars.com

It is my opinion that a purpose built escape hatch will be better than a generic escape hatch. Far less code that you need to review.

April 06, 2022

Re: Rebooting the __metada/__mutable discussion

Posted by Timon Gehr
in reply to IGotD-

Permalink

Timon Gehr

Posted in reply to IGotD-

Permalink

On 06.04.22 13:43, IGotD- wrote:
> On Wednesday, 6 April 2022 at 09:41:52 UTC, RazvanN wrote:
>>
>> We need this to be able to implement generic reference counting. So our main
>> problem is how do we reference count immutable/const objects. Timon's original
>> proposal tried to implement __metadata in a way that does not affect purity
>> based optimizations. I think that this could be done:
> 
> Immutable and mutable variables don't mix well at all. With immutable there is a high probability that it ends up in read only memory so any poking around there will cause the program to crash.

The decision whether or not to place `immutable`-qualified data in read-only memory is made at a point where the memory layout of that type is known. Just don't do it if there are any `__mutable` fields.

> Also reference counting with immutable objects doesn't make any sense.
> ...

That's not so clear.

> When it comes changing data on const objects such as reference counting, I think that this is low level code that works under the hood of the actual language.

Yes, this is all in the realm of @system code.

> This means that you will cast away whatever you need in order to get the job done.
> 

That's disallowed by existing language rules, hence the need to make some modifications to the specification.

April 06, 2022

Re: Rebooting the __metada/__mutable discussion

Posted by Timon Gehr
in reply to RazvanN

Permalink

Timon Gehr

Posted in reply to RazvanN

Permalink

On 06.04.22 11:41, RazvanN wrote:
> As such, I would propose that `__metadata` can only be accessed from inside
> the aggregate that defines it (`private` here means `Java private`) and methods
> that access __metadata directly need to also private.

Those rules are pretty arbitrary and don't buy much. However, accessing `__metadata` should be `@system`.

> 
> What do you think? Am I missing anything?

E.g., how to deallocate memory in a pure destructor? (Solved by `__mutable` functions in my original draft.)

What if the destructor is both pure and immutable?

More generally, there needs to be a story for how to support custom allocators for immutable memory.

April 07, 2022

Re: Rebooting the __metada/__mutable discussion

Posted by RazvanN
in reply to Timon Gehr

Permalink

RazvanN

Posted in reply to Timon Gehr

Permalink

On Wednesday, 6 April 2022 at 18:03:25 UTC, Timon Gehr wrote:
> On 06.04.22 11:41, RazvanN wrote:
>> As such, I would propose that `__metadata` can only be accessed from inside
>> the aggregate that defines it (`private` here means `Java private`) and methods
>> that access __metadata directly need to also private.
>
> Those rules are pretty arbitrary and don't buy much. However, accessing `__metadata` should be `@system`.
>
They buy the fact that __metadata fields cannot be accessed from outside
of the object that implements the reference count. This offers the guarantee
that a strongly pure function will not alter a __metadata field and hence it
can be the subject of any purity-based optimization (with the exception of
functions that do deallocations).

>> 
>> What do you think? Am I missing anything?
>
> E.g., how to deallocate memory in a pure destructor? (Solved by `__mutable` functions in my original draft.)
>
> What if the destructor is both pure and immutable?
>

Destructors suffer from the same issue as postblits did with regards to qualifiers. I would be surprised if you could ever call an immutable destructor. Combined with the fact that you cannot overload the destructor makes everything worse. So I would argue that talking about immutable pure destructors in this context is like talking about a broken glass when you're in a house on fire.

However, I do get your point and in order to support immutable allocators we do need to take into consideration how allocation and deallocation is done.

I think that the problem here is that we are conflating a high-level concept (purity) with low level operations (allocation/deallocation). The latter category is essentially not pure because it modifies operating system data structures, so I would say that we could view them as system code, not in the sense that they are unsafe (although they could be), but because they require OS assistance. Now, this sort of functions should be at most weakly pure (and this is what mutable functions essentially did) even though their signature looks as if they were strongly pure. One solution would be to consider @system/@trusted pure functions as weakly pure no matter what their signature looks like. Constructors/destructors should also be considered at most weakly pure. If your function is @safe and pure then the signature should be analyzed to decide if it is strongly or weakly pure. Also, any function that you are calling from another language should be at most trusted. This way, @trusted would act as an optimization blocker.

> More generally, there needs to be a story for how to support custom allocators for immutable memory.

April 08, 2022

Re: Rebooting the __metada/__mutable discussion

Posted by rikki cattermole
in reply to RazvanN

Permalink

rikki cattermole

Posted in reply to RazvanN

Permalink

On 07/04/2022 8:51 PM, RazvanN wrote:
> I think that the problem here is that we are conflating a high-level concept (purity) with low level operations (allocation/deallocation). The latter category is essentially not pure because it modifies operating system data structures, so I would say that we could view them as system code, not in the sense that they are unsafe (although they could be), but because they require OS assistance.

I've had to argue with myself on if memory mapping should be viewed as pure or not. During development of my own allocators.

Ultimately the argument I came up with is that we have already defined memory mapping in the form of new as being pure. I disagree that it should be defined as such, but it has been.

As such I have classified memory mapping as being a set of "deity" functions. Things that modify the execution environment but not the logic being executed therefore can be considered pure.

April 08, 2022

Re: Rebooting the __metada/__mutable discussion

Posted by Timon Gehr
in reply to RazvanN

Permalink

Timon Gehr

Posted in reply to RazvanN

Permalink

On 07.04.22 10:51, RazvanN wrote:
> On Wednesday, 6 April 2022 at 18:03:25 UTC, Timon Gehr wrote:
>> On 06.04.22 11:41, RazvanN wrote:
>>> As such, I would propose that `__metadata` can only be accessed from inside
>>> the aggregate that defines it (`private` here means `Java private`) and methods
>>> that access __metadata directly need to also private.
>>
>> Those rules are pretty arbitrary and don't buy much. However, accessing `__metadata` should be `@system`.
>>
> They buy the fact that __metadata fields cannot be accessed from outside
> of the object that implements the reference count.

Not really, e.g., just expose a pointer to it. I really don't see why we need special-case visibility rules just for this.

Note that reference counts are not the only use case. You can also do e.g., lazy initialization.

> This offers the guarantee
> that a strongly pure function will not alter a __metadata field

Not really.

> and hence it
> can be the subject of any purity-based optimization

That's backwards. The purity-based optimizations determine what you are allowed to do to `__metadata` fields. Because you shouldn't do arbitrary stuff there it should be `@system`, as it's up to the programmer to uphold guarantees here, not the compiler. It's a low-level feature.

> (with the exception of functions that do deallocations).
> ...

Note that this means you have to mark functions that are used from destructors specially.

>>>
>>> What do you think? Am I missing anything?
>>
>> E.g., how to deallocate memory in a pure destructor? (Solved by `__mutable` functions in my original draft.)
>>
>> What if the destructor is both pure and immutable?
>>
> 
> Destructors suffer from the same issue as postblits did with regards to qualifiers. I would be surprised if you could ever call an immutable destructor.

Well, what I meant is what happens if you destruct an immutable object. There need to be some language rules that allow you to have memory that is only temporarily immutable, otherwise your reference counting scheme will never work for immutable memory anyway.

> Combined with the fact that you cannot overload the destructor makes everything worse. So I would argue that talking about immutable pure destructors in this context is like talking about a broken glass when you're in a house on fire.
> 
> However, I do get your point and in order to support immutable allocators we do need to take into consideration how allocation and deallocation is done.
> 
> I think that the problem here is that we are conflating a high-level concept (purity) with low level operations (allocation/deallocation). 

It's not really a conflation, it's that you implement the high-level concept in terms of low-level operations. This is not problematic, this is how it always works.

> The latter category is essentially not pure because it modifies operating system data structures, so I would say that we could view them as system code, not in the sense that they are unsafe (although they could be), but because they require OS assistance.

They are unsafe because you need them to behave in a certain way for it to be possible to consider them `pure`.

> Now, this sort of functions should be at most weakly pure (and this is what mutable functions essentially did) even though their signature looks as if they were strongly pure.

I don't think this has much to do with weakly vs strongly pure. It's a different category.

> One solution would be to consider @system/@trusted pure functions as weakly pure no matter what their signature looks like. Constructors/destructors should also be considered at most weakly pure. If your function is @safe and pure then the signature should be analyzed to decide if it is strongly or weakly pure. Also, any function that you are calling from another language should be at most trusted. This way, @trusted would act as an optimization blocker.
> ...

There is no good reason whatsoever why @trusted or @system should block optimizations. Orthogonal language design, please. :(

April 08, 2022

Re: Rebooting the __metada/__mutable discussion

Posted by Timon Gehr
in reply to rikki cattermole

Permalink

Timon Gehr

Posted in reply to rikki cattermole

Permalink

On 07.04.22 19:40, rikki cattermole wrote:
> 
> On 07/04/2022 8:51 PM, RazvanN wrote:
>> I think that the problem here is that we are conflating a high-level concept (purity) with low level operations (allocation/deallocation). The latter category is essentially not pure because it modifies operating system data structures, so I would say that we could view them as system code, not in the sense that they are unsafe (although they could be), but because they require OS assistance.
> 
> I've had to argue with myself on if memory mapping should be viewed as pure or not. During development of my own allocators.
> 
> Ultimately the argument I came up with is that we have already defined memory mapping in the form of new as being pure. I disagree that it should be defined as such, but it has been.
> ...

That's just the wrong level of abstraction. Clearly creating new values should be pure, you can create arbitrary tree structures in basically all purely functional programming languages. It's usually the very core of how such code operates... The justification for `pure` in D is to be able to somewhat compete with such languages.

Whether allocation/memory mapping itself should be `pure` is debatable. It certainly can't be `pure` if allocation failure is not fatal or it returns uninitialized memory that can be accessed.

However, `new` (memory mapping + initialization + construction) is not setting any precedent for that, it's more high level.

> As such I have classified memory mapping as being a set of "deity" functions. Things that modify the execution environment but not the logic being executed therefore can be considered pure.

There is no issue with allocation plus construction as long as none of the allocator state can leak into the observable behavior. (E.g., peeking into the bits of pointers should be impure.) Deallocation needs to be treated specially in any case though.

April 08, 2022

Re: Rebooting the __metada/__mutable discussion

Posted by Zach Tollen
in reply to RazvanN

Permalink

Zach Tollen

Posted in reply to RazvanN

Permalink

On Wednesday, 6 April 2022 at 09:41:52 UTC, RazvanN wrote:

What do you think? Am I missing anything? If you think this could fly, I could update the DIP and submit it.

It strikes me that this DIP is eerily similar to the @system variables described in DIP1035 which is in the final phase of the review process. It seems perfectly possible to merge the characteristics of __metadata as described here, with the characteristics of @system variables in that DIP.

The effect would be that basically, you can do anything with @system variables — except access them in @safe code, which is precisely the point. You can violate the type system. You have total freedom. And that's that.

I suppose the downside is that one might not desire to have all that freedom. But I'm trying to think of any situation where you would want a @system variable to be immutable. The whole point of both @system and __metadata in the two DIPs, is that @system/__meta data is mutable. (With DIP1035, mutability is essential to what makes a given piece of data dangerous enough require being marked @system. And the whole point of __metadata is to force mutability bypassing the type system.)

-- Zach

Top | Forum index | About this forum

Forums