July 12, 2016
On Tue, Jul 12, 2016 at 01:33:00AM -0400, Andrei Alexandrescu via Digitalmars-d wrote:
> On 07/12/2016 01:15 AM, Shachar Shemesh wrote:
[...]
> > Casting away immutability is UB in D.
> 
> I understand. There is an essential detail that sadly puts an anticlimactic end to the telenovela. The unsafe cast happens at allocator level.
[...]

What's an "unsafe cast"?  I think we're mixing up terminology here, which is not helping this discussion.

Is casting away immutable merely *unsafe*, or is it UB?

Because if it's UB (as understood by the rest of the world), then your statement essentially amounts to saying that allocators are UB. Which in turn means that optimizing compilers are free to assume that allocaters are impossible (since they are UB and the compiler is therefore free to do whatever it wants there, such as assume that it cannot ever happen), and, in all likelihood, output garbage in the executable as a result.

If you don't mean UB in this sense of the term, then you (well, the D language spec) need to define what exactly is supposed to happen when immutable is cast away. Exactly when is such a cast UB, and when is it *not* UB?  (I'm assuming that casting away immutable in the general case is UB, e.g., if the compiler puts such memory in ROM. But since this isn't always the case, e.g., you're allocating a block of mutable memory from RAM but designating it as immutable for the purposes of the type system, then the spec needs to specify exactly when such casts will not result in UB, to allow room for allocators to be implementable. If all the spec says is a blanket statement that such casts are UB, then by definition of UB all such allocator code is invalid, and an optimizing compiler is free to "optimize" it away (with disastrous results).)

Or perhaps what you *really* mean is that casting away immutable is *implementation-defined*, not UB. The two are not the same thing. (But even then, you may still run into trouble with implementations that define a behaviour that doesn't match what, e.g., the allocator code assumes. These things need to be explicitly stated in the spec so that implementors won't do something outside of what you intended -- whether deliberate or as a misunderstanding of what the intent of the spec was.)


T

-- 
There is no gravity. The earth sucks.
July 12, 2016
On 07/12/2016 10:26 AM, H. S. Teoh via Digitalmars-d wrote:
> On Tue, Jul 12, 2016 at 01:33:00AM -0400, Andrei Alexandrescu via Digitalmars-d wrote:
>> On 07/12/2016 01:15 AM, Shachar Shemesh wrote:
> [...]
>>> Casting away immutability is UB in D.
>>
>> I understand. There is an essential detail that sadly puts an
>> anticlimactic end to the telenovela. The unsafe cast happens at
>> allocator level.
> [...]
>
> What's an "unsafe cast"?  I think we're mixing up terminology here,
> which is not helping this discussion.
>
> Is casting away immutable merely *unsafe*, or is it UB?

The subtlety here is that immutable is not being cast away. All data that is typed as immutable stays immutable. -- Andrei
July 12, 2016
On 07/12/2016 10:40 AM, Andrei Alexandrescu wrote:
> On 07/12/2016 10:26 AM, H. S. Teoh via Digitalmars-d wrote:
>> On Tue, Jul 12, 2016 at 01:33:00AM -0400, Andrei Alexandrescu via
>> Digitalmars-d wrote:
>>> On 07/12/2016 01:15 AM, Shachar Shemesh wrote:
>> [...]
>>>> Casting away immutability is UB in D.
>>>
>>> I understand. There is an essential detail that sadly puts an
>>> anticlimactic end to the telenovela. The unsafe cast happens at
>>> allocator level.
>> [...]
>>
>> What's an "unsafe cast"?  I think we're mixing up terminology here,
>> which is not helping this discussion.
>>
>> Is casting away immutable merely *unsafe*, or is it UB?
>
> The subtlety here is that immutable is not being cast away. All data
> that is typed as immutable stays immutable. -- Andrei

To clarify, this is the code in question:

https://github.com/dlang/phobos/blob/master/std/experimental/allocator/building_blocks/affix_allocator.d#L210

The parameter is an array of a generic type T (which may be immutable). The immutability of data is not compromised; the pointer to the beginning of the array is only used as a pivot to access data that sits before the array. That data has never been typed as immutable, so the code is typed correctly (assuming the data had been allocated with this allocator), even though the compiler cannot prove it.

The language definition must allow this to work. But this is not a matter of changing the definition of immutable. It's a simple matter of data layout (e.g. if you add a positive number to a pointer and later you subtract it, you get the same pointer etc). We already have that down, even if the spec language could be better.


Andrei

July 12, 2016
On 12/07/16 17:26, Andrei Alexandrescu wrote:

> Thanks. I must have misunderstood - I was looking for something that's
> not @safe.
>

No, I was referring to his statement that features in D tend to create complexity and unexpected/non-intuitive behavior when combined. Sorry if I was not clear.

>
> AffixAllocator is not casting away immutability - that's the beauty of
> it. But I'm all for making the language more precise to allow the kind
> of work AffixAllocator does portably. Would love some help from you there!

This is a bit academic, but I don't understand how you can get an immutable/const pointer to memory, and then get a pointer to a mutable uint out of it without casting away the constness. Yes, you are subtracting the pointer so it points to outside the original memory, but technically speaking (which is all the compiler really sees, at this point), you are casting "immutable MyClass*" to "uint*". It is only semantically that you know this is fine.

Saw your reference to the code in a different comment. I believe (and I might be wrong) the cast in question to be here:
https://github.com/dlang/phobos/blob/master/std/experimental/allocator/building_blocks/affix_allocator.d#L213

The input might be CI. The output is mutable.

If such cast is UB, then the compiler is free to say "this is nonsense, I'm not going to do it". If we use the C++'s UB definition, the compiler can say "I hereby assume that the buffer is actually mutable". This is, potentially, completely different code generation.

>>
>> The C++ definition is quite solid. Casting away constness is UB IFF the
>> buffer was originally const.
>
> Yeah, we might relax that in D as well, albeit for different reasons.

I would love to hear about what are D's reasons, and in what way they are different than C++'s. To clarify, the previous sentence was meant to be read with no cynicism intended.

>
>> In this case, your allocator does two UBs. One when allocating (casting
>> a mutable byte range to immutable reference), and another when
>> deallocating. Both are defined as undefined by D, which means the
>> compiler is free to wreak havoc in both without you having the right to
>> complain.
>>
>> Which leads me to the conclusion that you cannot write an allocator in
>> D. I doubt that's a conclusion you'd stand behind.
>
> Again, your help with improving the language definition would be very
> welcome. Obviously we do want to have AffixAllocator and other
> allocators work properly.

I was thinking about intrusive reference counting, which is the classic case I'd use "mutable" in C++. In that case, the value we're mutating actually is a member of the struct that was passed as const (I think immutable isn't an issue in that use case).

I think a great first step is to relax the UB around casting away CI modifiers where we know, semantically, that the underlying memory is actually mutable.

Shachar
July 12, 2016
On 07/12/2016 10:53 AM, Shachar Shemesh wrote:
> On 12/07/16 17:26, Andrei Alexandrescu wrote:
>
>> Thanks. I must have misunderstood - I was looking for something that's
>> not @safe.
>>
>
> No, I was referring to his statement that features in D tend to create
> complexity and unexpected/non-intuitive behavior when combined. Sorry if
> I was not clear.

I agree we have a few of those, though I don't think your example is particularly egregious.

>> AffixAllocator is not casting away immutability - that's the beauty of
>> it. But I'm all for making the language more precise to allow the kind
>> of work AffixAllocator does portably. Would love some help from you
>> there!
>
> Saw your reference to the code in a different comment. I believe (and I
> might be wrong) the cast in question to be here:
> https://github.com/dlang/phobos/blob/master/std/experimental/allocator/building_blocks/affix_allocator.d#L213

That's the code.

> The input might be CI. The output is mutable.

No. It's important to understand that the "input" and "output" are distinct because of the [-1]. That gets into memory that was never typed as unsafe.

This is a layout matter. It says that if you happen to know some immutable data sits 8 bytes to the right of some mutable data, doing the appropriate pointer arithmetic on the immutable data takes you correctly to the mutable data. I'm all for adding language to the spec to clarify that.

> If such cast is UB, then the compiler is free to say "this is nonsense,
> I'm not going to do it".

It's not UB.

> If we use the C++'s UB definition, the compiler
> can say "I hereby assume that the buffer is actually mutable". This is,
> potentially, completely different code generation.

I understand.

> I was thinking about intrusive reference counting, which is the classic
> case I'd use "mutable" in C++. In that case, the value we're mutating
> actually is a member of the struct that was passed as const (I think
> immutable isn't an issue in that use case).

You'll need to use AffixAllocator to do intrusive RC.

> I think a great first step is to relax the UB around casting away CI
> modifiers where we know, semantically, that the underlying memory is
> actually mutable.

That would not be the right thing to do.


Andrei


July 12, 2016
On 7/12/16 10:40 AM, Andrei Alexandrescu wrote:
> On 07/12/2016 10:26 AM, H. S. Teoh via Digitalmars-d wrote:
>> On Tue, Jul 12, 2016 at 01:33:00AM -0400, Andrei Alexandrescu via
>> Digitalmars-d wrote:
>>> On 07/12/2016 01:15 AM, Shachar Shemesh wrote:
>> [...]
>>>> Casting away immutability is UB in D.
>>>
>>> I understand. There is an essential detail that sadly puts an
>>> anticlimactic end to the telenovela. The unsafe cast happens at
>>> allocator level.
>> [...]
>>
>> What's an "unsafe cast"?  I think we're mixing up terminology here,
>> which is not helping this discussion.
>>
>> Is casting away immutable merely *unsafe*, or is it UB?
>
> The subtlety here is that immutable is not being cast away. All data
> that is typed as immutable stays immutable. -- Andrei

A related question: are we planning on making such access pure (or even allowing compiler to infer purity)? If so, we may have issues...

-Steve
July 12, 2016
On 07/12/2016 11:07 AM, Steven Schveighoffer wrote:
> A related question: are we planning on making such access pure (or even
> allowing compiler to infer purity)? If so, we may have issues...

Was that the link you posted? What's a summary of the issues and what do you think would be a proper way to address them? Thanks! -- Andrei
July 12, 2016
On 7/12/16 12:01 PM, Andrei Alexandrescu wrote:
> On 07/12/2016 11:07 AM, Steven Schveighoffer wrote:
>> A related question: are we planning on making such access pure (or even
>> allowing compiler to infer purity)? If so, we may have issues...
>
> Was that the link you posted? What's a summary of the issues and what do
> you think would be a proper way to address them? Thanks! -- Andrei

No, the link I posted was a poor proposal by a (much?) younger me to do a similar thing to the affix allocator (but on the language level). Apparently, I didn't have enough cred back then :)

The issue I'm referring to is the compiler eliding calls.

For example, let's say you have a reference counted type:

RC(T)
{
   T *value
}

And T is immutable. So far so good.

Now, we do this:

RC(T)
{
   alias MyAllocator = ...; // some form of affixallocator
   void incRef() { MyAllocator.prefix(value)++; }
}

Seems innocuous enough. However, if the compiler interprets incRef to be pure, and notices that "hey, all the parameters to this function are immutable, and it returns void! I don't have to call this, win-win!"

Then we have a problem. I raised similar points when C free was made pure (but to no avail).

It's not necessarily an unfixable problem, but we may need some language help to guarantee these aren't elided.

-Steve
July 12, 2016
On 7/12/16 12:12 PM, Steven Schveighoffer wrote:
> On 7/12/16 12:01 PM, Andrei Alexandrescu wrote:
>> On 07/12/2016 11:07 AM, Steven Schveighoffer wrote:
>>> A related question: are we planning on making such access pure (or even
>>> allowing compiler to infer purity)? If so, we may have issues...
>>
>> Was that the link you posted? What's a summary of the issues and what do
>> you think would be a proper way to address them? Thanks! -- Andrei
>
> No, the link I posted was a poor proposal by a (much?) younger me to do
> a similar thing to the affix allocator (but on the language level).
> Apparently, I didn't have enough cred back then :)

In all likelihood it must have been me who didn't get the implications. -- Andrei

July 12, 2016
On Tuesday, 12 July 2016 at 14:17:30 UTC, Andrei Alexandrescu wrote:
> Indeed I'm not the sharpest tool in the shed, and since it's already been established I'm the idiot and you're the wise man (congratulations - surely enough the great work to substantiate that is very soon to follow) in the proverb, I hope you'll allow me one more pedestrian question.
>

Proverb are not meant to be interpreted literally. If I'd think you are an actual idiot, I wouldn't waste my time arguing with you.

> So I've been looking through this thread for the five examples of what you're talking about (which to my mind is "@safe is just a convention") and the closest I could find is your post on http://forum.dlang.org/post/iysrtqzytdnrxsqtfwvk@forum.dlang.org.
>
> So there you discuss the inconsistency of "alias" which as far as I understand has nothing to do with safety. Then we have:
>
> enum E { A = 1, B = 2 }
> E bazinga = A | B;
> final switch (bazinga) { case A: ... case B: ... } // Enjoy !
>
> which I pasted with minor changes here: https://dpaste.dzfl.pl/b4f84374c3ae. I'm unclear how that interacts with @safe. It could, if the language would allow executing unsafe code after the switch. But it doesn't. Could you please clarify? And could you please point to the other examples?
>
>
> Thanks,
>
> Andrei

My point has nothing to do with safety, and this is why various example have nothing to do with safety. Safety was an example. The enum/final switch thing was another. The alias thing again one more. These are issue with which, each individual decision in isolation is actually very reasonable, but simply fail as a whole because these individual decision are either mutually exclusive (worst case scenario), or simply introduce needless complexity (best case scenario), both of which are undesirable.

The thread was about complexity in the language. My point is that the current way things are done introduce a lot of accidental complexity, which is overall undesirable. This impact negatively various aspects of the languages, including, but not limited to, @safe .

The problem I'm pointing at is that problems are considered in isolation, with disregard to the big picture. Ironically, this is exactly what is happening here, by debating every example to death rather than on the point.

Maybe I'm not expressing myself badly, but I discussed this with many other D community members and I seem to be able to reach them, so it must not be THAT bad.