Thread overview
Lowerings to strongly pure _d_arrayctor trigger warnings or risk being incorrectly removed
Oct 12, 2021
Teodor Dutu
Oct 12, 2021
Paul Backus
Oct 12, 2021
Teodor Dutu
Oct 12, 2021
Paul Backus
Oct 12, 2021
Teodor Dutu
Oct 12, 2021
Paul Backus
Oct 12, 2021
Johan
Oct 14, 2021
RazvanN
October 12, 2021

Hi,

I've been working on this PR for a while now and after seeing it fail some tests in phobos (for example, this one), my mentors for SAoC 2021, Razvan Nitu and Eduard Staniloiu, and I figured out that, when lowered using const or immutable arguments, _d_arrayctor becomes a strongly pure function. For instance, in the code snippet below

struct S {};
const S[2] b;
const S[2] a = b;

the line const S[2] a = b is lowered to _d_arrayctor(a, b), which is the intended behaviour.

However, since, in this case, _d_arrayctor is strongly pure and since its return value is ignored, the compiler issues the warning in the failed test above. In addition, its strong purity might also cause calls to _d_arrayctor to be removed by the compiler as part of the optimisation phase.

In order to avoid such unwanted events, the only solution we could come up with was to force _d_arrayctor to be weakly pure instead. We achieved this by adding a third pointer-type parameter, as implemented in this PR. But this is merely a stop-gap solution, because it acts against the language, by denying one of its properties: purity.

We also tried changing the type of either to or the from parameters to a mutable void[], but in this case the compiler was unable to instantiate the function's template correctly. So this solution didn't work.

Have you faced this issue before? What were your solutions?

Thanks,
Teodor

October 12, 2021

On Tuesday, 12 October 2021 at 16:27:51 UTC, Teodor Dutu wrote:

>

Hi,

I've been working on this PR for a while now and after seeing it fail some tests in phobos (for example, this one), my mentors for SAoC 2021, Razvan Nitu and Eduard Staniloiu, and I figured out that, when lowered using const or immutable arguments, _d_arrayctor becomes a strongly pure function. For instance, in the code snippet below

struct S {};
const S[2] b;
const S[2] a = b;

the line const S[2] a = b is lowered to _d_arrayctor(a, b), which is the intended behaviour.

I think the fundamental problem is that what _d_arrayctor is doing here is undefined behavior, according to the language spec:

>

Note that casting away a const qualifier and then mutating is undefined behavior, too, even when the referenced data is mutable.

Source: https://dlang.org/spec/const3.html#removing_with_cast

The spec makes a special exception for struct and class constructors, which allows them to write to non-mutable memory once without invoking UB, but there is no corresponding exception for variables that are not part of a struct or class.

In order to make progress on this, it will be necessary to change both the language spec and the compiler to allow initialization of non-mutable memory by library code. These changes will, presumably, render the optimization you are currently fighting against invalid.

October 12, 2021

On Tuesday, 12 October 2021 at 16:27:51 UTC, Teodor Dutu wrote:

>

Hi,

I've been working on this PR for a while now and after seeing it fail some tests in phobos (for example, this one), my mentors for SAoC 2021, Razvan Nitu and Eduard Staniloiu, and I figured out that, when lowered using const or immutable arguments, _d_arrayctor becomes a strongly pure function.

Why would you want to instantiate _d_arrayctor multiple times for different constness of its arguments?
You can choose the exact details of the lowering, so would your problem not be solved by simply lower to calls without const/immutable parameter types?

-Johan

October 12, 2021

On Tuesday, 12 October 2021 at 16:46:00 UTC, Paul Backus wrote:

>

On Tuesday, 12 October 2021 at 16:27:51 UTC, Teodor Dutu wrote:

>
struct S {};
const S[2] b;
const S[2] a = b;

I think the fundamental problem is that what _d_arrayctor is doing here is undefined behavior, according to the language spec:

>

Note that casting away a const qualifier and then mutating is undefined behavior, too, even when the referenced data is mutable.

Source: https://dlang.org/spec/const3.html#removing_with_cast

The spec makes a special exception for struct and class constructors, which allows them to write to non-mutable memory once without invoking UB, but there is no corresponding exception for variables that are not part of a struct or class.

I have read the documentation at the links you left, but I can't see why the usage of _d_arrayctor would be an undefined behaviour. This lowering only occurs when initialising static arrays or slices, whereas the removal of the immutable or const qualifiers that you mentioned is showcased like this:

immutable int* p = ...;
int* q = cast(int*)p;

And this has nothing to do with array initialisations, so I think I'm in the clear with using _d_arrayctor.

Do you think I'm missing anything?

October 12, 2021

On Tuesday, 12 October 2021 at 18:04:21 UTC, Teodor Dutu wrote:

>

I have read the documentation at the links you left, but I can't see why the usage of _d_arrayctor would be an undefined behaviour.

_d_arrayctor casts away immutable internally--either here, directly, or here indirectly via copyEmplace (which casts it away here). It then uses memcpy to mutate the memory that was originally typed as immutable. This is undefined behavior, according to the language spec.

>

This lowering only occurs when initialising static arrays or slices

The spec does not currently make an exception for initializing static arrays or slices, so the fact that it only occurs in that context does not make any difference to whether the spec considers it UB.

October 12, 2021

On Tuesday, 12 October 2021 at 18:27:53 UTC, Paul Backus wrote:

>

On Tuesday, 12 October 2021 at 18:04:21 UTC, Teodor Dutu wrote:

>

I have read the documentation at the links you left, but I can't see why the usage of _d_arrayctor would be an undefined behaviour.

_d_arrayctor casts away immutable internally--either here, directly, or here indirectly via copyEmplace (which casts it away here). It then uses memcpy to mutate the memory that was originally typed as immutable. This is undefined behavior, according to the language spec.

>

This lowering only occurs when initialising static arrays or slices

The spec does not currently make an exception for initializing static arrays or slices, so the fact that it only occurs in that context does not make any difference to whether the spec considers it UB.

I see your point now. But this behaviour is not new. The current implementation uses this hook, which suffers from the same shortcomings. Furthermore, the hook uses TypeInfo, which makes it slower than what I'm trying to achieve with the templated approach.

And secondly, bear in mind that the lowering only occurs when the lhs is being constructed, which means that two things can happen:

I don't see how the second scenario can be avoided, with or without _d_arrayctor.

October 12, 2021

On Tuesday, 12 October 2021 at 19:31:34 UTC, Teodor Dutu wrote:

>

On Tuesday, 12 October 2021 at 18:27:53 UTC, Paul Backus wrote:

>

On Tuesday, 12 October 2021 at 18:04:21 UTC, Teodor Dutu wrote:

>

I have read the documentation at the links you left, but I can't see why the usage of _d_arrayctor would be an undefined behaviour.

_d_arrayctor casts away immutable internally--either here, directly, or here indirectly via copyEmplace (which casts it away here). It then uses memcpy to mutate the memory that was originally typed as immutable. This is undefined behavior, according to the language spec.

>

This lowering only occurs when initialising static arrays or slices

The spec does not currently make an exception for initializing static arrays or slices, so the fact that it only occurs in that context does not make any difference to whether the spec considers it UB.

I see your point now. But this behaviour is not new. The current implementation uses this hook, which suffers from the same shortcomings. Furthermore, the hook uses TypeInfo, which makes it slower than what I'm trying to achieve with the templated approach.

And secondly, bear in mind that the lowering only occurs when the lhs is being constructed, which means that two things can happen:

I don't see how the second scenario can be avoided, with or without _d_arrayctor.

Yes, I understand that the compiler has to initialize the array somehow, and that in the general case this requires writing to memory typed as immutable.

My point is not that _d_arrayctor should not do its job, but that the current language spec makes it impossible to implement _d_arrayctor in D (as opposed to "in compiler magic") without invoking UB.

This means that if you want to implement _d_arrayctor in D, you will have to start by changing the language spec.

If you don't, any workaround you come up with is still going to involve UB, which means it will be at risk of falling apart as soon as it is exposed to a more powerful optimizer.

October 14, 2021

On Tuesday, 12 October 2021 at 16:46:10 UTC, Johan wrote:

>

On Tuesday, 12 October 2021 at 16:27:51 UTC, Teodor Dutu wrote:

>

Hi,

I've been working on this PR for a while now and after seeing it fail some tests in phobos (for example, this one), my mentors for SAoC 2021, Razvan Nitu and Eduard Staniloiu, and I figured out that, when lowered using const or immutable arguments, _d_arrayctor becomes a strongly pure function.

Why would you want to instantiate _d_arrayctor multiple times for different constness of its arguments?
You can choose the exact details of the lowering, so would your problem not be solved by simply lower to calls without const/immutable parameter types?

-Johan

Because you want to call the correct copy constructor. Consider:

struct S
{
     this(ref S rhs) immutable {}
}

void main()
{
   S[2] a;
   immutable S[2] b = a;
}