Slice expressions - exact evaluation order, dollar - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Slice expressions - exact evaluation order, dollar

Thread overview

Slice expressions - exact evaluation order, dollar
Jun 17, 2016 kinke
Jun 25, 2016 kinke
Jun 26, 2016 Timon Gehr
Jun 26, 2016 Iain Buclaw
Jun 26, 2016 Iain Buclaw
Jun 26, 2016 kinke
Jun 26, 2016 Timon Gehr
Jun 26, 2016 Iain Buclaw
Jun 27, 2016 Timon Gehr
Jul 12, 2016 Iain Buclaw
Jul 19, 2016 Timon Gehr
Jul 13, 2016 kinke
Jul 13, 2016 kinke
Jul 14, 2016 Michael Coulombe

June 17, 2016

Slice expressions - exact evaluation order, dollar

Posted by kinke

kinke

The following snippet is interesting:

<<<
__gshared int step = 0;
__gshared int[] globalArray;

ref int[] getBase()
{
    assert(step == 0);
    ++step;
    return globalArray;
}

int getLowerBound(size_t dollar)
{
    assert(step == 1);
    ++step;
    assert(dollar == 0);
    globalArray = [ 666 ];
    return 1;
}

int getUpperBound(size_t dollar)
{
    assert(step == 2);
    ++step;
    assert(dollar == 1);
    globalArray = [ 1, 2, 3 ];
    return 3;
}

// LDC issue #1433
void main()
{
    auto r = getBase()[getLowerBound($) .. getUpperBound($)];
    assert(r == [ 2, 3 ]);
}
>>>

Firstly, it fails with DMD 2.071 because $ in the upper bound expression is 0, i.e., it doesn't reflect the updated length (1) after evaluating the lower bound expression. LDC does.
Secondly, DMD 2.071 throws a RangeError, most likely because it's using the initial length for the bounds checks too.

Most interesting IMO though is the question when the slicee's pointer is to be loaded. This is only relevant if the base is an lvalue and may therefore be modified when evaluating the bound expressions. Should the returned slice be based on the slicee's buffer before or after evaluating the bounds expressions?
This has been triggered by https://github.com/ldc-developers/ldc/issues/1433 as LDC loads the pointer before evaluating the bounds.

June 25, 2016

Re: Slice expressions - exact evaluation order, dollar

Posted by kinke
in reply to kinke

kinke

Posted in reply to kinke

Ping. Let's clearly define these hairy evaluation order details and add corresponding tests; that'd be another advantage over C++.

June 26, 2016

Re: Slice expressions - exact evaluation order, dollar

Posted by Timon Gehr
in reply to kinke

Timon Gehr

Posted in reply to kinke

On 17.06.2016 21:59, kinke wrote:
>
> Most interesting IMO though is the question when the slicee's pointer is
> to be loaded. This is only relevant if the base is an lvalue and may
> therefore be modified when evaluating the bound expressions. Should the
> returned slice be based on the slicee's buffer before or after
> evaluating the bounds expressions?
> This has been triggered by
> https://github.com/ldc-developers/ldc/issues/1433 as LDC loads the
> pointer before evaluating the bounds.

Evaluation order should be strictly left-to-right. DMD and GDC get it wrong here.

June 26, 2016

Re: Slice expressions - exact evaluation order, dollar

Posted by Iain Buclaw
in reply to Timon Gehr

Iain Buclaw

Posted in reply to Timon Gehr

On 26 June 2016 at 03:30, Timon Gehr via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> On 17.06.2016 21:59, kinke wrote:
>>
>>
>> Most interesting IMO though is the question when the slicee's pointer is
>> to be loaded. This is only relevant if the base is an lvalue and may
>> therefore be modified when evaluating the bound expressions. Should the
>> returned slice be based on the slicee's buffer before or after
>> evaluating the bounds expressions?
>> This has been triggered by
>> https://github.com/ldc-developers/ldc/issues/1433 as LDC loads the
>> pointer before evaluating the bounds.
>
>
> Evaluation order should be strictly left-to-right. DMD and GDC get it wrong here.
>

It is evaluated left-to-right. getBase() -> getLowerBound() -> getUpperBound().

June 26, 2016

Re: Slice expressions - exact evaluation order, dollar

Posted by Iain Buclaw

Iain Buclaw

Attachments:

text/html part

On 26 June 2016 at 09:36, Iain Buclaw <ibuclaw@gdcproject.org> wrote:

> On 26 June 2016 at 03:30, Timon Gehr via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> > On 17.06.2016 21:59, kinke wrote:
> >>
> >>
> >> Most interesting IMO though is the question when the slicee's pointer is
> >> to be loaded. This is only relevant if the base is an lvalue and may
> >> therefore be modified when evaluating the bound expressions. Should the
> >> returned slice be based on the slicee's buffer before or after
> >> evaluating the bounds expressions?
> >> This has been triggered by
> >> https://github.com/ldc-developers/ldc/issues/1433 as LDC loads the
> >> pointer before evaluating the bounds.
> >
> >
> > Evaluation order should be strictly left-to-right. DMD and GDC get it
> wrong
> > here.
> >
>
> It is evaluated left-to-right. getBase() -> getLowerBound() ->
> getUpperBound().
>

Ah, I see what you mean.  I think you may be using an old GDC version. Before I used to cache the result of getBase().

Old codegen:

_base = *(getBase());
_lwr = getLowerBound(_base.length);
_upr = getUpperBound(_base.length);
r = {.length=(_upr - _lwr), .ptr=_base.ptr + _lwr * 4};

---
Now when creating temporaries of references, the reference is stabilized instead.

New codegen:

*(_ptr = getBase());
_lwr = getLowerBound(_ptr.length);
_upr = getUpperBound(_ptr.length);
r = {.length=(_upr - _lwr), .ptr=_ptr.ptr + _lwr * 4};
---

I suggest you fix LDC if it doesn't already do this. :-)

June 26, 2016

Re: Slice expressions - exact evaluation order, dollar

Posted by kinke
in reply to Iain Buclaw

kinke

Posted in reply to Iain Buclaw

On Sunday, 26 June 2016 at 08:08:58 UTC, Iain Buclaw wrote:
> Now when creating temporaries of references, the reference is stabilized instead.
>
> New codegen:
>
> *(_ptr = getBase());
> _lwr = getLowerBound(_ptr.length);
> _upr = getUpperBound(_ptr.length);
> r = {.length=(_upr - _lwr), .ptr=_ptr.ptr + _lwr * 4};
> ---
>
> I suggest you fix LDC if it doesn't already do this. :-)

Thx for the replies - so my testcase works for GDC already? So since what GDC is doing is what I came up for independently for LDC (PR #1566), I'd say DMD needs to follow suit.

June 26, 2016

Re: Slice expressions - exact evaluation order, dollar

Posted by Timon Gehr
in reply to Iain Buclaw

Timon Gehr

Posted in reply to Iain Buclaw

On 26.06.2016 10:08, Iain Buclaw via Digitalmars-d wrote:
>
>      > Evaluation order should be strictly left-to-right. DMD and GDC
>     get it wrong
>      > here.
>      >
>
>     It is evaluated left-to-right. getBase() -> getLowerBound() ->
>     getUpperBound().
>
>
> Ah, I see what you mean.  I think you may be using an old GDC version.
> Before I used to cache the result of getBase().
>
> Old codegen:
>
> _base = *(getBase());
> _lwr = getLowerBound(_base.length);
> _upr = getUpperBound(_base.length);
> r = {.length=(_upr - _lwr), .ptr=_base.ptr + _lwr * 4};
>
> ---

This seems to be what I'd expect. It's also what CTFE does.
CTFE and run time behaviour should be identical. (So either one of them needs to be fixed.)


> Now when creating temporaries of references, the reference is stabilized
> instead.
>
> New codegen:
>
> *(_ptr = getBase());
> _lwr = getLowerBound(_ptr.length);
> _upr = getUpperBound(_ptr.length);
> r = {.length=(_upr - _lwr), .ptr=_ptr.ptr + _lwr * 4};
> ---
>
> I suggest you fix LDC if it doesn't already do this. :-)


I'm not convinced this is a good idea. It makes (()=>base)()[lwr()..upr()] behave differently from base[lwr()..upr()].

June 26, 2016

Re: Slice expressions - exact evaluation order, dollar

Posted by Iain Buclaw
in reply to Timon Gehr

Iain Buclaw

Posted in reply to Timon Gehr

On 26 June 2016 at 14:33, Timon Gehr via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> On 26.06.2016 10:08, Iain Buclaw via Digitalmars-d wrote:
>>
>> Old codegen:
>>
>> _base = *(getBase());
>> _lwr = getLowerBound(_base.length);
>> _upr = getUpperBound(_base.length);
>> r = {.length=(_upr - _lwr), .ptr=_base.ptr + _lwr * 4};
>>
>> ---
>
>
> This seems to be what I'd expect. It's also what CTFE does.
> CTFE and run time behaviour should be identical. (So either one of them
> needs to be fixed.)
>
>

Very likely CTFE.  Anyway, this isn't the only thing where CTFE and Runtime do things differently.

>> Now when creating temporaries of references, the reference is stabilized instead.
>>
>> New codegen:
>>
>> *(_ptr = getBase());
>> _lwr = getLowerBound(_ptr.length);
>> _upr = getUpperBound(_ptr.length);
>> r = {.length=(_upr - _lwr), .ptr=_ptr.ptr + _lwr * 4};
>> ---
>>
>> I suggest you fix LDC if it doesn't already do this. :-)
>
>
>
> I'm not convinced this is a good idea. It makes (()=>base)()[lwr()..upr()]
> behave differently from base[lwr()..upr()].

No, sorry, I'm afraid you are wrong there. They should both behave exactly the same.

I may need to step aside and explain what changed in GDC, as it had nothing to do with this LDC bug.

==> Step

What made this subtle change was in relation to fixing bug 42 and 228 in GDC, which involved turning on TREE_ADDRESSABLE(type) bit in our codegen trees, which in turn makes NRVO work consistently regardless of optimization flags used - no more optimizer being confused by us "faking it".

How is the above jargon related? Well, one of the problems faced was that it must be ensured that lvalues continue being lvalues when considering creating a temporary in the codegen pass.  Lvalue references must have the reference stabilized, not the value that is being dereferenced.  This also came with an added assurance that GDC will now *never* create a temporary of a decl with a cpctor or dtor, else it'll die with an internal compiler error trying. :-)

<== Step

(() => base)[lwr()..up()] will make a temporary of (() => base), but
guarantees that references are stabilized first.

base[lwr()..upr()] will create no temporary if base has no side
effects.  And so if lwr() modifies base, then upr() will get the
updated copy.

June 27, 2016

Re: Slice expressions - exact evaluation order, dollar

Posted by Timon Gehr
in reply to Iain Buclaw

Timon Gehr

Posted in reply to Iain Buclaw

On 26.06.2016 20:08, Iain Buclaw via Digitalmars-d wrote:
> On 26 June 2016 at 14:33, Timon Gehr via Digitalmars-d
> <digitalmars-d@puremagic.com> wrote:
>> On 26.06.2016 10:08, Iain Buclaw via Digitalmars-d wrote:
>>>
>>> Old codegen:
>>>
>>> _base = *(getBase());
>>> _lwr = getLowerBound(_base.length);
>>> _upr = getUpperBound(_base.length);
>>> r = {.length=(_upr - _lwr), .ptr=_base.ptr + _lwr * 4};
>>>
>>> ---
>>
>>
>> This seems to be what I'd expect. It's also what CTFE does.
>> CTFE and run time behaviour should be identical. (So either one of them
>> needs to be fixed.)
>>
>>
>
> Very likely CTFE.  Anyway, this isn't the only thing where CTFE and
> Runtime do things differently.
> ...

All arbitrary differences should be eradicated.

>>> Now when creating temporaries of references, the reference is stabilized
>>> instead.
>>>
>>> New codegen:
>>>
>>> *(_ptr = getBase());
>>> _lwr = getLowerBound(_ptr.length);
>>> _upr = getUpperBound(_ptr.length);
>>> r = {.length=(_upr - _lwr), .ptr=_ptr.ptr + _lwr * 4};
>>> ---
>>>
>>> I suggest you fix LDC if it doesn't already do this. :-)
>>
>>
>>
>> I'm not convinced this is a good idea. It makes (()=>base)()[lwr()..upr()]
>> behave differently from base[lwr()..upr()].
>
> No, sorry, I'm afraid you are wrong there. They should both behave
> exactly the same.
> ...

I don't see how that is possible, unless I misunderstood your previous explanation. As far as I understand, for the first expression, code gen will generate a reference to a temporary copy of base, and for the second expression, it will generate a reference to base directly. If lwr() or upr() then update the ptr and/or the length of base, those changes will be seen for the second slice expression, but not for the first.


> I may need to step aside and explain what changed in GDC, as it had
> nothing to do with this LDC bug.
>
> ==> Step
>
> What made this subtle change was in relation to fixing bug 42 and 228
> in GDC, which involved turning on TREE_ADDRESSABLE(type) bit in our
> codegen trees, which in turn makes NRVO work consistently regardless
> of optimization flags used - no more optimizer being confused by us
> "faking it".
>
> How is the above jargon related? Well, one of the problems faced was
> that it must be ensured that lvalues continue being lvalues when
> considering creating a temporary in the codegen pass.  Lvalue
> references must have the reference stabilized, not the value that is
> being dereferenced.  This also came with an added assurance that GDC
> will now *never* create a temporary of a decl with a cpctor or dtor,
> else it'll die with an internal compiler error trying. :-)
> ...

What is the justification why the base should be evaluated as an lvalue?

> <== Step
>
> (() => base)[lwr()..up()] will make a temporary of (() => base), but
> guarantees that references are stabilized first.
>

(I assume you meant (() => base)()[lwr()..upr()].)

The lambda returns by value, so you will stabilize the reference to a temporary copy of base? (Unless I misunderstand your terminology.)

> base[lwr()..upr()] will create no temporary if base has no side
> effects.  And so if lwr() modifies base, then upr() will get the
> updated copy.
>

Yes, it is clear that upr() should see modifications to memory that lwr() makes. The point is that the slice expression itself does or does not see the updates based on whether I wrap base in a lambda or not.

July 12, 2016

Re: Slice expressions - exact evaluation order, dollar

Posted by Iain Buclaw
in reply to Timon Gehr

Iain Buclaw

Posted in reply to Timon Gehr

On 27 June 2016 at 04:38, Timon Gehr via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> On 26.06.2016 20:08, Iain Buclaw via Digitalmars-d wrote:
>>
>> On 26 June 2016 at 14:33, Timon Gehr via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>>>
>>> On 26.06.2016 10:08, Iain Buclaw via Digitalmars-d wrote:
>>>>
>>>>
>>>> Old codegen:
>>>>
>>>> _base = *(getBase());
>>>> _lwr = getLowerBound(_base.length);
>>>> _upr = getUpperBound(_base.length);
>>>> r = {.length=(_upr - _lwr), .ptr=_base.ptr + _lwr * 4};
>>>>
>>>> ---
>>>
>>>
>>>
>>> This seems to be what I'd expect. It's also what CTFE does.
>>> CTFE and run time behaviour should be identical. (So either one of them
>>> needs to be fixed.)
>>>
>>>
>>
>> Very likely CTFE.  Anyway, this isn't the only thing where CTFE and
>> Runtime do things differently.
>> ...
>
>
> All arbitrary differences should be eradicated.
>
>>>> Now when creating temporaries of references, the reference is stabilized instead.
>>>>
>>>> New codegen:
>>>>
>>>> *(_ptr = getBase());
>>>> _lwr = getLowerBound(_ptr.length);
>>>> _upr = getUpperBound(_ptr.length);
>>>> r = {.length=(_upr - _lwr), .ptr=_ptr.ptr + _lwr * 4};
>>>> ---
>>>>
>>>> I suggest you fix LDC if it doesn't already do this. :-)
>>>
>>>
>>>
>>>
>>> I'm not convinced this is a good idea. It makes
>>> (()=>base)()[lwr()..upr()]
>>> behave differently from base[lwr()..upr()].
>>
>>
>> No, sorry, I'm afraid you are wrong there. They should both behave
>> exactly the same.
>> ...
>
>
> I don't see how that is possible, unless I misunderstood your previous explanation. As far as I understand, for the first expression, code gen will generate a reference to a temporary copy of base, and for the second expression, it will generate a reference to base directly. If lwr() or upr() then update the ptr and/or the length of base, those changes will be seen for the second slice expression, but not for the first.
>
>
>> I may need to step aside and explain what changed in GDC, as it had nothing to do with this LDC bug.
>>
>> ==> Step
>>
>> What made this subtle change was in relation to fixing bug 42 and 228 in GDC, which involved turning on TREE_ADDRESSABLE(type) bit in our codegen trees, which in turn makes NRVO work consistently regardless of optimization flags used - no more optimizer being confused by us "faking it".
>>
>> How is the above jargon related? Well, one of the problems faced was
>> that it must be ensured that lvalues continue being lvalues when
>> considering creating a temporary in the codegen pass.  Lvalue
>> references must have the reference stabilized, not the value that is
>> being dereferenced.  This also came with an added assurance that GDC
>> will now *never* create a temporary of a decl with a cpctor or dtor,
>> else it'll die with an internal compiler error trying. :-)
>> ...
>
>
> What is the justification why the base should be evaluated as an lvalue?
>

Because changes made to a temporary get lost as they never bind back to the original reference.

Regardless, creating a temporary of a struct with a cpctor violates the semantics of the type - it's the job of the frontend to generate all the code for lifetime management for us.

(Sorry for the belated response, I have been distracted).

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation