Dlist and dip1000 challenge - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Dlist and dip1000 challenge

Thread overview

Dlist and dip1000 challenge
Oct 23, 2018 Steven Schveighoffer
Oct 23, 2018 Nicholas Wilson
Oct 23, 2018 Steven Schveighoffer
Oct 23, 2018 Stanislav Blinov
Oct 23, 2018 Steven Schveighoffer
Oct 23, 2018 Stanislav Blinov
Oct 23, 2018 Steven Schveighoffer
Oct 23, 2018 Stanislav Blinov
Oct 23, 2018 Walter Bright
Oct 23, 2018 rikki cattermole
Oct 23, 2018 Nicholas Wilson
Oct 23, 2018 Brad Roberts
Oct 23, 2018 Nicholas Wilson
Oct 24, 2018 Walter Bright
Oct 24, 2018 Nicholas Wilson
Oct 24, 2018 Walter Bright
Oct 25, 2018 Nicholas Wilson
Oct 24, 2018 Neia Neutuladh
Oct 24, 2018 Walter Bright
Oct 24, 2018 12345swordy
Oct 24, 2018 welkam
Oct 25, 2018 Nicholas Wilson
Oct 24, 2018 Neia Neutuladh
Oct 25, 2018 Nicholas Wilson
Oct 25, 2018 Neia Neutuladh
Oct 25, 2018 12345swordy
Oct 25, 2018 sclytrack
Oct 25, 2018 Nicholas Wilson
Oct 25, 2018 Nicholas Wilson
Oct 25, 2018 Bastiaan Veelo
Oct 25, 2018 Nicholas Wilson
Oct 24, 2018 Walter Bright
Oct 24, 2018 Nicholas Wilson
Oct 25, 2018 Steven Schveighoffer
Oct 25, 2018 Stanislav Blinov
Oct 25, 2018 Walter Bright
Oct 25, 2018 Steven Schveighoffer
Oct 26, 2018 Walter Bright
Oct 26, 2018 Steven Schveighoffer
Oct 26, 2018 jmh530
Oct 26, 2018 Walter Bright

October 23, 2018

Dlist and dip1000 challenge

Posted by Steven Schveighoffer

Steven Schveighoffer

I've implemented a mergesort for both slist and dlist in phobos. I found it easier to start with slist, since the algorithm and operation is basically the same, dlist will simply have a post-sort operation that reconnects all the prev pointers.

I went to apply this to dlist, and found a disturbing thing -- all accesses to data in dlist are done via *casting*. The idea behind it is, we only store non-templated nodes inside a list, and those nodes only have prev and next pointers. Then you cast to the correct "PayNode" type, giving you the actual data. Things are allocated properly, so it works, but holy crap, what a mess. For those who are interested, here is the original PR that introduced this: https://github.com/dlang/phobos/pull/2457.

I decided, I'll make a PR to undo all that (the savings from doing this can't be worth it, the only thing that's non-templated is the range popFront and popBack), but I've now run into an issue with dip1000. Apparently, some of the casting confuses the compiler sufficiently enough that it can't track lifetimes, and it just rubber-stamps it (at least, that's my theory).

I've removed the "BaseNode" type, and just use "PayNode" everywhere. All of a sudden, things become un-@safe.

So I started slapping @safe tags on to see where the issue is. It starts with something I've known is an issue for a long time, and that is having a scoped array of strings (or other reference types). You get this with a type-safe variadic function.

The compiler treats all the array items as if they were scope, meaning you can't assign a string, either allocated from the heap or a literal, to things that aren't scope, which doesn't make any sense. Sure the array is scope, but not what the elements point at!

In any case, I created a mock-up of the parts that are not working, can anyone find either an issue we can file for dip1000 or a way to solve this properly? I found I can mark stuff trusted, but I *really* don't like that. It's not much worse than casting everything, however.

Here is the mockup code:
https://run.dlang.io/is/6xDFnr

I've marked main @safe, and the problematic function @safe (if you don't mark the problematic function @safe, it just complains that main can't do anything).

Note that in reality everything here is perfectly safe, I'm not escaping anything.

---------

Begin rant

So, here is one other thing I want to say. This took me HOURS to find, and narrow down. Not because I don't understand the concepts behind dip1000, but because the compiler has fully inserted so many hidden scopes, I don't know what the actual code it's compiling is. One big problem I think with dip1000 is simply that it's nearly impossible to understand where the issues are. Like I said at the end of the post above, the result of allowing compiler inference of dip1000 is that your whole program is simply marked unsafe, and you have absolutely no idea where it is. You can't even guess, because scope just shows up where you never typed it. Given that you NEED this functionality on templates, it's going to result, IMO, in people either not using dip1000, or giving up and adding @trusted: to the top of their file. This is going to be horrible if we can't find a way to either require scope instead of inferring it in some cases, or create a way to diagnose where the blasted problem actually is. Maybe something to say "I expected this call to be @safe, why isn't it".

End rant.

-Steve

October 23, 2018

Re: Dlist and dip1000 challenge

Posted by Nicholas Wilson
in reply to Steven Schveighoffer

Nicholas Wilson

Posted in reply to Steven Schveighoffer

On Tuesday, 23 October 2018 at 15:10:24 UTC, Steven Schveighoffer wrote:
> I've implemented a mergesort for both slist and dlist in phobos. I found it easier to start with slist, since the algorithm and operation is basically the same, dlist will simply have a post-sort operation that reconnects all the prev pointers.
>
> I went to apply this to dlist, and found a disturbing thing -- all accesses to data in dlist are done via *casting*. The idea behind it is, we only store non-templated nodes inside a list, and those nodes only have prev and next pointers. Then you cast to the correct "PayNode" type, giving you the actual data. Things are allocated properly, so it works, but holy crap, what a mess. For those who are interested, here is the original PR that introduced this: https://github.com/dlang/phobos/pull/2457.
>
> I decided, I'll make a PR to undo all that (the savings from doing this can't be worth it, the only thing that's non-templated is the range popFront and popBack), but I've now run into an issue with dip1000. Apparently, some of the casting confuses the compiler sufficiently enough that it can't track lifetimes, and it just rubber-stamps it (at least, that's my theory).
>
> I've removed the "BaseNode" type, and just use "PayNode" everywhere. All of a sudden, things become un-@safe.
>
> So I started slapping @safe tags on to see where the issue is. It starts with something I've known is an issue for a long time, and that is having a scoped array of strings (or other reference types). You get this with a type-safe variadic function.
>
> The compiler treats all the array items as if they were scope, meaning you can't assign a string, either allocated from the heap or a literal, to things that aren't scope, which doesn't make any sense. Sure the array is scope, but not what the elements point at!
>
> In any case, I created a mock-up of the parts that are not working, can anyone find either an issue we can file for dip1000 or a way to solve this properly? I found I can mark stuff trusted, but I *really* don't like that. It's not much worse than casting everything, however.
>
> Here is the mockup code:
> https://run.dlang.io/is/6xDFnr
>
> I've marked main @safe, and the problematic function @safe (if you don't mark the problematic function @safe, it just complains that main can't do anything).
>
> Note that in reality everything here is perfectly safe, I'm not escaping anything.
>

My hunch (having looked at the source) would be that this is because a "value" to the compiler starts out not scope and must be proved to be scope. In

tail.next = allocate(stuff.front, tail);

there is an inductive step the compiler would need to make to infer tail as scope and that goes against the requirement to prove the scopeness of tail (with @safe the goal is to not have false positives, false negatives, like this, are annoying but fine). Please do add a bugziliqa for this even if it proves to be intractable to solve. We can at least document what the compiler can't infer.

> So, here is one other thing I want to say. This took me HOURS to find, and narrow down. Not because I don't understand the concepts behind dip1000, but because the compiler has fully inserted so many hidden scopes, I don't know what the actual code it's compiling is. One big problem I think with dip1000 is simply that it's nearly impossible to understand where the issues are. Like I said at the end of the post above, the result of allowing compiler inference of dip1000 is that your whole program is simply marked unsafe, and you have absolutely no idea where it is. You can't even guess, because scope just shows up where you never typed it. Given that you NEED this functionality on templates, it's going to result, IMO, in people either not using dip1000, or giving up and adding @trusted: to the top of their file. This is going to be horrible if we can't find a way to either require scope instead of inferring it in some cases, or create a way to diagnose where the blasted problem actually is. Maybe something to say "I expected this call to be @safe, why isn't it".
>

I totally agree.

Plug for https://github.com/dlang/dlang.org/pull/2453
We really need to better document this stuff.

October 23, 2018

Re: Dlist and dip1000 challenge

Posted by Steven Schveighoffer
in reply to Nicholas Wilson

Steven Schveighoffer

Posted in reply to Nicholas Wilson

On 10/23/18 1:18 PM, Nicholas Wilson wrote:
> My hunch (having looked at the source) would be that this is because a "value" to the compiler starts out not scope and must be proved to be scope. In
> 
> tail.next = allocate(stuff.front, tail);
> 
> there is an inductive step the compiler would need to make to infer tail as scope and that goes against the requirement to prove the scopeness of tail (with @safe the goal is to not have false positives, false negatives, like this, are annoying but fine). Please do add a bugziliqa for this even if it proves to be intractable to solve. We can at least document what the compiler can't infer.

Except that there should be no pointers to scope data here -- the only thing that is scope really is the string[] pointer. When you extract a string from that, it shouldn't be scope.

If it is being inferred scope, how does one have a scope array of non-scope pointers?

> I totally agree.
> 
> Plug for https://github.com/dlang/dlang.org/pull/2453
> We really need to better document this stuff.

Well, documentation is good, but if there is a bug or design flaw for dip1000 that's what we need to focus on.

Bottom line: I should be able to do what is obviously, trivially @safe with dip1000. If I can't do that, then there is something wrong with it.

-Steve

October 23, 2018

Re: Dlist and dip1000 challenge

Posted by Stanislav Blinov
in reply to Steven Schveighoffer

Stanislav Blinov

Posted in reply to Steven Schveighoffer

On Tuesday, 23 October 2018 at 17:48:23 UTC, Steven Schveighoffer wrote:

> Except that there should be no pointers to scope data here -- the only thing that is scope really is the string[] pointer. When you extract a string from that, it shouldn't be scope.
>
> If it is being inferred scope, how does one have a scope array of non-scope pointers?
>

I *think* there's a bug with the implementation of DIP1000:

https://github.com/dlang/DIPs/blob/master/DIPs/DIP1000.md#aggregates

- Lifetimes of built-in dynamically-sized slices T[] are analyzed as structs with two fields, one of type T* and the other of type size_t.

- For struct members of aggregate type, decomposition may continue transitively.

So, if "decomposition may continue transitively", it looks as if there's no tail-scope for arrays, but then shouldn't it follow that for

void foo(scope string[] args...);

it should transitively slap `scope` into the immutable(char)* of each argument?

However,

string[] escape;

// compiles, i.e. the behavior you essentially want
void foo(scope string[] args...) {
    foreach (s; args) escape ~= s;
}

// doesn't compile, i.e. the behavior you essentially get
void bar(scope string arg) {
    escape ~= s;
}

...and yet, in your code example the compiler behaves as if each string was indeed `scope`.

October 23, 2018

Re: Dlist and dip1000 challenge

Posted by Steven Schveighoffer
in reply to Stanislav Blinov

Steven Schveighoffer

Posted in reply to Stanislav Blinov

On 10/23/18 2:43 PM, Stanislav Blinov wrote:
> On Tuesday, 23 October 2018 at 17:48:23 UTC, Steven Schveighoffer wrote:
> 
>> Except that there should be no pointers to scope data here -- the only thing that is scope really is the string[] pointer. When you extract a string from that, it shouldn't be scope.
>>
>> If it is being inferred scope, how does one have a scope array of non-scope pointers?
>>
> 
> I *think* there's a bug with the implementation of DIP1000:
> 
> https://github.com/dlang/DIPs/blob/master/DIPs/DIP1000.md#aggregates
> 
> - Lifetimes of built-in dynamically-sized slices T[] are analyzed as structs with two fields, one of type T* and the other of type size_t.
> 
> - For struct members of aggregate type, decomposition may continue transitively.
> 
> So, if "decomposition may continue transitively", it looks as if there's no tail-scope for arrays, but then shouldn't it follow that for
> 
> void foo(scope string[] args...);
> 
> it should transitively slap `scope` into the immutable(char)* of each argument?

This is where I think there is going to be a problem. scope shouldn't be transitive, you can easily have a pointer allocated on the stack point to heap data. What is the type that results from taking the address of the pointer? It *should* be a pointer to a scoped pointer that points to NON-SCOPE data.

It may be an issue of expressability.

> 
> However,
> 
> string[] escape;
> 
> // compiles, i.e. the behavior you essentially want
> void foo(scope string[] args...) {
>      foreach (s; args) escape ~= s;
> }
> 
> // doesn't compile, i.e. the behavior you essentially get
> void bar(scope string arg) {
>      escape ~= s;
> }
> 
> ....and yet, in your code example the compiler behaves as if each string was indeed `scope`.

Yes, I believe the compiler inferring scope can do some things that syntax cannot. I think that might be where the issue is, but it's really hard to figure out.

-Steve

October 23, 2018

Re: Dlist and dip1000 challenge

Posted by Stanislav Blinov
in reply to Steven Schveighoffer

Stanislav Blinov

Posted in reply to Steven Schveighoffer

On Tuesday, 23 October 2018 at 19:03:25 UTC, Steven Schveighoffer wrote:

>> So, if "decomposition may continue transitively", it looks as if there's no tail-scope for arrays, but then shouldn't it follow that for
>> 
>> void foo(scope string[] args...);
>> 
>> it should transitively slap `scope` into the immutable(char)* of each argument?
>
> This is where I think there is going to be a problem. scope shouldn't be transitive, you can easily have a pointer allocated on the stack point to heap data.

Yeah, but OTOH, so can you have an array of pointers to the stack :\

> It may be an issue of expressability.

void tailScope(T scope [] x); // yuck...

>> 
>> ....and yet, in your code example the compiler behaves as if each string was indeed `scope`.
>
> Yes, I believe the compiler inferring scope can do some things that syntax cannot. I think that might be where the issue is, but it's really hard to figure out.

This is weird. 'foreach' compiles, manual iteration doesn't. Change the 'manualIter' enum below to compare.

/* Simplified versions of range primitives, for easy reference.
   Note, these don't handle decoding narrow strings!
*/

@property bool empty(T)(T[] a) @safe pure nothrow @nogc {
    return a.length == 0;
}

void popFront(T)(ref T[] a) @safe pure nothrow @nogc {
    assert(!a.empty);
    a = a[1 .. $];
}

/* version in std.range doesn't have a 'return' argument, but that doesn't seem to have an effect. */
@property ref T front(T)(return T[] a) @safe pure nothrow @nogc {
    assert(a.length);
    return a[0];
}

struct Array(T) {
    T[] data;

    void append(Stuff)(Stuff stuff) {
        appendImpl(stuff);
    }

    void appendImpl(T[] x...) {
        appendImpl2(x);
    }

    void appendImpl2(Stuff)(ref scope Stuff stuff) {

        // Change this to true to get different behavior
        enum manualIter = false;

        static if (manualIter) {
            while (!stuff.empty) {
                data ~= stuff.front;
                stuff.popFront;
            }
        } else {
            foreach (ref e; stuff) data ~= e;
        }
    }
}

void main() {
    Array!string array;
    array.append("hello");
}

October 23, 2018

Re: Dlist and dip1000 challenge

Posted by Steven Schveighoffer
in reply to Stanislav Blinov

Steven Schveighoffer

Posted in reply to Stanislav Blinov

On 10/23/18 4:10 PM, Stanislav Blinov wrote:
> On Tuesday, 23 October 2018 at 19:03:25 UTC, Steven Schveighoffer wrote:
> 
>>> So, if "decomposition may continue transitively", it looks as if there's no tail-scope for arrays, but then shouldn't it follow that for
>>>
>>> void foo(scope string[] args...);
>>>
>>> it should transitively slap `scope` into the immutable(char)* of each argument?
>>
>> This is where I think there is going to be a problem. scope shouldn't be transitive, you can easily have a pointer allocated on the stack point to heap data.
> 
> Yeah, but OTOH, so can you have an array of pointers to the stack :\

This is where dip1000 actually makes sense -- the lifetime of that array should not outlive the stack. So, with dip1000 you *shouldn't* be able to have a heap array of pointers to the stack. Unless you circumvent via some specially written type.

-Steve

October 23, 2018

Re: Dlist and dip1000 challenge

Posted by Stanislav Blinov
in reply to Steven Schveighoffer

Stanislav Blinov

Posted in reply to Steven Schveighoffer

On Tuesday, 23 October 2018 at 20:46:26 UTC, Steven Schveighoffer wrote:

>>> This is where I think there is going to be a problem. scope shouldn't be transitive, you can easily have a pointer allocated on the stack point to heap data.
>> 
>> Yeah, but OTOH, so can you have an array of pointers to the stack :\
>
> This is where dip1000 actually makes sense -- the lifetime of that array should not outlive the stack. So, with dip1000 you *shouldn't* be able to have a heap array of pointers to the stack.

Indeed, but that's for heap arrays. This will (and should) work:

void sink(scope immutable(char)*[] arr) @safe {}
void test() @safe { // even though putting @nogc here will still result in 'may cause GC allocation'...
    immutable char c;
    sink([&c]); // ...this is actually fine
}

Now for the meat. It looks like it just won't let you @safe-ly store away `scope` strings in that List:

string[] hatch;

void escape(string s) @safe { hatch ~= s; }

// templated to force attribute inference
void testInferred()(scope string[] stuff) {
    static string[] hatch;

    while (!stuff.empty) {

        // this blatantly and silently strips @safe from `testInferred`
        // without compile errors
        escape(stuff.front);

        // however, this will result in compilation error:
        //hatch ~= stuff.front;

        stuff.popFront;
    }
}

// fails to compile outright
void testSafe(scope string[] stuff) @safe {

    while (!stuff.empty) {

        // in @safe scope can't escape like this at all
        escape(stuff.front);

        stuff.popFront;
    }
}

void main() {
    testInferred(["hello"]); // compiles

    // fails to compile, testInferred is @system!
    () @safe {
        testInferred(["hello"]);
    } ();
}

...so transitivity strikes yet again, albeit from the shadows this time.

October 23, 2018

Re: Dlist and dip1000 challenge

Posted by Walter Bright
in reply to Steven Schveighoffer

Walter Bright

Posted in reply to Steven Schveighoffer

On 10/23/2018 8:10 AM, Steven Schveighoffer wrote:
> So, here is one other thing I want to say. This took me HOURS to find, and narrow down. Not because I don't understand the concepts behind dip1000, but because the compiler has fully inserted so many hidden scopes, I don't know what the actual code it's compiling is. One big problem I think with dip1000 is simply that it's nearly impossible to understand where the issues are. Like I said at the end of the post above, the result of allowing compiler inference of dip1000 is that your whole program is simply marked unsafe, and you have absolutely no idea where it is. You can't even guess, because scope just shows up where you never typed it. Given that you NEED this functionality on templates, it's going to result, IMO, in people either not using dip1000, or giving up and adding @trusted: to the top of their file. This is going to be horrible if we can't find a way to either require scope instead of inferring it in some cases, or create a way to diagnose where the blasted problem actually is. Maybe something to say "I expected this call to be @safe, why isn't it".

My improvements to DIP1000 are completely dead in the water due to lack of interest. It's impossible to make Phobos DIP1000 compatible if nobody is willing to approve the improvements.

https://github.com/dlang/dmd/pull/8504

October 24, 2018

Re: Dlist and dip1000 challenge

Posted by rikki cattermole
in reply to Walter Bright

rikki cattermole

Posted in reply to Walter Bright

On 24/10/2018 11:10 AM, Walter Bright wrote:
> On 10/23/2018 8:10 AM, Steven Schveighoffer wrote:
>> So, here is one other thing I want to say. This took me HOURS to find, and narrow down. Not because I don't understand the concepts behind dip1000, but because the compiler has fully inserted so many hidden scopes, I don't know what the actual code it's compiling is. One big problem I think with dip1000 is simply that it's nearly impossible to understand where the issues are. Like I said at the end of the post above, the result of allowing compiler inference of dip1000 is that your whole program is simply marked unsafe, and you have absolutely no idea where it is. You can't even guess, because scope just shows up where you never typed it. Given that you NEED this functionality on templates, it's going to result, IMO, in people either not using dip1000, or giving up and adding @trusted: to the top of their file. This is going to be horrible if we can't find a way to either require scope instead of inferring it in some cases, or create a way to diagnose where the blasted problem actually is. Maybe something to say "I expected this call to be @safe, why isn't it".
> 
> My improvements to DIP1000 are completely dead in the water due to lack of interest. It's impossible to make Phobos DIP1000 compatible if nobody is willing to approve the improvements.
> 
> https://github.com/dlang/dmd/pull/8504

Did the spec get the update that was requested over DIP1000?

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation