March 11, 2019
On Monday, 11 March 2019 at 17:00:35 UTC, bitwise wrote:
> On Sunday, 10 March 2019 at 18:46:14 UTC, Atila Neves wrote:
>> On Sunday, 10 March 2019 at 17:36:09 UTC, bitwise wrote:
>>> On Sunday, 10 March 2019 at 16:10:10 UTC, Atila Neves wrote:
>>>> On Sunday, 10 March 2019 at 15:12:03 UTC, bitwise wrote:
>>>>> [...]
>>>>
>>>> Like `const`, `scope` on a member function means that the `this` reference is `scope`.
>>>
>>> Interesting. I'll have to look up some examples.
>>>
>>>>> [...]
>>>>
>>>> https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L134
>>>> https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L159
>>>> https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L145
>>>>
>>>> static assert(isInputRange!(Vector!int));
>>>>
>>>>> [...]
>>>>
>>>> I would think so, especially since I wrote Vector to be @safe with dip1000. Look at this unit test for not being able to escape the payload for example:
>>>>
>>>> https://github.com/atilaneves/automem/blob/master/tests/ut/vector.d#L229
>>>
>>> Ok, but this container copies the entire collection in postblit:
>>> https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L104
>>
>> Well, yes. As it says in the module ddoc, it's my version of C++'s std::vector. If you don't want it to copy, pass by ref.
>>
>>> So if you try to treat it like a range, you'll take a performance hit every time you trigger the postblit, which seems likely.
>>
>> You can pass it by ref like I mention above, or you could slice it and pass the slice instead. Without DIP1000 passing the slice wouldn't be @safe but with it, it is.
>>
>> Again, just like std::vector, which is pretty much always passed by const ref. But safer and with a default allocator.
>>
>>> Isn't it reasonable to assume a range should act like a reference to data, rather than a copy of it?
>>
>> It depends.
>
> This container seems good for short-lived usage.

Why?

> If I wanted to pass a bunch of filenames to a function for processing, I would consider this container Ok, because it would allocate once, then be consumed as a range without having to copy anything.

Or you don't consume it all and slice it (as I mentioned in my last post):

@safe unittest {
    scope v = vector(1, 2, 3);
    static void fun(R)(R range) {
        import std.array: array;
        assert(range.array == [1, 2, 3]);
    }
    fun(v[]);
    // wasn't consumed
    assert(v[] == [1, 2, 3]);
}

Again, it's @safe because the slice is scoped.

> For any kind of persistent storage though, I would consider the performance costs of using this container's range functionality too high.

There aren't any performance costs. You pass a fat pointer just like you would with a GC array.

> I do think the idea of making the container itself the range is interesting, but what ruins it for me is the fact that ranges are consumable.

Which is why Vector has both opSlice and copies by default. It's only a range itself if the element type is mutable, otherwise it doesn't even define popFront.

> So I'm wondering if it would work well to use an integer index for the range implementation

That's what slices are.


March 11, 2019
On Mon, Mar 11, 2019 at 06:29:23PM +0000, Atila Neves via Digitalmars-d wrote:
> On Monday, 11 March 2019 at 17:00:35 UTC, bitwise wrote:
[...]
> > I do think the idea of making the container itself the range is interesting, but what ruins it for me is the fact that ranges are consumable.
> 
> Which is why Vector has both opSlice and copies by default. It's only a range itself if the element type is mutable, otherwise it doesn't even define popFront.
[...]

Generally, it's a bad idea to conflate a container with a range over the container, precisely for this reason. In this respect, built-in arrays are a bad example, because we think of them as containers, yet they are at the same time ranges. (Though strictly speaking, "arrays" are just slices of the underlying containers that are managed by druntime and not really directly manipulatable. But nobody thinks of them in this way; the concept of "array" is much easier to grasp and reason with than "slice over druntime-managed container".)

The recommended approach is to have the container be distinct from a range over its elements, and provide such a range via a member function, commonly chosen to be opSlice.


T

-- 
Real men don't take backups. They put their source on a public FTP-server and let the world mirror it. -- Linus Torvalds
March 11, 2019
On Monday, 11 March 2019 at 18:29:23 UTC, Atila Neves wrote:
> On Monday, 11 March 2019 at 17:00:35 UTC, bitwise wrote:
>> On Sunday, 10 March 2019 at 18:46:14 UTC, Atila Neves wrote:
>>> On Sunday, 10 March 2019 at 17:36:09 UTC, bitwise wrote:
>>>> On Sunday, 10 March 2019 at 16:10:10 UTC, Atila Neves wrote:
>>>>> On Sunday, 10 March 2019 at 15:12:03 UTC, bitwise wrote:
>>>>>> [...]
>>>>>
>>>>> Like `const`, `scope` on a member function means that the `this` reference is `scope`.
>>>>
>>>> Interesting. I'll have to look up some examples.
>>>>
>>>>>> [...]
>>>>>
>>>>> https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L134
>>>>> https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L159
>>>>> https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L145
>>>>>
>>>>> static assert(isInputRange!(Vector!int));
>>>>>
>>>>>> [...]
>>>>>
>>>>> I would think so, especially since I wrote Vector to be @safe with dip1000. Look at this unit test for not being able to escape the payload for example:
>>>>>
>>>>> https://github.com/atilaneves/automem/blob/master/tests/ut/vector.d#L229
>>>>
>>>> Ok, but this container copies the entire collection in postblit:
>>>> https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L104
>>>
>>> Well, yes. As it says in the module ddoc, it's my version of C++'s std::vector. If you don't want it to copy, pass by ref.
>>>
>>>> So if you try to treat it like a range, you'll take a performance hit every time you trigger the postblit, which seems likely.
>>>
>>> You can pass it by ref like I mention above, or you could slice it and pass the slice instead. Without DIP1000 passing the slice wouldn't be @safe but with it, it is.
>>>
>>> Again, just like std::vector, which is pretty much always passed by const ref. But safer and with a default allocator.
>>>
>>>> Isn't it reasonable to assume a range should act like a reference to data, rather than a copy of it?
>>>
>>> It depends.
>>
>> This container seems good for short-lived usage.
>
> Why?
>
>> If I wanted to pass a bunch of filenames to a function for processing, I would consider this container Ok, because it would allocate once, then be consumed as a range without having to copy anything.
>
> Or you don't consume it all and slice it (as I mentioned in my last post):
>
> @safe unittest {
>     scope v = vector(1, 2, 3);
>     static void fun(R)(R range) {
>         import std.array: array;
>         assert(range.array == [1, 2, 3]);
>     }
>     fun(v[]);
>     // wasn't consumed
>     assert(v[] == [1, 2, 3]);
> }
>
> Again, it's @safe because the slice is scoped.
>
>> For any kind of persistent storage though, I would consider the performance costs of using this container's range functionality too high.
>
> There aren't any performance costs. You pass a fat pointer just like you would with a GC array.
>
>> I do think the idea of making the container itself the range is interesting, but what ruins it for me is the fact that ranges are consumable.
>
> Which is why Vector has both opSlice and copies by default. It's only a range itself if the element type is mutable, otherwise it doesn't even define popFront.
>
>> So I'm wondering if it would work well to use an integer index for the range implementation
>
> That's what slices are.

At first, I missed the use of scope here:
> auto opSlice(this This)() scope return

My reasoning was that using automem/vector for persistent storage required it to be iterated over, which couldn't be done without consuming the container, or copying it. I see now that you can iterate by returning a scoped slice of the internal data store, but this could be unsafe too. If you called reserve on the container, the returned slice could end up dangling if a non-gc allocator was used, even if the returned slice was scoped.

I think that if opSlice returned a scoped range with a pointer back to the original container, then it would be easy for the range object to detect the state of the container, even including if it had been moved from and left in it's initial state. As long as the range couldn't leave the stack-frame of the container, I think this would be totally safe.


March 11, 2019
On Monday, 11 March 2019 at 19:40:06 UTC, bitwise wrote:
> On Monday, 11 March 2019 at 18:29:23 UTC, Atila Neves wrote:
>> On Monday, 11 March 2019 at 17:00:35 UTC, bitwise wrote:
>>> On Sunday, 10 March 2019 at 18:46:14 UTC, Atila Neves wrote:
>>>> On Sunday, 10 March 2019 at 17:36:09 UTC, bitwise wrote:
>>>>> On Sunday, 10 March 2019 at 16:10:10 UTC, Atila Neves wrote:
>>>>>> On Sunday, 10 March 2019 at 15:12:03 UTC, bitwise wrote:
>>>>>>> [...]
>>>>>>
>
> <snip>

> I see now that you can iterate by returning a scoped slice of the internal data store, but this could be unsafe too. If you called reserve on the container, the returned slice could end up dangling if a non-gc allocator was used, even if the returned slice was scoped.

Unless there's a bug in my implementation or in dip1000, no, it can't be unsafe.

> I think that if opSlice returned a scoped range with a pointer back to the original container,

It does.

> As long as the range couldn't leave the stack-frame of the container, I think this would be totally safe.

It can't; that's the whole point of dip1000. Again, that's why this test passes:

https://github.com/atilaneves/automem/blob/e34b2b0c1510efd91064a9cbf83cbf43856c1a5c/tests/ut/vector.d#L229

March 11, 2019
On Monday, 11 March 2019 at 19:23:19 UTC, H. S. Teoh wrote:
> On Mon, Mar 11, 2019 at 06:29:23PM +0000, Atila Neves via Digitalmars-d wrote:
>> [...]
> [...]
>> [...]
> [...]
>
> Generally, it's a bad idea to conflate a container with a range over the container, precisely for this reason. In this respect, built-in arrays are a bad example, because we think of them as containers, yet they are at the same time ranges. (Though strictly speaking, "arrays" are just slices of the underlying containers that are managed by druntime and not really directly manipulatable. But nobody thinks of them in this way; the concept of "array" is much easier to grasp and reason with than "slice over druntime-managed container".)
>
> [...]

Maybe so. I'll have to think about it and maybe change the API so it's no longer a range.
March 11, 2019
On Monday, 11 March 2019 at 20:02:05 UTC, Atila Neves wrote:
> On Monday, 11 March 2019 at 19:40:06 UTC, bitwise wrote:
>> On Monday, 11 March 2019 at 18:29:23 UTC, Atila Neves wrote:
>>> On Monday, 11 March 2019 at 17:00:35 UTC, bitwise wrote:
>>>> On Sunday, 10 March 2019 at 18:46:14 UTC, Atila Neves wrote:
>>>>> On Sunday, 10 March 2019 at 17:36:09 UTC, bitwise wrote:
>>>>>> On Sunday, 10 March 2019 at 16:10:10 UTC, Atila Neves wrote:
>>>>>>> On Sunday, 10 March 2019 at 15:12:03 UTC, bitwise wrote:
>>>>>>>> [...]
>>>>>>>
>>
>> <snip>
>
>> I see now that you can iterate by returning a scoped slice of the internal data store, but this could be unsafe too. If you called reserve on the container, the returned slice could end up dangling if a non-gc allocator was used, even if the returned slice was scoped.
>
> Unless there's a bug in my implementation or in dip1000, no, it can't be unsafe.

So if I use opSlice() to retrieve a slice of the internal array, then call reserve on the vector causing it to deallocate, then try to access the slice I just retrieved, what would you call that?

>> I think that if opSlice returned a scoped range with a pointer back to the original container,
>
> It does.

https://github.com/atilaneves/automem/blob/e34b2b0c1510efd91064a9cbf83cbf43856c1a5c/source/automem/vector.d#L284

I mean a custom range object that contains a pointer back to the original container. This code returns a pointer directly to memory inside the container. It's not wrapped in a range object, and it doesn't have any pointer back to the vector, which it would need because the internal memory of the vector could be deallocated, but the actual vector object itself could not (relative to the scoped range). At most, the container could be moved from and considered empty by the range it returned.

>> As long as the range couldn't leave the stack-frame of the container, I think this would be totally safe.
>
> It can't; that's the whole point of dip1000. Again, that's why this test passes:
>
> https://github.com/atilaneves/automem/blob/e34b2b0c1510efd91064a9cbf83cbf43856c1a5c/tests/ut/vector.d#L229

I was referring to my suggested implementation of a range, not the slice returned by automem/vector. automem/vector's opSlice() could potentially allow you to dereference an invalid pointer. I don't know if detecting that is within the scope of dip1000, but I would still consider it a hole in the implementation.


March 12, 2019
On Monday, 11 March 2019 at 20:02:05 UTC, Atila Neves wrote:
>>
>> <snip>
>
>> I see now that you can iterate by returning a scoped slice of the internal data store, but this could be unsafe too. If you called reserve on the container, the returned slice could end up dangling if a non-gc allocator was used, even if the returned slice was scoped.
>
> Unless there's a bug in my implementation or in dip1000, no, it can't be unsafe.

I think that this is what @bitwise was talking about:
https://github.com/atilaneves/automem/issues/25
March 12, 2019
On Tuesday, 12 March 2019 at 18:06:39 UTC, Petar Kirov [ZombineDev] wrote:
> On Monday, 11 March 2019 at 20:02:05 UTC, Atila Neves wrote:
>>>
>>> <snip>
>>
>>> I see now that you can iterate by returning a scoped slice of the internal data store, but this could be unsafe too. If you called reserve on the container, the returned slice could end up dangling if a non-gc allocator was used, even if the returned slice was scoped.
>>
>> Unless there's a bug in my implementation or in dip1000, no, it can't be unsafe.
>
> I think that this is what @bitwise was talking about:
> https://github.com/atilaneves/automem/issues/25

I hadn't seen this, but yes, thanks.

   Bit
March 13, 2019
On Tuesday, 12 March 2019 at 21:06:17 UTC, bitwise wrote:
> On Tuesday, 12 March 2019 at 18:06:39 UTC, Petar Kirov [ZombineDev] wrote:
>> On Monday, 11 March 2019 at 20:02:05 UTC, Atila Neves wrote:
>>>>
>>>> <snip>
>>>
>>>> I see now that you can iterate by returning a scoped slice of the internal data store, but this could be unsafe too. If you called reserve on the container, the returned slice could end up dangling if a non-gc allocator was used, even if the returned slice was scoped.
>>>
>>> Unless there's a bug in my implementation or in dip1000, no, it can't be unsafe.
>>
>> I think that this is what @bitwise was talking about:
>> https://github.com/atilaneves/automem/issues/25
>
> I hadn't seen this, but yes, thanks.
>
>    Bit

After thinking about what @bitwise said, I was actually going to post back similar code.

The code in the github issue is problematic, but it's a far cry from how such errors are usually introduced in C++. Normally it's because another thread has a reference to the now dangling pointer, or calling a function with the slice means it gets escaped somewhere.

Another way would be to write this:


void oops(ref Vector!int vec, int[] slice);


Which is also convoluted. Neither case would pass code review.

Either way, thanks for pointing out the possible issue. I'm going to have to think long and hard about whether it's possible to fix it with the language we have now.
March 13, 2019
On Wednesday, 13 March 2019 at 08:49:37 UTC, Atila Neves wrote:
> Either way, thanks for pointing out the possible issue. I'm going to have to think long and hard about whether it's possible to fix it with the language we have now.

One possibility might be to implement at run time what rust does at compile time, eg make it impossible to mutate vector as long as one or more readable view to its content exists.

So for instance, make opSlice and opIndex return a struct that bumps a reference counter up inside the vector; the counter is bumped down once that struct is destructed. As long as the number of "live" slices is non-zero, the vector isn't allowed to re-allocate.

> The code in the github issue is problematic, but it's a far cry from how such errors are usually introduced in C++. Normally it's because another thread has a reference to the now dangling pointer, or calling a function with the slice means it gets escaped somewhere.

> Which is also convoluted. Neither case would pass code review.

I don't think that's the point.

It's supposed to be *impossible* to get a memory corruption in @safe code. Not "convoluted and wouldn't pass code review", impossible. When you write a @trusted function, getting a memory corruption with that function is supposed to be impossible as well.

Other people have gone over why memory safety is necessary before. Not all code goes through code review, sometimes review miss errors even for critical applications, etc.