Thread overview
Slicing betterC
Sep 06, 2018
Oleksii
Sep 06, 2018
Oleksii
Sep 06, 2018
Adam D. Ruppe
Sep 06, 2018
Jonathan M Davis
September 06, 2018
Hi the folks,

Could you please share your wisdom with me? I wonder why the following code:
```
import core.stdc.stdlib;

Foo[] pool;
Foo[] foos;

auto buff = (Foo*)malloc(Foo.sizeof * 10);
pool = buff[0 .. 10];
foos = pool[0 .. 0 ];

// Now let's allocate a Foo:
Foo* allocatedFoo;
if (foos.length < foos.capacity) {    // <= Error: TypeInfo cannot be used with -betterC
  allocatedFoo = foos[0 .. $ + 1];    // <= Error: TypeInfo cannot be used with -betterC
}
```
fails to compile because of `foos.capacity` and `foos[0 .. $ + 1]`. Why do these two innocent looking expressions require TypeInfo? Aren't slices basically fat pointers with internal structure that looks like this:
```
struct Slice(T) {
  size_t capacity;
  size_t size;
  T*     memory;
}
```
?

It's weird that `TypeInfo` (being a run-time and reflection specific thing) is required in this particular case. Shouldn't static type checking be enough for all that?

Thanks in advance,
--
Oleksii
September 06, 2018
On Thursday, 6 September 2018 at 17:10:49 UTC, Oleksii wrote:

>   allocatedFoo = foos[0 .. $ + 1];    // <= Error: TypeInfo

This line meant to be `allocatedFoo = foos[$]`. Sorry about that.
September 06, 2018
On Thursday, 6 September 2018 at 17:10:49 UTC, Oleksii wrote:
> struct Slice(T) {
>   size_t capacity;
>   size_t size;
>   T*     memory;
> }

There's no capacity in the slice, that is stored as part of the GC block, which it looks up with the help of RTTI, thus the TypeInfo reference.

Slices *just* know their size and their memory pointer. They don't know how they were allocated and don't know what's beyond their bounds or how to grow their bounds. This needs to be managed elsewhere.

If you malloc a slice in regular D, the capacity will be returned as 0 - the GC doesn't know anything about it. Any attempt to append to it will allocate a whole new block.

In -betterC, there is no GC to look up at all, and thus it has nowhere to look. You'll have to make your own struct that stores capacity if you need it.

I like to do something like

struct MyArray {
      T* rawPointer;
      int capacity;
      int currentLength;

      // most user interaction will occur through this
      T[] opSlice() { return rawPointer[0 .. currentLength]; }

      // fill in other operators as needed
}

September 06, 2018
On Thursday, September 6, 2018 11:34:18 AM MDT Adam D. Ruppe via Digitalmars-d-learn wrote:
> On Thursday, 6 September 2018 at 17:10:49 UTC, Oleksii wrote:
> > struct Slice(T) {
> >
> >   size_t capacity;
> >   size_t size;
> >   T*     memory;
> >
> > }
>
> There's no capacity in the slice, that is stored as part of the GC block, which it looks up with the help of RTTI, thus the TypeInfo reference.
>
> Slices *just* know their size and their memory pointer. They don't know how they were allocated and don't know what's beyond their bounds or how to grow their bounds. This needs to be managed elsewhere.
>
> If you malloc a slice in regular D, the capacity will be returned as 0 - the GC doesn't know anything about it. Any attempt to append to it will allocate a whole new block.
>
> In -betterC, there is no GC to look up at all, and thus it has nowhere to look. You'll have to make your own struct that stores capacity if you need it.
>
> I like to do something like
>
> struct MyArray {
>        T* rawPointer;
>        int capacity;
>        int currentLength;
>
>        // most user interaction will occur through this
>        T[] opSlice() { return rawPointer[0 .. currentLength]; }
>
>        // fill in other operators as needed
> }

To try to make this even clearer, a dynamic array looks basically like this underneath the hood

struct DynamicArray(T)
{
    size_t length;
    T* ptr;
}

IIRC, it actually uses void* unfortunately, but that struct is basically what you get. Notice that _all_ of the information that's there is the pointer and the length. That's it. If you understand the semantics of what happens when passing that struct around, you'll understand the semantics of passing around dynamic arrays. And all of the operations that would have anything to do with memory management involve the GC - capacity, ~, ~=, etc. all require the GC. If you're not using -betterC, the fact that the dynamic array was allocated with malloc is pretty irrelevant, since all of those operations will function exactly the same as if the dynamic array were allocated by the GC. It's just that because the dynamic array is not GC-allocated, it's guaranteed that the capacity is 0, and therefore any operations that would increase the arrays length then require reallocating the dynamic array with the GC, whereas if it were already GC-allocated, then its capacity might have been greater than its length, in which case, reallocation would not be required.

If you haven't read it already, I would suggest reading this article:

https://dlang.org/articles/d-array-article.html

It does not use the official terminology, but in spite of that, it should really help clarify things for you. The article refers to T[] as being a slice (which is accurate, since it is a slice of memory), but it incorrectly refers to the memory buffer itself as being the dynamic array, whereas the language spec considers the T[] (the struct shown above) to be the dynamic array. The language does not have a specific name for that memory buffer, and it considers a T[] to be dynamic array regardless of what memory it refers to. So, you should keep that in mind when reading the article, but the concepts that it teaches are very much correct and should help a great deal in understanding how dynamic arrays work in D.

- Jonathan M Davis