Jump to page: 1 24  
Page
Thread overview
Re: Memory allocation in D (noob question)
Dec 01, 2007
mandel
Dec 01, 2007
Janice Caron
Dec 01, 2007
Robert Fraser
Dec 01, 2007
mandel
Dec 01, 2007
Don Clugston
Dec 03, 2007
Sean Kelly
Dec 03, 2007
mandel
Dec 04, 2007
Oskar Linde
Dec 04, 2007
Oskar Linde
Dec 04, 2007
Sean Kelly
Dec 05, 2007
Sean Kelly
Dec 04, 2007
Sean Kelly
Dec 05, 2007
Regan Heath
Dec 05, 2007
Regan Heath
Dec 05, 2007
Sean Kelly
Dec 05, 2007
Regan Heath
Dec 05, 2007
Sean Kelly
Dec 05, 2007
Derek Parnell
Dec 06, 2007
Regan Heath
Dec 05, 2007
Matti Niemenmaa
Dec 06, 2007
Regan Heath
Dec 05, 2007
Sean Kelly
Dec 05, 2007
Janice Caron
Dec 05, 2007
Sean Kelly
Dec 05, 2007
Regan Heath
Dec 05, 2007
Sean Kelly
December 01, 2007
It probably is a noob question,
but aren't array lengths just hidden size_t values
that are passed around?
Why do we need to allocate space for them, too?

voif foo()
{
  size_t length;
  char* ptr; //allocated memory of 2^n
  //.. the same as..?
  char[] data;
}
December 01, 2007
"mandel" <oh@no.es> wrote in message news:fiqu9l$18v$1@digitalmars.com...
> It probably is a noob question,
> but aren't array lengths just hidden size_t values
> that are passed around?
> Why do we need to allocate space for them, too?
>
> voif foo()
> {
>  size_t length;
>  char* ptr; //allocated memory of 2^n
>  //.. the same as..?
>  char[] data;
> }

What?

I mean, yes, a size_t and a pointer will be the same size as an array reference, but the point of an array reference is that, well, it's an array reference.  And you can do all kinds of things with them that you can't with pointers.

What are you getting at?


December 01, 2007
On 12/1/07, mandel <oh@no.es> wrote:
> Why do we need to allocate space for them, too?

Raw pointers are discouraged in modern languages such as D. They are the source of too many bugs. Use them if you need to do down-and-dirty, under-the-hood stuff, but for general use, forget pointers. Use arrays. They're safer.
December 01, 2007
mandel wrote:
> It probably is a noob question,
> but aren't array lengths just hidden size_t values
> that are passed around?
> Why do we need to allocate space for them, too?
> 
> voif foo()
> {
>   size_t length;
>   char* ptr; //allocated memory of 2^n
>   //.. the same as..?
>   char[] data;
> }

The extra space allocated isn't for the length (in fact, it's just a byte I think); it's to make checking for array bounds errors possible (since there's a byte of space that, if accessed, indicates an overflow). I tmight be used for something else, too.
December 01, 2007
Robert Fraser Wrote:

> mandel wrote:
> > It probably is a noob question,
> > but aren't array lengths just hidden size_t values
> > that are passed around?
> > Why do we need to allocate space for them, too?
> > 
> > voif foo()
> > {
> >   size_t length;
> >   char* ptr; //allocated memory of 2^n
> >   //.. the same as..?
> >   char[] data;
> > }
> 
> The extra space allocated isn't for the length (in fact, it's just a byte I think); it's to make checking for array bounds errors possible (since there's a byte of space that, if accessed, indicates an overflow). I tmight be used for something else, too.

Thanks, that answers my question.
But I can't think how it could be used for array bounds errors checking right now.
Well, I guess there some ng post about this, somewhere.
But the page allocation overhead looks ugly for a language like D.

Anyway, good to have D arrays. Working with pointers in C was often ready for surprises in case of reduced attention. :>
December 01, 2007
mandel wrote:
> Robert Fraser Wrote:
> 
>> mandel wrote:
>>> It probably is a noob question,
>>> but aren't array lengths just hidden size_t values
>>> that are passed around?
>>> Why do we need to allocate space for them, too?
>>>
>>> voif foo()
>>> {
>>>   size_t length;
>>>   char* ptr; //allocated memory of 2^n
>>>   //.. the same as..?
>>>   char[] data;
>>> }
>> The extra space allocated isn't for the length (in fact, it's just a byte I think); it's to make checking for array bounds errors possible (since there's a byte of space that, if accessed, indicates an overflow). I tmight be used for something else, too.
> 
> Thanks, that answers my question.
> But I can't think how it could be used for array bounds errors checking right now.
> Well, I guess there some ng post about this, somewhere.
> But the page allocation overhead looks ugly for a language like D.
> 
> Anyway, good to have D arrays. Working with pointers in C was often
> ready for surprises in case of reduced attention. :>

An observation...
In my experience, most pointer bugs are actually uninitialised variables.
An uninitialised pointer is a truly horrible thing. But since D initialises variables, pointers in D aren't nearly as bad as in C.
December 03, 2007
"mandel" wrote in message
> Robert Fraser Wrote:
>
>> mandel wrote:
>> > It probably is a noob question,
>> > but aren't array lengths just hidden size_t values
>> > that are passed around?
>> > Why do we need to allocate space for them, too?
>> >
>> > voif foo()
>> > {
>> >   size_t length;
>> >   char* ptr; //allocated memory of 2^n
>> >   //.. the same as..?
>> >   char[] data;
>> > }
>>
>> The extra space allocated isn't for the length (in fact, it's just a byte I think); it's to make checking for array bounds errors possible (since there's a byte of space that, if accessed, indicates an overflow). I tmight be used for something else, too.
>
> Thanks, that answers my question.
> But I can't think how it could be used for array bounds errors checking
> right now.
> Well, I guess there some ng post about this, somewhere.
> But the page allocation overhead looks ugly for a language like D.

Think of it this way:

int[] array1 = new int[5];
int[] array2 = new int[5];

imagine that array 1 and array 2 are now sequential in memory *AND* there is no extra byte separating them.

Now I create the valid array slices:

int[] array3 = array1[$..$];
int[] array4 = array2[0..0];

Note that both of these arrays are bit-for-bit identical (both have 0 length and the same ptr value).  Which one points to which piece of memory?  How is the GC to decide which memory gets collected?

These are the types of problems that the extra byte helps with.

I personally think there exists a way to fix this efficiently without adding the extra byte, but I can't think of one :)

Oh, and also, the size_t length is not stored in the allocated memory.  It's stored in the array structure, usually on the stack or inside a class instance.

I hope this helps your understanding of the issue.

-Steve


December 03, 2007
Steven Schveighoffer wrote:
> "mandel" wrote in message
>> Robert Fraser Wrote:
>>
>>> mandel wrote:
>>>> It probably is a noob question,
>>>> but aren't array lengths just hidden size_t values
>>>> that are passed around?
>>>> Why do we need to allocate space for them, too?
>>>>
>>>> voif foo()
>>>> {
>>>>   size_t length;
>>>>   char* ptr; //allocated memory of 2^n
>>>>   //.. the same as..?
>>>>   char[] data;
>>>> }
>>> The extra space allocated isn't for the length (in fact, it's just a
>>> byte I think); it's to make checking for array bounds errors possible
>>> (since there's a byte of space that, if accessed, indicates an
>>> overflow). I tmight be used for something else, too.
>> Thanks, that answers my question.
>> But I can't think how it could be used for array bounds errors checking right now.
>> Well, I guess there some ng post about this, somewhere.
>> But the page allocation overhead looks ugly for a language like D.
> 
> Think of it this way:
> 
> int[] array1 = new int[5];
> int[] array2 = new int[5];
> 
> imagine that array 1 and array 2 are now sequential in memory *AND* there is no extra byte separating them.
> 
> Now I create the valid array slices:
> 
> int[] array3 = array1[$..$];
> int[] array4 = array2[0..0];
> 
> Note that both of these arrays are bit-for-bit identical (both have 0 length and the same ptr value).  Which one points to which piece of memory?  How is the GC to decide which memory gets collected?
> 
> These are the types of problems that the extra byte helps with.
> 
> I personally think there exists a way to fix this efficiently without adding the extra byte, but I can't think of one :)
> 
> Oh, and also, the size_t length is not stored in the allocated memory.  It's stored in the array structure, usually on the stack or inside a class instance.

This is true in D 1.0.  However, there has been talk that arrays in D 2.0 would change from:

struct Array
{
    size_t length;
    byte*  ptr;
}

to:

struct Array
{
    byte*  ptr;
    byte*  end;
}

Which would make every array reference always point to itself and to the block immediately following it in memory, if no padding is done.


Sean
December 03, 2007
Steven Schveighoffer wrote:
[..]
> Now I create the valid array slices:
> 
> int[] array3 = array1[$..$];
> int[] array4 = array2[0..0];
> 
> Note that both of these arrays are bit-for-bit identical (both have 0 length and the same ptr value).  Which one points to which piece of memory?  How is the GC to decide which memory gets collected?
I see the problem.
The first possible solution that comes to my mind seeing this is to
make array1[0..0] and array1[$..$] equal.
array1[$..$] could point to the begin of the array.
Since the slice length is null, it shouldn't matter - would it?

Second thought, why not ignore empty slices at all by telling the GC that the pointers doesn't hold any data.

Anyway, I guess there some things I missed. ;-)


[..]
> I hope this helps your understanding of the issue.
Yes, it does. :)

December 04, 2007
mandel wrote:
> Steven Schveighoffer wrote:
> [..]
>> Now I create the valid array slices:
>>
>> int[] array3 = array1[$..$];
>> int[] array4 = array2[0..0];
>>
>> Note that both of these arrays are bit-for-bit identical (both have 0
>> length and the same ptr value).  Which one points to which piece of
>> memory?  How is the GC to decide which memory gets collected?
> I see the problem.
> The first possible solution that comes to my mind seeing this is to
> make array1[0..0] and array1[$..$] equal.
> array1[$..$] could point to the begin of the array.
> Since the slice length is null, it shouldn't matter - would it?

Appending to a (empty or not) array slice starting at the start of an allocated block appends in-place rather than allocate a new array. This is the reason

while(x)
  a ~= b;

can be reasonably efficient.

So appending to the [$..$] array would (without padding) mean that you corrupt the following array.

The upcoming D2 T[new] (hopefully T[*] :) ) array type will probably make that a non-issue though.

> Second thought, why not ignore empty slices at all by telling
> the GC that the pointers doesn't hold any data.

Except for the fact that having an empty slice at the start of an allocated block is needed for appending to a preallocated block in current D, the reason is that the current GC doesn't have that fine grained information. It currently only knows "this block might contain pointers" and "this block doesn't contain pointers", and in the former case, treats everything properly aligned as potential pointers.

-- 
Oskar
« First   ‹ Prev
1 2 3 4