View mode: basic / threaded / horizontal-split · Log in · Help
December 11, 2012
Array Slices and Interior Pointers
http://xtzgzorex.wordpress.com/2012/12/11/array-slices-and-interior-pointers/

Destroy.

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
December 11, 2012
Re: Array Slices and Interior Pointers
On 12/11/2012 01:04 AM, Alex Rønne Petersen wrote:
> http://xtzgzorex.wordpress.com/2012/12/11/array-slices-and-interior-pointers/
>
>
> Destroy.
>

Why does the internal representation have to be the same for a managed 
port and native D? Also, how does the second representation work 
exactly? Not all slices extend to the end of the memory block.

I don't really feel strongly about the memory requirements for slices, 
but 12 / 24 bytes is starting to feel a little bulky. I am not 
intimately familiar with druntime, but OTOH and AFAICS, the additional 
pointer should also allow faster retrieval of the slice's capacity. 
(though the compiler should IMHO implement specific optimizations for ~= 
in loops anyway.)
December 11, 2012
Re: Array Slices and Interior Pointers
On 11-12-2012 02:49, Timon Gehr wrote:
> On 12/11/2012 01:04 AM, Alex Rønne Petersen wrote:
>> http://xtzgzorex.wordpress.com/2012/12/11/array-slices-and-interior-pointers/
>>
>>
>>
>> Destroy.
>>
>
> Why does the internal representation have to be the same for a managed
> port and native D? Also, how does the second representation work
> exactly? Not all slices extend to the end of the memory block.

They don't have to be. Ideally it shouldn't even have to matter because 
D code shouldn't make assumptions about it.

And good point. That makes the second variation not useful for VMs that 
don't natively support slicing arrays, so I'll scratch that as a useful 
representation. A representation for a VM would then probably need to be 
{length, base, offset} (which could also work for a native D).

>
> I don't really feel strongly about the memory requirements for slices,
> but 12 / 24 bytes is starting to feel a little bulky. I am not
> intimately familiar with druntime, but OTOH and AFAICS, the additional
> pointer should also allow faster retrieval of the slice's capacity.
> (though the compiler should IMHO implement specific optimizations for ~=
> in loops anyway.)

Some optimizations can probably be done when the base pointer is known.

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
December 11, 2012
Re: Array Slices and Interior Pointers
On 11.12.2012 01:04, Alex Rønne Petersen wrote:
> http://xtzgzorex.wordpress.com/2012/12/11/array-slices-and-interior-pointers/
>
>
> Destroy.
>

I don't think there is a noticeable difference in detecting whether a 
pointer is pointing to the beginning of a GC memory block or somewhere 
inside it.

> This is clearly a huge problem for type-precise garbage collection.

I don't see problems here. If a memory block is referenced, all of it 
contents remains in memory, so they are scanned with their full type 
info. Or do you want to chop off unreferenced parts of the memory block?

From your post, it seems these are restrictions imposed by the .NET GC, 
not by slices in general. If you take a pointer to a field inside a 
struct, you will again get interior pointer. Do you want "fat pointers" 
for this as well?
December 11, 2012
Re: Array Slices and Interior Pointers
On Tuesday, 11 December 2012 at 00:04:57 UTC, Alex Rønne Petersen 
wrote:
> http://xtzgzorex.wordpress.com/2012/12/11/array-slices-and-interior-pointers/
>
> Destroy.

Instead of changing slices, shouldn't all pointers be modified if 
you want to do this kind of things (a pointer would have two 
parts a reference to the "head" and the real reference)?

BR,
renoX
December 11, 2012
Re: Array Slices and Interior Pointers
On 11-12-2012 11:36, renoX wrote:
> On Tuesday, 11 December 2012 at 00:04:57 UTC, Alex Rønne Petersen wrote:
>> http://xtzgzorex.wordpress.com/2012/12/11/array-slices-and-interior-pointers/
>>
>>
>> Destroy.
>
> Instead of changing slices, shouldn't all pointers be modified if you
> want to do this kind of things (a pointer would have two parts a
> reference to the "head" and the real reference)?
>
> BR,
> renoX

Interior pointers are not generally as useful for other things in the 
language as they are for slices, so I don't think any change is 
necessarily needed there.

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
December 11, 2012
Re: Array Slices and Interior Pointers
On 11-12-2012 08:29, Rainer Schuetze wrote:
>
> On 11.12.2012 01:04, Alex Rønne Petersen wrote:
>> http://xtzgzorex.wordpress.com/2012/12/11/array-slices-and-interior-pointers/
>>
>>
>>
>> Destroy.
>>
>
> I don't think there is a noticeable difference in detecting whether a
> pointer is pointing to the beginning of a GC memory block or somewhere
> inside it.

From what I could find in e.g. the Boehm GC, there seems to be 
significant work done to catch interior pointers in addition to base 
pointers (grep for GC_all_interior_pointers and related symbols).

>
>  > This is clearly a huge problem for type-precise garbage collection.
>
> I don't see problems here. If a memory block is referenced, all of it
> contents remains in memory, so they are scanned with their full type
> info. Or do you want to chop off unreferenced parts of the memory block?

No, the problem I was getting at is:

Suppose we have a field int* p; somewhere in the GC heap. With the 
current state of affairs, we have to consider that this field can hold a 
value that is either:

a) null (we don't care)
b) a pointer into C memory (we don't care)
c) a base pointer into the GC heap (unlikely but possible if "new int" 
was used somewhere)
d) an interior pointer into the GC heap (much more likely; a pointer to 
a field of another object)

So we have to look at the pointer and first figure out what kind of 
memory block it is /actually/ pointing to before we have any kind of 
type info available (just the knowledge that it's of type int* is not 
particularly useful by itself other than knowing that it could be a 
pointer at all).

With my scheme, the possibilities would be:

a) null (we don't care)
b) a pointer into C memory (we don't care)
c) a base pointer into the GC heap where the memory block is of type int*

Notice how we did not have to do any significant work to figure out what 
we're dealing with; we immediately know what kind of typed memory the 
pointer is pointing to.

This becomes more of an advantage with aggregates. Suppose we have:

struct A
{
    // ... more fields ...
}

And we have a field A* p; somewhere in the GC heap. We can now look at 
it and immediately tell whether it's a case of a, b, or c above and can 
trivially continue scanning into the pointed-to memory (if needed).

So the TL;DR is: We avoid extra work to figure out the actual type of 
the memory something is pointing to by simply making such cases illegal.

Whether that is practical, I do not know, and I don't plan to push for 
it anytime soon at least. But it has to be done for D to ever run on the 
CLI.

>
>  From your post, it seems these are restrictions imposed by the .NET GC,
> not by slices in general. If you take a pointer to a field inside a
> struct, you will again get interior pointer. Do you want "fat pointers"
> for this as well?

Sure, there's nothing wrong with slices if we assume all GCs that'll be 
running in a D implementation support interior pointers. But if we make 
this assumption, D can never run on the CLI.

Interior pointers are OK in the stack and registers, so taking pointers 
to fields inside aggregates should be fine so long as they are not 
stored in the heap.

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
December 11, 2012
Re: Array Slices and Interior Pointers
On Tue, 11 Dec 2012 11:25:44 -0600, Alex Rønne Petersen <alex@lycus.org>  
wrote:
> On 11-12-2012 08:29, Rainer Schuetze wrote:
>> On 11.12.2012 01:04, Alex Rønne Petersen wrote:
>>> http://xtzgzorex.wordpress.com/2012/12/11/array-slices-and-interior-pointers/
>>>
>>> Destroy.
>>>

Done.

[snip]

> From what I could find in e.g. the Boehm GC, there seems to be  
> significant work done to catch interior pointers in addition to base  
> pointers (grep for GC_all_interior_pointers and related symbols).

*Ahem* Arguments regarding performance require A) hard numbers and B) are  
implementation specific.

[snip]

> Suppose we have a field int* p;

p _isn't_ a slice, so you're 'fixes' don't apply.

[snip]

> So we have to look at the pointer and first figure out what kind of  
> memory block it is /actually/ pointing to before we have any kind of  
> type info available (just the knowledge that it's of type int* is not  
> particularly useful by itself other than knowing that it could be a  
> pointer at all).

How is p >> 12 slow or difficult? (Assuming log2(PageSize) == 12)

> So the TL;DR is: We avoid extra work to figure out the actual type of  
> the memory something is pointing to by simply making such cases illegal.

At the cost of extra work and more memory everywhere arrays are used.

> Whether that is practical, I do not know, and I don't plan to push for  
> it anytime soon at least. But it has to be done for D to ever run on the  
> CLI.

The issue with the CLI has nothing to do with this. The problem is that D  
arrays are D slices (i.e. we don't have T[new]) and D code is written to  
be slice compatible. Whereas the .Net libraries are, for the most part,  
slice incompatible. So slice-based code, in D or .Net, has to constantly  
convert back to arrays, which is a major performance sink.

[snip]

> But if we make this assumption, D can never run on the CLI.

False, see http://dnet.codeplex.com/.

> Interior pointers are OK in the stack and registers, so taking pointers  
> to fields inside aggregates should be fine so long as they are not  
> stored in the heap.

So what about unions?
December 11, 2012
Re: Array Slices and Interior Pointers
On 11-12-2012 19:11, Robert Jacques wrote:
> On Tue, 11 Dec 2012 11:25:44 -0600, Alex Rønne Petersen <alex@lycus.org>
> wrote:
>> On 11-12-2012 08:29, Rainer Schuetze wrote:
>>> On 11.12.2012 01:04, Alex Rønne Petersen wrote:
>>>> http://xtzgzorex.wordpress.com/2012/12/11/array-slices-and-interior-pointers/
>>>>
>>>>
>>>> Destroy.
>>>>
>
> Done.
>
> [snip]
>
>> From what I could find in e.g. the Boehm GC, there seems to be
>> significant work done to catch interior pointers in addition to base
>> pointers (grep for GC_all_interior_pointers and related symbols).
>
> *Ahem* Arguments regarding performance require A) hard numbers and B)
> are implementation specific.

Yes, but I'm not really here to convince anyone about whether interior 
pointers are needed or not. I was looking for input on the soundness of 
my proposal in order to /avoid/ interior pointers. Just that.

That is to say: I don't care enough about arguing this particular point 
to actually construct a benchmark. If you don't think interior pointers 
are a problem, that is fine, but then your input on the proposal 
probably isn't very useful, because even if I took back my argument 
about performance, interior pointers are still a very real problem for D 
integration into the CLI.

>
> [snip]
>
>> Suppose we have a field int* p;
>
> p _isn't_ a slice, so you're 'fixes' don't apply.

I was replying to Rainer's question about interior pointers and type 
precision in general. You're taking my reply way out of context.

>
> [snip]
>
>> So we have to look at the pointer and first figure out what kind of
>> memory block it is /actually/ pointing to before we have any kind of
>> type info available (just the knowledge that it's of type int* is not
>> particularly useful by itself other than knowing that it could be a
>> pointer at all).
>
> How is p >> 12 slow or difficult? (Assuming log2(PageSize) == 12)

Doesn't look slow and difficult to me. But it depends on the GC 
implementation, as you said. :)

>
>> So the TL;DR is: We avoid extra work to figure out the actual type of
>> the memory something is pointing to by simply making such cases illegal.
>
> At the cost of extra work and more memory everywhere arrays are used.

Yes.

>
>> Whether that is practical, I do not know, and I don't plan to push for
>> it anytime soon at least. But it has to be done for D to ever run on
>> the CLI.
>
> The issue with the CLI has nothing to do with this. The problem is that
> D arrays are D slices (i.e. we don't have T[new]) and D code is written
> to be slice compatible. Whereas the .Net libraries are, for the most
> part, slice incompatible. So slice-based code, in D or .Net, has to
> constantly convert back to arrays, which is a major performance sink.

I'm sorry, but you are wrong.

Interior pointers are /not/ permitted in the CLI. See Ecma 335, I.8.2.1.1.

D as it exists today cannot work in the CLI if it requires interior 
pointers for such a fundamental language feature no matter how you look 
at it.

>
> [snip]
>
>> But if we make this assumption, D can never run on the CLI.
>
> False, see http://dnet.codeplex.com/.

No, not false. This project is stalled because of slices. And 
regardless, the CLI spec clearly does not allow interior pointers.

>
>> Interior pointers are OK in the stack and registers, so taking
>> pointers to fields inside aggregates should be fine so long as they
>> are not stored in the heap.
>
> So what about unions?

Emit a type info bit saying "scan conservatively". Unions are the 
exception rather than the rule. As far as the CLI goes, unions cannot 
work at all, obviously.

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
December 11, 2012
Re: Array Slices and Interior Pointers
12/11/2012 4:04 AM, Alex Rønne Petersen пишет:
> http://xtzgzorex.wordpress.com/2012/12/11/array-slices-and-interior-pointers/
>
>
> Destroy.
>

Aside from the fact that I can use slices without GC just fine? :)
The base pointers would then be either counted, released manually or 
implicitly as part of stack unwinding.

I personally think that managed VMs are going to have to emulate slices 
and pointers as an array object + one or pair of offsets. In fact it 
could be implemented as an abstract object with implementation depending 
on where you did get that pointer from.

-- 
Dmitry Olshansky
« First   ‹ Prev
1 2
Top | Discussion index | About this forum | D home