Thread overview
Obtaining aligned addresses and aligned spaces for objects
Jun 29, 2011
Ali Çehreli
Jun 29, 2011
Ali Çehreli
Jun 29, 2011
Ali Çehreli
June 29, 2011
The address that is passed to emplace() must be properly aligned for that type. Am I correct in thinking that the following functions are a must? If so, do you think that they are correct? If so, are there equivalents in Phobos?

/**
 * Return an aligned address for the type at or after the
 * candidate address.
 */
T * alignedAddress(T)(T * candidate)
{
    return cast(T*)((cast(size_t)candidate + T.alignof - 1)
                    / T.alignof * T.alignof);
}

/**
 * A convenient overload for void*
 */
void * alignedAddress(T)(void * candidate)
{
    return alignedAddress(cast(T*)candidate);
}

/**
 * Return the size of the space that the type occupies from
 * one aligned address to the next.
 */
size_t alignedSpace(T)()
{
    static if (is (T == class)) {
        size_t size = __traits(classInstanceSize, T);
    } else {
        size_t size = T.sizeof;
    }

    return cast(size_t)alignedAddress(cast(T*)size);
}

Here is the rest of the program that uses those:

import std.stdio;
import core.memory;
import std.conv;

struct S
{
    string s;

    string toString()
    {
        return s;
    }
}

void main()
{
    /* Inputs to use to make objects */
    string[] inputs = [ "hello", "world" ];

    /* Raw memory */
    S * chunk = cast(S*)GC.calloc(inputs.length * alignedSpace!S());

    /* Make objects */
    foreach (i, str; inputs) {
        S * objectAddress = alignedAddress(chunk + i);
        emplace(objectAddress, str);
    }

    /* Convert from pointer to slice. Nice feature! :) */
    S[] objects = chunk[0 .. inputs.length];

    /* Use the objects */
    writeln(objects);
}

The output:

[hello, world]

On a related note, why doesn't __traits(classInstanceSize, T) consider the padding bytes? If I'm not mistaken struct sizes do include the padding bytes. The following class has the very odd size of 17! Is that by design?

class C
{
    char c;
}

void main()
{
    assert(__traits(classInstanceSize, C) == 17);
}

Ali
June 29, 2011
On Wed, 29 Jun 2011 01:42:37 -0400, Ali Çehreli <acehreli@yahoo.com> wrote:
> On a related note, why doesn't __traits(classInstanceSize, T) consider
> the padding bytes? If I'm not mistaken struct sizes do include the
> padding bytes. The following class has the very odd size of 17! Is that
> by design?

Probably it's because classes are intended to occupy their own memory block, so the pad doesn't matter, you can't put two in the same block anyways.

If you want to do it, I think it should be possible.  But I don't think it would work great if you put two classes in the same block without significant compiler/GC changes.

That being said, given that all GC blocks are at least 16-byte aligned, the same should hold true for any class instance.

-Steve
June 29, 2011
On Wed, 29 Jun 2011 10:47:39 -0400, Steven Schveighoffer wrote:

> On Wed, 29 Jun 2011 01:42:37 -0400, Ali Çehreli <acehreli@yahoo.com> wrote:
>> On a related note, why doesn't __traits(classInstanceSize, T) consider the padding bytes? If I'm not mistaken struct sizes do include the padding bytes. The following class has the very odd size of 17! Is that by design?
> 
> Probably it's because classes are intended to occupy their own memory block, so the pad doesn't matter, you can't put two in the same block anyways.

You also say a block is 16 bytes below but the following program seems to be able to put objects at 24 byte intervals.

> 
> If you want to do it, I think it should be possible.

Actually I don't care. :) I am just trying to find out the right way of placing objects in memory. I am just trying to cover all aspects of it.

> But I don't think
> it would work great if you put two classes in the same block without
> significant compiler/GC changes.
> 
> That being said, given that all GC blocks are at least 16-byte aligned, the same should hold true for any class instance.

I am able to place class objects at 24 byte intervals with the following program, which indicates that some objects straddle block boundaries. Is that fine? Is the following method sufficient. Am I doing things unnecessarily complicated?

import std.stdio;
import core.memory;
import std.conv;

class C
{
    char c;

    this(char c)
    {
        this.c = c;
    }
}

T * alignedAddress(T)(T * candidate)
{
    return cast(T*)((cast(size_t)candidate + T.alignof - 1)
                    / T.alignof * T.alignof);
}

void * alignedAddress(T)(void * candidate)
{
    return alignedAddress(cast(T*)candidate);
}

size_t alignedSpace(T)()
{
    static if (is (T == class)) {
        size_t size = __traits(classInstanceSize, T);
    } else {
        size_t size = T.sizeof;
    }

    return cast(size_t)alignedAddress(cast(T*)size);
}

void main()
{
    size_t size = __traits(classInstanceSize, C);
    size_t alignedSize = alignedSpace!C();

    writeln("instance size: ", size);
    writeln("alignment    : ", C.alignof);
    writeln("aligned size : ", alignedSize);

    /* Inputs to use to make objects */
    string inputs = "hello world";

    /* Raw memory */
    void * chunk = GC.calloc(inputs.length * alignedSize);

    C[] variables;

    /* Make objects */
    foreach (i, c; inputs) {
        void * candidateAddress = chunk + (i * alignedSize);

        /* I don't think anymore that I need to call alignedAddress()
         * below. Still... */
        void * objectAddress = alignedAddress!C(candidateAddress);

        writefln("candidate: %s   addr: %s",
                 candidateAddress, objectAddress);

        variables ~= emplace!C(objectAddress[0..size], c);
        /*
         * It might make more sense to pass
         * objectAddress[0..alignedSize] above, but I can't. (Please
         * see http://d.puremagic.com/issues/show_bug.cgi?id=6204 )
         */
    }

    /* Use the objects */
    foreach (v; variables) {
        writeln(v.c, '.');
    }
}

Here is an output:

instance size: 17
alignment    : 8
aligned size : 24
candidate: 7F11467A2E00   addr: 7F11467A2E00
candidate: 7F11467A2E18   addr: 7F11467A2E18
candidate: 7F11467A2E30   addr: 7F11467A2E30
candidate: 7F11467A2E48   addr: 7F11467A2E48
candidate: 7F11467A2E60   addr: 7F11467A2E60
candidate: 7F11467A2E78   addr: 7F11467A2E78
candidate: 7F11467A2E90   addr: 7F11467A2E90
candidate: 7F11467A2EA8   addr: 7F11467A2EA8
candidate: 7F11467A2EC0   addr: 7F11467A2EC0
candidate: 7F11467A2ED8   addr: 7F11467A2ED8
candidate: 7F11467A2EF0   addr: 7F11467A2EF0
h.
e.
l.
l.
o.
 .
w.
o.
r.
l.
d.

> 
> -Steve

Ali
June 29, 2011
On Wed, 29 Jun 2011 13:29:07 -0400, Ali Çehreli <acehreli@yahoo.com> wrote:

> On Wed, 29 Jun 2011 10:47:39 -0400, Steven Schveighoffer wrote:
>
>> On Wed, 29 Jun 2011 01:42:37 -0400, Ali Çehreli <acehreli@yahoo.com>
>> wrote:
>>> On a related note, why doesn't __traits(classInstanceSize, T) consider
>>> the padding bytes? If I'm not mistaken struct sizes do include the
>>> padding bytes. The following class has the very odd size of 17! Is that
>>> by design?
>>
>> Probably it's because classes are intended to occupy their own memory
>> block, so the pad doesn't matter, you can't put two in the same block
>> anyways.
>
> You also say a block is 16 bytes below but the following program seems to
> be able to put objects at 24 byte intervals.

Yes, it depends on the contents of the class.  All I was saying is that the GC already aligns all blocks to 16-byte boundaries, so the language simply doesn't worry about class alignment and padding.

>> But I don't think
>> it would work great if you put two classes in the same block without
>> significant compiler/GC changes.
>>
>> That being said, given that all GC blocks are at least 16-byte aligned,
>> the same should hold true for any class instance.
>
> I am able to place class objects at 24 byte intervals with the following
> program, which indicates that some objects straddle block boundaries. Is
> that fine? Is the following method sufficient. Am I doing things
> unnecessarily complicated?

A "block" is not a 16-byte space, it's simply a chunk of memory.  You are not straddling block boundaries, because you have everything in one block.

Note that your test is not very taxing on alignment -- char does not need to be aligned.  I would test with reals and also try and test the hidden elements of classes -- the vtable (i.e. virtual functions), interfaces, and the monitor object (i.e. try using synchronized) to ensure everything is working right.

Note that having data be misaligned may not result in failure, it may just result in slower execution.

Also note that you must manually destroy these classes via clear -- the GC only destroys the block, and assumes one class instance per block.  And in fact, you are not setting the finalize flag on the block on creation, so it wouldn't call any destructors.

-Steve
June 29, 2011
On Wed, 29 Jun 2011 13:43:51 -0400, Steven Schveighoffer wrote:

>> You also say a block is 16 bytes below but the following program seems to be able to put objects at 24 byte intervals.
> 
> Yes, it depends on the contents of the class.  All I was saying is that the GC already aligns all blocks to 16-byte boundaries, so the language simply doesn't worry about class alignment and padding.

Right. I was preoccupied with the padded space of objects that I misread what you said. :)

> Note that your test is not very taxing on alignment -- char does not need to be aligned.  I would test with reals and also try and test the hidden elements of classes -- the vtable (i.e. virtual functions), interfaces, and the monitor object (i.e. try using synchronized) to ensure everything is working right.
> 
> Note that having data be misaligned may not result in failure, it may just result in slower execution.

Related to that and if I remember correctly, in older hardware accessing an int through an odd address value used to be a failure. I think it was rejected by the  CPU. Doesn't seem to be the case on my current system. No complaints:

import std.stdio;
import std.conv;

void main()
{
    enum offset = 1;
    ubyte[int.sizeof + offset] chunk;

    int * p = cast(int*)(chunk.ptr + offset);
    *p = 0x12345678;

    foreach (b; chunk) {
        writef("%02x ", b);
    }
    writeln();

    assert(*p == 0x12345678);
}

The output:

00 78 56 34 12

> Also note that you must manually destroy these classes via clear -- the GC only destroys the block, and assumes one class instance per block. And in fact, you are not setting the finalize flag on the block on creation, so it wouldn't call any destructors.
> 
> -Steve

Thank you very much for all the heads up. Luckily we don't need to deal with these issues in daily programming.

Ali