Thread overview
Fixed-size OutBuffer that doesn't call .resize() automatically? (for database page buffers)
Aug 15, 2022
Gavin Ray
Aug 15, 2022
frame
Aug 19, 2022
Gavin Ray
Aug 19, 2022
frame
Aug 22, 2022
Gavin Ray
August 15, 2022

I'm learning about databases by implementing one from scratch, and decided I'd do it in D since it has the highest-level syntax for low-level code & seemed a natural fit.

Currently I am trying to implement the "slotted page" structure and storage (serialization/de-serialization from binary data on-disk)

"Slotted pages" are a data structure that store:

I tried to implement this using OutBuffer, but found that the buffer resized itself to greater than PAGE_SIZE bytes automatically =(

Is there an alternative to OutBuffer or a method you can call to set a byte limit on the resizing?

RUNNABLE DEMO: https://ldc.godbolt.org/z/ev78xos5b

enum PAGE_SIZE = 4096;
enum HEADER_SIZE = (uint.sizeof) * 6;
enum TUPLE_SLOT_SIZE = (uint.sizeof) * 2;

struct TupleSlot
{
    uint offset;
    uint size;
}

struct Tuple
{
    uint size;
    ubyte[] data;
}

struct PageHeader
{
    uint logStorageNumber = 0;
    uint pageId = 0;
    uint prevPageId = 0;
    uint nextPageId = 0;
    uint freeSpacePointer = PAGE_SIZE;
    uint tupleCount = 0;
    TupleSlot[] slots;
}

struct SlottedPage
{
    PageHeader header;
    Tuple[] tuples;
    ubyte[PAGE_SIZE] buffer;

    void insertTuple(Tuple tuple)
    {
        tuples ~= tuple;
        header.slots ~= TupleSlot(header.freeSpacePointer, cast(uint) tuple.sizeof);
        header.tupleCount++;
        header.freeSpacePointer -= tuple.sizeof;
    }

    void serialize(OutBuffer buf)
    {
        with (header)
        {
            buf.write(pageId);
            buf.write(logStorageNumber);
            buf.write(prevPageId);
            buf.write(nextPageId);
            buf.write(freeSpacePointer);
            buf.write(tupleCount);
        }

        foreach (TupleSlot slot; header.slots)
        {
            buf.write(slot.offset);
            buf.write(slot.size);
        }

        buf.fill0(header.freeSpacePointer);

        foreach (Tuple tuple; tuples)
        {
            buf.write(tuple.size);
            buf.write(tuple.data);
        }
    }
}

void main()
{
    OutBuffer buffer = new OutBuffer();
	buffer.reserve(4096);

	auto page = new SlottedPage(PageHeader());
	foreach (i; 0 .. 10)
	{
		OutBuffer buf = new OutBuffer();
		buf.write(i);
		page.insertTuple(Tuple(cast(uint) buf.data.length, buf.data));
	}

	page.serialize(buffer);
    // Writes 8206
    writeln("Buffer size is: ", buffer.data.length);
}
August 15, 2022

On Monday, 15 August 2022 at 20:51:07 UTC, Gavin Ray wrote:

>

Is there an alternative to OutBuffer or a method you can call to set a byte limit on the resizing?

Are you looking for a circular buffer?

https://code.dlang.org/packages/ringbuffer

August 19, 2022

On Monday, 15 August 2022 at 22:47:21 UTC, frame wrote:

>

On Monday, 15 August 2022 at 20:51:07 UTC, Gavin Ray wrote:

>

Is there an alternative to OutBuffer or a method you can call to set a byte limit on the resizing?

Are you looking for a circular buffer?

https://code.dlang.org/packages/ringbuffer

Thanks for this suggestion, I hadn't considered it

I discovered 3 ways of doing it:

  1. Calling .toBytes() on an OutBuffer will discard the extra bytes allocated past what was reserved and used. But this will still allocate the memory in the first place I guess (will the compiler optimize this away?)

  2. Copy the OutBuffer class into a new FixedSizeOutBuffer(T) and alter its behavior

  3. Use ubyte[PAGE_SIZE] and manually write like below:

ubyte[] toBytes(uint value)
{
    static ubyte[4] bytes = new ubyte[4];
    bytes[0] = cast(ubyte)(value >> 24);
    bytes[1] = cast(ubyte)(value >> 16);
    bytes[2] = cast(ubyte)(value >> 8);
    bytes[3] = cast(ubyte)(value);
    return bytes;
}

void serialize(out ubyte[PAGE_SIZE] outbuf)
{
    ubyte[] buf;
    reserve(buf, PAGE_SIZE);

    buf ~= toBytes(header.pageId);
    buf ~= toBytes(header.logStorageNumber);
    buf ~= toBytes(header.prevPageId);
    buf ~= toBytes(header.nextPageId);
    buf ~= toBytes(header.freeSpacePointer);
    buf ~= toBytes(header.tupleCount);

    foreach (idx, ref slot; header.slots)
    {
        buf ~= toBytes(slot.offset);
        buf ~= toBytes(slot.size);
    }

    // Skip over free space
    ubyte[] padding = new ubyte[header.freeSpacePointer];
    padding[] = 0;
    buf ~= padding;

    foreach (idx, ref tuple; tuples)
    {
        buf ~= toBytes(tuple.size);
        buf ~= tuple.data;
    }

    move(buf.ptr, outbuf.ptr);
}
August 19, 2022

On Friday, 19 August 2022 at 16:19:04 UTC, Gavin Ray wrote:

>
  1. Calling .toBytes() on an OutBuffer will discard the extra bytes allocated past what was reserved and used. But this will still allocate the memory in the first place I guess (will the compiler optimize this away?)

It does allocate when it needs to. It grows but never shrinks again.

>
  1. Copy the OutBuffer class into a new FixedSizeOutBuffer(T) and alter its behavior

  2. Use ubyte[PAGE_SIZE] and manually write like below:

>
    static ubyte[4] bytes = new ubyte[4];

Looks still unnecessary - you are allocating thread local memory. Just use a static array.

>
    foreach (idx, ref slot; header.slots)

No need to reference slot here. You may prevent compiler optimizations.

>
    // Skip over free space
    ubyte[] padding = new ubyte[header.freeSpacePointer];
    padding[] = 0;
    buf ~= padding;

Unnecessary, the initial value of ubyte is 0 and allocation is done automatically. Just set the slice length:

buf.length += header.freeSpacePointer;
>
    foreach (idx, ref tuple; tuples)

Again, no need to reference

>
    move(buf.ptr, outbuf.ptr);

Not sure what that is. A simple

outbuf[0 .. buf.length] = buf; // or whatever position to write in outbuf

should do it too.

August 22, 2022

Ahh, thanks a ton for these pointers, much appreciated!