Jump to page: 1 25  
Page
Thread overview
float[] → Vertex[] – decreases performance by 1000%
Jul 24, 2012
David
Jul 24, 2012
bearophile
Jul 24, 2012
David
Jul 24, 2012
Simon
Jul 24, 2012
David
Jul 24, 2012
H. S. Teoh
Jul 24, 2012
David
Jul 24, 2012
David
Jul 24, 2012
H. S. Teoh
Jul 24, 2012
David
Jul 24, 2012
Jonathan M Davis
Jul 24, 2012
Simen Kjaeraas
Jul 24, 2012
H. S. Teoh
Jul 24, 2012
David
Jul 24, 2012
Era Scarecrow
Jul 25, 2012
David
Jul 25, 2012
David
Jul 25, 2012
Andrea Fontana
Jul 25, 2012
David
Jul 25, 2012
Andrea Fontana
Jul 25, 2012
David
Jul 25, 2012
Dmitry Olshansky
Jul 25, 2012
David
Jul 25, 2012
Dmitry Olshansky
Jul 25, 2012
David
Jul 25, 2012
Dmitry Olshansky
Jul 25, 2012
David
Jul 25, 2012
bearophile
Jul 25, 2012
David
Jul 25, 2012
Ali Çehreli
Jul 25, 2012
David
Jul 25, 2012
Ali Çehreli
Jul 25, 2012
David
Jul 26, 2012
David
Jul 26, 2012
Dmitry Olshansky
Jul 26, 2012
David
Jul 27, 2012
dennis luehring
Aug 24, 2012
Benjamin Thaut
Aug 24, 2012
David
Aug 27, 2012
Sean Kelly
Aug 28, 2012
David
Aug 28, 2012
bearophile
Aug 28, 2012
David
Aug 28, 2012
bearophile
Aug 28, 2012
David
Aug 28, 2012
Timon Gehr
Aug 28, 2012
David
Aug 28, 2012
Timon Gehr
Aug 29, 2012
Brad Roberts
July 24, 2012
I am writing a game engine, well I was using a float[] array to store my vertices, this worked well, but I have to send more and more uv coordinates (and other information) which needn't be stored as `float`'s so I moved from a float-Array to a Vertex Array:
https://github.com/Dav1dde/BraLa/blob/master/brala/dine/builder/tessellator.d#L30 


align(1) struct Vertex {
    float x;
    float y;
    float z;
    float nx;
    float ny;
    float nz;
    float u_terrain;
    float v_terrain;
    float u_biome;
    float v_biome;
}

Everything is still a float, so it's easier. Nothing wrong with that or? Well this change decreases my performance by 1000%. My frame rate drops from ~12ms per frame to ~120ms per frame. I tried to find the bottleneck with `perf` but no results (the time is not spent in the game/engine).

The commit:
https://github.com/Dav1dde/BraLa/commit/02a37a0e46f195f5a46404747d659d26490e6c32

I hope you can see anything wrong. I have no idea!
July 24, 2012
David:

> align(1) struct Vertex {
>     float x;
>     float y;
>     float z;
>     float nx;
>     float ny;
>     float nz;
>     float u_terrain;
>     float v_terrain;
>     float u_biome;
>     float v_biome;
> }
>
> Everything is still a float, so it's easier. Nothing wrong with that or? Well this change decreases my performance by 1000%.

Aligning floats to 1 byte doesn't seem a good idea. Try to remove the aling(1).

Bye,
bearophile
July 24, 2012
Am 24.07.2012 20:57, schrieb bearophile:
> David:
>> Everything is still a float, so it's easier. Nothing wrong with that
>> or? Well this change decreases my performance by 1000%.
>
> Aligning floats to 1 byte doesn't seem a good idea. Try to remove the
> aling(1).
>
> Bye,
> bearophile

This makes no difference.
July 24, 2012
On Tue, Jul 24, 2012 at 08:57:08PM +0200, bearophile wrote:
> David:
> 
> >align(1) struct Vertex {
> >    float x;
> >    float y;
> >    float z;
> >    float nx;
> >    float ny;
> >    float nz;
> >    float u_terrain;
> >    float v_terrain;
> >    float u_biome;
> >    float v_biome;
> >}
> >
> >Everything is still a float, so it's easier. Nothing wrong with that or? Well this change decreases my performance by 1000%.
> 
> Aligning floats to 1 byte doesn't seem a good idea. Try to remove
> the aling(1).
[...]

I agree. I don't know how the CPU handles misaligned floats, but from what I understand, it will do two loads to fetch the two word-aligned parts of the float, and then assemble it together. This may be what's causing the slowdown.


T

-- 
Маленькие детки - маленькие бедки.
July 24, 2012
On 24/07/2012 20:08, David wrote:
> Am 24.07.2012 20:57, schrieb bearophile:
>> David:
>>> Everything is still a float, so it's easier. Nothing wrong with that
>>> or? Well this change decreases my performance by 1000%.
>>
>> Aligning floats to 1 byte doesn't seem a good idea. Try to remove the
>> aling(1).
>>
>> Bye,
>> bearophile
>
> This makes no difference.

Could be that your structs are getting default initialised so you will be getting a constructor called for every instance of a Vertex.

This will be a lot slower than a float array.
Try void initialising your Vertex arrays.

http://dlang.org/declaration.html

See the bit Void Initializations near the bottom.

Also make sure that you are passing fixed size arrays by reference.

-- 
My enormous talent is exceeded only by my outrageous laziness.
http://www.ssTk.co.uk


July 24, 2012
On Tue, Jul 24, 2012 at 09:08:10PM +0200, David wrote:
> Am 24.07.2012 20:57, schrieb bearophile:
> >David:
> >>Everything is still a float, so it's easier. Nothing wrong with that or? Well this change decreases my performance by 1000%.
> >
> >Aligning floats to 1 byte doesn't seem a good idea. Try to remove the
> >aling(1).
> >
> >Bye,
> >bearophile
> 
> This makes no difference.

Hmm. Could this be a GC-related issue?


T

-- 
No! I'm not in denial!
July 24, 2012
> I agree. I don't know how the CPU handles misaligned floats, but from
> what I understand, it will do two loads to fetch the two word-aligned
> parts of the float, and then assemble it together. This may be what's
> causing the slowdown.
>
>
> T
>

Remvoing the `align(1)` changes nothing, not 1ms slower or faster, unfortunatly.
July 24, 2012
> Could be that your structs are getting default initialised so you will
> be getting a constructor called for every instance of a Vertex.
>
> This will be a lot slower than a float array.
> Try void initialising your Vertex arrays.
>
> http://dlang.org/declaration.html
>
> See the bit Void Initializations near the bottom.
>
> Also make sure that you are passing fixed size arrays by reference.
>

No. The vertices are just created once (with a call to the default ctor) and immedialty added to the Vertex* but they are never instantiated.
July 24, 2012
> Hmm. Could this be a GC-related issue?

Actually this could be. They are stored inside a Vertex* array which is allocated which is allocated with `malloc`, maybe the GC scans all of the created vertex structs? Could this be?
July 24, 2012
Am 24.07.2012 21:46, schrieb David:
>> Hmm. Could this be a GC-related issue?
>
> Actually this could be. They are stored inside a Vertex* array which is
> allocated which is allocated with `malloc`, maybe the GC scans all of
> the created vertex structs? Could this be?

    import core.memory;
    GC.disable();

directly when entering main didn't help, so I guess it's not the GC
« First   ‹ Prev
1 2 3 4 5