February 05, 2018
I'm using vector operations in my graphics engine for rendering, since low-level raster operations in GPU are well hidden under layers of API, although I'm planning on porting the blitter algorithms for DCompute once it becomes more mature, as well as creating the CPUblit library for general use (will contain blitter and alpha-blending functions as well as basic drawing ones).

However, since I have to write most of the functions with Assembly, I have to write every function multiple times in a hard-to-read format. After spending some time with using vectors, I came up with some suggestions:
-32 bit and 64 bit long vectors have to be supported at the level of loading from the memory. The former is very useful in computer graphics.
-There's a lot of low-level operations in SSEn that are either also present in e.g. NEON, or can be emulated through a simple function, like unpacking (often used for integer promotion by me).