Initial feedback for std.experimental.image (page 3) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Initial feedback for std.experimental.image (page 3)

July 07, 2015

Re: Initial feedback for std.experimental.image

Posted by ponce
in reply to David Nadlinger

ponce

Posted in reply to David Nadlinger

On Tuesday, 7 July 2015 at 14:02:53 UTC, David Nadlinger wrote:
> Only in C/C++. In D, they are defined to overflow according to two's complement.
>
>  — David

Thanks for the correction.

July 08, 2015

Re: Initial feedback for std.experimental.image

Posted by ketmar
in reply to ponce

ketmar

Posted in reply to ponce

Attachments:

signature.asc

On Tue, 07 Jul 2015 12:10:07 +0000, ponce wrote:

> Signed indices optimize better in loops because signed overflow is undefined behaviour.

one of the reasons i dropped C. luckily, in D it's defined.

July 08, 2015

Re: Initial feedback for std.experimental.image

Posted by ketmar
in reply to Rikki Cattermole

ketmar

Posted in reply to Rikki Cattermole

Attachments:

signature.asc

On Tue, 07 Jul 2015 15:51:13 +1200, Rikki Cattermole wrote:

> Canvas is definitely and 100% out of scope however. It's a great idea and something we need long term. Just not something I can do right now. Also need to add that to specification document as a follow on work and out of scope.

so what primitives you want to include? will there be line joints of various types? wide lines (more that 1px wide)? will it allow antialiasing, and if yes, will it be at least of the quality of anti- grain geometry? etc, etc.

do you already see why i believe that canvas is out of scope? it's simply too big task that should be done separately, using image lib as foundation.

July 08, 2015

Re: Initial feedback for std.experimental.image

Posted by ketmar
in reply to Manu

ketmar

Posted in reply to Manu

Attachments:

signature.asc

On Tue, 07 Jul 2015 18:54:45 +1000, Manu via Digitalmars-d wrote:

>> std.zlib need to stop being crap. i.e. it should be thrown away and completely rewritten. i did that several times, including pure D inflate and compress/decompress that using byte streams. never used std.zlib as is.
> 
> What's wrong with zlib? Surely it's possible to produce a d interface
> for zlib that's nice?
> Thing is, zlib is possibly the single most popular library in the world,
> and it gets way more testing, and improvements + security fixes from
> time to time. Why wouldn't you want to link to the real thing?

nothing wrong with *zlib*, it's *std.zlib* sux. but not `etc.c.zlib` interface, which i used in some of my stream codecs.

July 08, 2015

Re: Initial feedback for std.experimental.image

Posted by ketmar
in reply to Rikki Cattermole

ketmar

Posted in reply to Rikki Cattermole

Attachments:

signature.asc

On Tue, 07 Jul 2015 15:54:15 +1200, Rikki Cattermole wrote:

> In that case, ketmar is now in charge of writing a new std.compression module for Phobos! Since he already knows these algos well.

i don't think that i'm ready to code for phobos. coding style can be fixed with dfmt (yet it will require to backport patches from other people, grrrr), but no tool can fix lack of documentation and tests. and i really HAET writing those. the same for writing API parts that should be there "for completeness", but has no use for me.

so while i can build a draft, some other dedicated person must do all the other work. and if there is such dedicated person, that person can do that without my draft shitcode, for sure.

p.s. writing xpath and css selectors engine for Adam's "dom.d" is much funnier, as i can stop caring about "api completeness", "comprehensive documentation" and so on. if my engine will be able to do more that current "dom.d" engine do, Adam will be happy to merge it, i believe. ;-)

July 08, 2015

Re: Initial feedback for std.experimental.image

Posted by Gregor Mückl
in reply to Rikki Cattermole

Gregor Mückl

Posted in reply to Rikki Cattermole

On Monday, 6 July 2015 at 13:48:53 UTC, Rikki Cattermole wrote:
>
> Please destroy!
>

You asked for it! :)

As a reference to a library that is used to handle images on a professional level (VFX industry), I'd encourage you to look at the feature set and interfaces of OpenImageIO. Sure, it's a big library and some of it is definitely out of scope for what you try to accomplish (image tile caching and texture sampling, obviously).

Yet, there are some features I specifically want to mention here to challenge the scope of your design:

- arbitrary channel layouts in images: this is a big one. You mention 3D engines as a targeted use case in the specification. 3D rendering is one of the worst offenders when it comes to crazy channel layouts in textures (which are obviously stored as image files). If you have a data texture that requires 2 channels (e.g. uv offsets for texture lookups in shaders or some crazy data tables), its memory layout should also only ever have two channels. Don't expand it to RGB transparently or anything else braindead. Don't change the data type of the pixel values wildly without being asked to do so. The developer most likely has chosen a 16 bit signed integer per channel (or whatever else) for a good reason. Some high end file formats like OpenEXR even allow users to store completely arbitrary channels as well, often with a different per-channel data format (leading to layouts like RGBAZ with an additional mask channel on top). But support for that really bloats image library interfaces. I'd stick with a sane variant of the uncompressed texture formats that the OpenGL specification lists as the target set of supported in-memory image formats. That mostly matches current GPU hardware support and probably will for some time to come.

- padding and memory alignment: depending on the platform, image format and task at hand you may want the in-memory layout of your image to be padded in various ways. For example, you would want your scanlines and pixel values aligned to certain offsets to make use of SIMD instructions which often carry alignment restrictions with them. This is one of the reasons why RGB images are sometimes expanded to have a dummy channel between the triplets. Also, aligning the start of each scanline may be important, which introduces a "pitch" between them that is greater than just the storage size of each scanline by itself. Again, this may help speeding up image processing.

- subimages: this one may seem obscure, but it happens in a number common of file formats (gif, mng, DDS, probably TIFF and others). Subimages can be - for instance - individual animation frames or precomputed mipmaps. This means that they may have metadata attached to them (e.g. framerate or delay to next frame) or they may come in totally different dimensions (mipmap levels).

- window regions: now this not quite your average image format feature, but relevant for some use cases. The gist of it is that the image file may define a coordinate system for a whole image frame but only contain actual data within certain regions that do not cover the whole frame. These regions may even extend beyond the defined image frame (used e.g. for VFX image postprocessing to have properly defined pixel values to filter into the visible part of the final frame). Again, the OpenEXR documentation explains this feature nicely. Again, I think this likely is out of scope for this library.

My first point also leads me to this criticism:

- I do not see a way to discover the actual data format of a PNG file through your loader. Is it 8 bit palette-based, 8 bit per pixel or 16 bits per pixel? Especially the latter should not be transparently converted to 8 bits per pixel if encountered because it is a lossy transformation. As I see it right now you have to know the pixel format up front to instantiate the loader. I consider that bad design. You can only have true knowledge of the file contents after the image header were parsed. The same is generally true of most actually useful image formats out there.

- Could support for image data alignment be added by defining a new ImageStorage subclass? The actual in-memory data is not exposed to direct access, is it? Access to the raw image data would be preferable for those cases where you know exactly what you are doing. Going through per-pixel access functions for large image regions is going to be dreadfully slow in comparison to what can be achieved with proper processing/filtering code.

- Also, uploading textures to the GPU requires passing raw memory blocks and a format description of sorts to the 3D API. Being required to slowly copy the image data in question into a temporary buffer for this process is not an adequate solution.

Let me know what you think!

July 09, 2015

Re: Initial feedback for std.experimental.image

Posted by Rikki Cattermole
in reply to Gregor Mückl

Rikki Cattermole

Posted in reply to Gregor Mückl

On 9/07/2015 6:07 a.m., "Gregor =?UTF-8?B?TcO8Y2tsIg==?= <gregormueckl@gmx.de>" wrote:
> On Monday, 6 July 2015 at 13:48:53 UTC, Rikki Cattermole wrote:
>>
>> Please destroy!
>>
>
> You asked for it! :)
>
> As a reference to a library that is used to handle images on a
> professional level (VFX industry), I'd encourage you to look at the
> feature set and interfaces of OpenImageIO. Sure, it's a big library and
> some of it is definitely out of scope for what you try to accomplish
> (image tile caching and texture sampling, obviously).
>
> Yet, there are some features I specifically want to mention here to
> challenge the scope of your design:
>
> - arbitrary channel layouts in images: this is a big one. You mention 3D
> engines as a targeted use case in the specification. 3D rendering is one
> of the worst offenders when it comes to crazy channel layouts in
> textures (which are obviously stored as image files). If you have a data
> texture that requires 2 channels (e.g. uv offsets for texture lookups in
> shaders or some crazy data tables), its memory layout should also only
> ever have two channels. Don't expand it to RGB transparently or anything
> else braindead. Don't change the data type of the pixel values wildly
> without being asked to do so. The developer most likely has chosen a 16
> bit signed integer per channel (or whatever else) for a good reason.
> Some high end file formats like OpenEXR even allow users to store
> completely arbitrary channels as well, often with a different
> per-channel data format (leading to layouts like RGBAZ with an
> additional mask channel on top). But support for that really bloats
> image library interfaces. I'd stick with a sane variant of the
> uncompressed texture formats that the OpenGL specification lists as the
> target set of supported in-memory image formats. That mostly matches
> current GPU hardware support and probably will for some time to come.

As long as the color implementation matches isColor from std.experimental.color. Then it's a color. I do not handle that :)
The rest of how it maps in memory is defined by the image storage types. Any image loader/exporter can use any as long as you specify it via a template argument *currently*.

> - padding and memory alignment: depending on the platform, image format
> and task at hand you may want the in-memory layout of your image to be
> padded in various ways. For example, you would want your scanlines and
> pixel values aligned to certain offsets to make use of SIMD instructions
> which often carry alignment restrictions with them. This is one of the
> reasons why RGB images are sometimes expanded to have a dummy channel
> between the triplets. Also, aligning the start of each scanline may be
> important, which introduces a "pitch" between them that is greater than
> just the storage size of each scanline by itself. Again, this may help
> speeding up image processing.

Image storage type implementation, not my problem. They can be added later to support padding ext.

> - subimages: this one may seem obscure, but it happens in a number
> common of file formats (gif, mng, DDS, probably TIFF and others).
> Subimages can be - for instance - individual animation frames or
> precomputed mipmaps. This means that they may have metadata attached to
> them (e.g. framerate or delay to next frame) or they may come in totally
> different dimensions (mipmap levels).

Ahhh, this. I can add it fairly easily. A struct that is a sub image of another. Any extra data would have to be alias this'd.

> - window regions: now this not quite your average image format feature,
> but relevant for some use cases. The gist of it is that the image file
> may define a coordinate system for a whole image frame but only contain
> actual data within certain regions that do not cover the whole frame.
> These regions may even extend beyond the defined image frame (used e.g.
> for VFX image postprocessing to have properly defined pixel values to
> filter into the visible part of the final frame). Again, the OpenEXR
> documentation explains this feature nicely. Again, I think this likely
> is out of scope for this library.

Ugh based upon what you said, that is out of scope of the image loader/exporters that I'm writing. Also right now it is only using unsigned integers for coordinates. I'm guessing if it is outside the bounds it can go negative then.
Slightly too specialized for what we need in the general case.

> My first point also leads me to this criticism:
>
> - I do not see a way to discover the actual data format of a PNG file
> through your loader. Is it 8 bit palette-based, 8 bit per pixel or 16
> bits per pixel? Especially the latter should not be transparently
> converted to 8 bits per pixel if encountered because it is a lossy
> transformation. As I see it right now you have to know the pixel format
> up front to instantiate the loader. I consider that bad design. You can
> only have true knowledge of the file contents after the image header
> were parsed. The same is generally true of most actually useful image
> formats out there.

The reasoning is because this is what I know I can work with. You specify what you want to use, it'll auto convert after that. It makes user code a bit simpler.

However if you can convince Manu Evans to add some sort of union color type that can hold many different sizes for e.g. RGB. Then I'm all for it. Although I would consider this to be a _bad_ feature.

> - Could support for image data alignment be added by defining a new
> ImageStorage subclass? The actual in-memory data is not exposed to
> direct access, is it? Access to the raw image data would be preferable
> for those cases where you know exactly what you are doing. Going through
> per-pixel access functions for large image regions is going to be
> dreadfully slow in comparison to what can be achieved with proper
> processing/filtering code.

I ugh... had this feature once. I removed it as if you already know the implementation why not just directly access it?
But, if there is genuine need to get access to it as e.g. void* then I can do it again.

> - Also, uploading textures to the GPU requires passing raw memory blocks
> and a format description of sorts to the 3D API. Being required to
> slowly copy the image data in question into a temporary buffer for this
> process is not an adequate solution.

Again for previous answer, was possible. No matter what the image storage type was. But it was hairy and could and would cause bugs in the future. Your probably better off knowing the type and getting access directly to it that way.

Some very good points that I believe definitely needs to be touched upon where had.

I've had a read of OpenImageIO documentation and all I can say is irkkk.
Most of what is in there with e.g. tiles and reflection styles methods are out of scope out right as they are a bit specialized for my liking. If somebody wants to add it, take a look at the offset support. It was written as an 'extension' like ForwardRange is for ranges.

The purpose of this library is to work more for GUI's and games then anything else. It is meant more for the average programmer then specialized in imagery ones. It's kinda why I wrote the specification document. Just so in the future if somebody comes in saying its awful who does know and use these kinds of libraries. Will have to understand that it was out of scope and was done on purpose.

July 09, 2015

Re: Initial feedback for std.experimental.image

Posted by Gregor Mückl
in reply to Rikki Cattermole

Gregor Mückl

Posted in reply to Rikki Cattermole

On Thursday, 9 July 2015 at 04:09:11 UTC, Rikki Cattermole wrote:
> On 9/07/2015 6:07 a.m., "Gregor =?UTF-8?B?TcO8Y2tsIg==?= <gregormueckl@gmx.de>" wrote:
>> On Monday, 6 July 2015 at 13:48:53 UTC, Rikki Cattermole wrote:
>>>
>>> Please destroy!
>>>
>>
>> You asked for it! :)
>>
>> As a reference to a library that is used to handle images on a
>> professional level (VFX industry), I'd encourage you to look at the
>> feature set and interfaces of OpenImageIO. Sure, it's a big library and
>> some of it is definitely out of scope for what you try to accomplish
>> (image tile caching and texture sampling, obviously).
>>
>> Yet, there are some features I specifically want to mention here to
>> challenge the scope of your design:
>>
>> - arbitrary channel layouts in images: this is a big one. You mention 3D
>> engines as a targeted use case in the specification. 3D rendering is one
>> of the worst offenders when it comes to crazy channel layouts in
>> textures (which are obviously stored as image files). If you have a data
>> texture that requires 2 channels (e.g. uv offsets for texture lookups in
>> shaders or some crazy data tables), its memory layout should also only
>> ever have two channels. Don't expand it to RGB transparently or anything
>> else braindead. Don't change the data type of the pixel values wildly
>> without being asked to do so. The developer most likely has chosen a 16
>> bit signed integer per channel (or whatever else) for a good reason.
>> Some high end file formats like OpenEXR even allow users to store
>> completely arbitrary channels as well, often with a different
>> per-channel data format (leading to layouts like RGBAZ with an
>> additional mask channel on top). But support for that really bloats
>> image library interfaces. I'd stick with a sane variant of the
>> uncompressed texture formats that the OpenGL specification lists as the
>> target set of supported in-memory image formats. That mostly matches
>> current GPU hardware support and probably will for some time to come.
>
> As long as the color implementation matches isColor from std.experimental.color. Then it's a color. I do not handle that :)
> The rest of how it maps in memory is defined by the image storage types. Any image loader/exporter can use any as long as you specify it via a template argument *currently*.
>

Hm... in that case you introduce transparent mappings between user-facing types and the internal mapping which may be lossy in various ways. This works, but the internal type should be discoverable somehow. This leads down a similar road to OpenGL texture formats: they have internal storage formats and there's the host formats to/from which the data is converted when passing back and forth. This adds a lot of complexity and potential for surprises, unfortunately. I'm not entirely sure what to think here.

>> - window regions: now this not quite your average image format feature,
>> but relevant for some use cases. The gist of it is that the image file
>> may define a coordinate system for a whole image frame but only contain
>> actual data within certain regions that do not cover the whole frame.
>> These regions may even extend beyond the defined image frame (used e.g.
>> for VFX image postprocessing to have properly defined pixel values to
>> filter into the visible part of the final frame). Again, the OpenEXR
>> documentation explains this feature nicely. Again, I think this likely
>> is out of scope for this library.
>
> Ugh based upon what you said, that is out of scope of the image loader/exporters that I'm writing. Also right now it is only using unsigned integers for coordinates. I'm guessing if it is outside the bounds it can go negative then.
> Slightly too specialized for what we need in the general case.
>

Yes, this is a slightly special use case. I can think of quite a lot of cases where you would want border regions of some kind for what you are doing, but they are all related to rendering and image processing.

>> My first point also leads me to this criticism:
>>
>> - I do not see a way to discover the actual data format of a PNG file
>> through your loader. Is it 8 bit palette-based, 8 bit per pixel or 16
>> bits per pixel? Especially the latter should not be transparently
>> converted to 8 bits per pixel if encountered because it is a lossy
>> transformation. As I see it right now you have to know the pixel format
>> up front to instantiate the loader. I consider that bad design. You can
>> only have true knowledge of the file contents after the image header
>> were parsed. The same is generally true of most actually useful image
>> formats out there.
>
> The reasoning is because this is what I know I can work with. You specify what you want to use, it'll auto convert after that. It makes user code a bit simpler.
>

I can understand your reasoning and this is why libraries like FreeImage make it very simple to get the image data converted to the format you want from an arbitrary input. What I'd like to see is more of an extension of the current mechanism: make it possible to query the data format of the image file. That way, the application can make a wiser decision on the format in which it wants to receive the data, but it always is able to get the data in a format it understands. The format description for the file format would have to be quite complex to cover all possibilities, though. The best that I can come up with is a list of tuples of channel names (as strings) and data type (as enums). Processing those isn't fun, though.

>> - Could support for image data alignment be added by defining a new
>> ImageStorage subclass? The actual in-memory data is not exposed to
>> direct access, is it? Access to the raw image data would be preferable
>> for those cases where you know exactly what you are doing. Going through
>> per-pixel access functions for large image regions is going to be
>> dreadfully slow in comparison to what can be achieved with proper
>> processing/filtering code.
>
> I ugh... had this feature once. I removed it as if you already know the implementation why not just directly access it?
> But, if there is genuine need to get access to it as e.g. void* then I can do it again.
>
>> - Also, uploading textures to the GPU requires passing raw memory blocks
>> and a format description of sorts to the 3D API. Being required to
>> slowly copy the image data in question into a temporary buffer for this
>> process is not an adequate solution.
>
> Again for previous answer, was possible. No matter what the image storage type was. But it was hairy and could and would cause bugs in the future. Your probably better off knowing the type and getting access directly to it that way.
>

This is where the abstraction of ImageStorage with several possible implementations becomes iffy. The user is at the loader's mercy to hopefully hand over the right implementation type. I'm not sure I like that idea. This seems inconsistent with making the pixel format the user's choice.  Why should the user have choice over one thing and not the other?



>
> Some very good points that I believe definitely needs to be touched upon where had.
>
> I've had a read of OpenImageIO documentation and all I can say is irkkk.
> Most of what is in there with e.g. tiles and reflection styles methods are out of scope out right as they are a bit specialized for my liking. If somebody wants to add it, take a look at the offset support. It was written as an 'extension' like ForwardRange is for ranges.
>

I mentioned OpenImageIO as this library is full-featured and very complete in a lot of areas. It shows what it takes to be as flexible as possible regarding the image data that is processed. Take it as a catalog of things to consider, but not as template.

> The purpose of this library is to work more for GUI's and games then anything else. It is meant more for the average programmer then specialized in imagery ones. It's kinda why I wrote the specification document. Just so in the future if somebody comes in saying its awful who does know and use these kinds of libraries. Will have to understand that it was out of scope and was done on purpose.

Having a specification is a good thing and this is why I entered the discussion. Although your specification is still a bit vague in my opinion, the general direction is good. The limitation of the scope looks fine to me and I'm not arguing against that. My point is rather that your design can still be improved to match that scope better.

July 09, 2015

Re: Initial feedback for std.experimental.image

Posted by Rikki Cattermole
in reply to Gregor Mückl

Rikki Cattermole

Posted in reply to Gregor Mückl

On 10/07/2015 2:07 a.m., "Gregor =?UTF-8?B?TcO8Y2tsIg==?= <gregormueckl@gmx.de>" wrote:
> On Thursday, 9 July 2015 at 04:09:11 UTC, Rikki Cattermole wrote:
>> On 9/07/2015 6:07 a.m., "Gregor =?UTF-8?B?TcO8Y2tsIg==?=
>> <gregormueckl@gmx.de>" wrote:
>>> On Monday, 6 July 2015 at 13:48:53 UTC, Rikki Cattermole wrote:
>>>>
>>>> Please destroy!
>>>>
>>>
>>> You asked for it! :)
>>>
>>> As a reference to a library that is used to handle images on a
>>> professional level (VFX industry), I'd encourage you to look at the
>>> feature set and interfaces of OpenImageIO. Sure, it's a big library and
>>> some of it is definitely out of scope for what you try to accomplish
>>> (image tile caching and texture sampling, obviously).
>>>
>>> Yet, there are some features I specifically want to mention here to
>>> challenge the scope of your design:
>>>
>>> - arbitrary channel layouts in images: this is a big one. You mention 3D
>>> engines as a targeted use case in the specification. 3D rendering is one
>>> of the worst offenders when it comes to crazy channel layouts in
>>> textures (which are obviously stored as image files). If you have a data
>>> texture that requires 2 channels (e.g. uv offsets for texture lookups in
>>> shaders or some crazy data tables), its memory layout should also only
>>> ever have two channels. Don't expand it to RGB transparently or anything
>>> else braindead. Don't change the data type of the pixel values wildly
>>> without being asked to do so. The developer most likely has chosen a 16
>>> bit signed integer per channel (or whatever else) for a good reason.
>>> Some high end file formats like OpenEXR even allow users to store
>>> completely arbitrary channels as well, often with a different
>>> per-channel data format (leading to layouts like RGBAZ with an
>>> additional mask channel on top). But support for that really bloats
>>> image library interfaces. I'd stick with a sane variant of the
>>> uncompressed texture formats that the OpenGL specification lists as the
>>> target set of supported in-memory image formats. That mostly matches
>>> current GPU hardware support and probably will for some time to come.
>>
>> As long as the color implementation matches isColor from
>> std.experimental.color. Then it's a color. I do not handle that :)
>> The rest of how it maps in memory is defined by the image storage
>> types. Any image loader/exporter can use any as long as you specify it
>> via a template argument *currently*.
>>
>
> Hm... in that case you introduce transparent mappings between
> user-facing types and the internal mapping which may be lossy in various
> ways. This works, but the internal type should be discoverable somehow.
> This leads down a similar road to OpenGL texture formats: they have
> internal storage formats and there's the host formats to/from which the
> data is converted when passing back and forth. This adds a lot of
> complexity and potential for surprises, unfortunately. I'm not entirely
> sure what to think here.

Internal color to an image storage type is well known at compile time.
Now SwappableImage that wraps another image type. That definitely muddies the water a lot. Since it auto converts from the original format. Which could be, you know messy.
It's actually the main reason I asked Manu for a gain/loss precision functions. For detecting if precision is being changed. Mostly for logging purposes.

>>> - window regions: now this not quite your average image format feature,
>>> but relevant for some use cases. The gist of it is that the image file
>>> may define a coordinate system for a whole image frame but only contain
>>> actual data within certain regions that do not cover the whole frame.
>>> These regions may even extend beyond the defined image frame (used e.g.
>>> for VFX image postprocessing to have properly defined pixel values to
>>> filter into the visible part of the final frame). Again, the OpenEXR
>>> documentation explains this feature nicely. Again, I think this likely
>>> is out of scope for this library.
>>
>> Ugh based upon what you said, that is out of scope of the image
>> loader/exporters that I'm writing. Also right now it is only using
>> unsigned integers for coordinates. I'm guessing if it is outside the
>> bounds it can go negative then.
>> Slightly too specialized for what we need in the general case.
>>
>
> Yes, this is a slightly special use case. I can think of quite a lot of
> cases where you would want border regions of some kind for what you are
> doing, but they are all related to rendering and image processing.

You have convinced me that I need to add a subimage struct which is basically SwappableImage. Just with offset/size different to original.

>>> My first point also leads me to this criticism:
>>>
>>> - I do not see a way to discover the actual data format of a PNG file
>>> through your loader. Is it 8 bit palette-based, 8 bit per pixel or 16
>>> bits per pixel? Especially the latter should not be transparently
>>> converted to 8 bits per pixel if encountered because it is a lossy
>>> transformation. As I see it right now you have to know the pixel format
>>> up front to instantiate the loader. I consider that bad design. You can
>>> only have true knowledge of the file contents after the image header
>>> were parsed. The same is generally true of most actually useful image
>>> formats out there.
>>
>> The reasoning is because this is what I know I can work with. You
>> specify what you want to use, it'll auto convert after that. It makes
>> user code a bit simpler.
>>
>
> I can understand your reasoning and this is why libraries like FreeImage
> make it very simple to get the image data converted to the format you
> want from an arbitrary input. What I'd like to see is more of an
> extension of the current mechanism: make it possible to query the data
> format of the image file. That way, the application can make a wiser
> decision on the format in which it wants to receive the data, but it
> always is able to get the data in a format it understands. The format
> description for the file format would have to be quite complex to cover
> all possibilities, though. The best that I can come up with is a list of
> tuples of channel names (as strings) and data type (as enums).
> Processing those isn't fun, though.

The problem here is simple. You must know what color type you are going to be working with. There is no guessing. If you want to change to match the file loader better, you'll have to load it twice and then you have to understand the file format internals a bit more.
This is kinda where it gets messy.

But, would it be better if you could just parse the headers? So it doesn't initialize the image data. I doubt it would be all that hard. It's just disabling a series of features.

>>> - Could support for image data alignment be added by defining a new
>>> ImageStorage subclass? The actual in-memory data is not exposed to
>>> direct access, is it? Access to the raw image data would be preferable
>>> for those cases where you know exactly what you are doing. Going through
>>> per-pixel access functions for large image regions is going to be
>>> dreadfully slow in comparison to what can be achieved with proper
>>> processing/filtering code.
>>
>> I ugh... had this feature once. I removed it as if you already know
>> the implementation why not just directly access it?
>> But, if there is genuine need to get access to it as e.g. void* then I
>> can do it again.
>>
>>> - Also, uploading textures to the GPU requires passing raw memory blocks
>>> and a format description of sorts to the 3D API. Being required to
>>> slowly copy the image data in question into a temporary buffer for this
>>> process is not an adequate solution.
>>
>> Again for previous answer, was possible. No matter what the image
>> storage type was. But it was hairy and could and would cause bugs in
>> the future. Your probably better off knowing the type and getting
>> access directly to it that way.
>>
>
> This is where the abstraction of ImageStorage with several possible
> implementations becomes iffy. The user is at the loader's mercy to
> hopefully hand over the right implementation type. I'm not sure I like
> that idea. This seems inconsistent with making the pixel format the
> user's choice.  Why should the user have choice over one thing and not
> the other?

If the image loader uses another image storage type then it is miss behaving. There is no excuse for it.

Anyway the main thing about this to understand is that if the image loader does not initialize, then it would have to resize and since not all image storage types have to support resizing...

>>
>> Some very good points that I believe definitely needs to be touched
>> upon where had.
>>
>> I've had a read of OpenImageIO documentation and all I can say is irkkk.
>> Most of what is in there with e.g. tiles and reflection styles methods
>> are out of scope out right as they are a bit specialized for my
>> liking. If somebody wants to add it, take a look at the offset
>> support. It was written as an 'extension' like ForwardRange is for
>> ranges.
>>
>
> I mentioned OpenImageIO as this library is full-featured and very
> complete in a lot of areas. It shows what it takes to be as flexible as
> possible regarding the image data that is processed. Take it as a
> catalog of things to consider, but not as template.
>
>> The purpose of this library is to work more for GUI's and games then
>> anything else. It is meant more for the average programmer then
>> specialized in imagery ones. It's kinda why I wrote the specification
>> document. Just so in the future if somebody comes in saying its awful
>> who does know and use these kinds of libraries. Will have to
>> understand that it was out of scope and was done on purpose.
>
> Having a specification is a good thing and this is why I entered the
> discussion. Although your specification is still a bit vague in my
> opinion, the general direction is good. The limitation of the scope
> looks fine to me and I'm not arguing against that. My point is rather
> that your design can still be improved to match that scope better.

Yeah indeed. Any tips for specification document improvement?
I would love to make it standard for Phobos additions like this.

July 09, 2015

Re: Initial feedback for std.experimental.image

Posted by Márcio Martins
in reply to Rikki Cattermole

Márcio Martins

Posted in reply to Rikki Cattermole

On Thursday, 9 July 2015 at 15:05:12 UTC, Rikki Cattermole wrote:
> On 10/07/2015 2:07 a.m., "Gregor =?UTF-8?B?TcO8Y2tsIg==?= <gregormueckl@gmx.de>" wrote:
>> On Thursday, 9 July 2015 at 04:09:11 UTC, Rikki Cattermole wrote:
>>> On 9/07/2015 6:07 a.m., "Gregor =?UTF-8?B?TcO8Y2tsIg==?=
>>> <gregormueckl@gmx.de>" wrote:
>>>> [...]
>>>
>>> As long as the color implementation matches isColor from
>>> std.experimental.color. Then it's a color. I do not handle that :)
>>> The rest of how it maps in memory is defined by the image storage
>>> types. Any image loader/exporter can use any as long as you specify it
>>> via a template argument *currently*.
>>>
>>
>> Hm... in that case you introduce transparent mappings between
>> user-facing types and the internal mapping which may be lossy in various
>> ways. This works, but the internal type should be discoverable somehow.
>> This leads down a similar road to OpenGL texture formats: they have
>> internal storage formats and there's the host formats to/from which the
>> data is converted when passing back and forth. This adds a lot of
>> complexity and potential for surprises, unfortunately. I'm not entirely
>> sure what to think here.
>
> Internal color to an image storage type is well known at compile time.
> Now SwappableImage that wraps another image type. That definitely muddies the water a lot. Since it auto converts from the original format. Which could be, you know messy.
> It's actually the main reason I asked Manu for a gain/loss precision functions. For detecting if precision is being changed. Mostly for logging purposes.
>
>>>> [...]
>>>
>>> Ugh based upon what you said, that is out of scope of the image
>>> loader/exporters that I'm writing. Also right now it is only using
>>> unsigned integers for coordinates. I'm guessing if it is outside the
>>> bounds it can go negative then.
>>> Slightly too specialized for what we need in the general case.
>>>
>>
>> Yes, this is a slightly special use case. I can think of quite a lot of
>> cases where you would want border regions of some kind for what you are
>> doing, but they are all related to rendering and image processing.
>
> You have convinced me that I need to add a subimage struct which is basically SwappableImage. Just with offset/size different to original.
>
>>>> [...]
>>>
>>> The reasoning is because this is what I know I can work with. You
>>> specify what you want to use, it'll auto convert after that. It makes
>>> user code a bit simpler.
>>>
>>
>> I can understand your reasoning and this is why libraries like FreeImage
>> make it very simple to get the image data converted to the format you
>> want from an arbitrary input. What I'd like to see is more of an
>> extension of the current mechanism: make it possible to query the data
>> format of the image file. That way, the application can make a wiser
>> decision on the format in which it wants to receive the data, but it
>> always is able to get the data in a format it understands. The format
>> description for the file format would have to be quite complex to cover
>> all possibilities, though. The best that I can come up with is a list of
>> tuples of channel names (as strings) and data type (as enums).
>> Processing those isn't fun, though.
>
> The problem here is simple. You must know what color type you are going to be working with. There is no guessing. If you want to change to match the file loader better, you'll have to load it twice and then you have to understand the file format internals a bit more.
> This is kinda where it gets messy.
>
> But, would it be better if you could just parse the headers? So it doesn't initialize the image data. I doubt it would be all that hard. It's just disabling a series of features.
>
>>>> [...]
>>>
>>> I ugh... had this feature once. I removed it as if you already know
>>> the implementation why not just directly access it?
>>> But, if there is genuine need to get access to it as e.g. void* then I
>>> can do it again.
>>>
>>>> [...]
>>>
>>> Again for previous answer, was possible. No matter what the image
>>> storage type was. But it was hairy and could and would cause bugs in
>>> the future. Your probably better off knowing the type and getting
>>> access directly to it that way.
>>>
>>
>> This is where the abstraction of ImageStorage with several possible
>> implementations becomes iffy. The user is at the loader's mercy to
>> hopefully hand over the right implementation type. I'm not sure I like
>> that idea. This seems inconsistent with making the pixel format the
>> user's choice.  Why should the user have choice over one thing and not
>> the other?
>
> If the image loader uses another image storage type then it is miss behaving. There is no excuse for it.
>
> Anyway the main thing about this to understand is that if the image loader does not initialize, then it would have to resize and since not all image storage types have to support resizing...
>
>>>
>>> Some very good points that I believe definitely needs to be touched
>>> upon where had.
>>>
>>> I've had a read of OpenImageIO documentation and all I can say is irkkk.
>>> Most of what is in there with e.g. tiles and reflection styles methods
>>> are out of scope out right as they are a bit specialized for my
>>> liking. If somebody wants to add it, take a look at the offset
>>> support. It was written as an 'extension' like ForwardRange is for
>>> ranges.
>>>
>>
>> I mentioned OpenImageIO as this library is full-featured and very
>> complete in a lot of areas. It shows what it takes to be as flexible as
>> possible regarding the image data that is processed. Take it as a
>> catalog of things to consider, but not as template.
>>
>>> The purpose of this library is to work more for GUI's and games then
>>> anything else. It is meant more for the average programmer then
>>> specialized in imagery ones. It's kinda why I wrote the specification
>>> document. Just so in the future if somebody comes in saying its awful
>>> who does know and use these kinds of libraries. Will have to
>>> understand that it was out of scope and was done on purpose.
>>
>> Having a specification is a good thing and this is why I entered the
>> discussion. Although your specification is still a bit vague in my
>> opinion, the general direction is good. The limitation of the scope
>> looks fine to me and I'm not arguing against that. My point is rather
>> that your design can still be improved to match that scope better.
>
> Yeah indeed. Any tips for specification document improvement?
> I would love to make it standard for Phobos additions like this.

Like Gregor, I think it's unreasonable to do any automatic conversions at all without being ask to do. This will greatly reduce the usability of this library.

We need to solve the problem of getting from a file format on disk into a color format in memory. I can get from an image that I have already stored and preprocessed in a format I like, and I want to get it as quickly as possibly into a GPU buffer. Similarly, there are many use cases for an image library that do not touch individual pixels at all, so doing any sort of conversion at load time is basically telling those people to look elsewhere, if they care about efficiency.


The most efficient way is a low-level 2-step interface:
1. Open the file and read headers (open from disk, from a memory buffer, or byte range)
- At this point, users know color format width, and image dimensions, so they can allocate their buffers, check what formats the GPU supports or just otherwise assess if conversion is needed.
2. Decode into a user supplied buffer, potentially with color format conversion, if requested. This is important.

At this point, we have a buffer with known dimensions and color format.
Some very useful manipulations can be achieved without knowing anything about the color format, except for the bit-size. Examples are flipping, rotations by a multiple of PI/2, cropping, etc...

On top of this, one can create all sorts of easy to use functions for all the remaining use cases, but this sort of low level access is really important for any globally useful library. Some users just cannot afford any sort of extra unnecessary copying and or conversions.

I also think we should be able to load all the meta information on demand. This is extremely valuable, but the use-cases are so diverse that it doesn't make sense to implement more than just discovering all this meta-data and letting users do with it what they will.

The most import thing is to get the interface right and lightweight.
If we can get away with no dependencies then it's even better.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation