Jump to page: 1 2
Thread overview
OpenGL: C and D same code - different results
Feb 06, 2007
dyh
Feb 06, 2007
Bill Baxter
Feb 06, 2007
Dave
Feb 07, 2007
Lionello Lunesu
Feb 07, 2007
Bill Baxter
Feb 08, 2007
Lionello Lunesu
Feb 08, 2007
Wolfgang Draxinger
Feb 08, 2007
Wolfgang Draxinger
Feb 08, 2007
Mike Parker
Feb 08, 2007
Wolfgang Draxinger
Feb 08, 2007
Mike Parker
Feb 09, 2007
Wolfgang Draxinger
Feb 12, 2007
dyh
February 06, 2007
recently I've tried few OpenGL examples ported from C to D (on win32) and have run into strange problems.

Translated example was built against same C libs as original, using same (translated from platform sdk) headers. Resulted binary is using same dlls as original. But there are some differences:

1. Behavior
In original calls glIndexi() has *no* effect. In translation has.
In original calls glColor3f() has effect. In translation has *not*.
In original initial color is white. In translation it is kind of brown.

2. Performance
original example performs noticeably faster than translated one.

No matter what compiler switches I've tried (-O, -inline, -release, etc). Example is extremely simple, and i do not see any possibilities to have any difference in performance. There is no GC used - there are no memory allocations array slice at all. Actually there are no array usage. In fact there is nothing at all except opengl api calls. And it is literally same library in both example and translation.

Here is code (~300 lines)
original (C) http://paste.dprogramming.com/dpfwrsgw.php
translation (D) http://paste.dprogramming.com/dpu768pr.php.

Any tips from opengl experts?

February 06, 2007
dyh wrote:
> recently I've tried few OpenGL examples ported from C to D (on win32)
> and have run into strange problems.
> 
> Translated example was built against same C libs as original, using same
> (translated from platform sdk) headers. Resulted binary is using same dlls
> as original. But there are some differences:
> 
> 1. Behavior
> In original calls glIndexi() has *no* effect. In translation has.
> In original calls glColor3f() has effect. In translation has *not*.
> In original initial color is white. In translation it is kind of brown.
> 
> 2. Performance
> original example performs noticeably faster than translated one.
> 
> No matter what compiler switches I've tried (-O, -inline, -release, etc).
> Example is extremely simple, and i do not see any possibilities to have any
> difference in performance. There is no GC used - there are no memory
> allocations array slice at all. Actually there are no array usage. In fact
> there is nothing at all except opengl api calls. And it is literally same
> library in both example and translation.
> 
> Here is code (~300 lines) original (C) http://paste.dprogramming.com/dpfwrsgw.php
> translation (D) http://paste.dprogramming.com/dpu768pr.php.
> 
> Any tips from opengl experts?

I don't know why, but it seems pretty clear from the results you're seeing that the original version didn't actually get the PFD_TYPE_COLORINDEX visual it was asking for, whereas the D version does.  That would explain the performance difference too because a color index visual is probably going to be slow on most modern hardware.

Recent hardware may not even support color index buffers, so it may mean you're getting a fallback software renderer in the D case.

Is there some reason why you really need to use a color index visual? You'd be much better off with a true color visual.

--bb
February 06, 2007
The depth and breadth of experience and knowledge in this NG continues to amaze me...

Bill Baxter wrote:
> dyh wrote:
>> recently I've tried few OpenGL examples ported from C to D (on win32)
>> and have run into strange problems.
>>
>> Translated example was built against same C libs as original, using same
>> (translated from platform sdk) headers. Resulted binary is using same dlls
>> as original. But there are some differences:
>>
>> 1. Behavior
>> In original calls glIndexi() has *no* effect. In translation has.
>> In original calls glColor3f() has effect. In translation has *not*.
>> In original initial color is white. In translation it is kind of brown.
>>
>> 2. Performance
>> original example performs noticeably faster than translated one.
>>
>> No matter what compiler switches I've tried (-O, -inline, -release, etc).
>> Example is extremely simple, and i do not see any possibilities to have any
>> difference in performance. There is no GC used - there are no memory
>> allocations array slice at all. Actually there are no array usage. In fact
>> there is nothing at all except opengl api calls. And it is literally same
>> library in both example and translation.
>>
>> Here is code (~300 lines) original (C) http://paste.dprogramming.com/dpfwrsgw.php
>> translation (D) http://paste.dprogramming.com/dpu768pr.php.
>>
>> Any tips from opengl experts?
> 
> I don't know why, but it seems pretty clear from the results you're seeing that the original version didn't actually get the PFD_TYPE_COLORINDEX visual it was asking for, whereas the D version does.  That would explain the performance difference too because a color index visual is probably going to be slow on most modern hardware.
> 
> Recent hardware may not even support color index buffers, so it may mean you're getting a fallback software renderer in the D case.
> 
> Is there some reason why you really need to use a color index visual? You'd be much better off with a true color visual.
> 
> --bb
February 07, 2007
dyh wrote:
> recently I've tried few OpenGL examples ported from C to D (on win32)
> and have run into strange problems.
> 
> Translated example was built against same C libs as original, using same
> (translated from platform sdk) headers. Resulted binary is using same dlls
> as original. But there are some differences:
> 
> 1. Behavior
> In original calls glIndexi() has *no* effect. In translation has.
> In original calls glColor3f() has effect. In translation has *not*.
> In original initial color is white. In translation it is kind of brown.
> 
> 2. Performance
> original example performs noticeably faster than translated one.
> 
> No matter what compiler switches I've tried (-O, -inline, -release, etc).
> Example is extremely simple, and i do not see any possibilities to have any
> difference in performance. There is no GC used - there are no memory
> allocations array slice at all. Actually there are no array usage. In fact
> there is nothing at all except opengl api calls. And it is literally same
> library in both example and translation.
> 
> Here is code (~300 lines) original (C) http://paste.dprogramming.com/dpfwrsgw.php
> translation (D) http://paste.dprogramming.com/dpu768pr.php.
> 
> Any tips from opengl experts?

No tips, since I'm pretty new at this myself, but I've just finished translating nVidia's glsl_pseudo_instancing* sample from C++ to D, and believe it or not, D's version is actually faster (albeit by a mere 1%).

That really surprised me, since the MSVC 2005 compiler is really good at FPU stuff, plus that whole-program-optimization thing they have.

So I'm confident there must be something else going on in your case. A literal translation to D should not cause big differences. Well, no negative differences anyway ;)

By the way, the C++ version had many uninitialized variables. Thanks, Walter, for making floats default to NaN. It's really easy to track the source of a uninited variable this way!

L.

(*If anybody wants that app, just say it)
February 07, 2007
Lionello Lunesu wrote:
> No tips, since I'm pretty new at this myself, but I've just finished translating nVidia's glsl_pseudo_instancing* sample from C++ to D, and believe it or not, D's version is actually faster (albeit by a mere 1%).
> 
> That really surprised me, since the MSVC 2005 compiler is really good at FPU stuff, plus that whole-program-optimization thing they have.
> 
> So I'm confident there must be something else going on in your case. A literal translation to D should not cause big differences. Well, no negative differences anyway ;)
> 
> By the way, the C++ version had many uninitialized variables. Thanks, Walter, for making floats default to NaN. It's really easy to track the source of a uninited variable this way!
> 
> L.
> 
> (*If anybody wants that app, just say it)


I'd be interested in seeing the code.

--bb
February 08, 2007
Bill Baxter wrote:
> Lionello Lunesu wrote:
>> No tips, since I'm pretty new at this myself, but I've just finished translating nVidia's glsl_pseudo_instancing* sample from C++ to D, and believe it or not, D's version is actually faster (albeit by a mere 1%).
>>
>> That really surprised me, since the MSVC 2005 compiler is really good at FPU stuff, plus that whole-program-optimization thing they have.
>>
>> So I'm confident there must be something else going on in your case. A literal translation to D should not cause big differences. Well, no negative differences anyway ;)
>>
>> By the way, the C++ version had many uninitialized variables. Thanks, Walter, for making floats default to NaN. It's really easy to track the source of a uninited variable this way!
>>
>> L.
>>
>> (*If anybody wants that app, just say it)
> 
> 
> I'd be interested in seeing the code.
> 
> --bb

Thought I'd check the license.txt:

"... Developer agrees not distribute the Materials or any derivative works created therewith without the express written permission of an authorized NVIDIA officer or employee. ..."

:(

I've written them an e-mail...

L.
February 08, 2007
Lionello Lunesu wrote:

> dyh wrote:
>> recently I've tried few OpenGL examples ported from C to D (on win32) and have run into strange problems.
>> 
>> Translated example was built against same C libs as original, using same (translated from platform sdk) headers. Resulted binary is using same dlls as original. But there are some differences:
>> 
>> 1. Behavior
>> In original calls glIndexi() has *no* effect. In translation
>> has. In original calls glColor3f() has effect. In translation
>> has *not*. In original initial color is white. In translation
>> it is kind of brown.>>
>> 2. Performance
>> original example performs noticeably faster than translated
>> one.
>> 
>> No matter what compiler switches I've tried (-O, -inline, -release, etc). Example is extremely simple, and i do not see any possibilities to have any difference in performance. There is no GC used - there are no memory allocations array slice at all. Actually there are no array usage. In fact there is nothing at all except opengl api calls. And it is literally same library in both example and translation.
>> 
>> Here is code (~300 lines)
>> original (C) http://paste.dprogramming.com/dpfwrsgw.php
>> translation (D) http://paste.dprogramming.com/dpu768pr.php.
>> 
>> Any tips from opengl experts?

Yes, me. The problem is that you're using indexed mode. For some reason the non D-example does not get a index colour mode, but a RGB mode.

The OpenGL bindings for D, that I've seen so far circumvent the normally used linkage to the DLL, which is normally happening by specifying the DLL in the Executable header. Instead the D bindings use LoadLibrary and GetProcAddress. Maybe this makes the D version to actually get the index mode.

That also explains, why glColor3f has effect in the C99 example: In index mode glColor doesn't work - period. Instead you must set a palette for your drawable, which is then accessed by the index values. Since you don't set a palette, and glColor doesn't work you will get only white shapes in index mode.

The bad performance is caused by the simple fact, that index mode is no longer supported by modern hardware and must be emulated by the OpenGL software renderer, which is by nature very slow. In fact OpenGL2.0 no longer has indexed color mode. Just don't use it and alwas do things in RGB(A). If you really need an indexed image first render in RGB(A) and dither afterhand.

Wolfgang Draxinger
-- 
E-Mail address works, Jabber: hexarith@jabber.org, ICQ: 134682867

February 08, 2007
Wolfgang Draxinger wrote:

> The OpenGL bindings for D, that I've seen so far circumvent the normally used linkage to the DLL, which is normally happening by specifying the DLL in the Executable header. Instead the D bindings use LoadLibrary and GetProcAddress. Maybe this makes the D version to actually get the index mode.

Got the explanation for that one, too: Modern drivers intercept the linkage on executable load to insert some of their own juice into the code. Mainly to make usage of features like antialiasing and PBuffers more efficient. Naturally getting the functions pointers via LoadLibrary and GetProcAddress will circumvent this and give you only the vanially opengl32.dll which will happyly serve you a software emulated index colour mode.

So instead of loader hacks a OpenGL binding for D should just have a pragma to link against opengl32.lib (on windows) or opengl32.so on *ix and provide the identifiers.

Extension loading should be done by wglGetProcAddress or glxGetProcAddress _after_ a valid OpenGL context has been aquired anyway. Sooner or later I will fork/extend GLEW to GLEW'D (aka a GLEW that creates D code instead of C).

Wolfgang Draxinger
-- 
E-Mail address works, Jabber: hexarith@jabber.org, ICQ: 134682867

February 08, 2007
Wolfgang Draxinger wrote:
> Wolfgang Draxinger wrote:
> 
>> The OpenGL bindings for D, that I've seen so far circumvent the
>> normally used linkage to the DLL, which is normally happening
>> by specifying the DLL in the Executable header. Instead the D
>> bindings use LoadLibrary and GetProcAddress. Maybe this makes
>> the D version to actually get the index mode.
> 
> Got the explanation for that one, too: Modern drivers intercept
> the linkage on executable load to insert some of their own juice
> into the code. Mainly to make usage of features like
> antialiasing and PBuffers more efficient. Naturally getting the
> functions pointers via LoadLibrary and GetProcAddress will
> circumvent this and give you only the vanially opengl32.dll
> which will happyly serve you a software emulated index colour
> mode.


While it's true that when statically linking to the import library on Windows will do some jiggity foo to get the current driver implementation loaded, going through LoadLibrary does not affect this. Every game out there based on the Quake 2 & 3 engines loads dynamically. All of the games using the GarageGame's Torque Game Engine do it. So do the Java games out there using LWJGL (like Tribal Trouble, Bang! Howdy, and the games from PuppyGames.net). Likely several other games I'm not aware of do the same. You can test this with DerelictGL. I've used it several times in testing and always get a hardware-accelerated 32-bit color mode.

Antialiasing is set up during context creation via wgl extensions on Windows. pbuffers are through extensions also.

> 
> Extension loading should be done by wglGetProcAddress or
> glxGetProcAddress _after_ a valid OpenGL context has been
> aquired anyway. Sooner or later I will fork/extend GLEW to
> GLEW'D (aka a GLEW that creates D code instead of C).

Yes, this is an issue on Windows. If the context is not created, you cannot properly load extensions. When you change contexts, there is no guarantee that previously loaded extensions will be valid. That's why extension loading is separated from DLL loading in DerelictGL. It also has a mechanism to reload extensions when you want to switch contexts.

So I thing the OP's problem lies elsewhere. Besides, AFAIK, DerelictGL is the only binding that does go through LoadLibrary. The other OpenGL bindings I've seen all link statically to the import library.
February 08, 2007
Mike Parker wrote:

> While it's true that when statically linking to the import library on Windows will do some jiggity foo to get the current driver implementation loaded, going through LoadLibrary does not affect this. Every game out there based on the Quake 2 & 3 engines loads dynamically.

I think, the main reason for this is, that a few years ago there were different versions of opengl32.dll (the SGI version, the Win9x version and the WinNT - though Win9x and WinNT DLLs are almost identical, only difference is a value in the version resource entry). Not linking with LoadLibrary could have caused some trouble since some compilers were creating the link based on the ordinal number which were not identical to the library installed on the developer's system.

> All of the games using the GarageGame's Torque Game Engine do it. So do the Java games out there using LWJGL (like Tribal Trouble, Bang! Howdy, and the games from PuppyGames.net).

> Likely several other games I'm not aware of do the same.

My engine does this, too. But for another reason: Instead of PE or ELF binaries the modeuls are contained in a custom, platform independent format. Despite OS specific code those modules can be loaded and run on any OS, as long it is running on the architecture, the code was compiled for. So the core game code must be compiled only once for every architecture. And since there has only two architectures remained on which games are played (x86 and x86_64) this means I have to compile only twice for most of the code.

Of course such a custom module format the normal executable binary loader is ignorant of, so there is a a small wrapper, bootstrapping it. All system libraries then must of course be loaded through dlopen/LoadLibrary.

I think the reason, that id's engine also use LoadLibrary is a similair: They all contain some VM that should get access to OpenGL => LoadLibrary. In Java there is no other option.

> You can test this with DerelictGL. I've
> used it several times in testing and always get a
> hardware-accelerated 32-bit color mode.

Well, you will have no problems getting a hardware accelerated mode, since all OpenGL implementations can do this. But if you request a mode, that the driver can't deal with you'll get software emulation. Indexed colour mode is such a mode. Now some drivers intercept this and give the application a "mode not supported" error instead.

> Antialiasing is set up during context creation via wgl extensions on Windows. pbuffers are through extensions also.

Yes, this is true of course, but some drivers use code injection as a workaround to intercept program calls that might cause buggy behaviour when sharing the contexts. I remember of a nasty stencil buffer bug on R300 cards, that was workarounded with that method. Unfortunately the engines you mentioned circumenvent this, rendering Viewports larger than 1600x1200 on a R300 unusable in the first version of the "fixed" drivers. Later versions where injecting some code into the engine's executable itself to fix it.

Wolfgang Draxinger
-- 
E-Mail address works, Jabber: hexarith@jabber.org, ICQ: 134682867

« First   ‹ Prev
1 2