Yeah, those are quite curious. Almost if OGLES2/Nova would be doing some heavy initializations during those "drops" because blended versions are fast. Hopefully Daniel or Hans can check the benchmark at some point.
@Capehill Doing another test, at this time i user 1.15 version of ogles2.library (previous one was 1.14),there was some speedup immprovements, maybe that make the differences. See new benchmark:
INFO: Starting to test renderer called [compositing], flags 0xE INFO: Points [mode: None]...100 frames drawn in 0.028 seconds => 3545.2 frames per second INFO: Points [mode: Blend]...100 frames drawn in 0.026 seconds => 3849.0 frames per second INFO: Points [mode: Add]...100 frames drawn in 0.026 seconds => 3849.9 frames per second INFO: Points [mode: Mod]...100 frames drawn in 0.026 seconds => 3852.5 frames per second INFO: Lines [mode: None]...100 frames drawn in 0.076 seconds => 1310.3 frames per second INFO: Lines [mode: Blend]...100 frames drawn in 1.370 seconds => 73.0 frames per second INFO: Lines [mode: Add]...100 frames drawn in 1.323 seconds => 75.6 frames per second INFO: Lines [mode: Mod]...100 frames drawn in 1.352 seconds => 74.0 frames per second INFO: FillRects [mode: None]...100 frames drawn in 0.160 seconds => 626.6 frames per second INFO: FillRects [mode: Blend]...100 frames drawn in 0.288 seconds => 346.8 frames per second INFO: FillRects [mode: Add]...100 frames drawn in 0.284 seconds => 352.1 frames per second INFO: FillRects [mode: Mod]...100 frames drawn in 0.284 seconds => 352.3 frames per second INFO: RenderCopy [mode: None]...100 frames drawn in 0.022 seconds => 4572.3 frames per second INFO: RenderCopy [mode: Blend]...100 frames drawn in 0.024 seconds => 4167.4 frames per second INFO: RenderCopy [mode: Add]...100 frames drawn in 0.026 seconds => 3788.6 frames per second INFO: [prepareTexture]Failed to set texture blend mode INFO: RenderCopyEx [mode: None]...100 frames drawn in 0.022 seconds => 4600.9 frames per second INFO: RenderCopyEx [mode: Blend]...100 frames drawn in 0.024 seconds => 4175.4 frames per second INFO: RenderCopyEx [mode: Add]...100 frames drawn in 0.026 seconds => 3860.3 frames per second INFO: [prepareTexture]Failed to set texture blend mode INFO: Color modulation [mode: None]...100 frames drawn in 0.105 seconds => 951.7 frames per second INFO: Color modulation [mode: Blend]...100 frames drawn in 0.105 seconds => 947.9 frames per second INFO: Color modulation [mode: Add]...100 frames drawn in 0.106 seconds => 946.9 frames per second INFO: [prepareTexture]Failed to set texture blend mode INFO: Alpha modulation [mode: None]...100 frames drawn in 0.022 seconds => 4564.3 frames per second INFO: Alpha modulation [mode: Blend]...100 frames drawn in 0.026 seconds => 3780.1 frames per second INFO: Alpha modulation [mode: Add]...100 frames drawn in 0.026 seconds => 3782.9 frames per second INFO: [prepareTexture]Failed to set texture blend mode INFO: UpdateTexture [mode: None]...100 frames drawn in 0.030 seconds => 3286.6 frames per second, 3418.0 operations per second INFO: UpdateTexture [mode: Blend]...100 frames drawn in 0.035 seconds => 2856.7 frames per second, 2970.9 operations per second INFO: UpdateTexture [mode: Add]...100 frames drawn in 0.035 seconds => 2852.1 frames per second, 2966.2 operations per second INFO: [prepareTexture]Failed to set texture blend mode INFO: ReadPixels [mode: None]...0 frames drawn in 1.521 seconds => 0.0 frames per second, 65.7 operations per second INFO: ReadPixels [mode: Blend]...0 frames drawn in 1.661 seconds => 0.0 frames per second, 60.2 operations per second INFO: ReadPixels [mode: Add]...0 frames drawn in 1.516 seconds => 0.0 frames per second, 66.0 operations per second INFO: [prepareTexture]Failed to set texture blend mode
INFO: Starting to test renderer called [opengl], flags 0x2 INFO: Points [mode: None]...100 frames drawn in 0.038 seconds => 2602.2 frames per second INFO: Points [mode: Blend]...100 frames drawn in 0.039 seconds => 2590.5 frames per second INFO: Points [mode: Add]...100 frames drawn in 0.039 seconds => 2589.9 frames per second INFO: Points [mode: Mod]...100 frames drawn in 0.039 seconds => 2594.1 frames per second INFO: Lines [mode: None]...100 frames drawn in 0.046 seconds => 2159.4 frames per second INFO: Lines [mode: Blend]...100 frames drawn in 0.054 seconds => 1847.4 frames per second INFO: Lines [mode: Add]...100 frames drawn in 0.054 seconds => 1858.0 frames per second INFO: Lines [mode: Mod]...100 frames drawn in 0.054 seconds => 1853.4 frames per second INFO: FillRects [mode: None]...100 frames drawn in 0.207 seconds => 484.0 frames per second INFO: FillRects [mode: Blend]...100 frames drawn in 0.226 seconds => 441.6 frames per second INFO: FillRects [mode: Add]...100 frames drawn in 0.227 seconds => 440.6 frames per second INFO: FillRects [mode: Mod]...100 frames drawn in 0.227 seconds => 440.7 frames per second INFO: RenderCopy [mode: None]...100 frames drawn in 0.047 seconds => 2108.3 frames per second INFO: RenderCopy [mode: Blend]...100 frames drawn in 0.047 seconds => 2124.4 frames per second INFO: RenderCopy [mode: Add]...100 frames drawn in 0.047 seconds => 2121.4 frames per second INFO: RenderCopy [mode: Mod]...100 frames drawn in 0.047 seconds => 2120.9 frames per second INFO: RenderCopyEx [mode: None]...100 frames drawn in 0.052 seconds => 1937.9 frames per second INFO: RenderCopyEx [mode: Blend]...100 frames drawn in 0.046 seconds => 2152.3 frames per second INFO: RenderCopyEx [mode: Add]...100 frames drawn in 0.047 seconds => 2134.1 frames per second INFO: RenderCopyEx [mode: Mod]...100 frames drawn in 0.047 seconds => 2127.8 frames per second INFO: Color modulation [mode: None]...100 frames drawn in 0.041 seconds => 2442.2 frames per second INFO: Color modulation [mode: Blend]...100 frames drawn in 0.041 seconds => 2458.9 frames per second INFO: Color modulation [mode: Add]...100 frames drawn in 0.041 seconds => 2463.7 frames per second INFO: Color modulation [mode: Mod]...100 frames drawn in 0.041 seconds => 2443.6 frames per second INFO: Alpha modulation [mode: None]...100 frames drawn in 0.047 seconds => 2115.6 frames per second INFO: Alpha modulation [mode: Blend]...100 frames drawn in 0.047 seconds => 2114.7 frames per second INFO: Alpha modulation [mode: Add]...100 frames drawn in 0.047 seconds => 2117.6 frames per second INFO: Alpha modulation [mode: Mod]...100 frames drawn in 0.048 seconds => 2103.0 frames per second INFO: UpdateTexture [mode: None]...100 frames drawn in 0.056 seconds => 1777.4 frames per second, 1848.5 operations per second INFO: UpdateTexture [mode: Blend]...100 frames drawn in 0.060 seconds => 1677.9 frames per second, 1745.0 operations per second INFO: UpdateTexture [mode: Add]...100 frames drawn in 0.056 seconds => 1788.0 frames per second, 1859.6 operations per second INFO: UpdateTexture [mode: Mod]...100 frames drawn in 0.061 seconds => 1649.5 frames per second, 1715.5 operations per second INFO: ReadPixels [mode: None]...0 frames drawn in 1.853 seconds => 0.0 frames per second, 54.0 operations per second INFO: ReadPixels [mode: Blend]...0 frames drawn in 1.806 seconds => 0.0 frames per second, 55.4 operations per second INFO: ReadPixels [mode: Add]...0 frames drawn in 1.810 seconds => 0.0 frames per second, 55.3 operations per second INFO: ReadPixels [mode: Mod]...0 frames drawn in 1.812 seconds => 0.0 frames per second, 55.2 operations per second
INFO: Starting to test renderer called [opengles2], flags 0xA INFO: Points [mode: None]...100 frames drawn in 0.081 seconds => 1233.0 frames per second INFO: Points [mode: Blend]...100 frames drawn in 0.022 seconds => 4614.7 frames per second INFO: Points [mode: Add]...100 frames drawn in 0.021 seconds => 4676.0 frames per second INFO: Points [mode: Mod]...100 frames drawn in 0.022 seconds => 4612.8 frames per second INFO: Lines [mode: None]...100 frames drawn in 0.032 seconds => 3097.9 frames per second INFO: Lines [mode: Blend]...100 frames drawn in 0.041 seconds => 2442.2 frames per second INFO: Lines [mode: Add]...100 frames drawn in 0.037 seconds => 2736.3 frames per second INFO: Lines [mode: Mod]...100 frames drawn in 0.036 seconds => 2759.5 frames per second INFO: FillRects [mode: None]...100 frames drawn in 0.118 seconds => 844.9 frames per second INFO: FillRects [mode: Blend]...100 frames drawn in 0.119 seconds => 843.2 frames per second INFO: FillRects [mode: Add]...100 frames drawn in 0.119 seconds => 841.0 frames per second INFO: FillRects [mode: Mod]...100 frames drawn in 0.119 seconds => 841.4 frames per second INFO: RenderCopy [mode: None]...100 frames drawn in 0.029 seconds => 3392.9 frames per second INFO: RenderCopy [mode: Blend]...100 frames drawn in 0.021 seconds => 4657.0 frames per second INFO: RenderCopy [mode: Add]...100 frames drawn in 0.021 seconds => 4673.3 frames per second INFO: RenderCopy [mode: Mod]...100 frames drawn in 0.022 seconds => 4636.3 frames per second INFO: RenderCopyEx [mode: None]...100 frames drawn in 0.029 seconds => 3484.2 frames per second INFO: RenderCopyEx [mode: Blend]...100 frames drawn in 0.030 seconds => 3337.1 frames per second INFO: RenderCopyEx [mode: Add]...100 frames drawn in 0.029 seconds => 3441.9 frames per second INFO: RenderCopyEx [mode: Mod]...100 frames drawn in 0.033 seconds => 3019.6 frames per second INFO: Color modulation [mode: None]...100 frames drawn in 0.023 seconds => 4309.6 frames per second INFO: Color modulation [mode: Blend]...100 frames drawn in 0.023 seconds => 4264.2 frames per second INFO: Color modulation [mode: Add]...100 frames drawn in 0.023 seconds => 4321.9 frames per second INFO: Color modulation [mode: Mod]...100 frames drawn in 0.023 seconds => 4308.1 frames per second INFO: Alpha modulation [mode: None]...100 frames drawn in 0.029 seconds => 3460.8 frames per second INFO: Alpha modulation [mode: Blend]...100 frames drawn in 0.030 seconds => 3357.6 frames per second INFO: Alpha modulation [mode: Add]...100 frames drawn in 0.029 seconds => 3446.1 frames per second INFO: Alpha modulation [mode: Mod]...100 frames drawn in 0.030 seconds => 3353.6 frames per second INFO: UpdateTexture [mode: None]...100 frames drawn in 0.031 seconds => 3218.2 frames per second, 3347.0 operations per second INFO: UpdateTexture [mode: Blend]...100 frames drawn in 0.032 seconds => 3144.2 frames per second, 3269.9 operations per second INFO: UpdateTexture [mode: Add]...100 frames drawn in 0.031 seconds => 3211.8 frames per second, 3340.3 operations per second INFO: UpdateTexture [mode: Mod]...100 frames drawn in 0.032 seconds => 3124.0 frames per second, 3249.0 operations per second INFO: ReadPixels [mode: None]...0 frames drawn in 2.090 seconds => 0.0 frames per second, 47.9 operations per second INFO: ReadPixels [mode: Blend]...0 frames drawn in 1.921 seconds => 0.0 frames per second, 52.1 operations per second INFO: ReadPixels [mode: Add]...0 frames drawn in 2.015 seconds => 0.0 frames per second, 49.6 operations per second INFO: ReadPixels [mode: Mod]...0 frames drawn in 2.062 seconds => 0.0 frames per second, 48.5 operations per second
INFO: Starting to test renderer called [software], flags 0x9 INFO: Points [mode: None]...100 frames drawn in 0.362 seconds => 276.4 frames per second INFO: Points [mode: Blend]...100 frames drawn in 0.354 seconds => 282.3 frames per second INFO: Points [mode: Add]...100 frames drawn in 0.354 seconds => 282.5 frames per second INFO: Points [mode: Mod]...100 frames drawn in 0.354 seconds => 282.5 frames per second INFO: Lines [mode: None]...100 frames drawn in 0.400 seconds => 249.9 frames per second INFO: Lines [mode: Blend]...100 frames drawn in 0.502 seconds => 199.1 frames per second INFO: Lines [mode: Add]...100 frames drawn in 0.502 seconds => 199.2 frames per second INFO: Lines [mode: Mod]...100 frames drawn in 0.501 seconds => 199.5 frames per second INFO: FillRects [mode: None]...100 frames drawn in 0.657 seconds => 152.3 frames per second INFO: FillRects [mode: Blend]...100 frames drawn in 3.575 seconds => 28.0 frames per second INFO: FillRects [mode: Add]...100 frames drawn in 3.000 seconds => 33.3 frames per second INFO: FillRects [mode: Mod]...100 frames drawn in 3.105 seconds => 32.2 frames per second INFO: RenderCopy [mode: None]...100 frames drawn in 0.454 seconds => 220.4 frames per second INFO: RenderCopy [mode: Blend]...100 frames drawn in 0.701 seconds => 142.7 frames per second INFO: RenderCopy [mode: Add]...100 frames drawn in 0.642 seconds => 155.7 frames per second INFO: RenderCopy [mode: Mod]...100 frames drawn in 0.643 seconds => 155.4 frames per second INFO: RenderCopyEx [mode: None]...100 frames drawn in 1.305 seconds => 76.6 frames per second INFO: RenderCopyEx [mode: Blend]...100 frames drawn in 1.280 seconds => 78.1 frames per second INFO: RenderCopyEx [mode: Add]...100 frames drawn in 1.741 seconds => 57.4 frames per second INFO: RenderCopyEx [mode: Mod]...100 frames drawn in 1.758 seconds => 56.9 frames per second INFO: Color modulation [mode: None]...100 frames drawn in 0.412 seconds => 242.6 frames per second INFO: Color modulation [mode: Blend]...100 frames drawn in 0.423 seconds => 236.2 frames per second INFO: Color modulation [mode: Add]...100 frames drawn in 0.422 seconds => 236.8 frames per second INFO: Color modulation [mode: Mod]...100 frames drawn in 0.421 seconds => 237.7 frames per second INFO: Alpha modulation [mode: None]...100 frames drawn in 0.544 seconds => 183.8 frames per second INFO: Alpha modulation [mode: Blend]...100 frames drawn in 0.773 seconds => 129.4 frames per second INFO: Alpha modulation [mode: Add]...100 frames drawn in 0.728 seconds => 137.3 frames per second INFO: Alpha modulation [mode: Mod]...100 frames drawn in 0.663 seconds => 150.7 frames per second INFO: UpdateTexture [mode: None]...100 frames drawn in 0.581 seconds => 172.1 frames per second, 179.0 operations per second INFO: UpdateTexture [mode: Blend]...100 frames drawn in 0.778 seconds => 128.6 frames per second, 133.7 operations per second INFO: UpdateTexture [mode: Add]...100 frames drawn in 0.725 seconds => 138.0 frames per second, 143.5 operations per second INFO: UpdateTexture [mode: Mod]...100 frames drawn in 0.697 seconds => 143.5 frames per second, 149.2 operations per second INFO: ReadPixels [mode: None]...0 frames drawn in 0.029 seconds => 0.0 frames per second, 3424.7 operations per second INFO: ReadPixels [mode: Blend]...0 frames drawn in 0.029 seconds => 0.0 frames per second, 3496.3 operations per second INFO: ReadPixels [mode: Add]...0 frames drawn in 0.029 seconds => 0.0 frames per second, 3508.6 operations per second INFO: ReadPixels [mode: Mod]...0 frames drawn in 0.028 seconds => 0.0 frames per second, 3582.3 operations per second INFO: Bye bye
@Capehill In new one all looks better (everything faster), just first one Points [mode: None], seems still slower. Will try to retest few times in a row and play with 1.14 and 1.15 versions, just to be sure that tests always kind of the same, and that newer version of library make it faster.
Looks better. Perhaps slowdown is related somehow to shader program switch or such. To be honest I'm not familiar what OGLES2 renderer exactly does since I haven't been able to test it myself anyway.
im sorry i dont have for now nova, just made too much shopping and now im out of budget :( will do asap when i will have extra bills for take the enancer pack
@Capehill Second run give me 1372 on the Points [mode: None], third run give me same 1372. Then doing reboot, run it again, and have on the same test 1233 (as previously it was after reboot), second run give me again 1372.
Then, for sake of tests i tested previous 1.14 version of ogles2.library, and indeed, it was much slower in that test. Seems that way initialisation done is much better in 1.15.
So, from that we can say that:
a) 1.15 version of ogles2.library is indeed faster,expectually in the first test (probably when initialisation done)
b). for initialiation it drop first value a bit (from 1372 till 1233), not that much. So probably something else there which make it be slower.
Need to say something ... i have a special version of dosbox, it using sdl2 and not sdl but facing something strange here it freezing the machine totally. i dont know if it is an sdl2 issue or warp3dSI issue or both, because this was not happening when i was not using w3dsi
i can gave binary but sources was ported by Michael Trebilcock. who made a good work on my configuration dosbox is fast like under linux. if we dont count the freezing . about serial nothing here i have a freeze
@Capehill probably found the issue it freeze after a shell message:using driver compositing for render. this message exit only on some programs, probably some games that access in specific vga mode.
Kas1e just informed me about this thread. Good to see somebody does something with ogles2
So far I have no explanation for that performance drop with "points mode: none". Note however that I didn't actually try your test by myself yet.
I quickly took a peek into the SDL2 source though and immediately noticed one potentially huge useless performance eater:
in the file "SDL_shaders_gles2.c", line 51: "gl_PointSize = 1.0;" For god's sake: remove it It's completely useless and is quite a performance killer, removing it can probably speed up the vertex-shader by factor 1.5
Some other notes (and again, remember I didn't check it out myself yet):
depending on the actual number of vertices etc. you are drawing per frame, you probably don't really measure the actual potential 3D performance. I mean, if you render real fat stuff it should quickly perform much faster than "only" around 2x as fast as the others
Btw.: I also have no explanation why 1.15 would perform significantly faster than 1.14. Yes, the new version contains a huge speedup, but that's focused on one single function that actually shouldn't be called often, usually only on startup: glCompile. Such calls are muuuuuuch faster now indeed.
Sure, this isn't a 3D benchmark. Anything SDL_renderer is 2D.
Could you clarify this gl_PointSize thing a bit more? I checked a few sources and it looks like we are in undefined zone if we leave that out. I have made a special test package where definition was removed: http://capehill.kapsi.fi/sdl2/
@Capehill Technically everything is "3D" in case of ogl-whatever
Yes, regarding the point-size stuff:
what I meant (and what I actually told kas1e before positing here, maybe I should have copy'n'pasted the mail :P ) is to remove it for the vertex-shader you *don't* use for point-rendering, of course.
So you should provide two vertex shaders, one for rendering if point-drawing is requested (that would be the one containing the gl_PointSize line) and one for everything else (triangles, tri-strips, etc.), without that line.
So the test package is doing undefined stuff when you render points now indeed, sorry (maybe it even works but it's undefined behaviour nevertheless). However, it should be good enough to see if it truely makes a measurable difference for other types of geometry.
I.e. in all modes slower on 3-4 fps, but in 1600x1200 even a bit faster.
Probably that can be because of shell-output window (in older ioquake there is none).
And sdl2 based one have good differences between old one: in full screen modes i didnt' have "tearing" like broken vsync, which happens from time to time when i type "timedemo".
I took another look at the SDL2 sources and I suppose I found the reason for the "points mode:none" drop with ogles2. It looks as if you're measuring the initial shader-compilation here simply because it's the first render-test.
GLES2_RenderDrawPoints calls GLES2_SetDrawingState, which in turn calls GLES2_SelectProgram, which checks the shader-program cache. Since it won't find anything there at this time it will compile and compose the required shader-prog then.
This process has been massively sped up in ogles2 1.15 (and the netto time wasted is very low indeed) but in such a tiny benchmark it has huge percentual influence in the final number.
With the other one that drops, "RenderCopy mode:none", it's most likely just the same: in the first frame of that respective render-test pretty much the same things happen, this time it's the very first draw-call that uses a texture and therefore it's the very first time that a fragment shader with texture-support is being requested, which has to be compiled and linked to a program first first.
So, if you change the benchmark prog to not consider the first frame renderered, the numbers should look as to be expected
p.s.: something else I noticed. In GLES2_RenderDrawPoints and GLES2_RenderDrawLines the input-SDL_FPoints aren't used directly but a temporary array is being created, all points are copied and offset by 0.5, then this temporary coord-array is being used for rendering. If you extend the ogles2-render-path to use more appropriate vertex-shaders for the respective type of primitives instead of one-for-all, then this should be removed and the SDL_FPoint-input-array should be used directly. The 0.5-offsetting can be done in the dedicated points/lines vertex shader then (alternatively you could patch the projection-matrix for those types of primitives).
p.p.s: and some more regarding dedicated vertex shaders: the current shader always does *lots* of computations that are *only* used by RenderCopyEx. Everything related to rotation is useless for all other draw-commands. Computations on center and angle, involving matrix setup, deg-to-rad-conversion, cos/sin, some adds, all just wasted time. So if / when you modify the shader-management subsystem of SDL2 to use different kinds of vertex shaders you should also take this into account, it won't hurt for sure.
Edited by Daytona675x on 2017/6/3 6:55:19 Edited by Daytona675x on 2017/6/3 7:05:09 Edited by Daytona675x on 2017/6/3 7:09:37