Login
Username:

Password:

Remember me



Lost Password?

Register now!
Sections
Who's Online
54 user(s) are online (48 user(s) are browsing Forums)

Members: 0
Guests: 54

more...
Support us!
Recent OS4 Files
OS4Depot.net



« 1 2 3 4 (5) 6 7 »


Re: Porting apitrace
Quite a regular
Joined:
2007/7/14 20:30
From Lothric
Posts: 713
@kas1e

Quote:

Another idea which maybe you find worth of adding : adding something like profiler-only flag to command line , so, while all the functions will be patches and counted as usuall, log will not be writen anything, but only final profiling information.


The profiling is not filtered by pause feature at the moment - you should get the total number of calls at the end. I'm pondering whether it should be. So if you hit the pause, you should still get the profiling results.

Regarding filters, I need to check, can't answer right now.

   Report Go to top

Re: Porting apitrace
Home away from home
Joined:
2007/9/11 11:31
From Russia
Posts: 5324
@Capehill
Tested latest commit, there is results:

1). all works (compiles/builds/runs/no_crashes/etc).
2). new added functions for ogles2: glClear,glUseProgram and for Warp3DNova : W3DN_Clear, W3DN_CompileShader, W3DN_BindTexture, W3DN_Submit and W3DN_SetShaderPipeline all traces fine too.
3). Profiling now logs and when quit from gl app, and when quit from glsnoop
4). New "Profile" option save the speed, yeah! I can now play almost without loosing original speed, and profile whole app in any place i need.


Pretty cool :)


Through, found some more new little issues:

1). if you run glsnoop without "PROFILE" key, it say in output "Profiling mode: [disabled]" , while it didn't disabled, it's enabled, just together with tracing. So imho this one should be set as Enabled as it enabled by default together with tracing. Its just when we use only Profiling, should be "Profiling mode: [enalbed]" as it now, and maybe "Tracing mode: [disabled]".

2). if you run glsnoop with "?" to just see help, and then hit "Ctrl+c" (so to not run glSnoop, but exit back to command line), then hit enter, then we have:

*** BREAK
***Command 'glSnoop' returnedwith unfreed signals 60000000!

And next run will says "glSnoop already running".

3). If you run glsnoop as "glsnoop GUI PROFILE" , and hit "pause", and then trace, then, whole tracing enables again, not only PROFILE.


About 1) i added that info to the ticket, about 2) and 3) is there needs for tickets ?

_________________
Join us to improve dopus5!
zerohero's mirror of os4/os3 crosscompiler suites
   Report Go to top

Re: Porting apitrace
Home away from home
Joined:
2007/1/26 21:48
From New Zealand
Posts: 2178
Hey guys, great to see glsnoop progressing and becoming an increasingly powerful graphics debugging tool.

A few quick suggestions for the profiler:
- It would be useful to print the draw calls/s and draw calls/frame values instead of having to calculate those manually
- It might also be worth adding a column for the percentage of total time. It's easy to make the mistake of thinking "wow!" it's spending 80% of its time in DrawElements(), when it's actually 80% of 58.9% spent in the gles2 lib + Nova.

EDIT: Oh, and number of vertices/s and number of primitives/s (triangles/s, points/s, lines/s, etc.) would be useful too.

Hans


Edited by Hans on 2019/7/22 5:44:56
_________________
http://hdrlab.org.nz/ - Amiga OS 4 projects, programming articles and more.
https://keasigmadelta.com/ - more of my work
   Report Go to top

Re: Porting apitrace
Home away from home
Joined:
2007/9/11 11:31
From Russia
Posts: 5324
@Hans, Capehill

Quote:

It might also be worth adding a column for the percentage of total time. It's easy to make the mistake of thinking "wow!" it's spending 80% of its time in DrawElements(), when it's actually 80% of 58.9% spent in the gles2 lib + Nova.


Yeah, for now its easy to made a mistake, as it not that clear that its not whole real %, but one with part from another.

And probably will be worth to add to the Profiling info at end some more info like :

"draw commands per frame: XXX" , and taking XXX value from getting final numbers of glDrawArrays + glDrawElements + any_other glDrawXXX and divided on number of aglSwapBuffers calls.


Edited by kas1e on 2019/7/22 5:00:31
_________________
Join us to improve dopus5!
zerohero's mirror of os4/os3 crosscompiler suites
   Report Go to top

Re: Porting apitrace
Quite a regular
Joined:
2007/7/14 20:30
From Lothric
Posts: 713
@kas1e

1) tracing vs. profiling: need to improve that bit
2) unfreed signals should be fixed
3) I thought about ghosting the buttons but didn't do anything yet

I try to fix the remaining things, if you don't see any progress then create all the needed tickets.

@Hans

- added draw calls / frame. TODO: add also draw calls / second
- added relative %
- TODO: number of primitives


   Report Go to top

Re: Porting apitrace
Home away from home
Joined:
2007/9/11 11:31
From Russia
Posts: 5324
@Capehill
Thanks ! Tested latest build, and there is example how it all looks like now (STK, playing one track):

OpenGL ES 2.0 profiling results for Shell Process 'supertuxkart_gl4es_1915':
--------------------------------------------------------
Total recorded duration 47195.816137 ms46.94 of total context life-time 100551.372218 ms
Drawcalls 
(glDraw*) per frame 565.925512
Frames 
(buffer swapsper second 16.110000
-> DrawElements callcount 62262duration 23267.922445 milliseconds49.30 of recorded time (23.14 of total context life-time)
-> 
DrawArrays callcount 849444duration 19443.886474 milliseconds41.20 of recorded time (19.34 of total context life-time)
-> 
Clear callcount 6697duration 1674.416850 milliseconds3.55 of recorded time (1.67 of total context life-time)
-> 
CompileShader callcount 36duration 1211.990055 milliseconds2.57 of recorded time (1.21 of total context life-time)
-> 
VertexAttribPointer callcount 2792074duration 781.801620 milliseconds1.66 of recorded time (0.78 of total context life-time)
-> 
TexImage2D callcount 1256duration 371.775033 milliseconds0.79 of recorded time (0.37 of total context life-time)
-> 
UseProgram callcount 40110duration 316.792758 milliseconds0.67 of recorded time (0.32 of total context life-time)
-> 
BindTexture callcount 46597duration 67.699122 milliseconds0.14 of recorded time (0.07 of total context life-time)
-> 
SwapBuffers callcount 1611duration 50.966556 milliseconds0.11 of recorded time (0.05 of total context life-time)
-> 
EnableVertexAttribArray callcount 27326duration 4.514256 milliseconds0.01 of recorded time (0.00 of total context life-time)
-> 
GenTextures callcount 142duration 2.333761 milliseconds0.00 of recorded time (0.00 of total context life-time)
-> 
DeleteTextures callcount 142duration 1.032843 milliseconds0.00 of recorded time (0.00 of total context life-time)
-> 
TexParameteri callcount 426duration 0.429242 milliseconds0.00 of recorded time (0.00 of total context life-time)
-> 
ShaderSource callcount 36duration 0.255123 milliseconds0.00 of recorded time (0.00 of total context life-time)
--------------------------------------------------------

Warp3D Nova profiling results for Shell Process 'supertuxkart_gl4es_1915':
--------------------------------------------------------
Total recorded duration 29230.469784 ms29.50 of total context life-time 99092.776356 ms
-> BufferUnlock callcount 988562duration 11852.596824 milliseconds40.55 of recorded time (11.96 of total context life-time)
-> 
DrawArrays callcount 849444duration 5851.102619 milliseconds20.02 of recorded time (5.90 of total context life-time)
-> 
DrawElements callcount 62262duration 4598.397843 milliseconds15.73 of recorded time (4.64 of total context life-time)
-> 
Submit callcount 102350duration 2036.381602 milliseconds6.97 of recorded time (2.06 of total context life-time)
-> 
Clear callcount 6697duration 1583.779083 milliseconds5.42 of recorded time (1.60 of total context life-time)
-> 
CompileShader callcount 36duration 1111.113967 milliseconds3.80 of recorded time (1.12 of total context life-time)
-> 
VBOSetArray callcount 2927999duration 906.110158 milliseconds3.10 of recorded time (0.91 of total context life-time)
-> 
VBOLock callcount 935958duration 849.793359 milliseconds2.91 of recorded time (0.86 of total context life-time)
-> 
CreateVertexBufferObject callcount 858duration 203.448530 milliseconds0.70 of recorded time (0.21 of total context life-time)
-> 
BindVertexAttribArray callcount 2897608duration 144.847536 milliseconds0.50 of recorded time (0.15 of total context life-time)
-> 
DestroyVertexBufferObject callcount 858duration 75.912419 milliseconds0.26 of recorded time (0.08 of total context life-time)
-> 
BindTexture callcount 51488duration 8.746882 milliseconds0.03 of recorded time (0.01 of total context life-time)
-> 
SetShaderPipeline callcount 40111duration 4.861852 milliseconds0.02 of recorded time (0.00 of total context life-time)
-> 
Destroy callcount 1duration 3.291454 milliseconds0.01 of recorded time (0.00 of total context life-time)
-> 
FBBindBuffer callcount 2duration 0.064563 milliseconds0.00 of recorded time (0.00 of total context life-time)
-> 
DestroyFrameBuffer callcount 2duration 0.017965 milliseconds0.00 of recorded time (0.00 of total context life-time)
-> 
CreateFrameBuffer callcount 1duration 0.002727 milliseconds0.00 of recorded time (0.00 of total context life-time)
-> 
SetRenderTarget callcount 2duration 0.000401 milliseconds0.00 of recorded time (0.00 of total context life-time)
--------------------------------------------------------

_________________
Join us to improve dopus5!
zerohero's mirror of os4/os3 crosscompiler suites
   Report Go to top

Re: Porting apitrace
Home away from home
Joined:
2007/9/11 11:31
From Russia
Posts: 5324
@Capehill
Do you think its worth for making a little .guide ? (press open image in another tab for fullsize):

Resized Image

I almost finish some initial version and can upload it, so you can add it and improve when/if need it (and fixing broken english:) ). If you find it interesting/good to have of course.

ps. seeing new logs, maybe it worth to replace everywhere "milliseconds" words, on "ms" , so to short the logs for better look ?

_________________
Join us to improve dopus5!
zerohero's mirror of os4/os3 crosscompiler suites
   Report Go to top

Re: Porting apitrace
Quite a regular
Joined:
2007/7/14 20:30
From Lothric
Posts: 713
@kas1e

Looking good. Do you think version number should follow glSnoop's? I guess it's about 0.2 now.

By the way, I tried to make summary tables more clear and less repetitive.

   Report Go to top

Re: Porting apitrace
Home away from home
Joined:
2007/9/11 11:31
From Russia
Posts: 5324
@Capehill
Quote:

Looking good. Do you think version number should follow glSnoop's? I guess it's about 0.2 now.


Yeah, sure, should follow, will made it 0.2 as well of course.

Quote:

By the way, I tried to make summary tables more clear and less repetitive.


Oh yeah, i like it pretty much now, much better and clean:

OpenGL ES 2.0 profiling results for Shell Process 'neverball':
--------------------------------------------------------
Total recorded duration 4934.309821 ms87.04 of total context life-time 5669.193327 ms
Draw calls 
(glDraw*) per frame 284.040000Draw calls per second 1252.584269
Frames 
(buffer swapsper second 4.409887
                      
function | call count |        duration (ms) |   % of recorded time |   % of total context life-time
                 CompileShader 
|         52 |          4405.388419 |                89.28 |                          77.71
                  DrawElements 
|       1550 |           247.808558 |                 5.02 |                           4.37
                    DrawArrays 
|       5551 |           233.863456 |                 4.74 |                           4.13
                    TexImage2D 
|         85 |            28.640093 |                 0.58 |                           0.51
                    UseProgram 
|        801 |             7.178811 |                 0.15 |                           0.13
                         Clear 
|         26 |             6.094438 |                 0.12 |                           0.11
                   BindTexture 
|       1735 |             1.420059 |                 0.03 |                           0.03
                   GenTextures 
|         85 |             1.025384 |                 0.02 |                           0.02
                   SwapBuffers 
|         25 |             0.880780 |                 0.02 |                           0.02
                DeleteTextures 
|         85 |             0.581345 |                 0.01 |                           0.01
           VertexAttribPointer 
|       2826 |             0.572042 |                 0.01 |                           0.01
                  ShaderSource 
|         52 |             0.476320 |                 0.01 |                           0.01
                 ActiveTexture 
|        698 |             0.178089 |                 0.00 |                           0.00
                 TexParameteri 
|        288 |             0.143121 |                 0.00 |                           0.00
       EnableVertexAttribArray 
|        428 |             0.058908 |                 0.00 |                           0.00
Primitive statistics 
(raw vertex counts):
  
Triangles 1448700per second 255544.124814per draw call 204.013519
  Triangle strips 26004
per second 4586.987935per draw call 3.662019
  Points 15200
per second 2681.211222per draw call 2.140544
--------------------------------------------------------


Warp3D Nova profiling results for Shell Process 'neverball':
--------------------------------------------------------
Total recorded duration 4606.608453 ms83.94 of total context life-time 5488.296868 ms
Draw calls per second 1293.869947
                      
function | call count |        duration (ms) |   % of recorded time |   % of total context life-time
                 CompileShader 
|         52 |          4205.243133 |                91.29 |                          76.62
                  BufferUnlock 
|      12342 |           155.239082 |                 3.37 |                           2.83
                    DrawArrays 
|       5551 |            85.741348 |                 1.86 |                           1.56
                  DrawElements 
|       1550 |            68.078117 |                 1.48 |                           1.24
                        Submit 
|       2377 |            35.352729 |                 0.77 |                           0.64
      CreateVertexBufferObject 
|        303 |            20.089786 |                 0.44 |                           0.37
                       VBOLock 
|       5295 |            11.485503 |                 0.25 |                           0.21
     DestroyVertexBufferObject 
|        303 |             9.394314 |                 0.20 |                           0.17
                   VBOSetArray 
|      21848 |             6.598468 |                 0.14 |                           0.12
                         Clear 
|         26 |             5.671452 |                 0.12 |                           0.10
                       Destroy 
|          |             1.417652 |                 0.03 |                           0.03
                      WaitDone 
|        383 |             1.239443 |                 0.03 |                           0.02
         BindVertexAttribArray 
|      15813 |             0.652805 |                 0.01 |                           0.01
                   BindTexture 
|       1959 |             0.253238 |                 0.01 |                           0.00
             SetShaderPipeline 
|        802 |             0.084333 |                 0.00 |                           0.00
                  FBBindBuffer 
|          |             0.036853 |                 0.00 |                           0.00
            DestroyFrameBuffer 
|          |             0.016081 |                 0.00 |                           0.00
                      WaitIdle 
|          |             0.010827 |                 0.00 |                           0.00
             CreateFrameBuffer 
|          |             0.002647 |                 0.00 |                           0.00
               SetRenderTarget 
|          |             0.000642 |                 0.00 |                           0.00
Primitive statistics 
(raw vertex counts):
  
Triangles 1448700per second 263966.961339per draw call 204.013519
  Triangle strips 26004
per second 4738.176891per draw call 3.662019
  Points 15200
per second 2769.585016per draw call 2.140544
--------------------------------------------------------


Also i see you fix the way how output writen in terms of profiling just by saying tracing disabled/enabled , that way good too for sure, so we can close another ticket then, thanks !

Btw, didn't see that neverball showing that compileshader() take all time, that ok, as it i just run neverball (so it compile all the shaders), and then exit immediately, so mostly compileshader() and draw menu action happens.

ps. and new "primitive statistics" looks nice too :)


Edited by kas1e on 2019/7/22 18:39:54
_________________
Join us to improve dopus5!
zerohero's mirror of os4/os3 crosscompiler suites
   Report Go to top

Re: Porting apitrace
Home away from home
Joined:
2007/9/11 11:31
From Russia
Posts: 5324
@Capehill
Is it possible also to add "pause" support for Profiling as well ? The scenario with which i meet now are : in foobillard++, i found some slow part (when you play in game, press on some gadget at right, to bring another transparent menu), and that menu arise not very fast.

So it will be good if you run "glsnoop GUI PROFILE", hit pause, then go to problematic place, tick "trace" , then doing things which you need to profile, hit pause again and/or exit, so to have profiling info only for problematic part (to see what functions it use, what primitives, etc).

If that possible of course

_________________
Join us to improve dopus5!
zerohero's mirror of os4/os3 crosscompiler suites
   Report Go to top

Re: Porting apitrace
Home away from home
Joined:
2007/1/26 21:48
From New Zealand
Posts: 2178
Quote:
Primitive statistics (raw vertex counts):
Triangles 1448700, per second 263966.961339, per draw call 204.013519
Triangle strips 26004, per second 4738.176891, per draw call 3.662019
Points 15200, per second 2769.585016, per draw call 2.140544


Hmmm. Statistics for the different primitives is probably not that useful, and even a bit misleading. Rendering is a mix of primitive types, none of those individual numbers give a true idea of performance. Triangle-strips can be made up of many triangles, and it's the number of individual triangles/primitives that matter.

Maybe sticking to the total number of vertices would be easiest (incl. vertices/s and vertices/call).

Hans


_________________
http://hdrlab.org.nz/ - Amiga OS 4 projects, programming articles and more.
https://keasigmadelta.com/ - more of my work
   Report Go to top

Re: Porting apitrace
Home away from home
Joined:
2007/9/11 11:31
From Russia
Posts: 5324
@Hans
From another side that info about diffrent primitivies can help as well in understanding why something slow , for example, if you will see a loooot of points / lines, rendering of which ones are not accelerated i do not remember (if currently any).

I remember some for sure was slow, it was points or lines. And if in profiling info you see for example that everything is points and lines, then game done bad by itself.

_________________
Join us to improve dopus5!
zerohero's mirror of os4/os3 crosscompiler suites
   Report Go to top

Re: Porting apitrace
Home away from home
Joined:
2007/9/11 11:31
From Russia
Posts: 5324
@Capehill
Can you also explain a bit how currently all patching done (and for ogles2 and for warp3dnova), i need that to put to guide, and in general i looses a bit since it was SetMethod which fail, but then i still see SetMethod in sources. Is it was SetMethod for warp3dnova and something else for ogles2,right ?

_________________
Join us to improve dopus5!
zerohero's mirror of os4/os3 crosscompiler suites
   Report Go to top

Re: Porting apitrace
Quite a regular
Joined:
2007/7/14 20:30
From Lothric
Posts: 713
@kas1e

OGLES2: glSnoop patches IExec GetInterface and DropInterface using SetMethod, in order to detect applications who ask for IOGLES2 interface. Then the interface-specific OGLES2 functions get also patched using SetMethod. https://github.com/capehill/glsnoop/bl ... ster/ogles2_module.c#L278

NOVA: W3DN_CreateContext is patched using SetMethod, then the function pointers in Nova context are simply replaced with wrapper functions. https://github.com/capehill/glsnoop/bl ... warp3dnova_module.c#L1065


   Report Go to top

Re: Porting apitrace
Home away from home
Joined:
2007/9/11 11:31
From Russia
Posts: 5324
@Capehill
Thanks !

In meantime tested latest version, yeah, all better:

Warp3D Nova profiling results for Shell Process 'foobillardplus':
--------------------------------------------------------
Function 
calls used 9174.493283 ms41.25 of context life-time 22242.865060 ms
Draw calls per second 3248.927183
                      
function | call count |     errors |        duration (ms) |   % of combined time |                  % of CPU time
                  BufferUnlock 
|     129968 |          |          2983.037294 |                32.51 |                          13.41
                 CompileShader 
|         32 |          |          2559.065445 |                27.89 |                          11.51
                  DrawElements 
|      26439 |          |          1327.010547 |                14.46 |                           5.97
                    DrawArrays 
|      45825 |          |          1043.385211 |                11.37 |                           4.69
                        Submit 
|      39455 |          |           474.698360 |                 5.17 |                           2.13
                         Clear 
|        729 |          |           447.556402 |                 4.88 |                           2.01
                       VBOLock 
|      72984 |          |           119.717648 |                 1.30 |                           0.54
      CreateVertexBufferObject 
|        473 |          |            82.958455 |                 0.90 |                           0.37
                   VBOSetArray 
|     196763 |          |            72.261940 |                 0.79 |                           0.32
     DestroyVertexBufferObject 
|        473 |          |            30.435898 |                 0.33 |                           0.14
                      WaitDone 
|       4057 |          |            17.755023 |                 0.19 |                           0.08
         BindVertexAttribArray 
|     193891 |          |             8.908169 |                 0.10 |                           0.04
                   BindTexture 
|      27204 |          |             4.622288 |                 0.05 |                           0.02
                       Destroy 
|          |          |             1.957252 |                 0.02 |                           0.01
             SetShaderPipeline 
|       9544 |          |             0.951237 |                 0.01 |                           0.00
                  FBBindBuffer 
|          |          |             0.125998 |                 0.00 |                           0.00
                      WaitIdle 
|          |          |             0.023219 |                 0.00 |                           0.00
            DestroyFrameBuffer 
|          |          |             0.019650 |                 0.00 |                           0.00
             CreateFrameBuffer 
|          |          |             0.002807 |                 0.00 |                           0.00
               SetRenderTarget 
|          |          |             0.000441 |                 0.00 |                           0.00
Primitive statistics 
(in vertices):
  
Total vertices 37953486per second 1706356.033052per draw call 525.205995
  Triangles 37650708
per second 1692743.395020per draw call 521.016108
  Triangle strips 173264
per second 7789.800170per draw call 2.397653
  Triangle fans 121804
per second 5476.202904per draw call 1.685542
  Lines 7710
per second 346.634958per draw call 0.106692
--------------------------------------------------------


OpenGL ES 2.0 profiling results for Shell Process 'foobillardplus':
--------------------------------------------------------
Function 
calls used 13195.615431 ms55.10 of context life-time 23949.303124 ms
Draw calls 
(glDraw*) per frame 198.527473Draw calls per second 3017.434309
Frames 
(buffer swapsper second 15.199077
                      
function | call count |     errors |        duration (ms) |   % of combined time |   % of CPU time (incldriver)
                  
DrawElements |      26439 |          |          7637.273208 |                57.88 |                          31.89
                 CompileShader 
|         32 |          |          2700.476481 |                20.46 |                          11.28
                    DrawArrays 
|      45825 |          |          1912.058066 |                14.49 |                           7.98
                         Clear 
|        729 |          |           454.487067 |                 3.44 |                           1.90
                    TexImage2D 
|       2045 |        672 |           338.288567 |                 2.56 |                           1.41
                    UseProgram 
|       9543 |          |            66.933833 |                 0.51 |                           0.28
                   SwapBuffers 
|        364 |          |            26.043911 |                 0.20 |                           0.11
                   BindTexture 
|      31254 |          |            24.726150 |                 0.19 |                           0.10
                 TexSubImage2D 
|         85 |          |            15.896619 |                 0.12 |                           0.07
           VertexAttribPointer 
|      96897 |          |            14.332839 |                 0.11 |                           0.06
                   GenTextures 
|        167 |          |             2.407266 |                 0.02 |                           0.01
                DeleteTextures 
|        167 |          |             1.297871 |                 0.01 |                           0.01
       EnableVertexAttribArray 
|       6245 |          |             0.618519 |                 0.00 |                           0.00
                 TexParameteri 
|        333 |          |             0.470746 |                 0.00 |                           0.00
                  ShaderSource 
|         32 |          |             0.280306 |                 0.00 |                           0.00
                 ActiveTexture 
|        174 |          |             0.023980 |                 0.00 |                           0.00
Primitive statistics 
(in vertices):
  
Total vertices 37953486per second 1584774.587849per draw call 525.205995
  Triangles 37650708
per second 1572131.878819per draw call 521.016108
  Triangle strips 173264
per second 7234.760575per draw call 2.397653
  Triangle fans 121804
per second 5086.011965per draw call 1.685542
  Lines 7710
per second 321.936490per draw call 0.106692
--------------------------------------------------------


Through a bit still messed with all those %..

Did the field " % of CPU time (incl. driver) " in ogles2 driver, mean " % of CPU time (incl. Warp3DNOVA times)" ?


And what differences between "combined time" and "of CPU time" ? I.e. what combined ?:)

I do not know, but maybe we need to substract those fields for ogles2 , (i.e. take ogles2 profiling results, substract warp3dnova , so to have only ogles2 only profile).

Like , in example above, BufferUnlock for warp3dnova take 13.41% of cpu time, and ogles2 will be 31.89 - 13.41 = 18.48. Which is true then.

But that only imho of course, for better understanding..

_________________
Join us to improve dopus5!
zerohero's mirror of os4/os3 crosscompiler suites
   Report Go to top

Re: Porting apitrace
Quite a regular
Joined:
2007/7/14 20:30
From Lothric
Posts: 713
@kas1e

Quote:

Did the field " % of CPU time (incl. driver) " in ogles2 driver, mean " % of CPU time (incl. Warp3DNOVA times)" ?


Yes, driver == Nova.

Quote:

And what differences between "combined time" and "of CPU time" ? I.e. what combined ?:)


I have trouble formulating it, but by "combined time" I mean the whole column: if you sum the rows, you should get 100%. BufferUnlock took 1/3 (33%) of time accumulated by all known Nova functions. Maybe I replace column header with time value (9174 ms)?

If you sum the "CPU time" column, you should get 41 % (9174/22242).

Quote:

I do not know, but maybe we need to substract those fields for ogles2 , (i.e. take ogles2 profiling results, substract warp3dnova , so to have only ogles2 only profile).


It sounds difficult. There isn't 1:1 relationship between OGLES2 and Nova. Also the context data is not aware of each other. Maybe it's technically possible by using some temp counter initiated by OGLES2 wrapper, written by Nova wrapper and read again by OGLES2 wrapper but as long as every function is not wrapped, it wouldn't be correct, I guess.

   Report Go to top

Re: Porting apitrace
Home away from home
Joined:
2007/9/11 11:31
From Russia
Posts: 5324
@Capehill
Tested latest version, output start to be better and better :)

Warp3D Nova profiling results for Shell Process 'neverball':
  Function 
calls used 8545.723744 ms61.75 of context life-time 13838.729117 ms
  Draw calls per second 6868.470723
                      
function | call count |     errors |        duration (ms) |            % of 8545.723744 ms |                  % of CPU time
                 CompileShader 
|         54 |          |          4215.599711 |                          49.33 |                          30.46
                  DrawElements 
|      31774 |          |          1338.987729 |                          15.67 |                           9.68
                    DrawArrays 
|      63275 |          |          1178.443879 |                          13.79 |                           8.52
                  BufferUnlock 
|     150801 |          |           934.965112 |                          10.94 |                           6.76
                        Submit 
|      37706 |          |           508.096283 |                           5.95 |                           3.67
                         Clear 
|        458 |          |            93.854152 |                           1.10 |                           0.68
                   VBOSetArray 
|     303160 |          |            87.031560 |                           1.02 |                           0.63
                       VBOLock 
|      70067 |          |            85.922685 |                           1.01 |                           0.62
      CreateVertexBufferObject 
|        399 |          |            42.516261 |                           0.50 |                           0.31
                      WaitDone 
|       4700 |          |            27.078478 |                           0.32 |                           0.20
     DestroyVertexBufferObject 
|        399 |          |            14.160926 |                           0.17 |                           0.10
         BindVertexAttribArray 
|     233413 |          |            11.594298 |                           0.14 |                           0.08
                   BindTexture 
|      32813 |          |             4.229619 |                           0.05 |                           0.03
                       Destroy 
|          |          |             2.163733 |                           0.03 |                           0.02
             SetShaderPipeline 
|      10211 |          |             1.006978 |                           0.01 |                           0.01
                  FBBindBuffer 
|          |          |             0.036572 |                           0.00 |                           0.00
            DestroyFrameBuffer 
|          |          |             0.019168 |                           0.00 |                           0.00
                      WaitIdle 
|          |          |             0.012993 |                           0.00 |                           0.00
             CreateFrameBuffer 
|          |          |             0.002847 |                           0.00 |                           0.00
               SetRenderTarget 
|          |          |             0.000762 |                           0.00 |                           0.00
  Primitive statistics
:
    
Total vertices 18135608per second 1310522.915412per draw call 190.802723consisting of:
    - 
Triangles 17629044per second 1273917.375078per draw call 185.473219
    
Triangle strips 395172per second 28556.084887per draw call 4.157561
    
Points 111392per second 8049.455447per draw call 1.171943


OpenGL ES 2.0 profiling results 
for Shell Process 'neverball':
  Function 
calls used 9891.772587 ms66.09 of context life-time 14966.655732 ms
  Draw calls 
(glDraw*) per frame 207.984683Draw calls per second 6350.844670
  Frames 
(buffer swapsper second 30.535156
                      
function | call count |     errors |        duration (ms) |            % of 9891.772587 ms |   % of CPU time (incldriver)
                 
CompileShader |         54 |          |          4427.655692 |                          44.76 |                          29.58
                  DrawElements 
|      31774 |          |          2852.008381 |                          28.83 |                          19.06
                    DrawArrays 
|      63275 |          |          2278.488471 |                          23.03 |                          15.22
                    TexImage2D 
|        284 |          |           103.638730 |                           1.05 |                           0.69
                         Clear 
|        458 |          |            97.271364 |                           0.98 |                           0.65
                    UseProgram 
|      10210 |          |            79.046758 |                           0.80 |                           0.53
                   BindTexture 
|      27422 |          |            21.728756 |                           0.22 |                           0.15
                   SwapBuffers 
|        457 |          |            12.847857 |                           0.13 |                           0.09
           VertexAttribPointer 
|      93175 |          |            11.570117 |                           0.12 |                           0.08
                   GenTextures 
|        284 |          |             3.310101 |                           0.03 |                           0.02
                DeleteTextures 
|        284 |          |             2.304969 |                           0.02 |                           0.02
                 ActiveTexture 
|       7164 |          |             0.543369 |                           0.01 |                           0.00
       EnableVertexAttribArray 
|       5100 |          |             0.520752 |                           0.01 |                           0.00
                  ShaderSource 
|         54 |          |             0.446004 |                           0.00 |                           0.00
                 TexParameteri 
|        921 |          |             0.391266 |                           0.00 |                           0.00
  Primitive statistics
:
    
Total vertices 18135608per second 1211758.455109per draw call 190.802723consisting of:
    - 
Triangles 17629044per second 1177911.604756per draw call 185.473219
    
Triangle strips 395172per second 26404.023081per draw call 4.157561
    
Points 111392per second 7442.827273per draw call 1.17194


And profiling stop/pause works too. Through, at first run i was a bit stack when run "glsnoop GUI PROFILE", as tracing was ghosted (that ok) with buttons "trace/pause", but profiling have "start/finish", so that "finish" a bit make me think for few seconds before i got that i need on running press finish, then found place to check, and press "start" again. And then all works, yeeah ! :))

If you doens't mind, i can tomorrow upload glsnoop.guide, you can correct it as you think it worth and for 0.2 release its already fat enough by new features and co :)

_________________
Join us to improve dopus5!
zerohero's mirror of os4/os3 crosscompiler suites
   Report Go to top

Re: Porting apitrace
Home away from home
Joined:
2007/9/11 11:31
From Russia
Posts: 5324
@Capehill
There is initall version of guide:
http://kas1e.mikendezign.com/aos4/glsnoop/glsnoop.guide

Some parts just taken from your simple readme (so that to be writen later with all explaining and stuff). Some parts like GUI mode and Q/A are "N/A" at moment but for 0.2 release can fits. Rewrite it of course as you think is need it to be, or remove things which you think not need it there, while i can add some bits in meantime: at least i thinking about putting to profile section example of profiling with describing all fields, as well as the same for tracing field, with example output with describing things a little too, but that can go for 0.3

Btw, " duration (ms)" in ogles2 profiling, also mean " duration (ms) including Warp3DNova times".

Maybe need to be changed somehow output there to point that all % and ms mean include warp3dnova times, not only CPU time

_________________
Join us to improve dopus5!
zerohero's mirror of os4/os3 crosscompiler suites
   Report Go to top

Re: Porting apitrace
Quite a regular
Joined:
2007/7/14 20:30
From Lothric
Posts: 713
@kas1e

Cool, thanks. I will add it to repo soon. Maybe a link to Sashimi tool can be added for completeness.

Yes, you are correct, OGLES2 times include Nova times. Maybe it should be removed from table, making columns slimmer. It could be added on a separate line for clarity.

There could also be "calls/s" or "call average duration" column. I think MiniGL profiler has something like that.

Then, I guess 0.2 would be ready.

(BTW: I have occasional WinFrame DSIs when I scroll around in console, for example after Sashimi logging or some long compilation, should report to OS4 devs. I think the problem is triggered by lots of content on console)

   Report Go to top

Re: Porting apitrace
Home away from home
Joined:
2007/9/11 11:31
From Russia
Posts: 5324
@Capehill
Quote:

(BTW: I have occasional WinFrame DSIs when I scroll around in console, for example after Sashimi logging or some long compilation, should report to OS4 devs. I think the problem is triggered by lots of content on console)


Yeah, quite rare but i have some winframe DSIs too, through not in case with sashimi and glsnoop, but when there is some heavy big output coming from something (like when you compile some big stuff).

I report it back in past to Tony, as well , as i report him that current console bring a lot of memguards HITS, on which he say that he start to rewrite it all, so i didn't bother reporting futher.

Quote:

Yes, you are correct, OGLES2 times include Nova times. Maybe it should be removed from table, making columns slimmer. It could be added on a separate line for clarity.


Probabaly not need to remove, because, its quite usefull to see how much exactly take nova and ogles2 when doing operation. Or you mean remove from last field that (incl. driver time) words, and put it at top of table, like all 3 last fields incl. driver time ?

Maybe there should be just 3st table maybe, of ogles2 without including warp3dnova times, but that will probabaly mess the things .. And have 3 tables sure will be more messy if just clarify things as you say..

_________________
Join us to improve dopus5!
zerohero's mirror of os4/os3 crosscompiler suites
   Report Go to top


« 1 2 3 4 (5) 6 7 »



[Advanced Search]


Powered by XOOPS 2.0 © 2001-2016 The XOOPS Project