Synthetic content approach for benchmarking mobile 3D graphics: Work-in-progress Kari J. Kangas, Mika Qvist, Kari Pulli Nokia
Outline Introduction and motivation Introduction and motivation Related work Related work Synthetic benchmark content approach Synthetic benchmark content approach Tracing, analyzing, synthesizing Tracing, analyzing, synthesizing Preliminary results Preliminary results Future work Future work Summary Summary
Introduction Phones already support 2D/3D vector graphics Phones already support 2D/3D vector graphics OpenGL ES, Java M3G, SVG, (OpenVG) OpenGL ES, Java M3G, SVG, (OpenVG) Vector graphics HW is coming Vector graphics HW is coming Technologies are adapted from desktop PCs very quickly Technologies are adapted from desktop PCs very quickly Vector graphics performance Vector graphics performance Essential part of interactive vector graphics user experience Essential part of interactive vector graphics user experience Very complex issue compared to traditional bitmap graphics; performance is highly content dependent Very complex issue compared to traditional bitmap graphics; performance is highly content dependent
Why OpenGL ES benchmarking? Performance optimization Performance optimization Find performance bugs Find performance bugs Monitor the progress of optimization work Monitor the progress of optimization work Understand 3D graphics platforms Understand 3D graphics platforms How various content affects performance How various content affects performance Performance estimates for content developers Performance estimates for content developers Benchmark data is needed as early as possible Benchmark data is needed as early as possible
Why benchmarking is challenging? Immature platforms Immature platforms Fragile SW environment, no GUI Fragile SW environment, no GUI Binary breaks Binary breaks Source code is needed Source code is needed Lack of versatile benchmark suites Lack of versatile benchmark suites Content ranges from simple UI controls to M3G to native OpenGL ES Content ranges from simple UI controls to M3G to native OpenGL ES We need speculative content We need speculative content
Synthetic benchmark content approach Key question: is synthetic content similar enough to the real content from the performance point of view? Key question: is synthetic content similar enough to the real content from the performance point of view? Analyze existing OpenGL ES content Analyze existing OpenGL ES content Tracer, trace player, analyzer Tracer, trace player, analyzer Create synthetic benchmark content Create synthetic benchmark content Synthetic content tool Synthetic content tool Ensure the synthetic content matches the original content from the performance point of view Ensure the synthetic content matches the original content from the performance point of view Use synthetic content for benchmarking Use synthetic content for benchmarking
Related work Workload characterization Workload characterization Dunwoody & Linton [1990], Mitra and Chiuen [1999], and Antochi et al. [2004] Dunwoody & Linton [1990], Mitra and Chiuen [1999], and Antochi et al. [2004] Analyzing workload features Analyzing workload features Render-time estimation Render-time estimation Funkhouser and Sequin [1993] Funkhouser and Sequin [1993] Benchmark suites Benchmark suites SPMark04, 3DMarkMobile06 SPMark04, 3DMarkMobile06
Tracing OpenGL ES content OpenGL ES tracer OpenGL ES tracer Store OpenGL ES calls & parameters Store OpenGL ES calls & parameters OpenGL ES impl. OpenGL ES impl. Render graphics Render graphics Applications used as is Applications used as is No changes, no source code No changes, no source code Real-time tracing Real-time tracing Full trace vs. sample frames Full trace vs. sample frames OpenGL ES tracer OpenGL ES Implementation OpenGL ES Call trace OpenGL ES application
Analyzing OpenGL ES trace OpenGL ES trace player OpenGL ES trace player Replay the graphics in controlled env. Replay the graphics in controlled env. OpenGL ES analyzer OpenGL ES analyzer Extract content features Extract content features Content features Content features Condensed representation Condensed representation Off-line analysis Off-line analysis Trace player Analyzer Content features Call trace
Content features: an example FNUM TXF TXBW TXIBW TXCBW TEXA TEXAM TEXC TEXCM TRIR TRIBFC TRIAA TRIRP PRIC VERIN TRIIN TPP ODE VSS Example data from OpenGL analyzer
Synthetic content tool OpenGL ES application OpenGL ES application Win32, WinCE/PocketPC, Symbian Win32, WinCE/PocketPC, Symbian Benchmark Benchmark OpenGL ES frame, drawn repeatedly OpenGL ES frame, drawn repeatedly Benchmark suite Benchmark suite Collection of benchmarks Collection of benchmarks Extensible framework Extensible framework Diverse benchmark actions Diverse benchmark actions Action composition Action composition Support for animation Support for animation Synthetic Content Tool Content features Benchmark suite
Benchmarks: old SCT ["Quake1 frame 11": FILLRATE] Surface : WINDOW FrameBufferFormat : 16/5/6/5/0/- PrimitiveType : TRIANGLES TriangleSize : 8*8 TriangleCount : 1008 Overdraw : 3 VertexType : FIXED ColorType : OFF TexCoordType : FIXED InterleavedArrays : LOOSE ShadeModel : FLAT Blending : OFF DepthTest : ON AlphaTest : OFF ColorMask : 0xF DepthMask : ON LogicOp : OFF Fog : OFF PerspectiveCorrectionHint : FASTEST Transformation : PERSPECTIVE Texture0 : ON Texture0Count : 4 Texture0Size : 128*128 Texture0Type : RGB565 Texture0MinFilter : GL_LINEAR Texture0MagFilter : GL_LINEAR Texture0EnvMode : GL_REPLACE Texture0Rotate : 0 Texture0Scale : 1 Texture1 : OFF [“Lighting": TRANSFORMATION] Surface : WINDOW FrameBufferFormat : 16/5/6/5/0/- TriangleSize : 2*1 TriangleCount : 8192 Overdraw : 4 InterleavedArrays : LOOSE ColorType : OFF LightCount : 0, 1, 2 TransPerTriangle : 1.0 SharingDistance : 0 VertexType : FIXED Transformation : COMPLEX Fog : OFF Blending : OFF BackfacingTrianglePercent : 100 Normalization : OFF Monolithic benchmarks with lots of parameters, no reuse
Benchmark: new SCT Surface: PBUFFER … ClearColor : {0,0,0,1} … ClearColor : ON ClearDepth : ON Dummy : 1 # ignore Projection : ORTHO … ShadeModel : GL_FLAT … Operation : FILLZ … Iterations : 1000 Iterations : 1 PrimitiveType : GL_TRIANGLE_STRIP TriangleWidth: 176 TriangleHeight : 208 TriangleCount : 2 VertexType : GL_FIXED ColorType : GL_UNSIGNED_BYTE TexCoordType : GL_FIXED TexCoordUnits : 2 NormalType : GL_FIXED InterleavedArrays : OFF UseVBO: OFF ["Max. screen clear rate benchmark":BENCHMARK] InitActions : DispSetup+ClearSetup+Camera BenchmarkActions : ClearScreen+EndFrame [“Marketing benchmark":BENCHMARK] InitActions : DispSetup+FlatSetup+Camera BenchmarkActions : Mesh+EndFrame [“Composite benchmark":BENCHMARK] InitActions : DispSetup+FlatSetup+Camera BenchmarkActions : Mem+Cpu+Mesh+EndFrame Reusable benchmark actions, support for composition, …
Comparing real and synthetic content Rendering time per frame Rendering time per frame Trace vs. synthetic content Trace vs. synthetic content Different platforms Different platforms Compare Compare Different platforms Different platforms Modify SCT to improve the match Modify SCT to improve the match New parameters New parameters Improved actions Improved actions Call trace Trace player OGLES1 Trace player OGLES2 Rendering time Synthetic content OGLES1 Synthetic content OGLES2 Rendering time Content features
Example Early OpenGL SCT proto: trace from Quake 2 demo 2, 600 MHz P3, 256 MB RAM (SW OpenGL)
Example Early OpenGL SCT proto: trace from Quake 2 demo 2, 3 GHz Xeon, 2 GB RAM, nVidia QuadroFX 500 SCT Proto was VERY primitive (single triangle rendered repeatedly, etc.)
Preliminary results Real content vs. synthetic content Real content vs. synthetic content Real OpenGL ES content analyzed by hand, very rough content features Real OpenGL ES content analyzed by hand, very rough content features ~20 FPS real vs. 24 FPS synthetic on a mobile 3D hardware ~20 FPS real vs. 24 FPS synthetic on a mobile 3D hardware Understanding 3D performance Understanding 3D performance Understanding content Understanding content Creating “realistic” synthetic content Creating “realistic” synthetic content
Future work OpenGL ES tracer, trace player, analyzer OpenGL ES tracer, trace player, analyzer OpenGL ES content analysis OpenGL ES content analysis Creating accurate synthetic content Creating accurate synthetic content Mapping content features to SCT input Mapping content features to SCT input Designing good benchmark actions, action compositions Designing good benchmark actions, action compositions Different types of workloads Different types of workloads CPU, memory, audio playback, game physics engine CPU, memory, audio playback, game physics engine
Summary Outline of the synthetic content approach Outline of the synthetic content approach Preliminary results indicate that it should be possible to match synthetic content performance to the real content: work-in-progress Preliminary results indicate that it should be possible to match synthetic content performance to the real content: work-in-progress
Thank you! Questions?