Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lu Hao Profiling-Based Hardware/Software Co- Exploration for the Design of Video Coding Architectures Heiko Hübert and Benno Stabernack.

Similar presentations


Presentation on theme: "Lu Hao Profiling-Based Hardware/Software Co- Exploration for the Design of Video Coding Architectures Heiko Hübert and Benno Stabernack."— Presentation transcript:

1 Lu Hao Profiling-Based Hardware/Software Co- Exploration for the Design of Video Coding Architectures Heiko Hübert and Benno Stabernack

2 Contents 1. Background 2. MEMTRACE profiler 3. Software/Hardware Optimization 4. Conclusion

3 Background -- profiling  Profiling is used to understand the run- time behavior of applications

4 Efficient profiling approaches  Software profiling  Sampling, Instrumentation  Flexible but have high overhead  Hardware profiling  Performance counter  inexpensive but more rigid and may not be universally available  Hybrid Combinations of the above  Hold great potential since they combine the advantages of both without the drawbacks

5 An example of hardware profiling  PC – Performance Counter

6 Background – system analysis  Why we need profiling?  It is very important to adapt the system to the application in order to find an efficient solution.  Video coding

7 Contents 1. Background 2. MEMTRACE profiler 3. Software/Hardware Optimization 4. Conclusion

8 MEMTRACE profiler  MEMTRACE delivers cycle-accurate profiling results on a C function level.  The results include clock cycles, various memory access statistics, and optionally energy consumption estimation for reduced instruction set computer (RISC)-based processors.  A focus is placed on memory access analysis, as for data-intensive applications this aspect has a high potential for increasing system efficiency.

9 MEMTRACE profiling toolflow

10 MEMTRACE -- Initialization

11 MEMTRACE – Performance Analysis

12 MEMTRACE – Post Processing

13 MEMTRACE backend

14 MEMTRACE -- Profiling data acquisition

15  init()  Initialize the profiler.  Creates a list of all functions and global variables  nextInstruction()  Checks if the program execution has changed from one function to another  If so, the cycle count of the previous function is recalculated and the call count of the new function is incremented  memoryAccess()  It is decided if a load or store access was performed, and which bit-width (8, 16, or 32-bit) was used.

16 MEMTRACE -- Profiling data acquisition  busActivity()  Identifies the bus status (idle cycle, core access or DMA access) and increments the appropriate counter of the current function  cacheMiss()  Is called each time a cache miss occurs  finish()  When the ISS terminates the simulation

17 Processor model generator

18 Interconnection

19  What can we do by using the result of MEMTRACE profiler?

20 Contents 1. Background 2. MEMTRACE profiler 3. Software/Hardware Optimization 4. Conclusion

21  System partitioning  Computationally intensive functions are well- suited for hardware acceleration in a coprocessor  Control-intensive functions are better suited for software implementation on ASIPs (Application Specific Instruction set Processors)

22 Software Optimization  Loop unrolling  For computational intensive parts, arithmetic optimizations or SIMD instructions can be applied, if such instructions are available in the processor  Video applications

23 Hardware Optimization  Memory Subsystem Optimizations  External memory  Cache (Cache miss) The data areas with the most cache misses and the smallest size should be stored in on-chip memory  SRAM  Instruction Set Architecture Optimizations  Frequently used instructions should be considered as targets for optimization during the processor architecture development.

24 Conclusion  Profiling and system analysis  MEMTRACE architecture  Initialization  Performance analysis  Post processing  Hardware/Software optimization  Software  Hardware

25 Lu Hao And questions?

26 References  [1] H Hübert, B Stabernack. Profiling-based hardware/software co-exploration for the design of video coding architectures. IEEE Transactions on Circuits and Systems for Video Technology, 2009, Pages: 1680-1691  [2]ST Microelectronics: Nomadik STn8820 Mobile Multimedia Application Processor (2008, Feb.). Data brief. [Online]. Available: www.st.com  [3] Broadcom: BCM2820 Low Power, High Performance Application Processor (2006, Sep.). Product brief. [Online]. Available: www.broadcom.com  [4] G. de Micheli and L. Benini, Network on Chips. San Francisco, CA: Morgan Kaufmann, 2006.  [5] H. H¨ubert, “MEMTRACE: A memory, performance and energy profiler targeting RISC-based embedded systems for dataintensive applications,” Ph.D. dissertation, Dept. Elect. Eng. Comput. Sci., Tech. Univ. Berlin, Germany, 2009. [Online]. Available: http://opus.kobv.de/tuberlin/volltexte/2009/2261


Download ppt "Lu Hao Profiling-Based Hardware/Software Co- Exploration for the Design of Video Coding Architectures Heiko Hübert and Benno Stabernack."

Similar presentations


Ads by Google