Download presentation
Presentation is loading. Please wait.
Published byAndra Norman Modified over 9 years ago
1
“Low-Power, Real-Time Object- Recognition Processors for Mobile Vision Systems”, IEEE Micro 2012. Jinwook Oh ; Gyeonghoon Kim ; Injoon Hong ; Junyoung Park ; Seungjin Lee ; Joo-Young Kim ; Jeong-Ho Woo ; Hoi-Jun Yoo Presenter: Juseong Lee, 2013021037 1
2
Outline Introduction Background Main Idea Implementation Conclusion Evaluation 2 Object Recognition by Juseong Lee
3
Outline Introduction Background Main Idea Implementation Conclusion Evaluation 3 Object Recognition by Juseong Lee
4
Introduction 4 Source by MBN News
5
Introduction 5 Object recognition system –Require real-time operation High performance Low power in mobile system How can implement? –Find suitable algorithm SIFT algorithm –Hardware optimization Algorithm optimization Make exclusive processor –Parallel computation Multi-threading NoC SIFT - Scale Invariant Feature Transform NoC - Network on Chip Source by VOLVO
6
Outline Introduction Background Main Idea Implementation Conclusion Evaluation 6 Object Recognition by Juseong Lee
7
Background Knowledge 7 What is SIFT algorithm? –Scale Invariant Feature Transform –The most popular candidate For how to extract some interest points out of the object and describe them – Robust against changes in translation, scaling, and rotation. Image matching by SIFT
8
Background Knowledge 8 What’s the problem in SIFT-based object recognition? –Consumes a lot of power Owing to the heavy computation required in descriptor Gen. and matching –Today’s high-resolution image sensors & tight power budgets Make real-time SIFT implementation in mobile device even harder Scare resources problem
9
Outline Introduction Background Main Idea Implementation Conclusion Evaluation 9 Object Recognition by Juseong Lee
10
Main Idea 10 How can we solve the problem? –Make an object-recognition processor Using an attention-based recognition algorithm –For energy efficiency A heterogeneous multicore architecture –For data and thread parallelism Network-on-Chip(NoC) communication –For high bandwidth The processor determines Regions of Interest(ROI) part of image –For minimizing unnecessary computations Heterogeneous multicore architecture –provides several types of parallelism –achieves high throughput –low power consumption High-bandwidth NoC plays a role as the communications backbone
11
Why find ROI? 11 Image processing algorithm has no regard throughput Image size 480 x 360 Objects have feature! 172,800 computations! Example) Edge detection You can select part for reducing computation!
12
Main Idea – BONE V 12 Using Conventional method Using Main Idea
13
Main Idea – Algorithm 13 Attention-based object recognition
14
Main Idea – Architecture 14 Pixel level parallel Very long instruction word 3 stage task level pipeline 1.5x↓ power consumption 5 stage fine-grained pipeline 3.45x↑ pipeline throughput
15
SMT-enabled heterogeneous multicore processor 15 Throughput-optimized SFEC –Find ROI tile for energy efficiency –Memory locality with high bandwidth utilization Latency-optimized FMP –ROI tile and NoC help latency Power-optimized MLE –Changes the core’s thread allocation –and operating voltage and frequency dynamically BONE-V5: SFEC: SMT-enabled Feature Extraction Cluster FMP: Feature Matching Processor MLE: Machine Learning Engine
16
Outline Introduction Background Main Idea Implementation Conclusion Evaluation 16 Object Recognition by Juseong Lee
17
Implementation 17
18
Implementation - Comparing 18
19
19 Implementation - Comparing
20
Outline Introduction Background Main Idea Implementation Conclusion Evaluation 20 Object Recognition by Juseong Lee
21
Conclusion Energy efficient system is important to improve performance Algorithm and architecture have to optimize at the same time BONE-V multicore processors can apply real- time object recognition system Future BONE-V processors will further lower the power consumption. 21
22
Outline Introduction Background Main Idea Implementation Conclusion Evaluation 22 Object Recognition by Juseong Lee
23
Evaluation Table 3 has to contain the result that comparing other recognition processor When hardware optimization, Not only overall algorithm but particular algorithm block optimization are needed –CORDIC based gradient and magnitude computation 23
24
Thanks for Ur listening! Thanks! Juseong_lee@korea.ac.kr 24
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.