Presentation is loading. Please wait.

Presentation is loading. Please wait.

The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.

Similar presentations


Presentation on theme: "The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008."— Presentation transcript:

1 The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008

2 Agenda BF561 architecture Cache coherency solution Interrupt dispatch SMP status and applications SMP performance Limitations

3 BF561 architecture

4 BF561 architecture (cont.) ‏ Block diagram

5 BF561 architecture (cont.) ‏ Memory architecture L1 run at core speed L1 scratchpad sram 4K L1 instruction cache 16K L1 instruction sram 16K L1 data cache 32K L1 data sram 32K L2 run at 1/2 core speed Data or instruction sram 128K Shared by CoreA/B Cached (Disabled in SMP)‏

6 BF561 architecture (cont.) ‏ Compare to x86

7 BF561 architecture (cont.) ‏ How to boot CoreB

8 Cache coherency solution Why cache coherence Jiffies, Spin-lock, Semaphore, Mutex,...

9 Cache coherence solution (cont.) ‏ Cache policy Main memory - Write Through Shared on chip SRAM (L2 SRAM) – No cacheable Global Lock: protect atomic data A special spin lock that stay in share on chip SRAM (L2 SRAM)‏ Operate functions: _get_core_lock/_put_core_lock Parameter: address of atomic data Spin lock: based on global lock Invalidate all the data cache if the same lock has been got by another CPU Atomic ops: based on global lock Protect the atomic operations Memory barrier Invalidate all the data cache

10 Interrupt dispatch Peripheral interrupt trigger both cores Two kinds of irq handlers

11 Interrupt dispatch (cont.) ‏ Time monotonicity problem Using two Core timers causes not monotonic Using gptimer and 'handle_simple_irq' casues CoreB sticky Solution Use general purpose timer0 instead of Core timers Use handle_percpu_irq() instead of handle_simple_irq()‏

12 Interrupt dispatch (cont.) ‏ Inter-processor interrupt: SICB_SYSCR Write 1 to CA_supplement_int0 trigger an interrupt to CoreA Write 1 to CB_supplement_int0 trigger an interrupt to CoreB Interrupt handler write 1 to relevant bit to clear interrupt request Inter-processor interrupt implementing Per-cpu message queue Per-cpu spin lock Per-cpu interrupt

13 SMP status and application (cont.) ‏ SMP status 2008R1.5 svn://sources.blackfin.uclinux.org/svn/uclinux-dist/branches/2008R1/bfin_patch/smp_patch/ Trunk svn://sources.blackfin.uclinux.org/svn/uclinux-dist/trunk/bfin_patch/smp_patch/ Application - Multi-task Video encoder/decoder - codec1 on CoreB, codec2 on CoreA VoIP - codec on CoreB, network stack on CoreA

14 SMP performance Whetstone test result Test software: Whetstone Test Hardware: BF561, Core Clock 600MHz, System Clock: 100MHz Test Environment 1: UP Test Environment 2: SMP Performance analysis Invalidate entire data cache: 79130 times in whetstone test

15 Limitations Store routines to L1 I-SRAM Store shared data to L1 D-SRAM User multi-threads running on different Cores

16 16 Questions?

17 The World Leader in High Performance Signal Processing Solutions The End Thank you!


Download ppt "The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008."

Similar presentations


Ads by Google