The World Leader in High Performance Signal Processing Solutions Heterogeneous Multicore for blackfin implementation Open Platform Solutions Steven Miao.

Slides:



Advertisements
Similar presentations
Network II.5 simulator ..
Advertisements

purpose Search : automation methods for device driver development in IP-based embedded systems in order to achieve high reliability, productivity, reusability.
Content Overview Virtual Disk Port to Intel platform
Threads, SMP, and Microkernels
AMD OPTERON ARCHITECTURE Omar Aragon Abdel Salam Sayyad This presentation is missing the references used.
Yaron Doweck Yael Einziger Supervisor: Mike Sumszyk Spring 2011 Semester Project.
WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.
Chorus and other Microkernels Presented by: Jonathan Tanner and Brian Doyle Articles By: Jon Udell Peter D. Varhol Dick Pountain.
Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
KeyStone Training Multicore Navigator Overview. Overview Agenda What is Navigator? – Definition – Architecture – Queue Manager Sub-System (QMSS) – Packet.
Computer Systems/Operating Systems - Class 8
Introduction to Operating Systems CS-2301 B-term Introduction to Operating Systems CS-2301, System Programming for Non-majors (Slides include materials.
VIA and Its Extension To TCP/IP Network Yingping Lu Based on Paper “Queue Pair IP, …” by Philip Buonadonna.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
Architectural Support for Operating Systems. Announcements Most office hours are finalized Assignments up every Wednesday, due next week CS 415 section.
Threads 1 CS502 Spring 2006 Threads CS-502 Spring 2006.
Chapter 13 Embedded Systems
Modern trends in computer architecture and semiconductor scaling are leading towards the design of chips with more and more processor cores. Highly concurrent.
Figure 1.1 Interaction between applications and the operating system.
1 Last Class: Introduction Operating system = interface between user & architecture Importance of OS OS history: Change is only constant User-level Applications.
Operating Systems CS208. What is Operating System? It is a program. It is the first piece of software to run after the system boots. It coordinates the.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
OS Implementation On SOPC Final Presentation
I/O Tanenbaum, ch. 5 p. 329 – 427 Silberschatz, ch. 13 p
Computer System Architectures Computer System Software
2017/4/21 Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational.
OpenMP in a Heterogeneous World Ayodunni Aribuki Advisor: Dr. Barbara Chapman HPCTools Group University of Houston.
Chapter 3: Processes Process Concept Process Scheduling Operations on Processes Interprocess Communication Examples of IPC Systems Communication in Client-Server.
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
CSE 451: Operating Systems Section 10 Project 3 wrap-up, final exam review.
1 Presenter: Min Yu,Lo 2015/10/9 Lauri Matilainen, Erno Salminen, Timo D. Hamalainen, and Marko Hannikainen International Conference on Embedded.
Operating Systems Lecture 02: Computer System Overview Anda Iamnitchi
Android is a trademark of Google Inc. Use of this trademark is subject to Google Permissions. Linux® is the registered trademark of Linus Torvalds in the.
Processes and Threads Processes have two characteristics: – Resource ownership - process includes a virtual address space to hold the process image – Scheduling/execution.
TEMPLATE DESIGN © Hardware Design, Synthesis, and Verification of a Multicore Communication API Ben Meakin, Ganesh Gopalakrishnan.
The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.
Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-
CSE 661 PAPER PRESENTATION
I/O Computer Organization II 1 Interconnecting Components Need interconnections between – CPU, memory, I/O controllers Bus: shared communication channel.
1 Threads, SMP, and Microkernels Chapter Multithreading Operating system supports multiple threads of execution within a single process MS-DOS.
CS533 - Concepts of Operating Systems 1 The Mach System Presented by Catherine Vilhauer.
DSP/BIOS™ LINK The foundation for GPP-DSP solutions.
1 Advanced Operating Systems - Fall 2009 Lecture 2 – January 12, 2009 Dan C. Marinescu Office: HEC 439 B.
Linux C6x Syslink. 1.What is Syslink? 2.Syslink Architecture 3.SharedRegion 4.What is Syslink-c6x? 5.Syslink-c6x Features 6.Syslink-c6x Dependency 7.Demo.
TI Information – Selective Disclosure Implementation of Linear Algebra Libraries for Embedded Architectures Using BLIS September 28, 2015 Devangi Parikh.
Processes. Process Concept Process Scheduling Operations on Processes Interprocess Communication Communication in Client-Server Systems.
Chapter 1 Basic Concepts of Operating Systems Introduction Software A program is a sequence of instructions that enables the computer to carry.
Operating Systems Unit 2: – Process Context switch Interrupt Interprocess communication – Thread Thread models Operating Systems.
Concepts and Structures. Main difficulties with OS design synchronization ensure a program waiting for an I/O device receives the signal mutual exclusion.
EU-Russia Call Dr. Panagiotis Tsarchopoulos Computing Systems ICT Programme European Commission.
Major OS Components CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University
Embedded Real-Time Systems Processing interrupts Lecturer Department University.
Threads, SMP, and Microkernels Chapter 4. Processes and Threads Operating systems use processes for two purposes - Resource allocation and resource ownership.
CSCI/CMPE 4334 Operating Systems Review: Exam 1 1.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
Z IGBEE and OSAL Jaehoon Woo KNU RTLAB. KNU RTLAB.
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
Introduction to Operating Systems Concepts
Computer System Structures
Introduction to threads
Operating Systems (CS 340 D)
CS703 - Advanced Operating Systems
OS Virtualization.
Chapter 4: Threads.
Threads, SMP, and Microkernels
Lecture Topics: 11/1 General Operating System Concepts Processes
Lecture 4- Threads, SMP, and Microkernels
Chapter 13: I/O Systems “The two main jobs of a computer are I/O and [CPU] processing. In many cases, the main job is I/O, and the [CPU] processing is.
Presentation transcript:

The World Leader in High Performance Signal Processing Solutions Heterogeneous Multicore for blackfin implementation Open Platform Solutions Steven Miao

Outline  Modern SoC architecture  Generic AMP/IPC framework overview  Intercore communication protocol  Userspace API for intercore communication(MCAPI)  Q & A 2

The emergence of multicore  With the ever-increasing demands for devices, multicore solutions are becoming more prevalent.  Performance  power management  Costs 3

Today’s SoC architecture 4

Ingredients for multicore development  Soc architecture Homogenous cores Heterogeneous cores  Operating system, multicore processing Smp Amp Hybrid  Application A collection of threads or tasks  Inter-core communications or message passing mechanism 5

Different Types of multicore  Homogeneous multicore - Multicore chips coming to market all have the same instruction sets/data structures.  Heterogeneous multicore – At least one core in a multicore architecture is different, different instruction sets/data structures. 6

Multicore processing  AMP - A separate OS, or a separate instantiation of the same OS, runs on each CPU.  SMP - A single instantiation of an OS manages all cores simultaneously, and applications can float to any of them. 7

Embedded heterogeneous multicore usage  Performance Increase system performance by implementing multicore hardware and software parallelism.  Migration Avoid costly porting and software rewrite efforts  Usability Enhance an existing real-time device by adding user-interface features on a general purpose operating system.  Reliability/security Improve system robustness by protecting and isolating operating environments from each other.  Even Safety-Critical Software 8

Programming model on heterogeneous multi-core  Master and slave model  Typical GP core runs operating system such as Linux, Microsoft Windows Embedded CE…  The DSP as a slave to offload GP core or accelerate Application Specific task like real-time, audio/video codec, signal processing, power management, runs OS as vdsp, uc/os, nucleus  We need a generic AMP/IPC mechanism 9

A generic AMP/IPC framework  Some existing solutions Syslink of TI (syslink, dsplink, dspbridge) Qmi of qualcomm(qualcomm msm/modem interface)  Generic IPC framework features for embedded system Control (power on, boot, power off) Communicate Synchronize Resource manage Minimal foot print Source level portability Scalable 10

Hardware Requirements  Each core can interrupt the other (required)  Mailbox register for buffer transfer(optional)  Shared Memory between cores (required) 16-bit and 32-bit shared memory reads must be indivisible Possible to either disable or flush the data cache on shared memory region DMA to shared memory available from each core  Same memory addressing on each core (desirable) Address N in shared memory on one core is address N in shared memory on the other  System-wide memory protection (optional) Each core can protect areas of shared memory 11

ICC protocol of ADI  Shared Memory based Inter-core Communication Protocol  Message passing  Dsp control  Resource management  Designed for heterogeneous multicore system  first Implementation on BF561 12

ADSP-BF561 Block Diagram 13

Message Format typedef struct { sm_endpoint_t dst_ep, src_ep; sm_uint32_t type; sm_uint32_t length; /* data1 */ sm_address_t payload; /* data0 */ } sm_msg_t; typedef struct { sm_endpoint_t dst_ep, src_ep; sm_uint32_t type; sm_uint32_t length; /* data1 */ sm_address_t payload; /* data0 */ } sm_msg_t; 14

All protocol Types 15 Protocol typeValueProtocol name SP_CORE_CONTROL1Core control protocol SP_TASK_MANAGER2Task manager protocol SP_RES_MANAGER3Resource manager protocol SP_PACKET4Connectionless packet transfer protocol SP_SESSION_PACKET5Connection packet transfer protocol SP_SCALAR6Connectionless scalar transfer protocol SP_SESSION_SCALAR7Connection scalar transfer protocol

ep1 ep2 ep3 ICC message passing 16 ep1 ep2 ep3 … … msg queue

User Interface  Linux User Interface, device node /dev/icc ioctl(fd, CMD_DSP_START, NULL); ioctl(fd, CMD_DSP_STOP, NULL);  DSP User Interface DSP initialization Application entry Entry(icc_task_init) Entry(icc_task_exit) 17

DSP task on core B  Link coreB application $(LD) -static $(LDFLAGS) -o -T coreb_task.lds --just-symbol $(ICC_CORE) $< $(LIBMCAPI_COREB) -Ttext $(TASK_LOAD_BASE) coreb_task.lds:.text_l1 : { /* basiccrt.o(.text) */ *(.l1.text) } >MEM_L1_CODE =0.text : { /* *(EXCLUDE_FILE (*basiccrt.o ).text) */ *(.text.stub.text.*.gnu.linkonce.t.*) KEEP (*(.text.*personality*)) /*.gnu.warning sections are handled specially by elf32.em. */ *(.gnu.warning) } >MEM_ICC =0  Link coreB application $(LD) -static $(LDFLAGS) -o -T coreb_task.lds --just-symbol $(ICC_CORE) $< $(LIBMCAPI_COREB) -Ttext $(TASK_LOAD_BASE) coreb_task.lds:.text_l1 : { /* basiccrt.o(.text) */ *(.l1.text) } >MEM_L1_CODE =0.text : { /* *(EXCLUDE_FILE (*basiccrt.o ).text) */ *(.text.stub.text.*.gnu.linkonce.t.*) KEEP (*(.text.*personality*)) /*.gnu.warning sections are handled specially by elf32.em. */ *(.gnu.warning) } >MEM_ICC =0 18

Run DSP task on core B  Dynamic load by icc_loader  Parsing the executable image file format(bfin ELF, vdsp ELF)  Find the task entries  Loading and controlling the DSP bare metal application via task manager protocol.  Core B will start running task from task init entry 19

Why MCAPI?  A unified communications API to handle this multitude of hardware and software  Standardized  Scalable  Portable 20

Goals of MCAPI  Generic API for any target and OS  Fast, lightweight, and scalable  Multiple channels  Flexible  Portability 21

Key concepts in MCAPI  Endpoints, the start and end points of the communication  Channel  Messages, connectionless, buffer communication  Packet channels, connected, unidirectional, buffer communication  Scalar channels, connected, unidirectional, scalar (8, 16, 32 or 64 bit) communication 22

Mcapi topology 23  Nodes exchange data via endpoints  A node is implementation defined, i.e. a core  An endpoint is created on a node to send/receive data  MCAPI has no knowledge of the underlying transport mechanism, share memory, ethernet, linkport, etc.

The big picture 24 core 0 core 1 core 2 Linux SMP RTOS ICC Resource management ICC Resource management ICC Resource management ICC Resource management MCAPI End applications Hybrid mixed OS system

Resources  ICC protocol on wiki core_communication_introduction core_communication_introduction  MCAPI home page  Implementation on bf561 uClinux-dist/lib/libmcapi uClinux-dist/user/blkfin-app/icc_utils uClinux-dist/linux-2.6.x/drivers/staging/icc 25

Thanks Q&A 26