Texas Instruments ExpressDSP Algorithm Standard Prof. Brian L. Evans Dept. of Electrical and Computer Engineering The University of Texas at Austin August.

Slides:



Advertisements
Similar presentations
purpose Search : automation methods for device driver development in IP-based embedded systems in order to achieve high reliability, productivity, reusability.
Advertisements

Ch. 7 Local Variables and Parameter Passing From the text by Valvano: Introduction to Embedded Systems: Interfacing to the Freescale 9S12.
Part IV: Memory Management
Copyright © 2000, Daniel W. Lewis. All Rights Reserved. CHAPTER 10 SHARED MEMORY.
Coding Standard: General Rules 1.Always be consistent with existing code. 2.Adopt naming conventions consistent with selected framework. 3.Use the same.
Prof. Necula CS 164 Lecture 141 Run-time Environments Lecture 8.
User-Level Memory Management in Linux Programming
Pointer applications. Arrays and pointers Name of an array is a pointer constant to the first element whose value cannot be changed Address and name refer.
Memory and Files Dr. Andrew Wallace PhD BEng(hons) EurIng
Tutorial 6 & 7 Symbol Table
Home: Phones OFF Please Unix Kernel Parminder Singh Kang Home:
Run time vs. Compile time
OS Spring’03 Introduction Operating Systems Spring 2003.
Run-time Environment and Program Organization
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
. Memory Management. Memory Organization u During run time, variables can be stored in one of three “pools”  Stack  Static heap  Dynamic heap.
Introduction to Embedded Systems
System Calls 1.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
ISBN Chapter 10 Implementing Subprograms.
Computer Architecture and Operating Systems CS 3230: Operating System Section Lecture OS-7 Memory Management (1) Department of Computer Science and Software.
ELG6163 Presentation Geoff Green March 20, 2006 TI Standard for Writing Algorithms.
Chapter 10 Implementing Subprograms. Copyright © 2012 Addison-Wesley. All rights reserved.1-2 Chapter 10 Topics The General Semantics of Calls and Returns.
Chapter 4 Storage Management (Memory Management).
CMPSC 16 Problem Solving with Computers I Spring 2014 Instructor: Tevfik Bultan Lecture 12: Pointers continued, C strings.
Chapter 6: User-Defined Functions
Lecture 2 Foundations and Definitions Processes/Threads.
Copyright © 2005 Elsevier Chapter 8 :: Subroutines and Control Abstraction Programming Language Pragmatics Michael L. Scott.
Ethernet Driver Changes for NET+OS V5.1. Design Changes Resides in bsp\devices\ethernet directory. Source code broken into more C files. Native driver.
TMS320 DSP Algorithm Standard: Overview & Rationalization.
C++ History C++ was designed at AT&T Bell Labs by Bjarne Stroustrup in the early 80's Based on the ‘C’ programming language C++ language standardised in.
Storage Management - Chap 10 MANAGING A STORAGE HIERARCHY on-chip --> main memory --> 750ps - 8ns ns. 128kb - 16mb 2gb -1 tb. RATIO 1 10 hard disk.
Topic 2d High-Level languages and Systems Software
Static Program Analyses of DSP Software Systems Ramakrishnan Venkitaraman and Gopal Gupta.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
UDI Tutorial & Driver Walk-Through Part 1 Kurt Gollhardt SCO Core OS Architect
ISBN Chapter 10 Implementing Subprograms.
RUN-Time Organization Compiler phase— Before writing a code generator, we must decide how to marshal the resources of the target machine (instructions,
1 Computer Systems II Introduction to Processes. 2 First Two Major Computer System Evolution Steps Led to the idea of multiprogramming (multiple concurrent.
Slides created by: Professor Ian G. Harris Hello World #include main() { printf(“Hello, world.\n”); }  #include is a compiler directive to include (concatenate)
12/22/ Thread Model for Realizing Concurrency B. Ramamurthy.
CSC 8505 Compiler Construction Runtime Environments.
Processes and Virtual Memory
More About Data Types & Functions. General Program Structure #include statements for I/O, etc. #include's for class headers – function prototype statements.
ICOM 4035 – Data Structures Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 – August 23, 2001.
© 2008, Renesas Technology America, Inc., All Rights Reserved 1 Introduction Purpose  This training course demonstrates the Project Generator function.
Advanced BioPSE NCRR Programming Conventions and Standards J. Davison de St. Germain Chief Software Engineer SCI Institute December 2003 J.
LECTURE 3 Translation. PROCESS MEMORY There are four general areas of memory in a process. The text area contains the instructions for the application.
Introduction to Operating Systems Concepts
Lecture 3 Translation.
Memory Management.
Processes and threads.
ENERGY 211 / CME 211 Lecture 25 November 17, 2008.
Software Development with uMPS
C++ History C++ was designed at AT&T Bell Labs by Bjarne Stroustrup in the early 80's Based on the ‘C’ programming language C++ language standardised in.
Code Generation.
Application Binary Interface (ABI)
More About Data Types & Functions
Module 2: Computer-System Structures
Chapter 8: Memory management
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
Lecture Topics: 11/1 General Operating System Concepts Processes
PZ09A - Activation records
Activation records Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Foundations and Definitions
Computer Organization & Compilation Process
Module 2: Computer-System Structures
Module 2: Computer-System Structures
COMP755 Advanced Operating Systems
Activation records Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Presentation transcript:

Texas Instruments ExpressDSP Algorithm Standard Prof. Brian L. Evans Dept. of Electrical and Computer Engineering The University of Texas at Austin August 7, 2001

Outline Introduction ExpressDSP requirements and rules Source code Algorithm performance characterization C6000-specific guidelines Direct memory access (DMA) resource rules Conclusion Reference:

Introduction: Algorithm Model Transformation of inputs into outputs Interoperate (play well) with other algorithms Decoupled from scheduling and resource allocation (e.g. dynamic memory allocation) object code (function) … inputs outputs algorithm performance … source code (function)

Introduction: Abstraction Maurice Wilkes: “There is no problem in computer programming that cannot be solved by an added level of indirection.” Achieve device independence by decoupling algorithms and physical peripherals Achieve algorithm interchangeability by using function pointers Maurice Wilkes is a hardware designer and the father of microcoding. He is from England and is 88 years old. He is one of the few foreign members of the US National Academy of Engineering.

Introduction: Abstraction Jim Gray: “There is no performance problem that cannot be solved by eliminating a level of indirection.” Jim Gray is software designer who is an expert in transaction processing and databases. He is in his early 50s and works at Microsoft Research in the San Francisco Bay Area. Jim is a member of the US National Academy of Engineering.

ExpressDSP Requirements Algorithms from multiple vendors can be integrated into a single system Algorithms are framework agnostic Algorithms can be deployed in statically scheduled and dynamically scheduled run-time environments Algorithms can be distributed in binary form (e.g. Hyperception’s Hypersignal approach) Algorithm integration does not require compilation but may require reconfiguration and relinking

ExpressDSP Rules Source code (levels 1 and 2) Programming guidelines: C callable, reentrant, etc. Component model: modules, naming conventions, etc. Algorithm performance characterization (level 2) Data and program memory Interrupt latency (delay upon invocation) Execution time (throughput) DMA resource rules (level 2) DSP-specific (level 3)

Programming Guidelines (level 1) R1 Algorithms must follow run-time conventions imposed by TI’s implementation of the C language Only externally visible functions must be C callable All functions must be careful in use of the stack R2 Algorithms must be reentrant Only modify data on stack or in instance “object” Treat global and static variables as read-only data Do not employ self-modifying code Disable interrupts (disabling all thread scheduling) around sections of code that violate these rules

Reentrant Example /* previous input values */ int iz0 = 0, iz1 = 0; void PRE_filter(int piInput[], int iLength) { int i; for (i = 0; i < iLength; i++) { int tmp = piInput[i] – iz0 + (13*iz1 + 16) / 32; iz1 = iz0; iz0 = piInput[i]; piInput[i] = tmp; } void PRE_filter(int piInput[], int iLength, int piPrevInput[]) { int i; for (i = 0; i < iLength; i++) { int iTmp = piInput[i] – piPrevInput[0] + (13*piPrevInput[1] + 16) / 32; piPrevInput[1] = piPrevInput[0]; piPrevInput[0] = piInput[i]; piInput[i] = iTmp; } Not reentrantReentrant state

Component Model (level 2) R7 All header files must support multiple inclusions within a single source file R8 All external definitions must be API identifiers or API and vendor prefixed (e.g. H263_UT) R11 All modules must supply an initialization (e.g. FIR_init) and finalization (e.g. FIR_exit) method Init function: initialize global data Exit function: run-time debugging assertions for sanity checks (usually empty in production code)

Component Model (level 2) ConventionDescriptionExample Variables and functions Begin with lower case (after prefix) LOG_printf() ConstantsAll uppercaseG729_FRAMELEN Data typesTitle case after prefixLOG_Obj or int Structure fieldsLowercasebuffer MacrosSame as constants or functions as appropriate LOG_getbuf() R10 Naming Conventions

FIR Module typedef struct FIR_Params { int iFrameLen; int *piCoeff; } FIR_Params; const FIR_Params FIR_PARAMS = { 64, NULL }; typedef struct FIR_Obj { int iHist[16]; int iFrameLen; int *piCoeff; } FIR_Obj; void FIR_init(void) { } void FIR_exit(void) { } typedef struct FIR_Obj* FIR_Handle; FIR_Handle FIR_create( FIR_Obj *pFir, const FIR_Params *pParams) { if (pFir) { if (pParams == NULL) { pParams = &FIR_PARAMS; } pFir->iFrameLen = pParams->iFrameLen; pFir->piCoeff = pParams->piCoeff; memset(pFir->iHist, 0, sizeof(pFir->iHist)); } return(pFir); } void FIR_delete(FIR_Handle fir) { }

Algorithm Performance (level 2) Algorithms declare memory usage in bytes R19 Worst-case heap data memory (inc. alignment) R20 Worst-case stack space data memory R21 Worst-case static data memory R22 Program memory Algorithms declare R23 Worst-case interrupt latency R24 Typical period and worst case execution time Include this information in the comment header

Example: Worst-case Heap Usage SizeAlignSizeAlignSizeAlign Scratch Persistent DARAMSARAMExternal Example requires bit words of single-access on-chip memory, bit words of external persistent memory. Entries in table may be functions of algorithm parameters.

C6000-specific Rules R25 All C6000 algorithms must be supplied in little endian format. Guideline 13 All C6000 algorithms should be supplied in both little and big endian formats R26 All C6000 algorithms must access all static and global data as far data. R27 C6000 algorithms must never assume placement in on-chip memory; they must properly operate with program memory operated in cache mode.

C6000 Register Usage RegisterUsageType A0-A9General purpose Scratch A10-A14General purpose Preserve A15Frame pointer Preserve A16-A31 (C64x) General purpose Scratch RegisterUsageType B0-B9General purpose Scratch B10-B13General purpose Preserve B14Data page pointer Preserve B15Stack pointer Preserve B16-B31 (C64x) General purpose Scratch Preserve: function must restore its value on exit

DMA Resource Rules DMA is used to move large blocks of memory on and off chip Algorithms cannot access DMA registers directly Algorithms cannot assume a particular physical DMA channel Algorithms must access the DMA resources through a handle using the specific asynchronous copy (ACPY) APIs. IDMA R1 All data transfer must be completed before return to caller

DMA Resource Rules IDMA R2 All algorithms using the DMA resource must implement the IDMA interface IDMA R3 Each of the IDMA methods implemented must be independently relocateable. IDMA R4 All algorithms must state the maximum number of concurrent DMA transfers for each logical channel IDMA R5 All algorithms must characterize the average and maximum size of the data transfers per logical channel for each operation.

Top 10 XDAIS Rules 1. All global and static variables must be constants 2. Must use C calling conventions 3. All include (.h) files must have an #ifdef structure to prevent it from being evaluated twice 4. All visible (external) variables, functions, and constants must be prefixed by PACKAGE_ORG, e.g. H263_UT. 5. Never use absolute (hardcoded) addresses (allows data to be relocated and properly aligned)

Top 10 XDAIS Rules 6. Must use IDMA instead of regular DMA (i.e. DAT_copy) 7. Never allocate or deallocate dynamic memory (no use of malloc, calloc, or free) 8. Disable interrupts (disables all thread scheduling) around sections of code that are not reentrant 9. Always check to see if a DSP/BIOS, chipset library, or I/O function is allowed (most are not) 10. Compile in far mode, e.g. in Code Composer

Issues for Future Standards Version control Licensing, encryption, and IP protection Installation and verification (digital signatures) Documentation and online help