Memory Considerations

Slides:



Advertisements
Similar presentations
Bus Specification Embedded Systems Design and Implementation Witawas Srisa-an.
Advertisements

Computer Architecture
AMD OPTERON ARCHITECTURE Omar Aragon Abdel Salam Sayyad This presentation is missing the references used.
8086.  The 8086 is Intel’s first 16-bit microprocessor  The 8086 can run at different clock speeds  Standard 8086 – 5 MHz  –10 MHz 
Microprocessors. Von Neumann architecture Data and instructions in single read/write memory Contents of memory addressable by location, independent of.
Processor System Architecture
System Design Tricks for Low-Power Video Processing Jonah Probell, Director of Multimedia Solutions, ARC International.
Memory Chapter 3. Slide 2 of 14Chapter 1 Objectives  Explain the types of memory  Explain the types of RAM  Explain the working of the RAM  List the.
10.2 Characteristics of Computer Memory RAM provides random access Most RAM is volatile.
Embedded Systems Programming
© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Chapter 13 Direct Memory Access (DMA)
1 Lecture 14: Cache Innovations and DRAM Today: cache access basics and innovations, DRAM (Sections )
1 Architectural Analysis of a DSP Device, the Instruction Set and the Addressing Modes SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software.
Midterm Tuesday October 23 Covers Chapters 3 through 6 - Buses, Clocks, Timing, Edge Triggering, Level Triggering - Cache Memory Systems - Internal Memory.
Getting the O in I/O to work on a typical microcontroller Activating a FLASH memory “output line” Part 1 Main part of Laboratory 1 Also needed for “voice.
CHAPTER 9: Input / Output
GCSE Computing - The CPU
COMPUTER MEMORY Modern computers use semiconductor memory It is made up of thousands of circuits (paths) for electrical currents on a single silicon chip.
NS Training Hardware. Memory Interface Support for SDRAM, asynchronous SRAM, ROM, asynchronous flash and Micron synchronous flash Support for 8,
Hardware Overview Net+ARM – Well Suited for Embedded Ethernet
Input/Output. Input/Output Problems Wide variety of peripherals —Delivering different amounts of data —At different speeds —In different formats All slower.
Higher Computing Computer Systems S. McCrossan 1 Higher Grade Computing Studies 2. Computer Structure Computer Structure The traditional diagram of a computer...
Advanced Higher Computing  Computer Architecture  Chapter 2.
Survey of Existing Memory Devices Renee Gayle M. Chua.
© 2005 Pearson Addison-Wesley. All rights reserved Figure 2.1 This chapter focuses on key hardware layer components.
Memory Cell Operation.
Input-Output Organization
By Edward A. Lee, J.Reineke, I.Liu, H.D.Patel, S.Kim
Identifying Hardware Components in a Computer (continued) Clock Speed (continued) The computer has a system clock that generates a regular electronic beat.
The World Leader in High Performance Signal Processing Solutions Multi-core programming frameworks for embedded systems Kaushal Sanghai and Rick Gentile.
System Hardware FPU – Floating Point Unit –Handles floating point and extended integer calculations 8284/82C284 Clock Generator (clock) –Synchronizes the.
Fundamentals of Programming Languages-II
Contemporary DRAM memories and optimization of their usage Nebojša Milenković and Vladimir Stanković, Faculty of Electronic Engineering, Niš.
COMPUTER SYSTEMS ARCHITECTURE A NETWORKING APPROACH CHAPTER 12 INTRODUCTION THE MEMORY HIERARCHY CS 147 Nathaniel Gilbert 1.
CS 1410 Intro to Computer Tecnology Computer Hardware1.
Amdahl’s Law & I/O Control Method 1. Amdahl’s Law The overall performance of a system is a result of the interaction of all of its components. System.
Networked Embedded Systems Pengyu Zhang & Sachin Katti EE107 Spring 2016 Lecture 11 Direct Memory Access.
2D-Graphic Accelerator
CPU Lesson 2.
Unit 2 Technology Systems
GCSE Computing - The CPU
DSP技术与应用 Section 4 ADSP-2191 Memory.
Direct memory access Direct memory access (DMA) is a process in which an external device takes over the control of system bus from the CPU. DMA is for.
Nios II Processor: Memory Organization and Access
Computer Hardware – System Unit
Computing Systems Organization
Discovering Computers 2011: Living in a Digital World Chapter 4
5.2 Eleven Advanced Optimizations of Cache Performance
EE 107 Fall 2017 Lecture 7 Serial Buses – I2C Direct Memory Access
Chapter 14 Instruction Level Parallelism and Superscalar Processors
Cache Memory Presentation I
Dr. Michael Nasief Lecture 2
Basic Computer Organization
Chapter III Desktop Imaging Systems & Issues
An Introduction to Microprocessor Architecture using intel 8085 as a classic processor
CMPT 886: Computer Architecture Primer
DRAM Bandwidth Slide credit: Slides adapted from
May 2006 Saeid Nooshabadi ELEC2041 Microprocessors and Interfacing Lectures 30: Memory and Bus Organisation - I
VLIW DSP vs. SuperScalar Implementation of a Baseline H.263 Encoder
Memory Basics Chapter 8.
Components of a CPU AS Computing - F451.
Computer Architecture
Memory Basics Chapter 7.
Modified from notes by Saeid Nooshabadi
Learning Objectives To be able to describe the purpose of the CPU
Wireless Embedded Systems
GCSE Computing - The CPU
ADSP 21065L.
Chapter 5 Input/Output Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved
Overview Problem Solution CPU vs Memory performance imbalance
Presentation transcript:

Memory Considerations

Memory Architecture: The Basics >600MHz Core >600MHz Single cycle to access 10’s of Kbytes L1 Instruction Memory L1 Data Memory L1 Data Memory >300MHz Unified L2 On-chip Off-chip Several cycles to access 100’s of Kbytes DMA <133MHz Several system cycles to access 100’s of Mbytes External Memory External Memory External Memory Unified L3 External Memory

Partitioning Data – Internal Memory Sub-Banks Multiple accesses can be made to different internal sub-banks in the same cycle Core fetch DMA Core fetch Buffer0 Buffer1 Buffer0 DMA Buffer2 Coefficients DMA Buffer1 DMA Unused Buffer2 Unused Coefficients Un-optimized DMA and core conflict when accessing sub-banks Optimized Core and DMA operate in harmony Know your memory bank architecture!

Partitioning Code and Data -- External Memory Banks Multiple rows can be “active” at the same time One row in each bank Code Video Frame 0 Code Instruction Fetch Instruction Fetch Four 16MByte Banks internal to SDRAM Video Frame 1 Ref Frame Video Frame 0 External Bus External Bus DMA DMA Unused Video Frame 1 Unused Ref Frame Row activation cycles are taken almost every access Row activation cycles are spread across hundreds of accesses External Memory External Memory

Managing External Memory Bus DMA External Memory External memory physics Must be understood Core Write Read Write Read Programmable Direction Control 2 Writes 2 Reads Improved performance with direction control Extra cycles without using direction control Group transfers in the same direction to reduce number of turn-arounds

Managing External Memory Accesses Transfer to L3 memory Blocks to L1 memory Stage line in L2 memory Turn-arounds every few transfers No Direction Control “Programmed” Direction Control Transfer directly to L3 from video peripheral Bring image block into L1 Memory (Turn-arounds occur every few transfers) Stage Video Line in L2 Memory Transfer complete line into L3 Memory Bring image block into L1 Memory (Turn-arounds are spread across hundreds of accesses)

Setting the clocks properly Try to maximize CCLK and SCLK On Blackfin Processors, CCLK can run to 600 MHz (depending on the device) SCLK is the speed of the internal system buses (max speed of 133MHz) SDRAM Refresh rate should be set to optimum value! It is based on SCLK If it is too low, you lose data because the contents “decay” If it is too high, performance suffers because the core needs to wait until the “refresh” cycles complete before an access is made