The Memory-Processor Gap

Slides:

Advertisements

Similar presentations

1 Jin Li Microsoft Research. Outline The Upcoming Video Tidal Wave Internet Infrastructure: Data Center/CDN/P2P P2P in Microsoft Locality aware P2P Conclusions.

Advertisements

COMPUTER SYSTEMS An Integrated Approach to Architecture and Operating Systems Chapter 9 Memory Hierarchy ©Copyright 2008 Umakishore Ramachandran and William.

Electrical and Computer Engineering UAH System Level Optical Interconnect Optical Fiber Computer Interconnect: The Simultaneous Multiprocessor Exchange.

Network Processor Technical Report Present by: Jiening Jiang June 05.

Communication-Avoiding Algorithms Jim Demmel EECS & Math Departments UC Berkeley.

Cache Design and Tricks Presenters: Kevin Leung Josh Gilkerson Albert Kalim Shaz Husain.

Arjun Suresh S7, R College of Engineering Trivandrum.

Jaewoong Sim Alaa R. Alameldeen Zeshan Chishti Chris Wilkerson Hyesoon Kim MICRO-47 | December 2014.

IMPACT Second Generation EPIC Architecture Wen-mei Hwu IMPACT Second Generation EPIC Architecture Wen-mei Hwu Department of Electrical and Computer Engineering.

55:035 Computer Architecture and Organization Lecture 7 155:035 Computer Architecture and Organization.

Cache Heng Sovannarith

Appendix B. Memory Hierarchy CSCI/ EENG – W01 Computer Architecture 1 Dr. Babak Beheshti Slides based on the PowerPoint Presentations created by.

Smart Refresh: An Enhanced Memory Controller Design for Reducing Energy in Conventional and 3D Die-Stacked DRAMs Mrinmoy Ghosh Hsien-Hsin S. Lee School.

Main Mem.. CSE 471 Autumn 011 Main Memory The last level in the cache – main memory hierarchy is the main memory made of DRAM chips DRAM parameters (memory.

Computer Architecture & Organization

Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.

1 BGL Photo (system) BlueGene/L IBM Journal of Research and Development, Vol. 49, No. 2-3.

1 RAMP 100K Core Breakout Assorted RAMPants RAMP Retreat, UC San Diego June 14, M.

Scaling the Bandwidth Wall: Challenges in and Avenues for CMP Scalability 36th International Symposium on Computer Architecture Brian Rogers †‡, Anil Krishna.

Memory Hierarchy.1 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output.

Architecture for Network Hub in 2011 David Chinnery Ben Horowitz.

Array Allocation Taking into Account SDRAM Characteristics Hong-Kai Chang Youn-Long Lin Department of Computer Science National Tsing Hua University HsinChu,

1 Effect of Increasing Chip Density on the Evolution of Computer Architectures R. Nair IBM Journal of Research and Development Volume 46 Number 2/3 March/May.

Management, Monitoring, and Optimization Chapter 20 Networking Essentials Spring, 2013.

2007 Sept 06SYSC 2001* - Fall SYSC2001-Ch1.ppt1 Computer Architecture & Organization  Instruction set, number of bits used for data representation,

User side and server side factors that influence the performance of the website P2 Unit 28.

Multi Core Processor Submitted by: Lizolen Pradhan

Architecture Examples And Hierarchy Samuel Njoroge.

Dong Hyuk Woo Nak Hee Seong Hsien-Hsin S. Lee

Amalgam: a Reconfigurable Processor for Future Fabrication Processes Nicholas P. Carter University of Illinois at Urbana-Champaign.

FAMU-FSU College of Engineering 1 Computer Architecture EEL 4713/5764, Fall 2006 Dr. Linda DeBrunner Module #17—Main Memory Concepts.

Frank Casilio Computer Engineering May 15, 1997 Multithreaded Processors.

– Mehmet SEVİK – Yasin İNAĞ

Part V Memory System Design

CS 312 Computer Architecture Memory Basics Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

Routing Prefix Caching in Network Processor Design Huan Liu Department of Electrical Engineering Stanford University

Future of parallel computing: issues and directions Laxmikant Kale CS433 Spring 2000.

Chap 4: Processors Mainly manufactured by Intel and AMD Important features of Processors: Processor Speed (900MHz, 3.2 GHz) Multiprocessing Capabilities.

Contemporary DRAM memories and optimization of their usage Nebojša Milenković and Vladimir Stanković, Faculty of Electronic Engineering, Niš.

CS203 – Advanced Computer Architecture

Taeho Kgil, Trevor Mudge Advanced Computer Architecture Laboratory The University of Michigan Ann Arbor, USA CASES’06.

Itanium® 2 Processor Architecture

CS 704 Advanced Computer Architecture

Fang Fang James C. Hoe Markus Püschel Smarahara Misra

CS203 – Advanced Computer Architecture

Hardware Acceleration of A Boolean Satisfiability Solver

3-D IC Fabrication and Devices

Computer architecture and computer organization

CFTP ( Configurable Fault Tolerant Processor )

Berkeley Cluster: Zoom Project

Architecture & Organization 1

A unified instruction and data cache

Buffered Compares: Excavating the Hidden Parallelism inside DRAM Architectures with Lightweight Logic Jinho Lee, Kiyoung Choi, and Jung Ho Ahn Seoul.

Gwangsun Kim Niladrish Chatterjee Arm, Inc. NVIDIA Mike O’Connor

Architecture & Organization 1

Lecture on High Performance Processor Architecture (CS05162)

Interconnect with Cache Coherency Manager

Discovering Computers 2014: Chapter6

Computer System Design (Processor Design)

Performance metrics for caches

Performance metrics for caches

Performance metrics for caches

Project ARIES Advanced RAM Integration for Efficiency and Scalability

Computer System Design Lecture 9

Die Stacking (3D) Microarchitecture -- from Intel Corporation

RECONFIGURABLE NETWORK ON CHIP ARCHITECTURE FOR AEROSPACE APPLICATIONS

ECE 463/563 Fall `18 Memory Hierarchies, Cache Memories H&P: Appendix B and Chapter 2 Prof. Eric Rotenberg Fall 2018 ECE 463/563, Microprocessor Architecture,

A microprocessor into a memory chip Dave Patterson, Berkeley, 1997

Low Overhead Interrupt Handling with SMT

Optimal Co-design of FPGA Implementations for MPC

Presentation transcript:

The Memory-Processor Gap Processor speed increases by 60% per year while memory access speed increase is 10% Most techniques focus on latency tolerance which exposes bandwidth limitations The most obvious solution is to reduce traffic to off-chip memory

IRAM Architecture

FlexRAM Design

DIVA Project

Active Pages

Experimental Plans On-chip DRAM cache FPGA-based chip Compatible with current architecture Allows bigger caches in same die area Reduces processor-memory gap problem by reducing traffic to off-chip memory FPGA-based chip Uses standard FPGA chips Allows good memory and logic locality Eliminates difference in fabrication problem Integrates memory and logic onto a single chip easily

Conclusions Memory-logic integration is possibly the best solution Much more research still needs to be done in this area There are several options that are relatively unexplored Research must continue in this area as the memory-processor gap widens every year