Kwangil Choi, Hyunok Oh Hanyang University.  Introduction ◦ Non-volatile Memory (NVM) ◦ Synchronous dataflow (SDF)  Problem Definition  Answer Set.

Slides:



Advertisements
Similar presentations
Outline Memory characteristics SRAM Content-addressable memory details DRAM © Derek Chiou & Mattan Erez 1.
Advertisements

Prefetching Techniques for STT-RAM based Last-level Cache in CMP Systems Mengjie Mao, Guangyu Sun, Yong Li, Kai Bu, Alex K. Jones, Yiran Chen Department.
Memory Section 7.2. Types of Memories Definitions – Write: store new information into memory – Read: transfer stored information out of memory Random-Access.
Probabilistic Design Methodology to Improve Run- time Stability and Performance of STT-RAM Caches Xiuyuan Bi (1), Zhenyu Sun (1), Hai Li (1) and Wenqing.
†The Pennsylvania State University
T.Stobiecki Katedra Elektroniki AGH Magnetic Tunnel Junction (MTJ) or Tunnel Magnetoresistance (TMR) or Junction Magneto- Resistance (JMR) 11 wykład
DATAFLOW PROCESS NETWORKS Edward A. Lee Thomas M. Parks.
11 1 Hierarchical Coarse-grained Stream Compilation for Software Defined Radio Yuan Lin, Manjunath Kudlur, Scott Mahlke, Trevor Mudge Advanced Computer.
Semiconductor Memories ECE423 Xiang Yu RAM vs. ROM  Volatile  RAM (random access) SRAM (static) SRAM (static) SynchronousSynchronous AsynchronousAsynchronous.
University of Michigan Electrical Engineering and Computer Science 1 Reducing Control Power in CGRAs with Token Flow Hyunchul Park, Yongjun Park, and Scott.
11/29/2004EE 42 fall 2004 lecture 371 Lecture #37: Memory Last lecture: –Transmission line equations –Reflections and termination –High frequency measurements.
1 Lecture 15: DRAM Design Today: DRAM basics, DRAM innovations (Section 5.3)
Overview Memory definitions Random Access Memory (RAM)
FunState – An Internal Design Representation for Codesign A model that enables representations of different types of system components. Mixture of functional.
M -RAM (Magnetoresistive – Random Access Memory) Kraków, 7 XII 2004r.
Code and Decoder Design of LDPC Codes for Gbps Systems Jeremy Thorpe Presented to: Microsoft Research
1 Oct 2, 2003 Design Optimization of Mixed Time/Event-Triggered Distributed Embedded Systems Traian Pop, Petru Eles, Zebo Peng Embedded Systems Laboratory.
CAFO: Cost Aware Flip Optimization for Asymmetric Memories RAKAN MADDAH *, SEYED MOHAMMAD SEYEDZADEH AND RAMI MELHEM COMPUTER SCIENCE DEPARTMENT UNIVERSITY.
Topic ? Course Overview. Guidelines Questions are rated by stars –One Star Question  Easy. Small definition, examples or generic formulas –Two Stars.
Basic Computer Organization CH-4 Richard Gomez 6/14/01 Computer Science Quote: John Von Neumann If people do not believe that mathematics is simple, it.
Chapter 6 Memory and Programmable Logic Devices
Overview Booth’s Algorithm revisited Computer Internal Memory Cache memory.
CS1104-8Memory1 CS1104: Computer Organisation Lecture 8: Memory
CH05 Internal Memory Computer Memory System Overview Semiconductor Main Memory Cache Memory Pentium II and PowerPC Cache Organizations Advanced DRAM Organization.
Memory Basics Chapter 8.
12/1/2004EE 42 fall 2004 lecture 381 Lecture #38: Memory (2) Last lecture: –Memory Architecture –Static Ram This lecture –Dynamic Ram –E 2 memory.
Physical Memory By Gregory Marshall. MEMORY HIERARCHY.
Orchestration by Approximation Mapping Stream Programs onto Multicore Architectures S. M. Farhad (University of Sydney) Joint work with Yousun Ko Bernd.
Software Pipelining for Stream Programs on Resource Constrained Multi-core Architectures IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEM 2012 Authors:
Magnetoresistive Random Access Memory (MRAM)
Semiconductor Memories.  Semiconductor memory is an electronic data storage device, often used as computer memory, implemented on a semiconductor-based.
Static Translation of Stream Programs S. M. Farhad School of Information Technology The University of Sydney.
1 Very Large Scale Integration II - VLSI II Memory Structures Hayri Uğur UYANIK Devrim Yılmaz AKSIN ITU VLSI Laboratories.
CPEN Digital System Design
Digital Logic Design Instructor: Kasım Sinan YILDIRIM
Resource Mapping and Scheduling for Heterogeneous Network Processor Systems Liang Yang, Tushar Gohad, Pavel Ghosh, Devesh Sinha, Arunabha Sen and Andrea.
Task Graph Scheduling for RTR Paper Review By Gregor Scott.
Magnetic Random Access Memory Jonathan Rennie, Darren Smith.
Energy Reduction for STT-RAM Using Early Write Termination Ping Zhou, Bo Zhao, Jun Yang, *Youtao Zhang Electrical and Computer Engineering Department *Department.
Outline Motivation Simulation Framework Experimental methodology STT-RAM (Spin Torque Transference) ReRAM (Resistive RAM) PCRAM (Phase change) Comparison.
An Efficient Linear Time Triple Patterning Solver Haitong Tian Hongbo Zhang Zigang Xiao Martin D.F. Wong ASP-DAC’15.
ECEN 248: INTRODUCTION TO DIGITAL SYSTEMS DESIGN Dr. Shi Dept. of Electrical and Computer Engineering.
Hanyang University Hyunok Oh Energy Optimal Bit Encoding for Flash Memory.
Orchestration by Approximation Mapping Stream Programs onto Multicore Architectures S. M. Farhad (University of Sydney) Joint work with Yousun Ko Bernd.
Magnetic tunnel junctions for magnetic random access memory applications M. Guth), G. Schmerber, A. Dinia France 2002.
Literature Review on Emerging Memory Technologies
Random Access Memory (RAM).  A memory unit stores binary information in groups of bits called words.  The data consists of n lines (for n-bit words).
A memory is just like a human brain. It is used to store data and instructions. Computer memory is the storage space in computer where data is to be processed.
THE PHYSICS OF FERROELECTRIC MEMORIES Electro-ceramic Lab. Dept. of Materials Engineering.
Digital Circuits Introduction Memory information storage a collection of cells store binary information RAM – Random-Access Memory read operation.
COMP541 Memories II: DRAMs
Charles Kime & Thomas Kaminski © 2008 Pearson Education, Inc. (Hyperlinks are active in View Show mode) Chapter 8 – Memory Basics Logic and Computer Design.
1 KU College of Engineering Elec 204: Digital Systems Design Lecture 22 Memory Definitions Memory ─ A collection of storage cells together with the necessary.
Click to edit Master title style Progress Update Energy-Performance Characterization of CMOS/MTJ Hybrid Circuits Fengbo Ren 05/28/2010.
Future Memory Technologies in Nano Era
Static Translation of Stream Program to a Parallel System S. M. Farhad The University of Sydney.
Index What is an Interface Pins of 8085 used in Interfacing Memory – Microprocessor Interface I/O – Microprocessor Interface Basic RAM Cells Stack Memory.
Hang Zhang1, Xuhao Chen1, Nong Xiao1,2, Fang Liu1
COMP541 Memories II: DRAMs
Modeling of Failure Probability and Statistical Design of Spin-Torque Transfer MRAM (STT MRAM) Array for Yield Enhancement Jing Li, Charles Augustine,
Magnetoresistive Random Access Memory (MRAM)
Memory Segmentation to Exploit Sleep Mode Operation
Low Write-Energy STT-MRAMs using FinFET-based Access Transistors
Introduction | Model | Solution | Evaluation
Parallel Programming By J. H. Wang May 2, 2017.
ABSTRACT   Recent work has shown that sink mobility along a constrained path can improve the energy efficiency in wireless sensor networks. Due to the.
Information Storage and Spintronics 13
Information Storage and Spintronics 10
STT-RAM Design Fengbo Ren Advisor: Prof. Dejan Marković Dec. 3rd, 2010
Literature Review A Nondestructive Self-Reference Scheme for Spin-Transfer Torque Random Access Memory (STT-RAM) —— Yiran Chen, et al. Fengbo Ren 09/03/2010.
Presentation transcript:

Kwangil Choi, Hyunok Oh Hanyang University

 Introduction ◦ Non-volatile Memory (NVM) ◦ Synchronous dataflow (SDF)  Problem Definition  Answer Set Programming  Experiment  Conclusion

 Non-Volatile Memory (NVM)  Replace DRAM for main memory Type Phase change RAM (PRAM) Spin-transfer torque magneto resistive RAM (STT-MRAM) Ferroelectric RAM(FRAM) Pros High density Low static energy consumption Cons High write energy consumption Poor write performance

( * MTJ : Magnetic Tunnel Junction) current free layer tunnel oxide fixed layer Gate SourceDrain Synthetic Anti ferromagnetic (SyAF) structure Bottom Electrode(substrate) Tunnel Barrier Free Layer Pinned Layer Seed Layer Capping Layer Spacing Layer Top AF Layer Top Electrode Bottom AF Layer Buffer Layer MTJ*

The reduction of the retention time contributes the cell density, leakage power, dynamic power consumption, and performance.

STT1STT2STT3 Cell size (F 2 ) T retention 26.5μs3.24s4.27yr Lat R (ns) Lat w (ns) Dyn R (nJ) Dyn W (nJ) P leak (mW)

26.5μs 3.24s 4.27y processor STT1 memory STT2 memory STT3 memory

 Synchronous dataflow (SDF) ◦ represents streaming applications like multimedia that require frequent memory access ◦ Node(Actor) - functional algorithm ◦ Edge - communication between two actors ◦ Producing / Consuming rate  the number of produced and consumed samples ◦ Rate is fixed A A B B 32

1ms } T retention = 1 ms 16 refresh operations T retention = 4 ms no refresh operation

 Introduction ◦ Non-volatile Memory (NVM) ◦ Synchronous dataflow (SDF)  Problem Definition  Answer Set Programming  Experiment  Conclusion

 Input ◦ Target architecture: A system with multiple relaxed reten tion time STT-MRAM modules. Note that the memory ref reshes memory cells containing valid data. ◦ Characteristics of STT-MRAM : retention time, read/write energy, and refresh energy. ◦ Application : An application is specified in SDF model. A schedule and the execution time of each node are given.  Goal ◦ Minimization the total energy consumption on the memo ry system for the application.  Output ◦ The mapping of buffers to STT-MRAM modules.

A A B B 32 SDF graph, schedule and execution time are given A system with multiple retention time memories Map the buffer to memory to minimize the energy consumption

1. Construct a schedule AACBDDFEEEE 2. Build lifetime chart STT- Short STT- Short STT- Long STT- Long 3. Determine buffer mapping Energy consumption = Energy consumption = 15033

 Write energy  Refresh energy  Total energy = write energy+refresh energy  ConstraintMeaning lt(t j )The lifetime of token belonging to buffer map(b i )The mapped memory for buffer rt(m)The retention time of memory bibi The buffer size on edge E ref (m)The refresh energy for a token in memory

 Introduction ◦ Non-volatile Memory (NVM) ◦ Synchronous dataflow (SDF)  Problem Definition  Answer Set Programming  Experiment  Conclusion

 Declarative approach for NP problems  Problem - logic predicates “ AND”  Solutions - answer sets  Easy to understand the formulation  Fast ASP solvers have been introduced

A B node(1..2). edge(1,1,2,3,3,3).edge(2,2,1,2,2,5). repetition(1,1).repetition(2,1). lifetime(E,Inv, Duration) :- fire(A,S), fire(B,F), edge(E,A,B, P,C,I), numFiredBefore(A,S,SN), numFiredBefore( B,F,FN), Inv=FN*C-C+1..FN*C, SN*P-P 0, Inv> 0, Inv<= R*P,repetition(A,R), S<F, extime(A,ATime). buffer_energy(E,Write*P*Rep+Refresh*Energy) :- Energy = [lifetime(E,T,Duration)=Duration/Retention+1], ed ge(E,A,_,P,_,I), retention(Type,Retention), memory _type(M,Type), map(E,M), refresh_energy(Type,Re fresh), write_energy(Type,Write), repetition(A,Rep). 1 { memory_type(M,T) : retention(T,_) } 1 :- memory(M). 1 { map(E,M) : memory(M) } 1 :- edge(E,_,_,_,_,_).sample (E,S-C,T) :- fire(B,T), edge(E,_,B,P,C,I), sample(E, S,T-1), S>= C, time(T). #minimize [buffer_energy(E,Energy)=Energy : edge(E,_,_,_,_,_)]. Answer: 1 memory_type(2,1) memory_type(1,1) map(8,2) map(7,2) map(6,2) map(5,2) map(4,2) map(3,2) map(2,2) map(1,2) buffer_energy(8,3123) buffer_energy(7,11196) buffer_energy(6,12438) buffer_energy(5,4850) buffer_energy(4,11196) buffer_energy(3,11196) buffer_energy(2,11196) buffer_energy(1,19269) Optimization: Answer: 2 memory_type(2,2) memory_type(1,1) map(8,2) map(7,2) map(6,2) map(5,2) map(4,2) map(3,2) map(2,2) map(1,2) buffer_energy(8,5004) buffer_energy(7,5994) buffer_energy(6,5994) buffer_energy(5,6660) buffer_energy(4,5994) buffer_energy(3,5994) buffer_energy(2,5994) buffer_energy(1,5994) Optimization: Answer: 3 memory_type(2,2) memory_type(1,1) map(8,2) map(7,2) map(6,2) map(5,1) map(4,2) map(3,2) map(2,2) map(1,2) buffer_energy(8,5004) buffer_energy(7,5994) buffer_energy(6,5994) buffer_energy(5,4850) buffer_energy(4,5994) buffer_energy(3,5994) buffer_energy(2,5994) buffer_energy(1,5994) Optimization: OPTIMUM FOUND SDF GraphASP formulationResult

 Introduction ◦ Non-volatile Memory (NVM) ◦ Synchronous dataflow (SDF)  Problem Definition  Answer Set Programming  Experiments  Conclusion

 Synthetic examples ◦ 7 randomly generated examples ◦ 3 to 7 actors  Real-life applications ◦ Part of CELP ◦ H.263 encoder / decoder ◦ MP3 decoder  Each node n has the execution time ◦ T(n) = k*T i (n) ◦ where k represents the scale factor, T i (n) the initial execution ti me, and T(n) the execution time of node n. CPUIntel i5 RAM8GB OSUbuntu Linux ASP SolverClingo 3.0

Scale factor = 1Scale factor = 10 Scale factor = 1000 Scale factor = 50000

scale factor ratio

Scale factor=1Scale factor=10

 Buffer mapping algorithm for a system with multiple retention STT-MRAM memories can reduce the energy consumption by 30~70%.  The mapped STT-MRAM memory is dependent on the variable lifetime.