M U N -March 10, 2005 - Phil Bording1 Computer Engineering of Wave Machines for Seismic Modeling and Seismic Migration R. Phillip Bording March 10, 2005.

Slides:



Advertisements
Similar presentations
PIPELINE AND VECTOR PROCESSING
Advertisements

Instruction Set Design
Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
RISC / CISC Architecture By: Ramtin Raji Kermani Ramtin Raji Kermani Rayan Arasteh Rayan Arasteh An Introduction to Professor: Mr. Khayami Mr. Khayami.
1/1/ /e/e eindhoven university of technology Microprocessor Design Course 5Z008 Dr.ir. A.C. (Ad) Verschueren Eindhoven University of Technology Section.
Tuan Tran. What is CISC? CISC stands for Complex Instruction Set Computer. CISC are chips that are easy to program and which make efficient use of memory.
Introduction CS 524 – High-Performance Computing.
Processor Technology and Architecture
Chapter 5: Computer Systems Organization Invitation to Computer Science, Java Version, Third Edition.
ELEC Fall 05 1 Very- Long Instruction Word (VLIW) Computer Architecture Fan Wang Department of Electrical and Computer Engineering Auburn.
Chapter 4 Processor Technology and Architecture. Chapter goals Describe CPU instruction and execution cycles Explain how primitive CPU instructions are.
RISC. Rational Behind RISC Few of the complex instructions were used –data movement – 45% –ALU ops – 25% –branching – 30% Cheaper memory VLSI technology.
11/11/05ELEC CISC (Complex Instruction Set Computer) Veeraraghavan Ramamurthy ELEC 6200 Computer Architecture and Design Fall 2005.
Computer Systems Computer Performance.
Reduced Instruction Set Computers (RISC) Computer Organization and Architecture.
Processor Organization and Architecture
Advanced Computer Architectures
Lecture#14. Last Lecture Summary Memory Address, size What memory stores OS, Application programs, Data, Instructions Types of Memory Non Volatile and.
Invitation to Computer Science 5th Edition
Chapter 5: Computer Systems Organization Invitation to Computer Science, Java Version, Third Edition.
Chapter One Introduction to Pipelined Processors.
LBNLGXTBR FY2001 Oil and Gas Recovery Technology Review Meeting Diagnostic and Imaging High Speed 3D Hybrid Elastic Seismic Modeling Lawrence Berkeley.
Scheduling Many-Body Short Range MD Simulations on a Cluster of Workstations and Custom VLSI Hardware Sumanth J.V, David R. Swanson and Hong Jiang University.
1 Instruction Sets and Beyond Computers, Complexity, and Controversy Brian Blum, Darren Drewry Ben Hocking, Gus Scheidt.
Problem is to compute: f(latitude, longitude, elevation, time)  temperature, pressure, humidity, wind velocity Approach: –Discretize the.
RISC By Ryan Aldana. Agenda Brief Overview of RISC and CISC Features of RISC Instruction Pipeline Register Windowing and renaming Data Conflicts Branch.
Radix-2 2 Based Low Power Reconfigurable FFT Processor Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Gin-Der Wu and Yi-Ming Liu Department.
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
Computer Organization and Architecture Tutorial 1 Kenneth Lee.
Chapter 17 Looking “Under the Hood”. 2Practical PC 5 th Edition Chapter 17 Getting Started In this Chapter, you will learn: − How does a computer work.
M U N - February 17, Phil Bording1 Computer Engineering of Wave Machines for Seismic Modeling and Seismic Migration R. Phillip Bording March 8,
RISC and CISC. What is CISC? CISC is an acronym for Complex Instruction Set Computer and are chips that are easy to program and which make efficient use.
M U N - February 17, Phil Bording1 Computer Engineering of Wave Machines for Seismic Modeling and Seismic Migration R. Phillip Bording February.
Computer Organization. This module surveys the physical resources of a computer system.  Basic components  CPU  Memory  Bus  I/O devices  CPU structure.
Parallel Computing.
CPS 4150 Computer Organization Fall 2006 Ching-Song Don Wei.
ECEG-3202 Computer Architecture and Organization Chapter 7 Reduced Instruction Set Computers.
Stored Programs In today’s lesson, we will look at: what we mean by a stored program computer how computers store and run programs what we mean by the.
CS 1308 Computer Literacy and the Internet. Objectives In this chapter, you will learn about:  The components of a computer system  Putting all the.
GPU Based Sound Simulation and Visualization Torbjorn Loken, Torbjorn Loken, Sergiu M. Dascalu, and Frederick C Harris, Jr. Department of Computer Science.
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
1  1998 Morgan Kaufmann Publishers Where we are headed Performance issues (Chapter 2) vocabulary and motivation A specific instruction set architecture.
EECS 322 March 18, 2000 RISC - Reduced Instruction Set Computer Reduced Instruction Set Computer  By reducing the number of instructions that a processor.
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
AUTOMATON A Fuzzy Logic Automatic Picker Paul Gettings 1 UTAM 2003 Annual Meeting 1 Thermal Geophysics Research Group, University of Utah.
Computer Architecture Lecture 24 Parallel Processing Ralph Grishman November 2015 NYU.
M U N - February 15, Phil Bording1 Computer Engineering of Wave Machines for Seismic Modeling and Seismic Migration R. Phillip Bording February.
CISC. What is it?  CISC - Complex Instruction Set Computer  CISC is a design philosophy that:  1) uses microcode instruction sets  2) uses larger.
Winter-Spring 2001Codesign of Embedded Systems1 Essential Issues in Codesign: Architectures Part of HW/SW Codesign of Embedded Systems Course (CE )
Processor Performance & Parallelism Yashwant Malaiya Colorado State University With some PH stuff.
Vector computers.
Addressing modes, memory architecture, interrupt and exception handling, and external I/O. An ISA includes a specification of the set of opcodes (machine.
These slides are based on the book:
Advanced Architectures
ESE532: System-on-a-Chip Architecture
A Closer Look at Instruction Set Architectures
Chapter 3: Principles of Scalable Performance
17-Nov-18 Parallel 2D and 3D Acoustic Modeling Application for hybrid computing platform of PARAM Yuva II Abhishek Srivastava, Ashutosh Londhe*, Richa.
CISC AND RISC SYSTEM Based on instruction set, we broadly classify Computer/microprocessor/microcontroller into CISC and RISC. CISC SYSTEM: COMPLEX INSTRUCTION.
Husky Energy Chair in Oil and Gas Research
EE 445S Real-Time Digital Signal Processing Lab Spring 2014
Acoustic Reflection 2 (distance) = (velocity) (time) *
Chapter 5: Computer Systems Organization
Elastic and Acoustic Wave Phenomena
Chapter 17 Looking “Under the Hood”
Cache - Optimization.
COMPUTER ORGANIZATION AND ARCHITECTURE
Multicore and GPU Programming
Husky Energy Chair in Oil and Gas Research
ESE532: System-on-a-Chip Architecture
Presentation transcript:

M U N -March 10, Phil Bording1 Computer Engineering of Wave Machines for Seismic Modeling and Seismic Migration R. Phillip Bording March 10, Max Address Husky Energy Chair in Oil and Gas Research Memorial University of Newfoundland

M U N -March 10, Phil Bording2 Cache Memory - Three Levels Architecture Address Pointer Memory Multi- Gigabytes Large and Slow 160 X 16X L3 Cache Memory Cache Control Logic L2 Cache Memory L1 Cache Memory 2X 8X 16 Megabytes 128 Kilobytes 32 Kilobytes 2 Gigahertz Clock Featuring Really Non-Deterministic Execution

M U N -March 10, Phil Bording3 Problem Solving – 3D Example of Array Addressing Address = (k-1)*Lx*Ly +(j-1)*Lx+(i-1) + base Grid Points i,j,ki-1,j,ki+1,j,k

M U N -March 10, Phil Bording4 Cache Memory Access Streams 1D Streams – 100% 1D +/-1 100% 2D +/-1 100% 2D +/-N 80% 2D +/-1 +/-N 26%

M U N -March 10, Phil Bording5 Cache Memory Access Streams 3D +/-1 100% 3D +/-N 80% 3D +/-N*N 28% 3D ALL 7%

M U N -March 10, Phil Bording6 IEEE 754 Floating Point

M U N -March 10, Phil Bording7 IEEE 754 Floating Point

M U N -March 10, Phil Bording8 IEEE 754 Floating Point

M U N -March 10, Phil Bording9 Seismic Modeling and the Inverse Problem

M U N -March 10, Phil Bording10

M U N -March 10, Phil Bording11 12 Streamers x 5.1 Kilometers Long Data collected for 70 continuous days Over 2300 Square Km.

M U N -March 10, Phil Bording12 3D Seismic Modeling 1.Large Scale 3D ~200+ Wave Lengths 2.Acoustic and Elastic Wave Equations 3.In-Homogeneous Earth has widely varying parameters. 4.Complexity limits use of 3D elastic modeling 5.Problem Scale Nx=Ny=Nz ~ 1000 Ntime ~ 10,000 Work per Grid Point ~ 100 Number of Seismic Shots per Survey ~ 100,000 Single Survey Simulation is 10^20 Operations.

M U N -March 10, Phil Bording13 The Babbage Difference Engine, circa 1853

M U N -March 10, Phil Bording14 Wave Equation Difference Engine (WEDE) for Seismic Modeling Four Processors Acoustic Wave Equation My PhD thesis project at the University of Tulsa

M U N -March 10, Phil Bording15 Wave Equation Difference Engine Finite Differences Elastic or Acoustic Wave Equations Regular Grids Sponge/One-Way Wave Equation Boundary Conditions Any Source/Receiver Geometry Explicit 4 th order in Time & 8 th order in Space?

M U N -March 10, Phil Bording16 Wave Equation Difference Engine No Cache Memory Deterministic Execution Not a MIMD or SIMD or Data Flow Data movement and control matches the algorithm Each grid point has control word Three levels of parallelism, ( Amount of Parallelism) Instruction trees, ~ Multiple Instructions with selection, ~2-3 Multiple Grid points, ~Hundreds of Thousands

M U N -March 10, Phil Bording17 Acoustic, Constant Density Density is so constant it does not appear in the equation. C is the P Wave Velocity. The source energy is in src. Psi is the wave field.

M U N -March 10, Phil Bording18 Wave Equation Difference Engine Machine Performance 100 operations in pipeline 1,000,000 grid point processors 100 Megahertz Clock 10^16 Operations per second

M U N -March 10, Phil Bording19

M U N -March 10, Phil Bording20

M U N -March 10, Phil Bording21 Application Specific Parallel Computing Choose carefully an application which is BIG. Find an algorithm which is suitable. Good data locality. Regular structure in data movement High memory data transfers Map the algorithm into hardware

M U N -March 10, Phil Bording22 Application Specific Parallel Computing What it is not! Not suitable for just any algorithm Not general purpose, we will have an efficient but specific memory subsystem. Does not match the alphabet soup, SIMD, MIMD,NUMA, etc

M U N -March 10, Phil Bording23 What do ASP machines need?? VLSI Design Team, fabless and good? Clever Architect for the problem. A very good memory design!

M U N -March 10, Phil Bording24 What do ASP machines do away with?? Language Compilers Outdated junk in the processor design, x86! Cache memories! Non-deterministic execution!

M U N -March 10, Phil Bording25 Multiple Bank Memory Systems Starting Address +N +2N +3N Mod 4 Memory Banks Bank As many as are needed!!!!

M U N -March 10, Phil Bording26 Pipelined Instruction Trees Each higher level offers parallel operations Pipeline assumes all registers are loaded every cycle Hardwired?? Actually today the instruction trees could be re-configurable using re-programmable cells!!! r = a+b-x*y

M U N -March 10, Phil Bording27 Pipelined Instruction Trees a bd y - * - abxy * + Multiple Trees offer the second level of Parallelism +

M U N -March 10, Phil Bording28 Three Levels of Parallelism 1.Instruction Trees, Multiple Levels 2.Multiple Results 3.Multiple Grid Point Processors

M U N -March 10, Phil Bording29 Wave Machine

M U N -March 10, Phil Bording30 Imaging Machine

M U N -March 10, Phil Bording31 Wave Equation a) 8th or 10th Order in space b) 4 th Order in time, tricky but possible c) Sponge Boundary Conditions, slowly varying weights along sides d) Nominal flat topography, new schemes are building in topography e) Any seismic source location, any geophone location

M U N -March 10, Phil Bording32 Elastic Wave Equation a) Grid point work is about 100 operations b) About 20,000 time steps per shot c) 200 Wavelengths gives about 160,000 geophone locations d) Traces have 4096 samples, 2 milliseconds, could be 1 ms.

M U N -March 10, Phil Bording33 Elastic Wave Equation Shots are placed at twice the receiver spacing Number of shots equals 40,000 Model Frequency is velocity dependent, assume something on the order of 60 hertz.

M U N -March 10, Phil Bording34 Economics Up Front Fixed Cost, $5 to $ 10 Million Each ASP Chip is $5 to 10 A Petaflop for $5 or $10 Million

M U N -March 10, Phil Bording35 Economics Seismic Shot takes 0.1 seconds 5 Year life is 50,000 Models A realistic 3D elastic seismic model would cost $200

M U N -March 10, Phil Bording36 Comparison 10 Clusters ~ $10 Million 10 models per year One Waves in Linear Motion Analyzer (WILMA) ~$10 Million 10,000 models per year

M U N -March 10, Phil Bording37 Comparison Waves in Linear Motion Analyzer 1000X faster For the same money!.

M U N -March 10, Phil Bording38 Summary 1000 Megawatts is a good sized power station Good memory design is worth the money! Removing the obstacles to efficient computing gives sustainable performance

M U N -March 10, Phil Bording39 Summary Slower is better. Less power is better. High Efficiency is better.

M U N -March 10, Phil Bording40 Conclusions Deterministic Computing is important for performance……… Application Specific Computing is a good fit for the wave equation….. And very cost effective………..

M U N -March 10, Phil Bording41 Thanks SEG – Continuing Education Memorial University of Newfoundland

M U N -March 10, Phil Bording42 Hamming “The purpose of computing is insight, not numbers”