Out-of core Streamline Generation Using Flow-guided File Layout Chun-Ming Chen 788 Project 1.

Slides:



Advertisements
Similar presentations
Lecture 19: Cache Basics Today’s topics: Out-of-order execution
Advertisements

AMD OPTERON ARCHITECTURE Omar Aragon Abdel Salam Sayyad This presentation is missing the references used.
Query Processing and Optimizing on SSDs Flash Group Qingling Cao
COS 461 Fall 1997 Workstation Clusters u replace big mainframe machines with a group of small cheap machines u get performance of big machines on the cost-curve.
Buffer management.
Caching and Virtual Memory. Main Points Cache concept – Hardware vs. software caches When caches work and when they don’t – Spatial/temporal locality.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 11.
1 External Sorting Chapter Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing.
1 Advanced Database Technology February 12, 2004 DATA STORAGE (Lecture based on [GUW ], [Sanders03, ], and [MaheshwariZeh03, ])
File System Implementation CSCI 444/544 Operating Systems Fall 2008.
Computer Architecture, Memory Hierarchy & Virtual Memory
Device Management.
Virtual Memory and Paging J. Nelson Amaral. Large Data Sets Size of address space: – 32-bit machines: 2 32 = 4 GB – 64-bit machines: 2 64 = a huge number.
1 CS222: Principles of Database Management Fall 2010 Professor Chen Li Department of Computer Science University of California, Irvine Notes 01.
Optimizing RAM-latency Dominated Applications
Caching and Virtual Memory. Main Points Cache concept – Hardware vs. software caches When caches work and when they don’t – Spatial/temporal locality.
Wook-Shin Han, Sangyeon Lee POSTECH, DGIST
EPICS Archiving Appliance Test at ESS
Performance Issues in Parallelizing Data-Intensive applications on a Multi-core Cluster Vignesh Ravi and Gagan Agrawal
So far we have covered … Basic visualization algorithms Parallel polygon rendering Occlusion culling They all indirectly or directly help understanding.
A User-Lever Concurrency Manager Hongsheng Lu & Kai Xiao.
Damian Gordon. HARD DISK (MAIN MEMORY) (SECONDARY MEMORY) 2 CACHE 1.
HPDC 2013 Taming Massive Distributed Datasets: Data Sampling Using Bitmap Indices Yu Su*, Gagan Agrawal*, Jonathan Woodring # Kary Myers #, Joanne Wendelberger.
CPSC 404, Laks V.S. Lakshmanan1 External Sorting Chapter 13: Ramakrishnan & Gherke and Chapter 2.3: Garcia-Molina et al.
CSE 241 Computer Engineering (1) هندسة الحاسبات (1) Lecture #3 Ch. 6 Memory System Design Dr. Tamer Samy Gaafar Dept. of Computer & Systems Engineering.
LYU0703 Parallel Distributed Programming on PS3 1 Huang Hiu Fung Wong Chung Hoi Supervised by Prof. Michael R. Lyu Department of Computer.
Memory Hierarchy. Hierarchy List Registers L1 Cache L2 Cache Main memory Disk cache Disk Optical Tape.
CS 153 Design of Operating Systems Spring 2015 Final Review 2.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Computer Architecture Foundations for Graduate Level Students.
PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 7 – Buffer Management.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 11.
External Sorting. Why Sort? A classic problem in computer science! Data requested in sorted order –e.g., find students in increasing gpa order Sorting.
ICC Module 3 Lesson 3 – Storage 1 / 4 © 2015 Ph. Janson Information, Computing & Communication Storage – Clip 0 – Introduction School of Computer Science.
Interactive Terascale Particle Visualization Ellsworth, Green, Moran (NASA Ames Research Center)
Computer Performance. Hard Drive - HDD Stores your files, programs, and information. If it gets full, you can’t save any more. Measured in bytes (KB,
CS422 Principles of Database Systems Buffer Management Chengyu Sun California State University, Los Angeles.
An Out-of-core Implementation of Block Cholesky Decomposition on A Multi-GPU System Lin Cheng, Hyunsu Cho, Peter Yoon, Jiajia Zhao Trinity College, Hartford,
Analyzing Memory Access Intensity in Parallel Programs on Multicore Lixia Liu, Zhiyuan Li, Ahmed Sameh Department of Computer Science, Purdue University,
1 Lecture 16: Data Storage Wednesday, November 6, 2006.
1 Query Processing Exercise Session 1. 2 The system (OS or DBMS) manages the buffer Disk B1B2B3 Bn … … Program’s private memory An application program.
Characteristics Location Capacity Unit of transfer Access method Performance Physical type Physical characteristics Organisation.
IBM’s OS/2 by Chris Axford Chris Evans Elizabeth McGinnis Erik Swensson.
Practical Hadoop: do’s and don’ts by example Kacper Surdy, Zbigniew Baranowski.
Getting the Most out of Scientific Computing Resources
Title of the Poster Supervised By: Prof.*********
Getting the Most out of Scientific Computing Resources
Cache Memory.
ECE232: Hardware Organization and Design
CAM Content Addressable Memory
Informed Prefetching and Caching
Yu Su, Yi Wang, Gagan Agrawal The Ohio State University
Data Structures and Algorithms
CS 140 Lecture Notes: Technology and Operating Systems
External Sorting The slides for this text are organized into chapters. This lecture covers Chapter 11. Chapter 1: Introduction to Database Systems Chapter.
CS 140 Lecture Notes: Technology and Operating Systems
Chap. 12 Memory Organization
Cache Replacement in Modern Processors
Computer Architecture
External Sorting.
CS222P: Principles of Data Management Lecture #3 Buffer Manager, PAX
Lecture 9: Caching and Demand-Paged Virtual Memory
Sarah Diesburg Operating Systems CS 3430
COSC 1306 COMPUTER SCIENCE AND PROGRAMMING THE FIFTH ASSIGNMENT
Sarah Diesburg Operating Systems COP 4610
Presentation transcript:

Out-of core Streamline Generation Using Flow-guided File Layout Chun-Ming Chen 788 Project 1

Background Visualize flow fields with streamlines Scientific data is huge – Traditional: Compute in clusters – Drawbacks: High Equipment Cost Inter-node communication 2

Background Nowadays: multi-core CPU on single machine May not have enough memory capacity Out-of-core computation is needed – Out-of-core: data cannot be fully loaded into main memory 3

Goal Compute streamlines on a lower-cost multi-core machine with limited memory, given arbitrary seeds 4

Demand Paging Algorithm Preparation Stage: – Break flow fields into blocks Streamline Generation Stage: – Only load needed blocks during computation – Release least recently used (LRU) block when memory full 5 Load data from Disk Compute Release data (LRU) Store data in memory pool

Multi-core streamline computation 6 Threaded Computation Seeds for block 1 Seeds for block 2 Seeds for block 3 Seeds for block 4 Threaded Computation New seeds generated from block 1 Job Queue

Problem of Out-of-core Computation Earlier tests: 1Gb Data – Environment: 8-core Intel Machine Limit 25Mb memory usage – Time Generating streamlines: s – Time Loading flow field : s IO is the bottle neck 7

More tests Read all blocks in a 6Gb data Unit block size: float 16x16x16 (49152 bytes) Total 131,072 blocks – Random access: sec – Sequential read: sec – Reverse-Sequential read: sec Sequential read can be 20 times faster Reason: Disk Prefetching 8

File Layout Re-arrange data to increase more sequential reads Hilbert Curve Layout: 9

Result of Scheduling for Hilbert Curve Layout Scheduler: only read forward Test: 1Gb Data – Environment: 8-core Intel Machine Limit 25Mb memory usage Old test: – Time Generating streamlines: s – Time Loading flow field : s Hilbert layout: – Time Generating streamlines: s – Time Loading flow field : s 10

Layout By Flow Direction 11

Next And Conclusion Next: – Better layout? – Re-arrange data based on flow direction – NP-hard Problem Conclusion: – If we want to analyze large scientific data in a single machine, out-of-core computation is required now and also in the future – Good File layout is important for out-of-core computation 12