EMC PresentationApril 20051 Northeastern University I/O storage modeling and performance –David Kaeli Soft error modeling and mitigation –Mehdi.

Slides:



Advertisements
Similar presentations
Data Storage Solutions Module 1.2. Data Storage Solutions Upon completion of this module, you will be able to: List the common storage media and solutions.
Advertisements

CoMPI: Enhancing MPI based applications performance and scalability using run-time compression. Rosa Filgueira, David E.Singh, Alejandro Calderón and Jesús.
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
1 CSC 486/586 Network Storage. 2 Objectives Familiarization with network data storage technologies Understanding of RAID concepts and RAID levels Discuss.
International Conference on Supercomputing June 12, 2009
SHARCNET. Multicomputer Systems r A multicomputer system comprises of a number of independent machines linked by an interconnection network. r Each computer.
Understanding Application Scaling NAS Parallel Benchmarks 2.2 on NOW and SGI Origin 2000 Frederick Wong, Rich Martin, Remzi Arpaci-Dusseau, David Wu, and.
An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.
Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing, Volume: 3.
Chapter 1 Introduction 1.1A Brief Overview - Parallel Databases and Grid Databases 1.2Parallel Query Processing: Motivations 1.3Parallel Query Processing:
Techniques for Efficient Processing in Runahead Execution Engines Onur Mutlu Hyesoon Kim Yale N. Patt.
Instrumentation and Profiling David Kaeli Department of Electrical and Computer Engineering Northeastern University Boston, MA
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
MATE-EC2: A Middleware for Processing Data with Amazon Web Services Tekin Bicer David Chiu* and Gagan Agrawal Department of Compute Science and Engineering.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
Chapter 2 Computer Clusters Lecture 2.1 Overview.
Center for Subsurface Sensing & Imaging Systems Overview of Image and Data Information Management in CenSSIS David Kaeli Northeastern University Boston,
CLUSTER COMPUTING Prepared by: Kalpesh Sindha (ITSNS)
A Workflow-Aware Storage System Emalayan Vairavanathan 1 Samer Al-Kiswany, Lauro Beltrão Costa, Zhao Zhang, Daniel S. Katz, Michael Wilde, Matei Ripeanu.
Object-based Storage Long Liu Outline Why do we need object based storage? What is object based storage? How to take advantage of it? What's.
Toolbox for Dimensioning Windows Storage Systems Jalil Boukhobza, Claude Timsit 12/09/2006 Versailles Saint Quentin University.
Motivation “Every three minutes a woman is diagnosed with Breast cancer” (American Cancer Society, “Detailed Guide: Breast Cancer,” 2006) Explore the use.
Parallelization with the Matlab® Distributed Computing Server CBI cluster December 3, Matlab Parallelization with the Matlab Distributed.
Tomographic mammography parallelization Juemin Zhang (NU) Tao Wu (MGH) Waleed Meleis (NU) David Kaeli (NU)
Center for Subsurface Sensing & Imaging Systems HP UPRM February 27, 2003 Overview of Research Thrust R3 R3 Fundamental Research Topics R3A Parallel.
“SEMI-AUTOMATED PARALLELISM USING STAR-P " “SEMI-AUTOMATED PARALLELISM USING STAR-P " Dana Schaa 1, David Kaeli 1 and Alan Edelman 2 2 Interactive Supercomputing.
1 Application of multiprocessor and GRID technology in medical image processing IKTA /2002.
CLUSTER COMPUTING STIMI K.O. ROLL NO:53 MCA B-5. INTRODUCTION  A computer cluster is a group of tightly coupled computers that work together closely.
Guide to Linux Installation and Administration, 2e1 Chapter 2 Planning Your System.
Energy Prediction for I/O Intensive Workflow Applications 1 MASc Exam Hao Yang NetSysLab The Electrical and Computer Engineering Department The University.
Building a Parallel File System Simulator E Molina-Estolano, C Maltzahn, etc. UCSC Lab, UC Santa Cruz. Published in Journal of Physics, 2009.
An I/O Simulator for Windows Systems Jalil Boukhobza, Claude Timsit 27/10/2004 Versailles Saint Quentin University laboratory.
Shared Memory Parallelization of Decision Tree Construction Using a General Middleware Ruoming Jin Gagan Agrawal Department of Computer and Information.
Amy Apon, Pawel Wolinski, Dennis Reed Greg Amerson, Prathima Gorjala University of Arkansas Commercial Applications of High Performance Computing Massive.
Profiling Memory Subsystem Performance in an Advanced POWER Virtualization Environment The prominent role of the memory hierarchy as one of the major bottlenecks.
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
Performance Prediction for Random Write Reductions: A Case Study in Modelling Shared Memory Programs Ruoming Jin Gagan Agrawal Department of Computer and.
IPDPS 2005, slide 1 Automatic Construction and Evaluation of “Performance Skeletons” ( Predicting Performance in an Unpredictable World ) Sukhdeep Sodhi.
1/30/2003 BARC1 Profile-Guided I/O Partitioning Yijian Wang David Kaeli Electrical and Computer Engineering Department Northeastern University {yiwang,
PARALLEL APPLICATIONS EE 524/CS 561 Kishore Dhaveji 01/09/2000.
Computer Science and Engineering Predicting Performance for Grid-Based P. 1 IPDPS’07 A Performance Prediction Framework.
HPCMP Benchmarking Update Cray Henry April 2008 Department of Defense High Performance Computing Modernization Program.
GVis: Grid-enabled Interactive Visualization State Key Laboratory. of CAD&CG Zhejiang University, Hangzhou
Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.
Coupling Facility. The S/390 Coupling Facility (CF), the key component of the Parallel Sysplex cluster, enables multisystem coordination and datasharing.
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science Cornell University Ithaca, NY Soam Acharya Inktomi Corporation.
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.
High-level Interfaces for Scalable Data Mining Ruoming Jin Gagan Agrawal Department of Computer and Information Sciences Ohio State University.
Ohio State University Department of Computer Science and Engineering Servicing Range Queries on Multidimensional Datasets with Partial Replicas Li Weng,
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 28, 2005 Session 29.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
Accelerating K-Means Clustering with Parallel Implementations and GPU Computing Janki Bhimani Miriam Leeser Ningfang Mi
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Clouds , Grids and Clusters
What is Fibre Channel? What is Fibre Channel? Introduction
Bank-aware Dynamic Cache Partitioning for Multicore Architectures
Department of Computer Science University of California, Santa Barbara
Department of Computer Science University of California, Santa Barbara
Automatic and Efficient Data Virtualization System on Scientific Datasets Li Weng.
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

EMC PresentationApril Northeastern University I/O storage modeling and performance –David Kaeli Soft error modeling and mitigation –Mehdi B. Tahoori

I/O Storage Research at Northeastern University David Kaeli Yijian Wang Department of Electrical and Computer Engineering Northeastern University Boston, MA

EMC PresentationApril Outline Motivation to study file-based I/O Profile-driven partitioning for parallel file I/O I/O Qualification NU Areas for future work

EMC PresentationApril Important File-base I/O Workloads Many subsurface sensing and imaging workloads involve file-based I/O –Cellular biology – in-vitro fertilization with NU biologists –Medical imaging – cancer therapy with MGH –Underwater mapping – multi-sensor fusion with Woods Hole Oceanographic Institution –Ground-penetrating radar – toxic waste tracking with Idaho National Labs

EMC PresentationApril The Impact of Profile-guided Parallelization on SSI Applications Reduced the runtime of a single-body Steepest Descent Fast Multipole Method (SDFMM) application by 74% on a 32-node Beowulf cluster Hot-path parallelization Data restructuring Reduced the runtime of a Monte Carlo scattered light simulation by 98% on a 16-node Silicon Graphics Origin 2000 Matlab-to-C compliation Hot-path parallelization Obtained superlinear speedup of Ellipsoid Algorithm run on a 16-node IBM SP2 Matlab-to-C compliation Hot-path parallelization Soil Air Mine

EMC PresentationApril Limits of Parallelization For compute-bound workloads, Beowulf clusters can be used effectively to overcome computational barriers Middlewares (e.g., MPI and MPI/IO) can significantly reduce the programming effort on parallel systems Multiple clusters can be combined, utilizing Grid Middleware (Globus Toolkit) For file-based I/O-bound workloads, Beowulf clusters and Grid systems are presently ill-suited to exploit the potential parallelism present on these systems

EMC PresentationApril Outline Motivation to study file-based I/O Profile-driven partitioning for parallel file I/O I/O Qualification NU Areas for future work

EMC PresentationApril Parallel I/O Acceleration The I/O bottleneck –The growing gap between the speed of processors, networks and underlying I/O devices –Many imaging and scientific applications access disks very frequently I/O intensive applications –Out-of-core applications –Work on large datasets that cannot fit in main memory –File-intensive applications –Access file-based datasets frequently –Large number of file operations

EMC PresentationApril Introduction Storage architectures –Direct Attached Storage (DAS) –Storage device is directly attached to the computer –Network Attached Storage (NAS) –Storage subsystem is attached to a network of servers and file requests are passed through a parallel filesystem to the centralized storage device –Storage Area Network (SAN) –A dedicated network to provide an any-to-any connection between processors and disks

EMC PresentationApril I/O Partitioning P An I/O intensive application Disk PPP … PPP … … Data Partitioning Multiple disks (i.e. RAID) Disk P … Data Striping Multiple Processes (i.e. MPI-IO)

EMC PresentationApril I/O Partitioning I/O is parallelized at both the application level (using MPI and MPI-IO) and the disk level (using file partitioning) Ideally, every process will only access files on local disk (though this is typically not possible due to data sharing) How to recognize the access patterns? Profile-guided approach

EMC PresentationApril Profile Generation Run the application Capture I/O execution profiles Apply our partitioning algorithm Rerun the tuned application

EMC PresentationApril I/O traces and partitioning For every process, for every contiguous file access, we capture the following I/O profile information: –Process ID –File ID –Address –Chunk size –I/O operation (read/write) –Timestamp Generate a partition for every process Optimal partitioning is NP-complete, so we develop a greedy algorithm We have found we can use partial profiles to guide partitioning

EMC PresentationApril for each IO process, create a partition; for each contiguous data chunk { total up the # of read/write accesses on a process-ID basis; if the chunk is accessed by only one process assign the chunk to the associated partition; if the chunk is read (but never written) by multiple processes duplicate the chunk in all partitions where read; if the chunk is written by one process, but later read by multiple { assign the chunk to all partitions where read and broadcast the updates on writes; else assign the chunk to a shared partition; } For each partition sort chunks based on the earliest timestamp for each chunk; Greedy File Partitioning Algorithm

EMC PresentationApril Parallel I/O Workloads NASA Parallel Benchmark (NPB2.4)/BT –Computational fluid dynamics –Generates a file (~1.6 GB) dynamically and then reads it back –Writes/reads sequentially in chunk sizes of 2040 Bytes SPEChpc96/seismic –Seismic processing –Generates a file (~1.5 GB) dynamically and then reads it back –Writes sequential chunks of 96 KB and reads sequential chunks of 2 KB Tile-IO –Parallel Benchmarking Consortium –Tile access to a two-dimensional matrix (~1 GB) with overlap –Writes/reads sequential chunks of 32 KB, with 2KB of overlap Perf –Parallel I/O test program within MPICH –Writes a 1 MB chunk at a location determined by rank, no overlap Mandelbrot –An image processing application that includes visualization –Chunk size is dependent on the number of processes

EMC PresentationApril /100Mb Ethernet Switch RAID Node Local PCI-IDE Disk Local PCI-IDE Disk P2-350Mhz RAID Node P2-350Mhz Beowulf Cluster

EMC PresentationApril Hardware Specifics DAS configuration –Linux box, Western Digital WD800BB (IDE), 80GB, 7200RPM Beowulf cluster (base configuration) –Fast Ethernet 100Mbits/sec –Network Attached RAID - Morstor TF200 with 6-9GB drives Seagate SCSI disks, 7200rpm, RAID-5 –Local attached IDE disks – IBM UltraATA , 5400rpm Fibre channel disks –Seagate Cheetah X15 ST FC, 15000rpm

EMC PresentationApril Write/Read Bandwidth NPB2.4/BT SPECHPC/seis

EMC PresentationApril Write/Read Bandwidth MPI-TilePerf Mandelbrot

EMC PresentationApril

EMC PresentationApril Profile training sensitivity analysis We have found that IO access patterns are independent of file-based data values When we increase the problem size or reduce the number of processes, either: –the number of IOs increases, but access patterns and chunk size remain the same (SPEChpc96, Mandelbrot), or –the number of IOs and IO access patterns remain the same, but the chunk size increases (NBT, Tile- IO, Perf) Re-profiling can be avoided

EMC PresentationApril Execution-driven Parallel I/O Modeling Growing need to process large, complex datasets in high performance parallel computing applications Efficient implementation of storage architectures can significantly improve system performance An accurate simulation environment for users to test and evaluate different storage architectures and applications

EMC PresentationApril Execution-driven I/O Modeling Target applications: parallel scientific programs (MPI) Target machine/Host machine: Beowulf clusters Use DiskSim as the underlying disk drive simulator Direct execution to model CPU and network communication We execute the real parallel I/O accesses and meanwhile, calculate the simulated I/O response time

EMC PresentationApril Validation – Synthetic I/O Workload on DAS

EMC PresentationApril Simulation Framework - NAS LAN/WAN Network File System I/O traces Local I/O traces RAID controller Disk Sim I/O requests Filesystem metadata Logical file access addresses

EMC PresentationApril

EMC PresentationApril LAN/WAN FileSystem I/O traces Disk Sim Disk Sim Disk Sim Disk Sim Simulation Framework – SAN direct A variety of SAN where disks are distributed across the network and each server is directly connected to a single device File partitioning Utilize I/O profiling and data partitioning heuristics to distribute portions of files to disks close to the processing nodes

EMC PresentationApril

EMC PresentationApril Hardware Specifications

EMC PresentationApril

EMC PresentationApril

EMC PresentationApril Publications 1.“Profile-guided File Partitioning on Beowulf Clusters,” Journal of Cluster Computing, Special Issue on Parallel I/O, to appear “Execution-Driven Simulation of Network Storage Systems,” Proceedings of the 12th ACM/IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), October 2004, pp “Profile-Guided I/O Partitioning,” Proceedings of the 17th ACM International Symposium on Supercomputing, June 2003, pp “Source Level Transformations to Apply I/O Data Partitioning,” Proceedings of the IEEE Workshop on Storage Network Architecture And Parallel IO, Oct. 2003, pp “Profile-Based Characterization and Tuning for Subsurface Sensing and Imaging Applications,” International Journal of Systems, Science and Technology, September 2002, pp

EMC PresentationApril Summary of Cluster-based Work Many imaging applications are dominated by file-based I/O Parallel systems can only be effectively utilized if I/O is also parallelized Developed a profile-guided approach to I/O data partitioning Impacting clinical trials at MGH Reduced overall execution time by 27-82% over MPI-IO Execution-driven I/O model is highly accurate and provides significant modeling flexibility

EMC PresentationApril Outline Motivation to study file-based I/O Profile-driven partitioning for parallel file I/O I/O Qualification NU Areas for future work

EMC PresentationApril I/O Qualification Laboratory Working with Enterprise Strategy Group Develop a state-of-the-art facility to provide independent performance qualification of Enterprise Storage systems Provide a quarterly report to ES customer base on the status of current ES offerings Work with leading ES vendors to provide them with custom early performance evaluation of their beta products

EMC PresentationApril I/O Qualification Laboratory Contacted by IOIntegrity and SANGATE for product qualification Developed potential partners that are leaders in the ES field Initial proposals already reviewed by IBM, Hitachi and other ES vendors Looking for initial endorsement from industry

EMC PresentationApril I/O Qualification Laboratory NU –Track record with industry (EMC, IBM, Sun) –Experience with benchmarking and IO characterization –Interesting set of applications (medical, environmental, etc.) –Great opportunity to work within the cooperative education model

EMC PresentationApril Outline Motivation to study file-based I/O Profile-driven partitioning for parallel file I/O I/O Qualification NU Areas for future work

EMC PresentationApril Areas for Future Work Designing a Peer-to-Peer storage system on a Grid system by partitioning datasets across geographically distributed storage devices joulian.hpcl.neu.edu keys.ece.neu.edu Internet 1Gbit/s100Mbit/s RAID 31 sub-nodes8 sub-nodes Head node

EMC PresentationApril

EMC PresentationApril Areas for Future Work Reduce simulation time by identifying characteristic “phases” in I/O workloads Apply machine learning algorithms to identify clusters of representative I/O behavior Utilize K-Means and Multinomial clustering to obtain high fidelity in simulation runs utilizing sampled I/O behavior “A Multinomial Clustering Model for Fast Simulation of Architecture Designs”, submitted to the 2005 ACM KDD Conference.