ParaMEDIC: Parallel Metadata Environment for Distributed I/O and Computing P. Balaji, Argonne National Laboratory W. Feng and J. Archuleta, Virginia Tech.

Slides:

Advertisements

Similar presentations

1 A GPU Accelerated Storage System NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany Sathish Gopalakrishnan Matei.

Advertisements

SALSA HPC Group School of Informatics and Computing Indiana University.

IBM 1350 Cluster Expansion Doug Johnson Senior Systems Developer.

High Performance Computing Course Notes Grid Computing.

1 The Case for Versatile Storage System NetSysLab The University of British Columbia Samer Al-Kiswany, Abdullah Gharaibeh, Matei Ripeanu.

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 Data-Intensive Solutions.

Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.

Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc

Measuring zSeries System Performance Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012 Sponsored in part by Deer &

Object-based Storage Long Liu Outline Why do we need object based storage? What is object based storage? How to take advantage of it? What's.

Data Center Infrastructure

CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.

GePSeA: A General Purpose Software Acceleration Framework for Lightweight Task Offloading Ajeet SinghPavan BalajiWu-chun Feng Dept. of Computer Science,

SDSC RP Update TeraGrid Roundtable Reviewing Dash Unique characteristics: –A pre-production/evaluation “data-intensive” supercomputer based.

Distributed I/O with ParaMEDIC: Experiences with a Worldwide Supercomputer P. Balaji, W. Feng, H. Lin, J. Archuleta, S. Matsuoka, A. Warren, J. Setubal,

Impact of Network Sharing in Multi-core Architectures G. Narayanaswamy, P. Balaji and W. Feng Dept. of Comp. Science Virginia Tech Mathematics and Comp.

Face Detection And Recognition For Distributed Systems Meng Lin and Ermin Hodžić 1.

Bio-IT World Asia, June 7, 2012 High Performance Data Management and Computational Architectures for Genomics Research at National and International Scales.

Semantics-based Distributed I/O with the ParaMEDIC Framework P. Balaji, W. Feng, H. Lin Math. and Computer Science, Argonne National Laboratory Computer.

Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,

An Analysis of 10-Gigabit Ethernet Protocol Stacks in Multi-core Environments G. Narayanaswamy, P. Balaji and W. Feng Dept. of Comp. Science Virginia Tech.

Computer Science Department of 1 Massively Parallel Genomic Sequence Search on Blue Gene/P Heshan Lin (NCSU) Pavan Balaji.

Amy Apon, Pawel Wolinski, Dennis Reed Greg Amerson, Prathima Gorjala University of Arkansas Commercial Applications of High Performance Computing Massive.

ICPP 2012 Indexing and Parallel Query Processing Support for Visualizing Climate Datasets Yu Su*, Gagan Agrawal*, Jonathan Woodring † *The Ohio State University.

A modeling approach for estimating execution time of long-running Scientific Applications Seyed Masoud Sadjadi 1, Shu Shimizu 2, Javier Figueroa 1,3, Raju.

EFFECTIVE LOAD-BALANCING VIA MIGRATION AND REPLICATION IN SPATIAL GRIDS ANIRBAN MONDAL KAZUO GODA MASARU KITSUREGAWA INSTITUTE OF INDUSTRIAL SCIENCE UNIVERSITY.

Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory.

North Carolina Bioinformatics Grid Thom H. Dunning, Jr. HPCC Division, MCNC Chemistry, University of North Carolina.

Presented by Leadership Computing Facility (LCF) Roadmap Buddy Bland Center for Computational Sciences Leadership Computing Facility Project.

DOE PI Meeting at BNL 1 Lightweight High-performance I/O for Data-intensive Computing Jun Wang Computer Architecture and Storage System Laboratory (CASS)

ARGONNE NATIONAL LABORATORY Climate Modeling on the Jazz Linux Cluster at ANL John Taylor Mathematics and Computer Science & Environmental Research Divisions.

SAN DIEGO SUPERCOMPUTER CENTER SDSC's Data Oasis Balanced performance and cost-effective Lustre file systems. Lustre User Group 2013 (LUG13) Rick Wagner.

Tools for collaboration How to share your duck tales…

Towards Exascale File I/O Yutaka Ishikawa University of Tokyo, Japan 2009/05/21.

Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.

NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.

Non-Data-Communication Overheads in MPI: Analysis on Blue Gene/P P. Balaji, A. Chan, W. Gropp, R. Thakur, E. Lusk Argonne National Laboratory University.

May 6, 2002Earth System Grid - Williams The Earth System Grid Presented by Dean N. Williams PI’s: Ian Foster (ANL); Don Middleton (NCAR); and Dean Williams.

Ultimate Integration Joseph Lappa Pittsburgh Supercomputing Center ESCC/Internet2 Joint Techs Workshop.

Modeling Billion-Node Torus Networks Using Massively Parallel Discrete-Event Simulation Ning Liu, Christopher Carothers 1.

Active Storage Processing in Parallel File Systems Jarek Nieplocha Evan Felix Juan Piernas-Canovas SDM CENTER.

Evolving Scientific Data Workflow CAS 2011 Pamela Gillman

COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.

Bio-IT World Conference and Expo ‘12, April 25, 2012 A Nation-Wide Area Networked File System for Very Large Scientific Data William K. Barnett, Ph.D.

Semantics-based Distributed I/O for mpiBLAST P. Balaji ά, W. Feng β, J. Archuleta β, H. Lin δ, R. Kettimuthu ά, R. Thakur ά and X. Ma δ ά Argonne National.

Final Implementation of a High Performance Computing Cluster at Florida Tech P. FORD, X. FAVE, K. GNANVO, R. HOCH, M. HOHLMANN, D. MITRA Physics and Space.

Nanbor Wang, Balamurali Ananthan Tech-X Corporation Gerald Gieraltowski, Edward May, Alexandre Vaniachine Argonne National Laboratory 2. ARCHITECTURE GSIMF:

Parallel IO for Cluster Computing Tran, Van Hoai.

Tackling I/O Issues 1 David Race 16 March 2010.

LIOProf: Exposing Lustre File System Behavior for I/O Middleware

COMP7500 Advanced Operating Systems I/O-Aware Load Balancing Techniques Dr. Xiao Qin Auburn University

TeraGrid Capability Discovery John-Paul “JP” Navarro TeraGrid Area Co-Director for Software Integration University of Chicago/Argonne National Laboratory.

Presented by Robust Storage Management On Desktop, in Machine Room, and Beyond Xiaosong Ma Computer Science and Mathematics Oak Ridge National Laboratory.

Model-driven Data Layout Selection for Improving Read Performance Jialin Liu 1, Bin Dong 2, Surendra Byna 2, Kesheng Wu 2, Yong Chen 1 Texas Tech University.

Synergy.cs.vt.edu VOCL: An Optimized Environment for Transparent Virtualization of Graphics Processing Units Shucai Xiao 1, Pavan Balaji 2, Qian Zhu 3,

TeraGrid Software Integration: Area Overview (detailed in 2007 Annual Report Section 3) Lee Liming, JP Navarro TeraGrid Annual Project Review April, 2008.

Lustre File System chris. Outlines  What is lustre  How does it works  Features  Performance.

Introduction to Computers - Hardware

Warehouse Scaled Computers

Performance measurement of transferring files on the federated SRB

Grid and Cloud Computing

Experience of Lustre at QMUL

MadeCR: Correlation-based Malware Detection for Cognitive Radio

Distributed Network Traffic Feature Extraction for a Real-time IDS

Globus —— Toolkits for Grid Computing

Experience of Lustre at a Tier-2 site

Recap: introduction to e-science

Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel

Experiences in Running Workloads over OSG/Grid3

Data Management Components for a Research Data Archive

Presentation transcript:

ParaMEDIC: Parallel Metadata Environment for Distributed I/O and Computing P. Balaji, Argonne National Laboratory W. Feng and J. Archuleta, Virginia Tech H. Lin, North Carolina State University SC|07 Storage Challenge

Overview Biological Problems of Significance –Discover missing genes via sequence-similarity computations (i.e., mpiBLAST, –Generate a complete genome sequence-similarity tree to speed- up future sequence searches Our Contributions –Worldwide Supercomputer Compute: ~12,000 cores across six U.S. supercomputing centers Storage: 0.5-petabyte at the Tokyo Institute of Technology –ParaMEDIC: Parallel Metadata Environment for Distributed I/O and Computing Decouples computation and I/O and drastically reduces I/O overhead Delivers 90% storage bandwidth utilization –A 100x improvement over (vanilla) mpiBLAST

Outline Motivation Problem Statement Approach Results Conclusion

Importance of Sequence Search Motivation Why sequence search is so important …

Challenges in Sequence Search Observations –Overall size of genomic databases doubles every 12 months –Processing horsepower doubles only every months Consequence –The rate at which genomic databases are growing is outstripping our ability to compute (i.e., sequence search) on them.

Problem Statement #1 The Case of the Missing Genes –Problem Most current genes have been detected by a gene-finder program, which can miss real genes –Approach Every possible location along a genome should be checked for the presence of genes –Solution All-to-all sequence search of all 567 microbial genomes that have been completed to date … but requires more resources than can be traditionally found at a single supercomputer center 2.63 x sequence searches!

Problem Statement #2 The Search for a Genome Similarity Tree –Problem Genome databases are stored as an unstructured collection of sequences in a flat ASCII file –Approach Completely correlate all sequences by matching each sequence with every other sequence –Solution Use results from all-to-all sequence search to create genome similarity tree … but requires more resources than can be traditionally found at a single supercomputer center –Level 1: 250 matches; Level 2: = 62,500 matches; Level 3: = 15,625,000 matches …

Approach: Hardware Infrastructure Worldwide Supercomputer –Six U.S. supercomputing institutions (~12,000 processors) and one Japanese storage institution (0.5 petabytes), ~10,000 kilometers away

Approach: ParaMEDIC Architecture ParaMEDIC: Parallel Metadata Environment for Distributed I/O and Computing ParaMEDIC API (PMAPI) ParaMEDIC Data Tools Encryption Data Encryption Data Integrity Data Integrity

Approach: ParaMEDIC Framework The ParaMEDIC Framework

Preliminary Results: ANL-VT Supercomputer

Preliminary Results: Teragrid Supercomputer

Storage Challenge: Compute Resources 2200-processor System X cluster (Virginia Tech) 2048-processor BG/L supercomputer (Argonne) 5832-processor SiCortex supercomputer (Argonne) 700-processor Intel Jazz cluster (Argonne) processors on TeraGrid (U. Chicago & SDSC) 512-processor Oliver cluster (CCT at LSU) A few hundred processors on Open Science Grid (RENCI) 128-processors on the Breadboard cluster (Argonne) Total: ~12,000 Processors

Storage Challenge: Storage Resources Clients –10 quad-core SunFire X4200 –Two 16-core SunFire X4500 systems. Object Storage Servers (OSS) –20 SunFire X4500 Object Storage Targets (OST) –140 SunFire X4500 (each OSS has 7 OSTs) RAID configuration for OST –RAID5 with 6 drives Network: Gigabit Ethernet Kernel: 2.6 Lustre Version: 1.6.2

Storage Utilization with Lustre

Storage Utilization Breakdown with Lustre

Storage Utilization (Local Disks)

Storage Utilization Breakdown (Local Disks)

Conclusion: Biology Biological Problems Addressed –Discovering missing genes via sequence-similarity computations 2.63 x sequence searches! –Generating a complete genome sequence-similarity tree to speed-up future sequence searches. Status –Missing Genes Now possible! Ongoing with biologists –Complete Similarity Tree Large % of chromosomes do not match any other chromosomes

Conclusion: Computer Science Contributions –Worldwide supercomputer consisting of ~12,000 processors and 0.5-petabyte storage Output: 1 PB uncompressed 0.3 PB compressed –ParaMEDIC: Parallel Metadata Environment for Distributed I/O and Computing Decouples computation and I/O and drastically reduces I/O overhead.

Acknowledgments Computational Resources K. Shinpaugh, L. Scharf, G. Zelenka (Virginia Tech) I. Foster, M. Papka (U. Chicago) E. Lusk and R. Stevens (Argonne National Laboratory) M. Rynge, J. McGee, D. Reed (RENCI) S. Jha and H. Liu (CCT at LSU) Storage Resources S. Matsuoka (Tokyo Inst. of Technology) S. Ihara, T. Kujiraoka (Sun Microsystems, Japan) S. Vail, S. Cochrane (Sun Microsystems, USA)