CHEP 2004 September 2004Richard P. Mount, SLAC Huge-Memory Systems for Data-Intensive Science Richard P. Mount SLAC CHEP, September 29, 2004.

Slides:



Advertisements
Similar presentations
Dominik Stokłosa Pozna ń Supercomputing and Networking Center, Supercomputing Department INGRID 2008 Lacco Ameno, Island of Ischia, ITALY, April 9-11 Workflow.
Advertisements

Distributed Data Processing
Database Tuning Principles, Experiments and Troubleshooting Techniques Baseado nos slides do tutorial com o mesmo nome da autoria de: Dennis Shasha
High Performing Cache Hierarchies for Server Workloads
S.Chechelnitskiy / SFU Simon Fraser Running CE and SE in a XEN virtualized environment S.Chechelnitskiy Simon Fraser University CHEP 2007 September 6 th.
FIRST COURSE Essential Computer Concepts. New Perspectives on Microsoft Office 2007: Windows XP Edition 2 Objectives Compare the types of computers Describe.
Chapter 5: Server Hardware and Availability. Hardware Reliability and LAN The more reliable a component, the more expensive it is. Server hardware is.
1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,
Mainframe Replication and Disaster Recovery Services.
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
View from Experiment/Observation driven Applications Richard P. Mount May 24, 2004 DOE Office of Science Data Management Workshop.
1 Andrew Hanushevsky - HEPiX, October 6-8, 1999 Mass Storage For BaBar at SLAC Andrew Hanushevsky Stanford.
Scientific Computing at SLAC Richard P. Mount Director: Scientific Computing and Computing Services DOE Review June 15, 2005.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Matei Ripeanu.
COMPUTER MEMORY Modern computers use semiconductor memory It is made up of thousands of circuits (paths) for electrical currents on a single silicon chip.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
“Better together” PowerVault virtualization solutions
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
Prepared by Careene McCallum-Rodney Hardware specification of a computer system.
Mass RHIC Computing Facility Razvan Popescu - Brookhaven National Laboratory.
Microsoft ® Application Virtualization 4.6 Infrastructure Planning and Design Published: September 2008 Updated: February 2010.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Advisor: Professor.
Abstract Load balancing in the cloud computing environment has an important impact on the performance. Good load balancing makes cloud computing more.
PMIT-6102 Advanced Database Systems
SGI Proprietary SGI Update IDC HPC User Forum September, 2008.
Computer Architecture ECE 4801 Berk Sunar Erkay Savas.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
Virtualization Lab 3 – Virtualization Fall 2012 CSCI 6303 Principles of I.T.
Chapter Two Hardware Basics: Inside the Box. ©1999 Addison Wesley Longman2.2 Chapter Outline What Computers Do A Bit About Bits The Computer’s Core: CPU.
CH2 System models.
9/16/2000Ian Bird/JLAB1 Planning for JLAB Computational Resources Ian Bird.
May Richard P. Mount, SLAC Advanced Computing Technology Overview Richard P. Mount Director: Scientific Computing and Computing Services Stanford.
School of EECS, Peking University Microsoft Research Asia UStore: A Low Cost Cold and Archival Data Storage System for Data Centers Quanlu Zhang †, Yafei.
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
Performance Prediction for Random Write Reductions: A Case Study in Modelling Shared Memory Programs Ruoming Jin Gagan Agrawal Department of Computer and.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Storage and File Structure.
10/18: Lecture topics Memory Hierarchy –Why it works: Locality –Levels in the hierarchy Cache access –Mapping strategies Cache performance Replacement.
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
PARALLEL APPLICATIONS EE 524/CS 561 Kishore Dhaveji 01/09/2000.
ATLAS Computing at SLAC Future Possibilities Richard P. Mount Western Tier 2 Users Forum April 7, 2009.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Xrootd Present & Future The Drama Continues Andrew Hanushevsky Stanford Linear Accelerator Center Stanford University HEPiX 13-October-05
Scientific Computing at SLAC: The Transition to a Multiprogram Future Richard P. Mount Director: Scientific Computing and Computing Services Stanford Linear.
 The End to the Means › (According to IBM ) › 03.ibm.com/innovation/us/thesmartercity/in dex_flash.html?cmp=blank&cm=v&csr=chap ter_edu&cr=youtube&ct=usbrv111&cn=agus.
Randy MelenApril 14, Stanford Linear Accelerator Center Site Report April 1999 Randy Melen SLAC Computing Services/Systems HPC Team Leader.
Click to edit Master title style Literature Review Interconnection Architectures for Petabye-Scale High-Performance Storage Systems Andy D. Hospodor, Ethan.
Tackling I/O Issues 1 David Race 16 March 2010.
Pathway to Petaflops A vendor contribution Philippe Trautmann Business Development Manager HPC & Grid Global Education, Government & Healthcare.
Western Tier 2 Site at SLAC Wei Yang US ATLAS Tier 2 Workshop Harvard University August 17-18, 2006.
Presented by, MySQL & O’Reilly Media, Inc. MySQL in eBay’s Personalization Platform Chris Kasten eBay Kernel Framework Group April 16, 2008.
PetaCache: Data Access Unleashed Tofigh Azemoon, Jacek Becla, Chuck Boeheim, Andy Hanushevsky, David Leith, Randy Melen, Richard P. Mount, Teela Pulliam,
Chapter 1: Computer Basics Instructor:. Chapter 1: Computer Basics Learning Objectives: Understand the purpose and elements of information systems Recognize.
Virtual Server Server Self Service Center (S3C) JI July.
1  2004 Morgan Kaufmann Publishers Fallacies and Pitfalls Fallacy: the rated mean time to failure of disks is 1,200,000 hours, so disks practically never.
Chapter 16 Client/Server Computing Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
FusionCube At-a-Glance. 1 Application Scenarios Enterprise Cloud Data Centers Desktop Cloud Database Application Acceleration Midrange Computer Substitution.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
Clouds , Grids and Clusters
Western Analysis Facility
Scaling for the Future Katherine Yelick U.C. Berkeley, EECS
Introduction to Computing
IBM Power Systems.
Co-designed Virtual Machines for Reliable Computer Systems
Computer Architecture
Presentation transcript:

CHEP 2004 September 2004Richard P. Mount, SLAC Huge-Memory Systems for Data-Intensive Science Richard P. Mount SLAC CHEP, September 29, 2004

CHEP 2004 September 2004Richard P. Mount, SLAC Outline The Science Case –DOE Office of Science Data Management Workshop Technical Issues –Characterizing scientific data –Technology issues in data access Proposal and Currently Funded Plans –The “solution” and the strategy –Development Machine –Leadership Class Machine

CHEP 2004 September 2004Richard P. Mount, SLAC

CHEP 2004 September 2004Richard P. Mount, SLAC

CHEP 2004 September 2004Richard P. Mount, SLAC

CHEP 2004 September 2004Richard P. Mount, SLAC

CHEP 2004 September 2004Richard P. Mount, SLAC

CHEP 2004 September 2004Richard P. Mount, SLAC

CHEP 2004 September 2004Richard P. Mount, SLAC

CHEP 2004 September 2004Richard P. Mount, SLAC

CHEP 2004 September 2004Richard P. Mount, SLAC

CHEP 2004 September 2004Richard P. Mount, SLAC

CHEP 2004 September 2004Richard P. Mount, SLAC Characterizing Scientific Data My petabyte is harder to analyze than your petabyte –Images (or meshes) are bulky but simply structured and usually have simple access patterns –Features are perhaps 1000 times less bulky, but often have complex structures and hard-to-predict access patterns

CHEP 2004 September 2004Richard P. Mount, SLAC Characterizing Scientific Data This proposal aims at revolutionizing the query and analysis of scientific databases with complex structure. Generally this applies to feature databases (terabytes–petabytes) rather than bulk data (petabytes–exabytes)

CHEP 2004 September 2004Richard P. Mount, SLAC Technology Issues in Data Access Latency Speed/Bandwidth (Cost) (Reliabilty)

CHEP 2004 September 2004Richard P. Mount, SLAC Latency and Speed – Random Access

CHEP 2004 September 2004Richard P. Mount, SLAC Latency and Speed – Random Access

CHEP 2004 September 2004Richard P. Mount, SLAC Storage Issues Disks: –Random access performance is lousy, unless objects are megabytes or more independent of cost deteriorating with time at the rate at which disk capacity increases (Define random-access performance as time taken to randomly access entire contents of a disk)

CHEP 2004 September 2004Richard P. Mount, SLAC The “Solution” Disk storage is lousy and getting worse Use memory instead of disk (“Let them eat cake”) Obvious problem: –Factor  100 in cost Optimization: –Brace ourselves to spend (some) more money –Architecturally decouple data-cache memory from high- performance, close-to-the-processor memory –Lessen performance-driven replication of disk-resident data

CHEP 2004 September 2004Richard P. Mount, SLAC The Proposal A Leadership Class Facility for Data- Intensive Science and Currently Funded Plan

CHEP 2004 September 2004Richard P. Mount, SLAC The Strategy There is significant commercial interest in an architecture including data-cache memory But: from interest to delivery will take 3-4 years And: applications will take time to adapt not just codes, but their whole approach to computing, to exploit the new architecture Hence: two phases 1.Development phase (years 1,2,3) –Commodity hardware taken to its limits –BaBar as principal user, adapting existing data-access software to exploit the configuration –BaBar/SLAC contribution to hardware and manpower –Publicize results –Encourage other users –Begin collaboration with industry to design the leadership-class machine 2.“Leadership-Class” Facility (years 3,4,5) –New architecture –Strong industrial collaboration –Facility open to all

CHEP 2004 September 2004Richard P. Mount, SLAC Development Machine Design Principles Attractive to scientists –Big enough data-cache capacity to promise revolutionary benefits –1000 or more processors Processor to (any) data-cache memory latency < 100  s Aggregate bandwidth to data-cache memory > 10 times that to a similar sized disk cache Data-cache memory should be 3% to 10% of the working set (approximately 10 to 30 terabytes for BaBar) Cost effective, but acceptably reliable –Constructed from carefully selected commodity components

CHEP 2004 September 2004Richard P. Mount, SLAC Development Machine Design Choices Intel/AMD server mainboards with 4 or more ECC dimm slots per processor 2 Gbyte dimms ($635 each) 4 Gbyte dimms ($7,000 each) too expensive this year 64-bit operating system and processor –Favors Solaris and AMD Opteron Large (500+ port) switch fabric –Large IP switches are most cost-effective Use of ($10M+) BaBar disk/tape infrastructure, augmented for any non-BaBar use

CHEP 2004 September 2004Richard P. Mount, SLAC Development Machine Deployment – Proposed Year 1 Memory Interconnect Switch Fabric 650 Nodes, each 2 CPU, 16 GB memory Storage Interconnect Switch Fabric > 100 Disk Servers Provided by BaBar Cisco/Extreme/Foundry

CHEP 2004 September 2004Richard P. Mount, SLAC Development Machine Deployment – Currently Funded Cisco Switch Data-Servers Nodes, each Sun V20z, 2 Opteron CPU, 16 GB memory Up to 2TB total Memory Solaris Cisco Switch Clients up to 2000 Nodes, each 2 CPU, 2 GB memory Linux

CHEP 2004 September 2004Richard P. Mount, SLAC BaBar/HEP Object-Serving Software AMS and XrootD (Andy Hanushevsky/SLAC) –Optimized for read-only access –Make 100s of servers transparent to user code –Load balancing –Automatic staging from tape –Failure recovery Can allow BaBar to start getting benefit from a new data-access architecture within months without changes to user code Minimizes impact of hundreds of separate address spaces in the data-cache memory

CHEP 2004 September 2004Richard P. Mount, SLAC “Leadership-Class” Facility Design Principles All data-cache memory should be directly addressable by all processors Optimize for read-only access to data-cache memory Choose commercial processor nodes optimized for throughput Use the (then) standard high-performance memory within nodes Data-cache memory design optimized for reliable bulk storage –5  s latency is low enough –No reason to be on the processor motherboard Operating system should allow transparent access to data-cache memory, but should also distinguish between high-performance memory and data-cache memory

CHEP 2004 September 2004Richard P. Mount, SLAC “Leadership-Class” Facility Design Directions ~256 terabytes of data-cache memory and ~100 teraops/s by 2008 Expandable by factor 2 in each of 2009,10,11 Well-aligned with mainstream technologies but: –Operating system enhancements –Memory controller enhancements (read-only and coarse- grained locking where appropriate) Industry partnership essential Excellent network access essential –(SLAC is frequently the largest single user of both ESNet and Internet 2) Detailed design proposal to DOE in 2006

CHEP 2004 September 2004Richard P. Mount, SLAC “Leadership-Class” Facility Memory Interconnect Fabric Disk and Mass-Storage Hierarchy SMP systems 100 teraops/s > 10,000 threads ~ 256 terabytes data-cache memory

CHEP 2004 September 2004Richard P. Mount, SLAC Summary Data-intensive science increasingly requires low-latency access to terabytes or petabytes Memory is one key: –Commodity DRAM today (increasing total cost by ~2x) –Storage-class memory (whatever that will be) in the future Revolutions in scientific data analysis will be another key –Current HEP approaches to data analysis assume that random access is prohibitively expensive –As a result, permitting random access brings much-less-than- revolutionary immediate benefit Use the impressive motive force of a major HEP collaboration with huge data-analysis needs to drive the development of techniques for revolutionary exploitation of an above-threshold machine.