SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Status Update TeraGrid Science Advisory Board Meeting July 19, 2010 Dr. Mike.

Slides:



Advertisements
Similar presentations
Appro Xtreme-X Supercomputers A P P R O I N T E R N A T I O N A L I N C.
Advertisements

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO HPEC September 16, 2010 Dr. Mike Norman, PI Dr. Allan Snavely, Co-PI.
SAN DIEGO SUPERCOMPUTER CENTER Using Gordon to Accelerate LHC Science Rick Wagner San Diego Supercomputer Center XSEDE 13 July 22-25, 2013 San Diego, CA.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Gordon: NSF Flash-based System for Data-intensive Science Mahidhar Tatineni 37.
SAN DIEGO SUPERCOMPUTER CENTER Niches, Long Tails, and Condos Effectively Supporting Modest-Scale HPC Users 21st High Performance Computing Symposia (HPC'13)
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO SDSC RP Update Trestles Recent Dash results Gordon schedule SDSC’s broader HPC.
National Center for Atmospheric Research John Clyne 4/27/11 4/26/20111.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO SDSC RP Update October 21, 2010.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Performance of Applications Using Dual-Rail InfiniBand 3D Torus Network on the.
IDC HPC User Forum Conference Appro Product Update Anthony Kenisky, VP of Sales.
LinkSCEEM-2: A computational resource for the development of Computational Sciences in the Eastern Mediterranean Mostafa Zoubi SESAME SESAME – LinkSCEEM.
ASKAP Central Processor: Design and Implementation Calibration and Imaging Workshop 2014 ASTRONOMY AND SPACE SCIENCE Ben Humphreys | ASKAP Software and.
A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political, and Economic Presentation by Larry Smarr to the NSF Campus Bridging Workshop.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 Data-Intensive Solutions.
SAN DIEGO SUPERCOMPUTER CENTER Accounting & Allocation Subhashini Sivagnanam SDSC Special Thanks to Dave Hart.
Lecture 1: Introduction to High Performance Computing.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
SALSASALSASALSASALSA Digital Science Center June 25, 2010, IIT Geoffrey Fox Judy Qiu School.
Real Parallel Computers. Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra,
HPC and e-Infrastructure Development in China’s High- tech R&D Program Danfeng Zhu Sino-German Joint Software Institute (JSI), Beihang University Dec.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO TeraGrid Coordination Meeting June 10, 2010 TeraGrid Forum Meeting June 16, 2010.
Scientific Data Infrastructure in CAS Dr. Jianhui Scientific Data Center Computer Network Information Center Chinese Academy of Sciences.
Project Overview:. Longhorn Project Overview Project Program: –NSF XD Vis Purpose: –Provide remote interactive visualization and data analysis services.
Swiss Academic Compute Cloud Project Lightning Talk at CH OpenStack User Meeting Nov Tyanko Alekseyev (UZH) Markus Eurich (ETH) Dean Flanders.
SDSC RP Update TeraGrid Roundtable Changes in SDSC Allocated Resources We will decommission our IA-64 cluster June 30 (rather than March 2010)
SDSC RP Update TeraGrid Roundtable Reviewing Dash Unique characteristics: –A pre-production/evaluation “data-intensive” supercomputer based.
SoCal Education OptIPuter Education and Outreach at Preuss School UCSD and 6th College Rozeanne Steckler, Ph.D. Director of Education, SDSC September 2003.
SAN DIEGO SUPERCOMPUTER CENTER NUCRI Advisory Board Meeting November 9, 2006 Science Gateways on the TeraGrid Nancy Wilkins-Diehr TeraGrid Area Director.
/ ZZ88 Performance of Parallel Neuronal Models on Triton Cluster Anita Bandrowski, Prithvi Sundararaman, Subhashini Sivagnanam, Kenneth Yoshimoto,
1 Preparing Your Application for TeraGrid Beyond 2010 TG09 Tutorial June 22, 2009.
“Comparative Human Microbiome Analysis” Remote Video Talk to CICESE Big Data, Big Network Workshop Ensenada, Mexico October 10, 2013 Dr. Larry Smarr Director,
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
Data Storage Systems: A Survey Abdullah Aldhamin July 29, 2013 CMPT 880: Large-Scale Multimedia Systems and Cloud Computing Course Project.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Michael L. Norman Principal Investigator Interim Director, SDSC Allan Snavely.
Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.
HPC computing at CERN - use cases from the engineering and physics communities Michal HUSEJKO, Ioannis AGTZIDIS IT/PES/ES 1.
Where to go and what to do next — resources, funding, mentors Thursday afternoon Ruth Pordes, Executive Director Fermilab.
- Rohan Dhamnaskar. Overview  What is a Supercomputer  Some Concepts  Couple of examples.
SAN DIEGO SUPERCOMPUTER CENTER SDSC's Data Oasis Balanced performance and cost-effective Lustre file systems. Lustre User Group 2013 (LUG13) Rick Wagner.
2009/4/21 Third French-Japanese PAAP Workshop 1 A Volumetric 3-D FFT on Clusters of Multi-Core Processors Daisuke Takahashi University of Tsukuba, Japan.
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
TeraGrid Quarterly Meeting Arlington, VA Sep 6-7, 2007 NCSA RP Status Report.
“The Pacific Research Platform: a Science-Driven Big-Data Freeway System.” Big Data for Information and Communications Technologies Panel Presentation.
“The UCSD Big Data Freeway System” Invited Short Talk Workshop on “Enriching Human Life and Society” UC San Diego February 6, 2014 Dr. Larry Smarr Director,
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
DiRAC-3 – The future Jeremy Yates, STFC DiRAC HPC Facility.
Galaxy Community Conference July 27, 2012 The National Center for Genome Analysis Support and Galaxy William K. Barnett, Ph.D. (Director) Richard LeDuc,
NICS Update Bruce Loftis 16 December National Institute for Computational Sciences University of Tennessee and ORNL partnership  NICS is the 2.
Presented by NCCS Hardware Jim Rogers Director of Operations National Center for Computational Sciences.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Tapping into National Cyberinfrastructure Resources Donald Frederick SDSC
Tackling I/O Issues 1 David Race 16 March 2010.
Pathway to Petaflops A vendor contribution Philippe Trautmann Business Development Manager HPC & Grid Global Education, Government & Healthcare.
Parallel Computers Today Oak Ridge / Cray Jaguar > 1.75 PFLOPS Two Nvidia 8800 GPUs > 1 TFLOPS Intel 80- core chip > 1 TFLOPS  TFLOPS = floating.
Introduction to Exadata X5 and X6 New Features
PEER 2003 Meeting 03/08/031 Interdisciplinary Framework Major focus areas Structural Representation Fault Systems Earthquake Source Physics Ground Motions.
Opportunistic Computing Only Knocks Once: Processing at SDSC Ian Fisk FNAL On behalf of the CMS Collaboration.
“Pacific Research Platform Science Drivers” Opening Remarks PRP Science Driver PI Workshop UC Davis March 23, 2016 Dr. Larry Smarr Director, California.
Architecture of a platform for innovation and research Erik Deumens – University of Florida SC15 – Austin – Nov 17, 2015.
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
SAN DIEGO SUPERCOMPUTER CENTER SDSC Resource Partner Summary March, 2009.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
Using Pattern-Models to Guide SSD Deployment for Big Data in HPC Systems Junjie Chen 1, Philip C. Roth 2, Yong Chen 1 1 Data-Intensive Scalable Computing.
LinkSCEEM-2: A computational resource for the development of Computational Sciences in the Eastern Mediterranean Mostafa Zoubi SESAME Outreach SESAME,
Appro Xtreme-X Supercomputers
Windows Server* 2016 & Intel® Technologies
Introduction to XSEDE Resources HPC Workshop 08/21/2017
Ping-Sung Yeh, Te-Hao Hsu Conclusions Results Introduction
BlueGene/L Supercomputer
Presentation transcript:

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Status Update TeraGrid Science Advisory Board Meeting July 19, 2010 Dr. Mike Norman, PI Dr. Allan Snavely, Co-PI

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Gordon Objective Deploy a computational resource to the national community that is specifically designed for data intensive computing 245 Tflop, 1,024 node cluster based on the Intel Sandy Bridge Processor 2TB, large shared memory “supernodes” composed of 32 compute nodes per supernode High performance I/O subsystem based on enterprise class SSD’s Low latency, high speed interconnect via a dual rail, QDR InfiniBand 3D torus network

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO The Gordon Sweet Spot Data Mining De novo genome assembly from sequencer reads & analysis of galaxies from cosmological simulations and observations. Federations of databases and Interaction network analysis for drug discovery, social science, biology, epidemiology, etc. Predictive Science Solution of inverse problems in oceanography, atmospheric science, & seismology. Modestly scalable codes in quantum chemistry & structural engineering. Large Shared Memory; Low Latency, Fast Interconnect; Fast I/O system

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Gordon Aggregate Capabilities Speed245 TFLOPS Memory (RAM)64 TB Memory (SSD)256 TB Memory (RAM+SSD)320 TB Ratio (MEM/SPEED)1.31 BYTES/FLOP IO rate to SSDs35 Million IOPS Network bandwidth16 GB/s full-duplex Network latency 1  sec. Disk storage (external/PFS)4 PB IO Bandwidth to PFS>100 GB/sec

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Gordon Supernode Architecture 32 Appro Green Blade Dual processor Intel Sandy Bridge 240 GFLOPS 64 GB/node # Cores TBD 2 Appro IO nodes/SN Intel SSD drives 4 TB ea. 560,000 IOPS ScaleMP vSMP virtual shared memory 2 TB RAM aggregate (64GBx32) 8 TB SSD aggregate (256GBx32) 240 GF Comp. Node 64 GB RAM 240 GF Comp. Node 64 GB RAM 4 TB SSD I/O Node vSMP memory virtualization

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Before Gordon There is Dash Dash has been deployed as a risk mitigator for Gordon Dash is an Appro cluster that embodies the core architectural features of Gordon and provides a platform for testing, evaluation, and porting/optimizing applications 64 node, dual-socket, 4 core, Nehalem 48GB memory per node 4TB of Intel SLC Flash (X25E) InfiniBand Interconnect vSMP Foundation supernodes Using Dash for: SSD Testing (vendors, controllers, RAID, file systems) 16-Way vSMP Acceptance 32-Way vSMP Acceptance test Early User Testing Development of processes and procedures for systems administration, security, and networking

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Dash TeraGrid Resource Two 16 node virtual clusters SSD-only 16 node; Nehalem, dual socket 8 core; 48GB ; 1 TB SSD (16) SSD’s are local to the nodes Standard queues available vSMP + SSD 16 nodes, Nehalem, dual socket, 8 core, 48GB; 960GB SSB (15) SSD’s are local to the nodes, but presented as a single file system via vSMP The supernode is treated as a single shared resource GPFS-WAN Additional 32 nodes will be brought online in early August after the vSMP 32-way acceptance testing is complete

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Gordon Timeline

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Dash Early User Success Stories Palomar Transient Factory (Astrophysics) Large, random queries with 100 new transients every minute - Increased performance of queries upto 161 % NIH Biological Networks Pathway Analysis Queries on graphical data producing a lot of random IO, requiring significant IOPS. DASH vSMP speedup: 186 % Protein Data Bank - Alignment Database Predictive Science with queries on pair-wise comparisons and alignments of protein structures: 69% DASH speedup Supercomputing Conference (SC) 2009 HPC Storage Challenge Winner, 2010 – Two Publications accepted. Finalist for Best paper and Best Student paper.

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Dash/ Gordon Documentation Dash User Guide (SDSC site -->User Support --> Resources --> Dash) TeraGrid Resource Catalog (TeraGrid site --> User Support --> Resources --> Compute & Viz Resources): Gordon is mentioned under Dash's listing in the TG Resource Catalog as a future resource. It will have its own entry as the production date nears TeraGrid Knowledge Base, two articles (TeraGrid site --> Help & Support --> KB --> Search on "Dash" or "Gordon"): On the TeraGrid, what is Dash? On the TeraGrid, what is Gordon?

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Dash Allocations This project is getting at the heart of the value of Gordon. Need more text here – why focus on this allocation? Project (or) userInstitution(s)Science CommunityScope of Work Status and Outcomes Tim Axelrod University of Arizona, LSST.org Astronomical Sciences Determine if and how the new paradigm of IO- oriented data-intensive supercomputing used by DASH/Flash Gordon can be used by LSST. LSST has requested a startup account on DASH that has been approved by TG (30000 SUs) Mark Miller UCSDMolecular Biosciences Improve BFAST code performance using Dash features. Specifically SSDs will be used to accelerate file I/O operations. Startup request approved on Dash (30000 SUs) Sameer Shende University of Oregon Performance Evaluation and Benchmarking Performance Evaluation Using the TAU Performance System (R). vSMP node will be used to analyze, visualize performance data. Startup request approved on Dash (20000 SUs) John Helly University of California-San Diego Atmospheric Sciences Data Transposition Development for Exa-scale Data in Memory Startup request approved on Dash (30000 SUs)

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Invited Speaker : M.L. Norman: Accelerating Data-Intensive Science with Gordon and Dash Presentation: DASH-IO: an Empirical Study of Flash-based IO for HPC; Jiahua He, Jeffrey Bennett, and Allan Snavely, SDSC Birds of a Feather (2): NSF’s Track 2D and RVDAS Resources; Richard Moore, Chair Tutorial: Using vSMP and Flash Technologies for Data Intensive Applications Presenters: Mahidhar Tatineni, Jerry Greenberg, and Arun Jagatheesan, San Diego Supercomputer Center (SDSC) University of California, San Diego (UCSD) Abstract: Virtual shared-memory (vSMP) and flash memory technologies have the potential to improve the performance of data-intensive applications. Dash is a new TeraGrid resource at SDSC that showcases both of these technologies. This tutorial will be a basic introduction to using vSMP and flash technologies and how to access Dash via the TeraGrid. Hands-on material will be used to demonstrate the use and performance benefits. Agenda for this half-day tutorial includes: Dash Architecture, the Dash user environment ; hands-on examples on use of a vSMP node; hands-on examples illustrating flash memory use; and a Q&A session including hands on preliminary work with attendee codes on vSMP nodes and flash IO nodes.

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO "Understanding the Impact of Emerging Non-volatile Memories on High-performance, IO-Intensive Computing " Nominated for a best paper as well as best student paper. Presenter: Adrian Caulfield Authors: Adrian Caulfield, J. Coburn, T. Mollov, A. De, A. Akel, J. He, A. Jagatheesan, R. Gupta, A. Snavely, S. Swanson "DASH: a Recipe for a Flash-based Data Intensive Supercomputer" focuses on the use of commodity hardware to achieve a significant cost/performance ratio for data-intensive supercomputing. Presenter: Jiahua He Authors: Jiahua He, A. Jagatheesan, S. Gupta, J. Bennett, A. Snavely Gordon/Dash Education, Outreach and Training Activities Supercomputing Conference 2010, New Orleans, LA November 13-19, 2010

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO G RAND C HALLENGES IN D ATA -I NTENSIVE S CIENCES O CTOBER 26-29, 2010 S AN D IEGO S UPERCOMPUTER C ENTER, UC S AN D IEGO Confirmed conference topics and speakers : Needs and Opportunities in Observational Astronomy - Alex Szalay, JHU Transient Sky Surveys – Peter Nugent, LBNL Large Data-Intensive Graph Problems – John Gilbert, UCSB Algorithms for Massive Data Sets – Michael Mahoney, Stanford U. Needs and Opportunities in Seismic Modeling and Earthquake Preparedness - Tom Jordan, USC Needs and Opportunities in Fluid Dynamics Modeling and Flow Field Data Analysis – Parviz Moin, Stanford U. Needs and Emerging Opportunities in Neuroscience – Mark Ellisman, UCSD Data-Driven Science in the Globally Networked World – Larry Smarr, UCSD