“End-to-end Optical Fiber Cyberinfrastructure for Data-Intensive Research: Implications for Your Campus” Featured Speaker EDUCAUSE 2010 Anaheim Convention.

Slides:



Advertisements
Similar presentations
Grids and Biology: A Natural and Happy Pairing Rick Stevens Director, Mathematics and Computer Science Division Argonne National Laboratory Professor,
Advertisements

OptIPuter Goal: Removing Bandwidth Barriers to e-Science ATLAS Sloan Digital Sky Survey LHC ALMA.
Three Disruptive Leadership Opportunities for Washington State to Live in the Future Keynote Talk Washington Innovation Summit: New Decade, New Partnerships,
Calit2 " Talk Nortel Visiting Team December 12, 2005 Dr. Larry Smarr Director, California Institute for Telecommunications and Information.
Calit2: Experiments in Living in the Virtual/Physical World Pat Ledden Memorial Luncheon Faculty Luncheon Seminar UC San Diego Faculty Club February 24,
Calit2: a SoCal UC Infrastructure for Innovation Welcoming Talk Visit to Calit2 by The Trusteeship April 24, 2010 Dr. Larry Smarr Director, California.
Coupling Australias Researchers to the Global Innovation Economy Fourth Lecture in the Australian American Leadership Dialogue Scholar Tour Swinburne University.
Health Sciences Driving UCSD Research Cyberinfrastructure Invited Talk UCSD Health Sciences Faculty Council UC San Diego April 3, 2012 Dr. Larry Smarr.
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting Stem Cell Research Invited Presentation Sanford Consortium for Regenerative.
The Missing Link: Dedicated End-to-End 10Gbps Optical Lightpaths for Clusters, Grids, and Clouds Invited Keynote Presentation 11 th IEEE/ACM International.
Sequencing Genomics: The New Big Data Driver IntermezzoTalk SURFnet7, Part of GigaPort3 Utrecht, Netherlands December 7, 2011 Dr. Larry Smarr Director,
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Intensive Research Seminar Presentation Princeton Institute for Computational.
Calit2: Past, Present, and Future University Librarians Advisory Board Luncheon Seminar UC San Diego Library January 4, 2012 Dr. Larry Smarr Director,
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biomedical Sciences Joint Presentation UCSD School of Medicine Research Council.
The Strongly Coupled LambdaCloud Tour TTI-Vanguard February 20, 2009 Dr. Larry Smarr Director, California Institute for Telecommunications.
High Performance Cyberinfrastructure Required for Data Intensive Scientific Research Invited Presentation National Science Foundation Advisory Committee.
Uses of the OptIPortal Presentation to the Minority Serving Institutions Cyberinfrastructure Empowerment Coalition June 10, 2010 Dr. Larry.
Calit2-Living in the Future " Keynote Sharecase 2006 University of California, San Diego March 29, 2006 Dr. Larry Smarr Director, California Institute.
Supercomputers and Supernetworks are Transforming Research Invited Talk Computing Research that Changed the World: Reflections and Perspectives Washington,
Bringing Mexico Into the Global LambdaGrid Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber.
The OptIPlanet Collaboratory -- a Global CineGrid Testbed Invited Presentation CineGrid International Workshop 2008 December 8, 2008 Dr. Larry.
Introduction to the UCSD Division of Calit2" Calit2 Tour NextMed / MMVR20 UC San Diego February 20, 2013 Dr. Larry Smarr Director, California.
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globally Networked World Keynote Presentation Sequencing Data Storage and Management.
Overview of Photonics Research at Calit2: Scaling from Nanometers to the Earth UCSD-NICT Joint Symposium on Innovative Lightwave, Millimeter-Wave and THz.
Exploring Our Inner Universe Using Supercomputers and Gene Sequencers Physics Department Colloquium UC San Diego October 24, 2013 Dr. Larry Smarr Director,
High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC.
Why Optical Networks Are Emerging as the 21 st Century Driver Scientific American, January 2001.
"The OptIPuter: an IP Over Lambda Testbed" Invited Talk NREN Workshop VII: Optical Network Testbeds (ONT) NASA Ames Research Center Mountain View, CA August.
The First Year of Cal-(IT) 2 Report to The University of California Regents UCSF San Francisco, CA March 13, 2002 Dr. Larry Smarr Director, California.
The OptIPuter and Its Applications Invited Talk IEEE/LEOS Summer 2009 Topicals Meeting on Future Global Networks July 27, 2009 Dr. Larry Smarr Director,
AHM Overview OptIPuter Overview Third All Hands Meeting OptIPuter Project San Diego Supercomputer Center University of California, San Diego January 26,
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * Calit2 * LBNL * NICS * ORNL * SDSC Report to the Dept. of Energy Advanced Scientific.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political, and Economic Presentation by Larry Smarr to the NSF Campus Bridging Workshop.
Computer Supported Cooperative Work: Past, Present, and Future Vision Panel on Computer Supported Cooperative Work 9 th Annual ON*VECTOR International.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 Data-Intensive Solutions.
“High Performance Cyberinfrastructure Enables Data-Driven Science in the Globally Networked World” Invited Speaker Grand Challenges in Data-Intensive Discovery.
An Introduction to the Open Science Data Cloud Heidi Alvarez Florida International University Robert L. Grossman University of Chicago Open Cloud Consortium.
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
Science and Cyberinfrastructure in the Data-Dominated Era Symposium #1610, How Computational Science Is Tackling the Grand Challenges Facing Science and.
“Set My Data Free: High-Performance CI for Data-Intensive Research” KeynoteSpeaker Cyberinfrastructure Days University of Michigan Ann Arbor, MI November.
The OptIPlanet Collaboratory Supporting Researchers Worldwide Talk Australian American Leadership Dialogue January 15, 2008 Dr. Larry Smarr.
CI Days: Planning Your Campus Cyberinfrastructure Strategy Russ Hobby, Internet2 Internet2 Member Meeting 9 October 2007.
Source: Jim Dolgonas, CENIC CENIC is Removing the Inter-Campus Barriers in California ~ $14M Invested in Upgrade Now Campuses Need to Upgrade.
“An Integrated Science Cyberinfrastructure for Data-Intensive Research” Panel CISCO Executive Symposium San Diego, CA June 9, 2015 Dr. Larry Smarr Director,
Developing a North American Global LambdaGrid Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E.
Cal-(IT) 2 : A Public-Private Partnership in Southern California U.S. Business Council for Sustainable Development Year-End Meeting December 11, 2003 Institute.
Introduction to Calit2 Visit by NASA Ames February 29, 2008 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology.
Chicago/National/International OptIPuter Infrastructure Tom DeFanti OptIPuter Co-PI Distinguished Professor of Computer Science Director, Electronic Visualization.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Michael L. Norman Principal Investigator Interim Director, SDSC Allan Snavely.
Innovative Research Alliances Invited Talk IUCRP Fellows Seminar UCSD La Jolla, CA July 10, 2006 Dr. Larry Smarr Director, California Institute for Telecommunications.
A Wide Range of Scientific Disciplines Will Require a Common Infrastructure Example--Two e-Science Grand Challenges –NSF’s EarthScope—US Array –NIH’s Biomedical.
Using Photonics to Prototype the Research Campus Infrastructure of the Future: The UCSD Quartzite Project Philip Papadopoulos Larry Smarr Joseph Ford Shaya.
SoCal Infrastructure OptIPuter Southern California Network Infrastructure Philip Papadopoulos OptIPuter Co-PI University of California, San Diego Program.
A High-Performance Campus-Scale Cyberinfrastructure For Effectively Bridging End-User Laboratories to Data-Intensive Sources Presentation by Larry Smarr.
John D. McCoy Principal Investigator Tom McKenna Project Manager UltraScienceNet Research Testbed Enabling Computational Genomics Project Overview.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
The OptIPuter Project Tom DeFanti, Jason Leigh, Maxine Brown, Tom Moher, Oliver Yu, Bob Grossman, Luc Renambot Electronic Visualization Laboratory, Department.
“ Collaborations Between Calit2, SIO, and the Venter Institute—a Beginning " Talk to the UCSD Representative Assembly La Jolla, CA November 29, 2005 Dr.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
“The UCSD Big Data Freeway System” Invited Short Talk Workshop on “Enriching Human Life and Society” UC San Diego February 6, 2014 Dr. Larry Smarr Director,
Southern California Infrastructure Philip Papadopoulos Greg Hidley.
University of Illinois at Chicago Lambda Grids and The OptIPuter Tom DeFanti.
High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC.
“OptIPuter: From the End User Lab to Global Digital Assets" Panel UC Research Cyberinfrastructure Meeting October 10, 2005 Dr. Larry Smarr.
“ Building an Information Infrastructure to Support Microbial Metagenomic Sciences " Presentation to the NBCR Research Advisory Committee UCSD La Jolla,
Lennart Johnsson Professor CSC Director, PDC
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * Calit2 * LBNL * NICS * ORNL * SDSC Report to the Dept. of Energy Advanced.
Optical SIG, SD Telecom Council
The OptIPortal, a Scalable Visualization, Storage, and Computing Termination Device for High Bandwidth Campus Bridging Presentation by Larry Smarr to.
Presentation transcript:

“End-to-end Optical Fiber Cyberinfrastructure for Data-Intensive Research: Implications for Your Campus” Featured Speaker EDUCAUSE 2010 Anaheim Convention Center Anaheim, CA October 13, 2010 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD Follow me on Twitter: lsmarr

Abstract Most campuses today only provide shared Internet connectivity to the end user’s labs, in spite of the existence of national-scale optical fiber networking, capable of multiple wavelengths of 10Gbps dedicated bandwidth. This “last mile gap” requires campus CIOs to plan for installing a more ubiquitous fiber infrastructure on campus and rethinking the centralization of storage and computing.  Such a set of high-bandwidth campus “on-ramps” will also be required if remote clouds are to be useful for storing gigabyte to terabyte size data objects, which are routinely produced by modern scientific instruments. I will review experiments at UCSD which give a preview of how to build a 21st century data-intensive research campus.

The Data Intensive Era Requires High Performance Cyberinfrastructure Growth of Digital Data is Exponential “Data Tsunami” Driven by Advances in Digital Detectors, Networking, and Storage Technologies Shared Internet Optimized for Megabyte-Size Objects Need New Cyberinfrastructure for Gigabyte Objects Making Sense of it All is the New Imperative Data Analysis Workflows Data Mining Visual Analytics Multiple-database Queries Data-driven Applications Source: SDSC

What Are the Components of High Performance Cyberinfrastructure? High Performance Optical Networks Data-Intensive Visualization and Analysis End-to-End Wide Area CI Data-Intensive Research CI

High Performance Optical Networks

In Japan, FTTH Has Become the Dominant Broadband-- Subscribers to “Slow” 40 Mbps ADSL Are Decreasing! Dec 2000 March 2009 Japan’s Households can get 50 Mbps DSL & 100Mbps to1Gbps FTTH Services with Competitive Prices Source: Japan’s Ministry of Internal Affairs and Communications http://tilgin.wordpress.com/2009/12/17/japan-the-land-of-fiber/

Connect 93% of All Australian Premises with Fiber Australia—The Broadband Nation: Universal Coverage with Fiber, Wireless, Satellite Connect 93% of All Australian Premises with Fiber 100 Mbps to Start, Upgrading to Gigabit 7% with Next Gen Wireless and Satellite 12 Mbps to Start Provide Equal Wholesale Access to Retailers Providing Advanced Digital Services to the Nation Driven by Consumer Internet, Telephone, Video “Triple Play”, eHealth, eCommerce… “NBN is Australia’s largest nation building project in our history.” - Minister Stephen Conroy www.nbnco.com.au

Globally Fiber to the Premise is Growing Rapidly, Mostly in Asia FTTP Connections Growing at ~30%/year 130 Million Households with FTTH in 2013 Source: Heavy Reading (www.heavyreading.com), the market research division of Light Reading (www.lightreading.com).

Research Innovation Labs Linked by 10G GLIF The Global Lambda Integrated Facility-- Creating a Planetary-Scale High Bandwidth Collaboratory Research Innovation Labs Linked by 10G GLIF www.glif.is Created in Reykjavik, Iceland 2003 Visualization courtesy of Bob Patterson, NCSA.

Academic Research “OptIPlatform” Cyberinfrastructure: A 10Gbps “End-to-End” Lightpath Cloud HD/4k Video Cams HD/4k Telepresence Instruments HPC End User OptIPortal 10G Lightpaths National LambdaRail Campus Optical Switch Data Repositories & Clusters HD/4k Video Images

Data-Intensive Visualization and Analysis

The OptIPuter Project: Creating High Resolution Portals Over Dedicated Optical Channels to Global Science Data Scalable Adaptive Graphics Environment (SAGE) Picture Source: Mark Ellisman, David Lee, Jason Leigh Calit2 (UCSD, UCI), SDSC, and UIC Leads—Larry Smarr PI Univ. Partners: NCSA, USC, SDSU, NW, TA&M, UvA, SARA, KISTI, AIST Industry: IBM, Sun, Telcordia, Chiaro, Calient, Glimmerglass, Lucent

On-Line Resources Help You Build Your Own OptIPortal www.optiputer.net http://wiki.optiputer.net/optiportal www.evl.uic.edu/cavern/sage/ http://vis.ucsd.edu/~cglx/ OptIPortals Are Built From Commodity PC Clusters and LCDs To Create a 10Gbps Scalable Termination Device

1/3 Billion Pixel OptIPortal Used to Study NASA Earth Satellite Images of October 2007 Wildfires Source: Falko Kuester, Calit2@UCSD

Nearly Seamless AESOP OptIPortal 46” NEC Ultra-Narrow Bezel 720p LCD Monitors Source: Tom DeFanti, Calit2@UCSD;

3D Stereo Head Tracked OptIPortal: NexCAVE Array of JVC HDTV 3D LCD Screens KAUST NexCAVE = 22.5MPixels www.calit2.net/newsroom/article.php?id=1584 Source: Tom DeFanti, Calit2@UCSD

Green Initiative: Can Optical Fiber Replace Airline Travel for Continuing Collaborations? Source: Maxine Brown, OptIPuter Project Manager

Multi-User Global Workspace: San Diego, Chicago, Saudi Arabia Source: Tom DeFanti, KAUST Project, Calit2

CineGrid 4K Remote Microscopy USC to Calit2 Photo: Alan Decker December 8, 2009 Richard Weinberg, USC

First Tri-Continental Premier of a Streamed 4K Feature Film With Global HD Discussion 4K Film Director, Beto Souza Keio Univ., Japan Calit2@UCSD Source: Sheldon Brown, CRCA, Calit2 San Paulo, Brazil Auditorium 4K Transmission Over 10Gbps-- 4 HD Projections from One 4K Projector

End-to-end WAN HPCI

Project StarGate Goals: Combining Supercomputers and Supernetworks Create an “End-to-End” 10Gbps Workflow Explore Use of OptIPortals as Petascale Supercomputer “Scalable Workstations” Exploit Dynamic 10Gbps Circuits on ESnet Connect Hardware Resources at ORNL, ANL, SDSC Show that Data Need Not be Trapped by the Network “Event Horizon” OptIPortal@SDSC Rick Wagner Mike Norman Source: Michael Norman, SDSC, UCSD ANL * Calit2 * LBNL * NICS * ORNL * SDSC

Using Supernetworks to Couple End User’s OptIPortal to Remote Supercomputers and Visualization Servers Source: Mike Norman, SDSC Argonne NL DOE Eureka 100 Dual Quad Core Xeon Servers 200 NVIDIA Quadro FX GPUs in 50 Quadro Plex S4 1U enclosures 3.2 TB RAM rendering ESnet 10 Gb/s fiber optic network SDSC Calit2/SDSC OptIPortal1 20 30” (2560 x 1600 pixel) LCD panels 10 NVIDIA Quadro FX 4600 graphics cards > 80 megapixels 10 Gb/s network throughout visualization NICS ORNL NSF TeraGrid Kraken Cray XT5 8,256 Compute Nodes 99,072 Compute Cores 129 TB RAM simulation *ANL * Calit2 * LBNL * NICS * ORNL * SDSC

Terasort on Open Cloud Testbed Wavelengths and the Appropriate Cloud Middleware Make Wide Area Clouds Practical Terasort on Open Cloud Testbed Sorting 10 Billion Records (1.2 TB) at 4 Sites (120 Nodes) Sustaining >5 Gbps--Only 5% Distance Penalty

Open Cloud OptIPuter Testbed--Manage and Compute Large Datasets Over 10Gbps Lambdas NLR C-Wave MREN CENIC Dragon Open Source SW Hadoop Sector/Sphere Nebula Thrift, GPB Eucalyptus Benchmarks 9 Racks 500 Nodes 1000+ Cores 10+ Gb/s Now Upgrading Portions to 100 Gb/s in 2010/2011 Source: Robert Grossman, UChicago

Sector Won the SC 08 and SC 09 Bandwidth Challenge 2009: Sector/Sphere Sustained Over 100 Gbps Cloud Computation Across 4 Geographically Distributed Data Centers 2008: Sector/Sphere Used for a Variety of Scientific Computing Applications on Open Cloud Testbed. Source: Robert Grossman, UChicago

Amazon Experiment for Big Data California and Washington Universities Are Testing a 10Gbps Connected Commercial Data Cloud Amazon Experiment for Big Data Only Available Through CENIC & Pacific NW GigaPOP Private 10Gbps Peering Paths Includes Amazon EC2 Computing & S3 Storage Services Early Experiments Underway Robert Grossman, Open Cloud Consortium Phil Papadopoulos, Calit2/SDSC Rocks

Hybrid Cloud Computing with modENCODE Data Computations in Bionimbus Can Span the Community Cloud & the Amazon Public Cloud to Form a Hybrid Cloud Sector was used to Support the Data Transfer between Two Virtual Machines One VM was at UIC and One VM was an Amazon EC2 Instance Graph Illustrates How the Throughput between Two Virtual Machines in a Wide Area Cloud Depends upon the File Size Biological data (Bionimbus) Source: Robert Grossman, UChicago

Moving into the Clouds: Rocks and EC2 We Can Build Physical Hosting Clusters & Multiple, Isolated Virtual Clusters: Can I Use Rocks to Author “Images” Compatible with EC2? (We Use Xen, They Use Xen) Can I Automatically Integrate EC2 Virtual Machines into My Local Cluster (Cluster Extension) Submit Locally My Own Private + Public Cloud What This Will Mean All your Existing Software Runs Seamlessly Among Local and Remote Nodes User Home Directories are Mounted Queue Systems Work Unmodified MPI Works Source: Phil Papadopoulos, SDSC/Calit2

APBS Rocks Roll (NBCR) + EC2 Roll + Condor Roll = Amazon VM Proof of Concept Using Condor and Amazon EC2 Adaptive Poisson-Boltzmann Solver (APBS) APBS Rocks Roll (NBCR) + EC2 Roll + Condor Roll = Amazon VM Cluster extension into Amazon using Condor Local Cluster EC2 Cloud Running in Amazon Cloud NBCR VM NBCR VM NBCR VM APBS + EC2 + Condor Source: Phil Papadopoulos, SDSC/Calit2

Data-Intensive Research Campus CI

Focus on Data-Intensive Cyberinfrastructure “Blueprint for the Digital University”--Report of the UCSD Research Cyberinfrastructure Design Team Focus on Data-Intensive Cyberinfrastructure April 2009 No Data Bottlenecks--Design for Gigabit/s Data Flows http://research.ucsd.edu/documents/rcidt/RCIDTReportFinal2009.pdf

Broad Campus Input to Build the Plan and Support for the Plan Campus Survey of CI Needs-April 2008 45 Responses (Individuals, Groups, Centers, Depts) #1 Need was Data Management 80% Data Backup 70% Store Large Quantities of Data 64% Long Term Data Preservation 50% Ability to Move and Share Data Vice Chancellor of Research Took the Lead Case Studies Developed from Leading Researchers Broad Research CI Design Team Chaired by Mike Norman and Phil Papadopoulos Faculty and Staff: Engineering, Oceans, Physics, Bio, Chem, Medicine, Theatre SDSC, Calit2, Libraries, Campus Computing and Telecom

Current UCSD Optical Core: Bridging End-Users to CENIC L1, L2, L3 Services Enpoints: >= 60 endpoints at 10 GigE >= 32 Packet switched >= 32 Switched wavelengths >= 300 Connected endpoints Approximately 0.5 TBit/s Arrive at the “Optical” Center of Campus. Switching is a Hybrid of: Packet, Lambda, Circuit -- OOO and Packet Switches Lucent Glimmerglass Force10 Source: Phil Papadopoulos, SDSC/Calit2 (Quartzite PI, OptIPuter co-PI) Quartzite Network MRI #CNS-0421555; OptIPuter #ANI-0225642

UCSD Planned Optical Networked Biomedical Researchers and Instruments Cellular & Molecular Medicine West National Center for Microscopy & Imaging Biomedical Research Center for Molecular Genetics Pharmaceutical Sciences Building Cellular & Molecular Medicine East CryoElectron Microscopy Facility Radiology Imaging Lab Bioengineering Calit2@UCSD San Diego Supercomputer Center Connects at 10 Gbps : Microarrays Genome Sequencers Mass Spectrometry Light and Electron Microscopes Whole Body Imagers Computing Storage

UCSD Campus Investment in Fiber Enables Consolidation of Energy Efficient Computing & Storage WAN 10Gb: CENIC, NLR, I2 N x 10Gb DataOasis (Central) Storage Gordon – HPD System Cluster Condo Triton – Petascale Data Analysis Scientific Instruments Digital Data Collections Campus Lab Cluster OptIPortal Tile Display Wall Source: Philip Papadopoulos, SDSC/Calit2

Moving to a Shared Campus Data Storage and Analysis Resource: Triton Resource @ SDSC Large Memory PSDAF 256/512 GB/sys 9TB Total 128 GB/sec ~ 9 TF Shared Resource Cluster 24 GB/Node 6TB Total 256 GB/sec ~ 20 TF x256 x28 UCSD Research Labs Large Scale Storage 2 PB 40 – 80 GB/sec 3000 – 6000 disks Phase 0: 1/3 TB, 8GB/s Campus Research Network Source: Philip Papadopoulos, SDSC/Calit2

Rapid Evolution of 10GbE Port Prices Makes Campus-Scale 10Gbps CI Affordable Port Pricing is Falling Density is Rising – Dramatically Cost of 10GbE Approaching Cluster HPC Interconnects $80K/port Chiaro (60 Max) $ 5K Force 10 (40 max) ~$1000 (300+ Max) $ 500 Arista 48 ports $ 400 Arista 48 ports 2005 2007 2009 2010 Source: Philip Papadopoulos, SDSC/Calit2

10G Switched Data Analysis Resource: Data Oasis (RFP Underway) RCN OptIPuter Colo CalRen 20 Triton 24 32 32 2 Existing Storage 40 Dash Oasis Procurement (RFP) 8 Minimum 40 GB/sec for Lustre Nodes must be able to function as Lustre OSS (Linux) or NFS (Solaris) Connectivity to Network is 2 x 10GbE/Node Likely Reserve dollars for inexpensive replica servers 1500 – 2000 TB > 40 GB/s Gordon 100 Source: Philip Papadopoulos, SDSC/Calit2

High Performance Computing (HPC) vs. High Performance Data (HPD) Attribute HPC HPD Key HW metric Peak FLOPS Peak IOPS Architectural features Many small-memory multicore nodes Fewer large-memory vSMP nodes Typical application Numerical simulation Database query Data mining Concurrency High concurrency Low concurrency or serial Data structures Data easily partitioned e.g. grid Data not easily partitioned e.g. graph Typical disk I/O patterns Large block sequential Small block random Typical usage mode Batch process Interactive Source: Mike Norman, SDSC

What is Gordon? Data-Intensive Supercomputer Based on SSD Flash Memory and Virtual Shared Memory SW Emphasizes MEM and IOPS over FLOPS System Designed to Accelerate Access to Massive Data Bases being Generated in all Fields of Science, Engineering, Medicine, and Social Science The NSF’s Most Recent Track 2 Award to the San Diego Supercomputer Center (SDSC) Coming Summer 2011 Source: Mike Norman, SDSC

Data Mining Applications will Benefit from Gordon De Novo Genome Assembly from Sequencer Reads & Analysis of Galaxies from Cosmological Simulations & Observations Will Benefit from Large Shared Memory Federations of Databases & Interaction Network Analysis for Drug Discovery, Social Science, Biology, Epidemiology, Etc. Will Benefit from Low Latency I/O from Flash RAM + flash Source: Mike Norman, SDSC

Grand Challenges in Data-Intensive Sciences October 26-28, 2010 San Diego Supercomputer Center , UC San Diego Confirmed conference topics and speakers : Needs and Opportunities in Observational Astronomy - Alex Szalay, JHU Transient Sky Surveys – Peter Nugent, LBNL Large Data-Intensive Graph Problems – John Gilbert, UCSB Algorithms for Massive Data Sets – Michael Mahoney, Stanford U.     Needs and Opportunities in Seismic Modeling and Earthquake Preparedness - Tom Jordan, USC Needs and Opportunities in Fluid Dynamics Modeling and Flow Field Data Analysis – Parviz Moin, Stanford U. Needs and Emerging Opportunities in Neuroscience – Mark Ellisman, UCSD Data-Driven Science in the Globally Networked World – Larry Smarr, UCSD 

You Can Download This Presentation at lsmarr.calit2.net