Health Sciences Driving UCSD Research Cyberinfrastructure Invited Talk UCSD Health Sciences Faculty Council UC San Diego April 3, 2012 Dr. Larry Smarr.

Slides:



Advertisements
Similar presentations
ARIZONA DEPARTMENT OF ADMINISTRATION INFORMATION SERVICES DIVISION - DATA CENTER.
Advertisements

Cyber Metagenomics; Challenge to See The Unseen Majority in The Ocean
A Systems Approach to Personalized Medicine Talk and Discussion NASA Ames Mountain View, CA March 28, 2013 Dr. Larry Smarr Director, California Institute.
The OptIPuter – Toward a Terabit LAN Talk at The ON*VECTOR Terabit LAN Workshop Hosted by Calit2 University of California, San Diego January 29, 2005 Dr.
Advancing the Metagenomics Revolution Invited Talk Symposium #1816, Managing the Exaflood: Enhancing the Value of Networked Data for Science and Society.
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting Stem Cell Research Invited Presentation Sanford Consortium for Regenerative.
Sequencing Genomics: The New Big Data Driver IntermezzoTalk SURFnet7, Part of GigaPort3 Utrecht, Netherlands December 7, 2011 Dr. Larry Smarr Director,
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Intensive Research Seminar Presentation Princeton Institute for Computational.
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biomedical Sciences Joint Presentation UCSD School of Medicine Research Council.
Reading Out the State of the Body and How it Changes Under Therapy Guest Lecture Pharmacy Informatics 2013 University of California San Diego June 7, 2013.
Calit2-Living in the Future " Keynote Sharecase 2006 University of California, San Diego March 29, 2006 Dr. Larry Smarr Director, California Institute.
Calit2s Program in Nano-science, Nano-engineering, and Nano-medicine Invited Talk Review of Nano-cancer project April 11, 2006 Dr. Larry Smarr Director,
Bringing Mexico Into the Global LambdaGrid Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber.
Large Memory High Performance Computing Enables Comparison Across Human Gut Microbiome of Patients with Autoimmune Diseases and Healthy Subjects XSEDE.
Introduction to the UCSD Division of Calit2" Calit2 Tour NextMed / MMVR20 UC San Diego February 20, 2013 Dr. Larry Smarr Director, California.
Deep Self - Quantifying the State of Your Body Invited Talk NextMed / MMVR20 San Diego February 21, 2013 Dr. Larry Smarr Director, California Institute.
Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury.
“Tracking Immune Biomarkers and the Human Gut Microbiome: Inflammation, Crohn's Disease, and Colon Cancer” USC Monthly Seminar Series Physical Sciences.
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globally Networked World Keynote Presentation Sequencing Data Storage and Management.
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (CAMERA) Invited Talk CONNECT Board Meeting La Jolla, CA April 26, 2006.
Exploring Our Inner Universe Using Supercomputers and Gene Sequencers Physics Department Colloquium UC San Diego October 24, 2013 Dr. Larry Smarr Director,
Discussion Janssen La Jolla Research and Development La Jolla, CA
Leveraging Biomedical Big Data: Quantified Self & Beyond Invited Talk FutureMed Singularity University NASA Ames Campus February 5, 2013 Dr. Larry Smarr.
High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC.
Becoming a Quantified Self Invited Speaker Technology, Media and Telecom (TMT) Summit Deloitte University Dallas, TX April 26, 2012 Dr. Larry Smarr Director,
Why Optical Networks Are Emerging as the 21 st Century Driver Scientific American, January 2001.
Cal-(IT) 2 The California Institute for Telecommunications and Information Technology Invited Talk UCSD Foundation Board of Trustees La Jolla, CA January.
The First Year of Cal-(IT) 2 Report to The University of California Regents UCSF San Francisco, CA March 13, 2002 Dr. Larry Smarr Director, California.
UT Research Data Repository Chris Jordan UT Research Cyberinfrastructure Storage Committee Chair.
UNIVERSITY OF CALIFORNIA, SAN DIEGO SAN DIEGO SUPERCOMPUTER CENTER UC Grid Summit -- April 1, 2009 UC San Diego Campus Grid Update Shava Smallen San Diego.
Xsede eXtreme Science and Engineering Discovery Environment Ron Perrott University of Oxford 1.
SAN DIEGO SUPERCOMPUTER CENTER Emerging HIPAA and Protected Data Requirements for Research Computing at SDSC Ron Hawkins Director of Industry Relations.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Gordon: NSF Flash-based System for Data-intensive Science Mahidhar Tatineni 37.
SAN DIEGO SUPERCOMPUTER CENTER Niches, Long Tails, and Condos Effectively Supporting Modest-Scale HPC Users 21st High Performance Computing Symposia (HPC'13)
A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political, and Economic Presentation by Larry Smarr to the NSF Campus Bridging Workshop.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 Data-Intensive Solutions.
“Introduction to UC San Diego’s Integrated Digital Infrastructure” Opening Talk IDI Showcase 2015 University of California, San Diego May 6-7, 2015 Dr.
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology Metagenomics Center for Earth Observations and Applications Advisory Committee.
“The Quantified Self Movement: The Technologies That Are Revolutionizing Health and Fitness” Panel Discussion MIT Enterprise Forum San Diego UC San Diego.
“Mapping the Human Gut Microbiome in Health and Disease Using Sequencing, Supercomputing, and Data Analysis” Invited Talk Delivered by Mehrdad Yazdani,
SDSC RP Update TeraGrid Roundtable Reviewing Dash Unique characteristics: –A pre-production/evaluation “data-intensive” supercomputer based.
“An Integrated Science Cyberinfrastructure for Data-Intensive Research” Panel CISCO Executive Symposium San Diego, CA June 9, 2015 Dr. Larry Smarr Director,
“Quantified Self- On Being a Personal Genomic Observatory” Keynote in the “Humans as Genomic Observatories” Meeting Session in the Genomics Standards Consortium.
The analyses upon which this publication is based were performed under Contract Number HHSM C sponsored by the Center for Medicare and Medicaid.
“Calit2: A UC Experiment for Living in the Future" Talk to UCSD Near You La Jolla, CA April 11, 2006 Dr. Larry Smarr Director, California Institute.
“Creating a High Performance Cyberinfrastructure to Support Analysis of Illumina Metagenomic Data” DNA Day Department of Computer Science and Engineering.
“Comparative Human Microbiome Analysis” Remote Video Talk to CICESE Big Data, Big Network Workshop Ensenada, Mexico October 10, 2013 Dr. Larry Smarr Director,
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Michael L. Norman Principal Investigator Interim Director, SDSC Allan Snavely.
Using Photonics to Prototype the Research Campus Infrastructure of the Future: The UCSD Quartzite Project Philip Papadopoulos Larry Smarr Joseph Ford Shaya.
“Living in a Microbial World” Global Health Program Council on Foreign Relations New York, NY April 10, 2014 Dr. Larry Smarr Director, California Institute.
SoCal Infrastructure OptIPuter Southern California Network Infrastructure Philip Papadopoulos OptIPuter Co-PI University of California, San Diego Program.
Russ Hobby Program Manager Internet2 Cyberinfrastructure Architect UC Davis.
A High-Performance Campus-Scale Cyberinfrastructure For Effectively Bridging End-User Laboratories to Data-Intensive Sources Presentation by Larry Smarr.
Project GreenLight Overview Thomas DeFanti Full Research Scientist and Distinguished Professor Emeritus California Institute for Telecommunications and.
“Observing the Dynamics of the Human Immune System Coupled to the Microbiome in Health and Disease” CASIS Workshop on Biomedical Research Aboard the ISS.
“Assay Lab Within Your Body: Biometrics and Biomes” Invited Lecture TSensors Summit La Jolla, CA November 12, 2014 Dr. Larry Smarr Director, California.
“The Pacific Research Platform: a Science-Driven Big-Data Freeway System.” Big Data for Information and Communications Technologies Panel Presentation.
“CAMERA Goes Live!" Presentation with Craig Venter National Press Club Washington, DC March 13, 2007 Dr. Larry Smarr Director, California Institute for.
“The UCSD Big Data Freeway System” Invited Short Talk Workshop on “Enriching Human Life and Society” UC San Diego February 6, 2014 Dr. Larry Smarr Director,
“ OptIPuter Year Five: From Research to Adoption " OptIPuter All Hands Meeting La Jolla, CA January 22, 2007 Dr. Larry Smarr Director, California.
Lecture Science & Entertainment Exchange National Academy of Sciences Los Angeles June 13, 2013 Dr. Larry Smarr Director, California Institute for Telecommunications.
“Adding Consumer-Generated and Microbiome Data to the Electronic Medical Record” Using Big Data to Advance Healthcare Panel National Health Policy Conference.
Southern California Infrastructure Philip Papadopoulos Greg Hidley.
“Genomics: The CAMERA Project" Invited Talk 5 th Annual ON*VECTOR International Photonics Workshop UCSD February 28, 2006 Dr. Larry Smarr Director,
High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC.
“OptIPuter: From the End User Lab to Global Digital Assets" Panel UC Research Cyberinfrastructure Meeting October 10, 2005 Dr. Larry Smarr.
“ Building an Information Infrastructure to Support Microbial Metagenomic Sciences " Presentation to the NBCR Research Advisory Committee UCSD La Jolla,
Tools and Services Workshop
Joslynn Lee – Data Science Educator
Optical SIG, SD Telecom Council
Presentation transcript:

Health Sciences Driving UCSD Research Cyberinfrastructure Invited Talk UCSD Health Sciences Faculty Council UC San Diego April 3, 2012 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD Follow me at

UCSD Researcher Research Cyberinfrastructure Needs UCSD Researchers Surveyed in 2008 to Determine Their Unmet CI Needs Answer: DATA – Help! –Data Infrastructure (Storage, Transmission, Curation) –Data Expertise (Management, Analysis, Visualization, Curation) Diverse Sources of Data Source: Mike Norman, SDSC

Blueprint for a Digital University Report 2009

UCSD RCI Provider Organizations 4 RCI element SDSCUCSD Libraries ACTCalit2 Co- Location Lead StorageLeadPartner CurationPartnerLead ComputingLead NetworkingPartnerLeadPartner Source: Mike Norman, SDSC

From One to a Billion Data Points Defining Me: The Exponential Rise in Body Data in Just One Decade Weight Blood Variables SNPs Full Genome

First Stage of Metagenomic Sequencing of My Gut Microbiome at J. Craig Venter Institute Gel Image of Extract from Smarr Sample-Next is Library Construction Manny Torralba, Project Lead - Human Genomic Medicine J Craig Venter Institute January 25, 2012 I Received a Disk Drive Today With GigaBytes

The Coming Digital Transformation of Health

Integrative Personal Omics Profiling Reveals Details of Clinical Onset of Viruses and Diabetes Michael Snyder, Chair of Genomics Stanford Univ. Genome 140x Coverage Blood Tests 20 Times in 14 Months –tracked nearly 20,000 distinct transcripts coding for 12,000 genes –measured the relative levels of more than 6,000 proteins and 1,000 metabolites in Snyder's blood Cell 148, 1293–1307, March 16, 2012

iDASH 9 Outcome of NIH Botstein-Smarr Report (1999) Source: Lucila Ohno-Machado, UCSD SOM

integrating Data for Analysis, Anonymization, and SHaring (iDASH) funded by NIH U54HL Data Exported for Computation Elsewhere –Users download data from iDASH Computation Comes to the Data –Users access data in iDASH –Users upload algorithms into iDASH iDASH Exportable Cyberinfrastructure –Users download infrastructure – Private Cloud at SD Supercomputer Center Medical Center Data Hosting HIPAA certified facility Source: Lucila Ohno-Machado, UCSD SOM

Complications associated with a new drug or device? Semantic Integration Information Query UC DavisUC Irvine UCLA UCSF UCSD Extraction Transformation Load (even with same vendor, the EMRs are configured differently) Data + Ontologies + Tools Source: Lucila Ohno-Machado, UCSD SOM

Personalized Care and Population Health Genomics –SNP-based therapy (cancer) Phenomics –Electronic Health Records –Personal monitoring –Blood pressure, glucose –Behavior –Adherence to medication, exercise Public Health and Environment –Air quality, food –Surveillance Source: DOE Source: Lucila Ohno-Machado, UCSD SOM

NCMIRs Integrated Infrastructure of Shared Resources Source: Steve Peltier, NCMIR Local SOM Infrastructure Scientific Instruments End User Workstations Shared Infrastructure

SDSC/Triton Skaggs/Users StorageLeichtag/Sequencer Calit2/Storage Ideker Lab Workflow Source: Chris Misleh, Calit2/SOM

Next Generation Genome Sequencers Produce Large Data Sets Source: Chris Misleh, SOM

SDSC Large Memory Nodes 256/512 GB/sys 8TB Total 128 GB/sec ~ 9 TF x28 SDSC Shared Resource Cluster 24 GB/Node 6TB Total 256 GB/sec ~ 20 TF x256 UCSD Research Labs SDSC Data Oasis Large Scale Storage 2 PB 50 GB/sec 3000 – 6000 disks Phase 0: 1/3 PB, 8GB/s Moving to Shared Enterprise Data Storage & Analysis Resources: SDSC Triton Resource & Calit2 GreenLight Campus Research Network Calit2 GreenLight N x 10Gb/s Source: Philip Papadopoulos, SDSC, UCSD

SOM Use of SDSC Triton Resource 10 SOM PIs Received Substantial Allocations –100K CPU-hours or more 8 SOM PIs / Labs Currently Using Triton with Time Purchased from Grant Funds 30+ Active Trial Accounts Supporting ~6 Next Generation Sequencing Projects with PIs from SOM, SIO, and 2 Outside Research Institutes (TSRI, LIAI)

Community Cyberinfrastructure for Advanced Microbial Ecology Research and Analysis

Calit2 Microbial Metagenomics Cluster- Next Generation Optically Linked Science Data Server 512 Processors ~5 Teraflops ~ 200 Terabytes Storage 1GbE and 10GbE Switched / Routed Core ~200TB Sun X4500 Storage 10GbE Source: Phil Papadopoulos, SDSC, Calit Users From 90 Countries

Creating CAMERA Advanced Cyberinfrastructure Service Oriented Architecture Source: CAMERA CTO Mark Ellisman

Access to Computing Resources Tailored by Users Requirements and Resources CAMERA Core HPC Resource Advanced HPC Platforms NSF/DOE TeraScale Resources Source: Jeff Grethe, CAMERA

NSF Funds a Data-Intensive Track 2 Supercomputer: SDSCs Gordon-Coming Summer 2011 Data-Intensive Supercomputer Based on SSD Flash Memory and Virtual Shared Memory SW –Emphasizes MEM and IOPS over FLOPS –Supernode has Virtual Shared Memory: –2 TB RAM Aggregate –8 TB SSD Aggregate –Total Machine = 32 Supernodes –4 PB Disk Parallel File System >100 GB/s I/O System Designed to Accelerate Access to Massive Data Bases being Generated in Many Fields of Science, Engineering, Medicine, and Social Science Source: Mike Norman, Allan Snavely SDSC

Rapid Evolution of 10GbE Port Prices Makes Campus-Scale 10Gbps CI Affordable $80K/port Chiaro (60 Max) $ 5K Force 10 (40 max) $ 500 Arista 48 ports ~$1000 (300+ Max) $ 400 Arista 48 ports Port Pricing is Falling Density is Rising – Dramatically Cost of 10GbE Approaching Cluster HPC Interconnects Source: Philip Papadopoulos, SDSC/Calit2

10G Switched Data Analysis Resource: SDSCs Data Oasis – Scaled Performance 2 12 OptIPuter 32 Co-Lo UCSD RCI CENIC/ NLR Trestles 100 TF 8 Dash 128 Gordon Oasis Procurement (RFP) Phase0: > 8GB/s Sustained Today Phase I: > 50 GB/sec for Lustre (May 2011) :Phase II: >100 GB/s (Feb 2012) Source: Philip Papadopoulos, SDSC/Calit2 Triton 32 Radical Change Enabled by Arista G Switch G Capable 8 Existing Commodity Storage 1/3 PB 2000 TB > 50 GB/s 10Gbps

2012 RCI Initiatives RCI is Preparing an Attractive Storage Offering for All UCSD Researchers to Encourage Adoption –Wide and Deep –On-Ramp to Digital Curation Efforts SOM Possesses Many of the Most Data-Intensive Instruments on Campus (NGS, MassSpec, MRI) –Effort to Connect Them to RCI Resources This Year SDSC Working with DBMI to Define a HIPPA-compliant Cloud Computing Resource that Would Leverage or Extend RCI Resources RCI Implementation Team Needs your Input and Collaboration ( Richard SDSC) Source: Mike Norman, SDSC

Potential UCSD Optical Networked Biomedical Researchers and Instruments Cellular & Molecular Medicine West National Center for Microscopy & Imaging Biomedical Research Center for Molecular Genetics Pharmaceutical Sciences Building Cellular & Molecular Medicine East CryoElectron Microscopy Facility Radiology Imaging Lab Bioengineering San Diego Supercomputer Center Connects at 10 Gbps : –Microarrays –Genome Sequencers –Mass Spectrometry –Light and Electron Microscopes –Whole Body Imagers –Computing –Storage Developing Detailed Plan