“Quantifying The Dynamics of Your Superorganism Body Using Big Data Supercomputing” 2014-15 Distinguished Lecturer Series Computer Science and Engineering.

Slides:



Advertisements
Similar presentations
A Systems Approach to Personalized Medicine Talk and Discussion NASA Ames Mountain View, CA March 28, 2013 Dr. Larry Smarr Director, California Institute.
Advertisements

The Quantified Self: Personal Monitoring and the Control of Health Interview by Andre de Fusco Future in Review 2011 Laguna Beach, CA May 26, 2011 Dr.
Reading Out the State of the Body and How it Changes Under Therapy Guest Lecture Pharmacy Informatics 2013 University of California San Diego June 7, 2013.
Large Memory High Performance Computing Enables Comparison Across Human Gut Microbiome of Patients with Autoimmune Diseases and Healthy Subjects XSEDE.
Deep Self - Quantifying the State of Your Body Invited Talk NextMed / MMVR20 San Diego February 21, 2013 Dr. Larry Smarr Director, California Institute.
“Tracking Immune Biomarkers and the Human Gut Microbiome: Inflammation, Crohn's Disease, and Colon Cancer” USC Monthly Seminar Series Physical Sciences.
Exploring Our Inner Universe Using Supercomputers and Gene Sequencers Physics Department Colloquium UC San Diego October 24, 2013 Dr. Larry Smarr Director,
Discussion Janssen La Jolla Research and Development La Jolla, CA
“Integrating Healthcare Informatics, Imaging, and Systems Biology-A Personal Example” Plenary Lecture 2nd IEEE Conference on Healthcare Informatics, Imaging,
Leveraging Biomedical Big Data: Quantified Self & Beyond Invited Talk FutureMed Singularity University NASA Ames Campus February 5, 2013 Dr. Larry Smarr.
Personal Data Tracking and the Digital Transformation of Healthcare Invited Talk University of Illinois Silicon Valley Round Table Palo Alto, CA December.
“The Systems Biology Dynamics of the Human Immune System and Gut Microbiome” Invited Talk UCI Systems Biology Seminar Series Irvine, CA October 14, 2013.
“Predictive & Preventive Personalized Medicine” Invited Talk Right Care Initiative Annual Leadership Summit Collaborating to Prevent Heart Attacks, Strokes,
“Using Data Analytics to Discover the 100 Trillion Bacteria Living Within Each of Us” Invited Talk New Applications of Computer Analysis to Biomedical.
“Finding the Patterns in the Big Data From Human Microbiome Ecology” Invited Talk Exponential Medicine November 10, 2014 Dr. Larry Smarr Director, California.
The NIH Human Microbiome Project
“Personalized Medicine, Colorectal Cancer and Gut Bacteria”
The Microbiome and Metagenomics
“Quantifying Your Superorganism Body Using Big Data Supercomputing” Ken Kennedy Institute Distinguished Lecture Rice University Houston, TX November 12,
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology Metagenomics Center for Earth Observations and Applications Advisory Committee.
“The Quantified Self Movement: The Technologies That Are Revolutionizing Health and Fitness” Panel Discussion MIT Enterprise Forum San Diego UC San Diego.
“Discovering the Other 90% of our Human Superorganism” Remote Video Lecture to The eResearch Australasia Conference 2014 Melbourne, Australia October 28,
“Inflammation, Gut Microbiome, Bacteriophages, and the Initiation of Colorectal Cancer” Seminar Lecture City of Hope Pasadena, CA October 20, 2014 Dr.
My N=1 Experience Pioneer Session: "N=1: Pioneers of Self-Tracking“ Panel at the Genomes, Environment, and Traits Conference Harvard Medical School Cambridge,
“Mapping the Human Gut Microbiome in Health and Disease Using Sequencing, Supercomputing, and Data Analysis” Invited Talk Delivered by Mehrdad Yazdani,
“The Quantified Self: From Idiosyncratic Hobby to an Emerging Growth Industry” Invited Lecture Science & Technology Discovery Series Technology Alliance.
"Towards Digitally Enabled Genomic Medicine" Distinguished Lecture Series Department of Computer Science and Engineering UC San Diego October 15, 2012.
“Measuring the Human Brain-Gut Microbiome-Immune System Dynamics: a Big Data Challenge” Plenary Talk 45 th Annual Meeting of the Behavior Genetics Association.
“The Digital Transformation of Healthcare”
“Big Data and Superorganism Genomics – Microbial Metagenomics Meets Human Genomics” NGS and the Future of Medicine Illumina Headquarters La Jolla, CA February.
H = -Σp i log 2 p i. SCOPI Each one of the many microbial communities has its own structure and ecosystem, depending on the body environment it exists.
“The Deeply Quantified Self: A Case Study” Future Technology Keynote Minimally Invasive Surgery Week 2015 Society of Laparoendoscopic Surgeons New York.
“Quantified Health and Disease” Lecture for the Osher Lifetime Learning Institute UCSD Extension Calit2’s Qualcomm Institute, UCSD La Jolla, CA February.
“Using Data Analytics to Discover the 100 Trillion Bacteria Living Within Each of Us” Invited Talk Ayasdi Menlo Park, CA December 5, 2014 Dr. Larry Smarr.
“Toward Novel Human Microbiome Surveillance Diagnostics to Support Public Health” Invited Talk Institute for Public Health University of California San.
“Tracking Large Variations in My Immune Biomarkers and My Gut Microbiome: Inflammation, Crohn's Disease, and Colon Cancer” IBD Conference Speaker Series.
“Quantified Self- On Being a Personal Genomic Observatory” Keynote in the “Humans as Genomic Observatories” Meeting Session in the Genomics Standards Consortium.
“The Human Microbiome and the Revolution in Digital Health” The Florida Institute for Human and Machine Cognition Pensacola Evening Lecture Series Pensacola,
“Using Supercomputing & Advanced Analytic Software to Discover Radical Changes in the Human Microbiome in Health and Disease” Invited Remote Presentation.
“Creating a High Performance Cyberinfrastructure to Support Analysis of Illumina Metagenomic Data” DNA Day Department of Computer Science and Engineering.
“Comparative Human Microbiome Analysis” Remote Video Talk to CICESE Big Data, Big Network Workshop Ensenada, Mexico October 10, 2013 Dr. Larry Smarr Director,
The Human Microbiome: PSC, IBD, and the Gut-Liver Axis
“Individual, Consumer-Driven Care of the Future -- Taking Wellness One Step Further” Closing Keynote Address The World Congress 2 nd Annual Leadership.
713 Lecture 15 Host metagenomics. Progression of techniques Culture based –Use phenotypes and genotypes to ID Non-culture based, focused on 16S rDNA –Clone.
“Inspired by Carl: Exploring the Microbial Dynamics Within” Invited Talk Looking in the Right Direction: Carl Woese and the New Biology University of Illinois,
“Living in a Microbial World” Global Health Program Council on Foreign Relations New York, NY April 10, 2014 Dr. Larry Smarr Director, California Institute.
“How Studying Astrophysics and Coral Reefs Enabled Me to Become an Empowered, Engaged Patient” Invited Talk FutureMed at the Hotel Del Coronado, CA November.
“Frontiers of Self-Tracking” Plenary Talk Quantified Self Conference 2012 Stanford University September 15, 2012 Dr. Larry Smarr Director, California Institute.
“Deciphering the Dynamic Coupling of the Human Immune System and the Gut Microbiome” Overview Data-Enabled Life Sciences Research (DELSA) DELSA Workshop.
“Observing the Dynamics of the Human Immune System Coupled to the Microbiome in Health and Disease” CASIS Workshop on Biomedical Research Aboard the ISS.
“Quantifying Your Superorganism Body Using Big Data Supercomputing” ACM International Workshop on Big Data in Life Sciences BigLS 2014 Newport Beach, CA.
“Assay Lab Within Your Body: Biometrics and Biomes” Invited Lecture TSensors Summit La Jolla, CA November 12, 2014 Dr. Larry Smarr Director, California.
“Discovering the Other 90% of our Human Superorganism” Remote Video Lecture to The eResearch Australasia Conference 2014 Melbourne, Australia October 28,
“Quantifying the Time Progression of the Interaction of the Human Immune System with the Gut Microbiome” Research Council Presentation UC San Diego Health.
“The UCSD Big Data Freeway System” Invited Short Talk Workshop on “Enriching Human Life and Society” UC San Diego February 6, 2014 Dr. Larry Smarr Director,
Lecture Science & Entertainment Exchange National Academy of Sciences Los Angeles June 13, 2013 Dr. Larry Smarr Director, California Institute for Telecommunications.
“Know Thyself: Quantifying Your Human Body and Its One Hundred Trillion Microbes” Understanding Cultures and Addressing Disparities in Society: Degrees.
“Using Genetic Sequencing to Unravel the Dynamics of Your Superorganism Body” Weekly Bioinformatics Seminar Series UC San Diego La Jolla, CA October 17,
“Adding Consumer-Generated and Microbiome Data to the Electronic Medical Record” Using Big Data to Advance Healthcare Panel National Health Policy Conference.
“Quantifying Your Dynamic Human Body (Including Its Microbiome), Will Move Us From a Sickcare System to a Healthcare System” Invited Presentation Microbiology.
Keynote Presentation Cavendish Global Health Impact Forum
“Connecting Body Time Series to Macro Body Changes”
“Analyzing the Human Gut Microbiome Dynamics in Health and Disease Using Supercomputers and Supernetworks” Invited Presentation ESnet CrossConnects Bioinformatics.
“Linking Phenotype Changes to Internal/External Longitudinal Time Series in a Single Human” Invited Presentation at EMBC ‘16 38th International Conference.
“Machine Learning in Healthcare Diagnostics”
Briefing for Dell Analytics Team Calit2’s Qualcomm Institute
Invited Presentation Machine Learning in Healthcare
Genomes and Their Evolution
Determinants of Health in U.S.
H = -Σpi log2 pi.
Presentation transcript:

“Quantifying The Dynamics of Your Superorganism Body Using Big Data Supercomputing” Distinguished Lecturer Series Computer Science and Engineering Department University of Washington Seattle, WA October 9, 2014 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD 1

Abstract As a member of Lee Hood's 100 Person Wellness Project, headquartered in Seattle's Institute for System Biology, I am engaged in experiments to read out the time varying state of a complex dynamical system - my human body. However, the human body is host to 100 trillion microorganisms, ten times the number of cells in the human body, and these microbes contain 100 times the number of DNA genes that our human DNA does. The microbial component of this "superorganism" is comprised of hundreds of species spread over many taxonomic phyla. The human immune system is tightly coupled with this microbial ecology and in cases of autoimmune disease, both the immune system and the microbial ecology can have dynamic excursions far from normal. To provide a deeper context for the microbiome results from the 100 Person Wellness Project, I have been exploring the variation in the microbiome ecology across healthy and chronically ill populations. Our research starts with trillions of DNA bases, produced by Illumina Next Generation sequencers, of the human gut microbial DNA taken from my own body over time, as well as from hundreds of people sequenced under the NIH Human Microbiome Project. To decode the details of the microbial ecology we feed this data into parallel supercomputers, running sophisticated bioinformatics software pipelines. We then use Calit2/SDSC designed Big Data PCs to manage the data and drive innovative scalable visualization systems to examine the complexities of the changing human gut microbial ecology in health and disease. I will show how advanced data analytics tools find patterns in the resulting microbial distribution data that suggest new hypotheses for clinical application.

Calit2 Has Had a Vision of “the Digital Transformation of Health” for a Decade Next Step—Putting You On-Line! –Wireless Internet Transmission –Key Metabolic and Physical Variables –Model -- Dozens of Processors and 60 Sensors / Actuators Inside of our Cars Post-Genomic Individualized Medicine –Combine –Genetic Code –Body Data Flow –Use Powerful AI Data Mining Techniques The Content of This Slide from 2001 Larry Smarr Calit2 Talk on Digitally Enabled Genomic Medicine

By Measuring the State of My Body and “Tuning” It Using Nutrition and Exercise, I Became Healthier 2000 Age Age Age My Decade Long Journey to Being a Quantified Self: I Arrived in La Jolla in 2000 After 20 Years in the Midwest I Reversed My Body’s Decline By Quantifying and Altering Nutrition, Exercise, Sleep, and Stress

From One to a Billion Data Points Defining Me: The Exponential Rise in Body Data in Just One Decade Billion: My Full DNA, MRI/CT Images Million: My DNA SNPs, Zeo, FitBit Hundred: My Blood Variables One: My Weight Weight Blood Variables SNPs Microbial Genome Improving Body Discovering Disease

Early Adopting MDs Are Creating Partnerships with Their Quantified Patients “The 100 participants will be guided on this 9-month journey by a coach and when necessary, be referred to their own health care practitioners.” The data sets that will be evaluated include: –Self-Tracking Devices –Medical History, Traits, Lifestyle –Blood, Urine, Saliva –Gut Microbiome –Whole Genome Sequencing There are 8760 Hours in a Year One of These Hours You Are With a Doctor… The Other 8759 Hours Are Up to You! Will Grow to 1000, then 10,000

Visualizing Time Series of 150 LS Blood and Stool Variables, Each Over 5-10 Years Calit2 64 megapixel VROOM

Only One of My Blood Measurements Was Far Out of Range--Indicating Chronic Inflammation Normal Range <1 mg/L Normal 27x Upper Limit Complex Reactive Protein (CRP) is a Blood Biomarker for Detecting Presence of Inflammation Episodic Peaks in Inflammation Followed by Spontaneous Drops

Adding Stool Tests Revealed Oscillatory Behavior in an Immune Variable Normal Range <7.3 µg/mL 124x Upper Limit Antibiotics Lactoferrin is a Protein Shed from Neutrophils - An Antibacterial that Sequesters Iron Typical Lactoferrin Value for Active Inflammatory Bowel Disease (IBD) Hypothesis: Lactoferrin Oscillations Coupled to Relative Abundance of Microbes that Require Iron

Descending Colon Sigmoid Colon Threading Iliac Arteries Major Kink Confirming the IBD (Crohn’s) Hypothesis: Finding the “Smoking Gun” with MRI Imaging I Obtained the MRI Slices From UCSD Medical Services and Converted to Interactive 3D Working With Calit2 Staff & DeskVOX Software Transverse Colon Liver Small Intestine Diseased Sigmoid Colon Cross Section MRI Jan 2012

Why Did I Have an Autoimmune Disease like IBD? Despite decades of research, the etiology of Crohn's disease remains unknown. Its pathogenesis may involve a complex interplay between host genetics, immune dysfunction, and microbial or environmental factors. --The Role of Microbes in Crohn's Disease Paul B. Eckburg & David A. Relman Clin Infect Dis. 44: (2007) So I Set Out to Quantify All Three!

The Cost of Sequencing a Human Genome Has Fallen Over 10,000x in the Last Ten Years This Has Enabled Sequencing of Both Human and Microbial Genomes

Inclusion of the Microbiome Will Radically Change Medicine and Wellness 99% of Your DNA Genes Are in Microbe Cells Not Human Cells Your Body Has 10 Times As Many Microbe Cells As Human Cells I Will Focus on the Human Gut Microbiome, Which Contains Hundreds of Microbial Species

When We Think About Biological Diversity We Typically Think of the Wide Range of Animals But All These Animals Are in One SubPhylum Vertebrata of the Chordata Phylum All images from Wikimedia Commons. Photos are public domain or by Trisha Shears & Richard Bartz

Think of These Phyla of Animals When You Consider the Biodiversity of Microbes Inside You All images from WikiMedia Commons. Photos are public domain or by Dan Hershman, Michael Linnenbach, Manuae, B_cool Phylum Annelida Phylum Echinodermata Phylum Cnidaria Phylum Mollusca Phylum Arthropoda Phylum Chordata

However, The Evolutionary Distance Between Your Gut Microbes Is Much Greater Than Between All Animals Source: Carl Woese, et al Last Slide Evolutionary Distance Derived from Comparative Sequencing of 16S or 18S Ribosomal RNA Green Circles Are Human Gut Microbes

A Year of Sequencing a Healthy Gut Microbiome Daily - Remarkable Stability with Abrupt Changes Days Genome Biology (2014) David, et al.

To Map Out the Dynamics of My Microbiome Ecology I Partnered with the J. Craig Venter Institute JCVI Did Metagenomic Sequencing on Seven of My Stool Samples Over 1.5 Years Sequencing on Illumina HiSeq 2000 –Generates 100bp Reads –Run Takes ~14 Days –My 7 Samples Produced –>200Gbp of Data JCVI Lab Manager, Genomic Medicine –Manolito Torralba IRB PI Karen Nelson –President JCVI Illumina HiSeq 2000 at JCVI Manolito Torralba, JCVI Karen Nelson, JCVI

We Expanded Our Healthy Cohort to All Gut Microbiomes from NIH HMP For Comparative Analysis 5 Ileal Crohn’s Patients, 3 Points in Time 2 Ulcerative Colitis Patients, 6 Points in Time “Healthy” Individuals Source: Jerry Sheehan, Calit2 Weizhong Li, Sitao Wu, CRBS, UCSD Total of 27 Billion Reads Or 2.7 Trillion Bases IBD Patients 250 Subjects 1 Point in Time Larry Smarr 7 Points in Time Each Sample Has Million Illumina Short Reads (100 bases)

We Created a Reference Database Of Known Gut Genomes NCBI April 2013 –2471 Complete Draft Bacteria & Archaea Genomes –2399 Complete Virus Genomes –26 Complete Fungi Genomes –309 HMP Eukaryote Reference Genomes Total 10,741 genomes, ~30 GB of sequences Now to Align Our 27 Billion Reads Against the Reference Database Source: Weizhong Li, Sitao Wu, CRBS, UCSD

Computational NextGen Sequencing Pipeline: From “Big Equations” to “Big Data” Computing PI: (Weizhong Li, CRBS, UCSD): NIH R01HG ( , $1.1M)

We Used SDSC’s Gordon Data-Intensive Supercomputer to Analyze a Wide Range of Gut Microbiomes Enabled by a Grant of Time on Gordon from SDSC Director Mike Norman Our Team Used 25 CPU-Years To Compute the Comparative Gut Microbiome of My Time Samples and Our Healthy and IBD Controls Starting With the 5 Billion Illumina Reads Received from JCVI Source: Weizhong Li, Sitao Wu, CRBS, UCSD

We Used Dell’s HPC Cloud to Analyze All of Our Human Gut Microbiomes Dell’s Sanger Cluster –32 Nodes, 512 Cores –48GB RAM per Node We Processed the Taxonomic Relative Abundance –Used ~35,000 Core-Hours on Dell’s Sanger Produced Relative Abundance of ~10,000 Bacteria, Archaea, Viruses in ~300 People –~3Million Spreadsheet Cells New System: R Bio-Gen System –48 Nodes, 768 Cores –128 GB RAM per Node Source: Weizhong Li, UCSD

Using Scalable Visualization Allows Comparison of the Relative Abundance of 200 Microbe Species Calit2 VROOM-FuturePatient Expedition Comparing 3 LS Time Snapshots (Left) with Healthy, Crohn’s, UC (Right Top to Bottom)

Using Microbiome Profiles to Survey 155 Subjects for Unhealthy Candidates

Bacteroidetes and Firmicutes Phyla Dominate “Healthy” Subjects in the Pioneer 100 Gut Microbiomes A Few With High % Proteobacteria or Verrucomicrobia

Lessons from Ecological Dynamics: Gut Microbiome Has Multiple Relatively Stable Equilibria “The Application of Ecological Theory Toward an Understanding of the Human Microbiome,” Elizabeth Costello, Keaton Stagaman, Les Dethlefsen, Brendan Bohannan, David Relman Science 336, (2012)

We Found Major State Shifts in Microbial Ecology Phyla Between Healthy and Two Forms of IBD Most Common Microbial Phyla Average HE Average Ulcerative ColitisAverage LS Average Crohn’s Disease Collapse of Bacteroidetes Explosion of Actinobacteria Explosion of Proteobacteria Hybrid of UC and CD High Level of Archaea

Is the Gut Microbial Ecology Different in Crohn’s Disease Subtypes? Ben Willing, GASTROENTEROLOGY 2010;139:1844 –1854 Colonic Crohn’s Disease (CCD) Ileal Crohn’s Disease (ICD)

PCA Analysis on Species Abundance Across People PCA2 PCA1 Analysis by Mehrdad Yazdani, Calit2 Green-Healthy Red-CD Purple-UC Blue-LS ICD CCD Healthy Subset?

KEGG: a Database Resource for Understanding High-Level Functions and Utilities of the Biological System

Using Ayasdi To Discover Patterns in KEGG Cellular Pathway Dataset topological data analysis Source: Pek Lum, Chief Data Scientist, Ayasdi Dataset from Larry Smarr Team With 60 Subjects (HE, CD, UC, LS) Each with 10,000 KEGGs - 600,000 Cells

Disease Arises from Perturbed Cellular Networks: Dynamics of a Prion Perturbed Network in Mice Source: Lee Hood, ISB 33 Our Next Goal is to Create Such Perturbed Networks in Humans

Next Step: Compute Genes and Function Full Processing to Function (COGs, KEGGs) Would Require ~1-2 Million Core-Hours Plus Dedicated Network to Move Data From R Systems / Dell to San Diego

“A Whole-Cell Computational Model Predicts Phenotype from Genotype” A model of Mycoplasma genitalium, 525 genes Using 1,900 experimental observations From 900 studies, They created the software model, Which requires 128 computers to run

Early Attempts at Modeling the Systems Biology of the Gut Microbiome and the Human Immune System

Next Step: Time Series of Metagenomic Gut Microbiomes and Immune Variables in an N=100 Clinic Trial Goal: Understand The Coupled Human Immune-Microbiome Dynamics In the Presence of Human Genetic Predispositions Drs. William J. Sandborn, John Chang, & Brigid Boland UCSD School of Medicine, Division of Gastroenterology

From Quantified Self to National-Scale Biomedical Research Projects My Anonymized Human Genome is Available for Download The Quantified Human Initiative is an effort to combine our natural curiosity about self with new research paradigms. Rich datasets of two individuals, Drs. Smarr and Snyder, serve as 21 st century personal data prototypes.

Thanks to Our Great Team! UCSD Metagenomics Team Weizhong Li Sitao Wu Future Patient Team Jerry Sheehan Tom DeFanti Kevin Patrick Jurgen Schulze Andrew Prudhomme Philip Weber Fred Raab Joe Keefe Ernesto Ramirez JCVI Team Karen Nelson Shibu Yooseph Manolito Torralba SDSC Team Michael Norman Mahidhar Tatineni Robert Sinkovits UCSD Health Sciences Team William J. Sandborn Elisabeth Evans John Chang Brigid Boland David Brenner