“Using Data Analytics to Discover the 100 Trillion Bacteria Living Within Each of Us” Invited Talk Ayasdi Menlo Park, CA December 5, 2014 Dr. Larry Smarr.

Slides:



Advertisements
Similar presentations
A Systems Approach to Personalized Medicine Talk and Discussion NASA Ames Mountain View, CA March 28, 2013 Dr. Larry Smarr Director, California Institute.
Advertisements

The Quantified Self: Personal Monitoring and the Control of Health Interview by Andre de Fusco Future in Review 2011 Laguna Beach, CA May 26, 2011 Dr.
Reading Out the State of the Body and How it Changes Under Therapy Guest Lecture Pharmacy Informatics 2013 University of California San Diego June 7, 2013.
Large Memory High Performance Computing Enables Comparison Across Human Gut Microbiome of Patients with Autoimmune Diseases and Healthy Subjects XSEDE.
Deep Self - Quantifying the State of Your Body Invited Talk NextMed / MMVR20 San Diego February 21, 2013 Dr. Larry Smarr Director, California Institute.
“Tracking Immune Biomarkers and the Human Gut Microbiome: Inflammation, Crohn's Disease, and Colon Cancer” USC Monthly Seminar Series Physical Sciences.
Exploring Our Inner Universe Using Supercomputers and Gene Sequencers Physics Department Colloquium UC San Diego October 24, 2013 Dr. Larry Smarr Director,
Discussion Janssen La Jolla Research and Development La Jolla, CA
“Integrating Healthcare Informatics, Imaging, and Systems Biology-A Personal Example” Plenary Lecture 2nd IEEE Conference on Healthcare Informatics, Imaging,
Leveraging Biomedical Big Data: Quantified Self & Beyond Invited Talk FutureMed Singularity University NASA Ames Campus February 5, 2013 Dr. Larry Smarr.
Personal Data Tracking and the Digital Transformation of Healthcare Invited Talk University of Illinois Silicon Valley Round Table Palo Alto, CA December.
“The Systems Biology Dynamics of the Human Immune System and Gut Microbiome” Invited Talk UCI Systems Biology Seminar Series Irvine, CA October 14, 2013.
“Using Data Analytics to Discover the 100 Trillion Bacteria Living Within Each of Us” Invited Talk New Applications of Computer Analysis to Biomedical.
“Finding the Patterns in the Big Data From Human Microbiome Ecology” Invited Talk Exponential Medicine November 10, 2014 Dr. Larry Smarr Director, California.
The NIH Human Microbiome Project
“Personalized Medicine, Colorectal Cancer and Gut Bacteria”
The Microbiome and Metagenomics
Introduction to metagenomics Agnieszka S. Juncker Center for Biological Sequence Analysis Technical University of Denmark.
“Quantifying Your Superorganism Body Using Big Data Supercomputing” Ken Kennedy Institute Distinguished Lecture Rice University Houston, TX November 12,
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology Metagenomics Center for Earth Observations and Applications Advisory Committee.
“The Quantified Self Movement: The Technologies That Are Revolutionizing Health and Fitness” Panel Discussion MIT Enterprise Forum San Diego UC San Diego.
“Discovering the Other 90% of our Human Superorganism” Remote Video Lecture to The eResearch Australasia Conference 2014 Melbourne, Australia October 28,
“Inflammation, Gut Microbiome, Bacteriophages, and the Initiation of Colorectal Cancer” Seminar Lecture City of Hope Pasadena, CA October 20, 2014 Dr.
My N=1 Experience Pioneer Session: "N=1: Pioneers of Self-Tracking“ Panel at the Genomes, Environment, and Traits Conference Harvard Medical School Cambridge,
“Mapping the Human Gut Microbiome in Health and Disease Using Sequencing, Supercomputing, and Data Analysis” Invited Talk Delivered by Mehrdad Yazdani,
“The Quantified Self: From Idiosyncratic Hobby to an Emerging Growth Industry” Invited Lecture Science & Technology Discovery Series Technology Alliance.
"Towards Digitally Enabled Genomic Medicine" Distinguished Lecture Series Department of Computer Science and Engineering UC San Diego October 15, 2012.
“Measuring the Human Brain-Gut Microbiome-Immune System Dynamics: a Big Data Challenge” Plenary Talk 45 th Annual Meeting of the Behavior Genetics Association.
“The Digital Transformation of Healthcare”
“Big Data and Superorganism Genomics – Microbial Metagenomics Meets Human Genomics” NGS and the Future of Medicine Illumina Headquarters La Jolla, CA February.
H = -Σp i log 2 p i. SCOPI Each one of the many microbial communities has its own structure and ecosystem, depending on the body environment it exists.
“Quantifying The Dynamics of Your Superorganism Body Using Big Data Supercomputing” Distinguished Lecturer Series Computer Science and Engineering.
“The Deeply Quantified Self: A Case Study” Future Technology Keynote Minimally Invasive Surgery Week 2015 Society of Laparoendoscopic Surgeons New York.
“Quantified Health and Disease” Lecture for the Osher Lifetime Learning Institute UCSD Extension Calit2’s Qualcomm Institute, UCSD La Jolla, CA February.
“Toward Novel Human Microbiome Surveillance Diagnostics to Support Public Health” Invited Talk Institute for Public Health University of California San.
“Tracking Large Variations in My Immune Biomarkers and My Gut Microbiome: Inflammation, Crohn's Disease, and Colon Cancer” IBD Conference Speaker Series.
“Quantified Self- On Being a Personal Genomic Observatory” Keynote in the “Humans as Genomic Observatories” Meeting Session in the Genomics Standards Consortium.
“The Human Microbiome and the Revolution in Digital Health” The Florida Institute for Human and Machine Cognition Pensacola Evening Lecture Series Pensacola,
“Using Supercomputing & Advanced Analytic Software to Discover Radical Changes in the Human Microbiome in Health and Disease” Invited Remote Presentation.
“Creating a High Performance Cyberinfrastructure to Support Analysis of Illumina Metagenomic Data” DNA Day Department of Computer Science and Engineering.
“Comparative Human Microbiome Analysis” Remote Video Talk to CICESE Big Data, Big Network Workshop Ensenada, Mexico October 10, 2013 Dr. Larry Smarr Director,
“Individual, Consumer-Driven Care of the Future -- Taking Wellness One Step Further” Closing Keynote Address The World Congress 2 nd Annual Leadership.
713 Lecture 15 Host metagenomics. Progression of techniques Culture based –Use phenotypes and genotypes to ID Non-culture based, focused on 16S rDNA –Clone.
Microbial diversity and virulence probing of five different body sites Anu Rebbapragada, Pub. Health Ontario Central Lab. Canada Wei-Jen Lin, Cal State.
“Inspired by Carl: Exploring the Microbial Dynamics Within” Invited Talk Looking in the Right Direction: Carl Woese and the New Biology University of Illinois,
“Living in a Microbial World” Global Health Program Council on Foreign Relations New York, NY April 10, 2014 Dr. Larry Smarr Director, California Institute.
“How Studying Astrophysics and Coral Reefs Enabled Me to Become an Empowered, Engaged Patient” Invited Talk FutureMed at the Hotel Del Coronado, CA November.
“Frontiers of Self-Tracking” Plenary Talk Quantified Self Conference 2012 Stanford University September 15, 2012 Dr. Larry Smarr Director, California Institute.
“Deciphering the Dynamic Coupling of the Human Immune System and the Gut Microbiome” Overview Data-Enabled Life Sciences Research (DELSA) DELSA Workshop.
“Observing the Dynamics of the Human Immune System Coupled to the Microbiome in Health and Disease” CASIS Workshop on Biomedical Research Aboard the ISS.
“Quantifying Your Superorganism Body Using Big Data Supercomputing” ACM International Workshop on Big Data in Life Sciences BigLS 2014 Newport Beach, CA.
“Assay Lab Within Your Body: Biometrics and Biomes” Invited Lecture TSensors Summit La Jolla, CA November 12, 2014 Dr. Larry Smarr Director, California.
“Discovering the Other 90% of our Human Superorganism” Remote Video Lecture to The eResearch Australasia Conference 2014 Melbourne, Australia October 28,
“Quantifying the Time Progression of the Interaction of the Human Immune System with the Gut Microbiome” Research Council Presentation UC San Diego Health.
“The UCSD Big Data Freeway System” Invited Short Talk Workshop on “Enriching Human Life and Society” UC San Diego February 6, 2014 Dr. Larry Smarr Director,
Lecture Science & Entertainment Exchange National Academy of Sciences Los Angeles June 13, 2013 Dr. Larry Smarr Director, California Institute for Telecommunications.
“Know Thyself: Quantifying Your Human Body and Its One Hundred Trillion Microbes” Understanding Cultures and Addressing Disparities in Society: Degrees.
“Using Genetic Sequencing to Unravel the Dynamics of Your Superorganism Body” Weekly Bioinformatics Seminar Series UC San Diego La Jolla, CA October 17,
“Adding Consumer-Generated and Microbiome Data to the Electronic Medical Record” Using Big Data to Advance Healthcare Panel National Health Policy Conference.
High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC.
“Quantifying Your Dynamic Human Body (Including Its Microbiome), Will Move Us From a Sickcare System to a Healthcare System” Invited Presentation Microbiology.
Keynote Presentation Cavendish Global Health Impact Forum
“Connecting Body Time Series to Macro Body Changes”
Metagenomic Species Diversity.
“Analyzing the Human Gut Microbiome Dynamics in Health and Disease Using Supercomputers and Supernetworks” Invited Presentation ESnet CrossConnects Bioinformatics.
“Linking Phenotype Changes to Internal/External Longitudinal Time Series in a Single Human” Invited Presentation at EMBC ‘16 38th International Conference.
“Machine Learning in Healthcare Diagnostics”
Briefing for Dell Analytics Team Calit2’s Qualcomm Institute
Invited Presentation Machine Learning in Healthcare
H = -Σpi log2 pi.
Presentation transcript:

“Using Data Analytics to Discover the 100 Trillion Bacteria Living Within Each of Us” Invited Talk Ayasdi Menlo Park, CA December 5, 2014 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD 1

From One to a Billion Data Points Defining Me: The Exponential Rise in Body Data in Just One Decade Billion: My Full DNA, MRI/CT Images Million: My DNA SNPs, Zeo, FitBit Hundred: My Blood Variables One: My Weight Weight Blood Variables SNPs Microbial Genome Improving Body Discovering Disease

How Will Detailed Knowledge of Microbiome Ecology Radically Change Medicine and Wellness? 99% of Your DNA Genes Are in Microbe Cells Not Human Cells Your Body Has 10 Times As Many Microbe Cells As Human Cells Challenge: Map Out Microbial Ecology and Function in Health and Disease States

June 8, 2012June 14, 2012 Intense Scientific Research is Underway on Understanding the Human Microbiome August 18, 2012

To Map Out the Dynamics of Autoimmune Microbiome Ecology Couples Next Generation Genome Sequencers to Big Data Supercomputers Metagenomic Sequencing –JCVI Produced –~150 Billion DNA Bases From Seven of LS Stool Samples Over 1.5 Years –We Downloaded ~3 Trillion DNA Bases From NIH Human Microbiome Program Data Base –255 Healthy People, 21 with IBD Supercomputing (Weizhong Li, JCVI/HLI/UCSD): –~20 CPU-Years on SDSC’s Gordon –~4 CPU-Years on Dell’s HPC Cloud Produced Relative Abundance of –~10,000 Bacteria, Archaea, Viruses in ~300 People –~3Million Filled Spreadsheet Cells Illumina HiSeq 2000 at JCVI SDSC Gordon Data Supercomputer Example: Inflammatory Bowel Disease (IBD)

Computational NextGen Sequencing Pipeline: From Sequence to Taxonomy and Function PI: (Weizhong Li, CRBS, UCSD): NIH R01HG ( , $1.1M)

Next Step Programmability, Scalability and Reproducibility using bioKeplerwww.kepler-project.org National Resources (Gordon) (Comet) (Stampede) (Lonestar) Cloud Resources Optimized Local Cluster Resources Source: Ilkay Altintas, SDSC

How Best to Analyze The Microbiome Datasets to Discover Patterns in Health and Disease? Can We Find New Noninvasive Diagnostics In Microbiome Ecologies?

We Found Major State Shifts in Microbial Ecology Phyla Between Healthy and Two Forms of IBD Most Common Microbial Phyla Average HE Average Ulcerative ColitisAverage LS Average Crohn’s Disease Collapse of Bacteroidetes Explosion of Actinobacteria Explosion of Proteobacteria Hybrid of UC and CD High Level of Archaea

Using Scalable Visualization Allows Comparison of the Relative Abundance of 200 Microbe Species Calit2 VROOM-FuturePatient Expedition Comparing 3 LS Time Snapshots (Left) with Healthy, Crohn’s, Ulcerative Colitis (Right Top to Bottom)

Using Dell HPC Cloud and Dell Analytics to Discover Microbial Diagnostics for Disease Dynamics Can We Distinguish Noninvasively Between Health and Disease States? Are There Subsets of Health or Disease States? Can We Track Time Development of the Disease State? Can Novel Microbial Diagnostics Differentiate Health and Disease States?

Using Microbiome Profiles to Survey 155 Subjects for Unhealthy Candidates

Dell Analytics Separates The 4 Patient Types in Our Data Using Our Microbiome Species Data Source: Thomas Hill, Ph.D. Executive Director Analytics Dell | Information Management Group, Dell Software Healthy Ulcerative Colitis Colonic Crohn’s Ileal Crohn’s

I Built on Dell Analytics to Show Dynamic Evolution of My Microbiome Toward and Away from Healthy State – Colonic Crohn’s Source: Thomas Hill, Ph.D. Executive Director Analytics Dell | Information Management Group, Dell Software

I Built on Dell Analytics to Show Dynamic Evolution of My Microbiome Toward and Away from Healthy State – Colonic Crohn’s Healthy Ileal Crohn’s Seven Time Samples Over 1.5 Years Colonic Crohn’s

Dell Analytics Tree Graphs Classifies the 4 Health/Disease States With Just 3 Microbe Species Source: Thomas Hill, Ph.D. Executive Director Analytics Dell | Information Management Group, Dell Software

Our Relative Abundance Results Across ~300 People Show Why Dell Analytics Tree Classifier Works UC 100x Healthy LS 100x UC We Produced Similar Results for ~2500 Microbial Species Healthy 100x CD

Using Ayasdi’s Advanced Topological Data Analysis to Separate Healthy from Disease States All Healthy All Ileal Crohn’s Healthy, Ulcerative Colitis, and LS All Healthy Using Ayasdi Categorical Data Lens Analysis by Mehrdad Yazdani, Calit2 Talk to Ayasdi in the Intel Booth at SC14

Ayasdi Enables Discovery of Differences Between Healthy and Disease States Using Microbiome Species Healthy LS Ileal Crohn’s Ulcerative Colitis Using Multidimensional Scaling Lens with Correlation Metric High in Healthy and LS High in Healthy and Ulcerative Colitis High in Both LS and Ileal Crohn’s Disease Analysis by Mehrdad Yazdani, Calit2

From Taxonomy to Function: Analysis of LS Clusters of Orthologous Groups (COGs) Analysis: Weizhong Li & Sitao Wu, UCSD

In a “Healthy” Gut Microbiome: Large Taxonomy Variation, Low Protein Family Variation Source: Nature, 486, (2012) Over 200 People

Ratio of HE11529 to Ave HE Test to see How Much Variation There is Within Healthy Most KEGGs Are Within 10x Of Healthy for a Random HE Ratio of Random HE11529 to Healthy Average for Each Nonzero KEGG

However, Our Research Shows Large Changes in Protein Families Between Health and Disease Most KEGGs Are Within 10x In Healthy and Ileal Crohn’s Disease KEGGs Greatly Increased In the Disease State KEGGs Greatly Decreased In the Disease State Over 7000 KEGGs Which Are Nonzero in Health and Disease States Ratio of CD Average to Healthy Average for Each Nonzero KEGG Note Hi/Low Symmetry

Note UC Has Many Few KEGGs that are Much Smaller than HE; Also Fewer KEGGs That are Nonzero; Note Asymmetry Between High & Low Most KEGGs Are Within 10x In Healthy and Ulcerative Colitis KEGGs Greatly Increased In the Disease State KEGGs Greatly Decreased In the Disease State Ratio of UC Average to Healthy Average for Each Nonzero KEGG

Note LS001 Has Many Few KEGGs that are Much Smaller than HE; ~Same # KEGGs That are Nonzero; Note Asymmetry Between High & Low Ratio of LS001 Average to Healthy Average for Each Nonzero KEGG Most KEGGs Are Within 10x In Healthy and LS001 KEGGs Greatly Increased In the Disease State KEGGs Greatly Decreased In the Disease State

We Can Define a Subgroup of the 10,000 KEGGs Which Are Extreme in the Disease State Look for KEGGs That Have the Properties: –Are 100x in All Four Disease States –LS001/Ave HE –Ave CD/ Ave HE –Ave UC/Ave HE –Sick HE Person/Ave HE There are 48 of These Extreme KEGGs A New Way to Define What is Wrong with the Microbiome in Disease? Can We Devise an Ayasdi Lens That Can Separates These Extreme KEGGs?

Using Ayasdi Interactively to Explore Protein Families in Healthy and Disease States Source: Pek Lum, Formerly Chief Data Scientist, Ayasdi Dataset from Larry Smarr Team With 60 Subjects (HE, CD, UC, LS) Each with 10,000 KEGGs - 600,000 Cells

CD is Missing a Population of Bacteria That Exists in High Quantities in HE ( Circled with Arrow) Problem is That These KEGGs Have Moderate Values of Ave CD/ Ave HE How Can We Change the Ayasdi Lenses So That We Pick Out The Very High Values of Ratios to Ave HE? Low in CD and LS Source: Pek Lum, Formerly Chief Data Scientist, Ayasdi

This Ayasdi Lens Does Identify KEGGs In Which Ave CD and LS001 Are Less Than Ave HE Problem is That These KEGGs Have Moderate Low Values of Ave CD/ Ave HE How Can We Change the Ayasdi Lenses So That We Pick Out The Very High Values of Ratios to Ave HE?

We Found a Set of Lenes That Clearer Find the 43 Extreme KEGGs K00108(choline_dehydrogenase) K00673(arginine_N-succinyltransferase) K00867(type_I_pantothenate_kinase) K01169(ribonuclease_I_(enterobacter_ribonuclease)) K01484(succinylarginine_dihydrolase) K01682(aconitate_hydratase_2) K01690(phosphogluconate_dehydratase) K01825(3-hydroxyacyl-CoA_dehydrogenase_/_enoyl-CoA_hydratase_/3-hydroxybutyryl-CoA_epimerase_/_enoyl-CoA_isomerase_[EC: _ _ ]) K02173(hypothetical_protein) K02317(DNA_replication_protein_DnaT) K02466(glucitol_operon_activator_protein) K02846(N-methyl-L-tryptophan_oxidase) K03081(3-dehydro-L-gulonate-6-phosphate_decarboxylase) K03119(taurine_dioxygenase) K03181(chorismate--pyruvate_lyase) K03807(AmpE_protein) K05522(endonuclease_VIII) K05775(maltose_operon_periplasmic_protein) K05812(conserved_hypothetical_protein) K05997(Fe-S_cluster_assembly_protein_SufA) K06073(vitamin_B12_transport_system_permease_protein) K06205(MioC_protein) K06445(acyl-CoA_dehydrogenase) K06447(succinylglutamic_semialdehyde_dehydrogenase) K07229(TrkA_domain_protein) K07232(cation_transport_protein_ChaC) K07312(putative_dimethyl_sulfoxide_reductase_subunit_YnfH_(DMSO_reductaseanchor_subunit)) K07336(PKHD-type_hydroxylase) K08989(putative_membrane_protein) K09018(putative_monooxygenase_RutA) K09456(putative_acyl-CoA_dehydrogenase) K09998(arginine_transport_system_permease_protein) K10748(DNA_replication_terminus_site-binding_protein) K11209(GST-like_protein) K11391(ribosomal_RNA_large_subunit_methyltransferase_G) K11734(aromatic_amino_acid_transport_protein_AroP) K11735(GABA_permease) K11925(SgrR_family_transcriptional_regulator) K12288(pilus_assembly_protein_HofM) K13255(ferric_iron_reductase_protein_FhuF) K14588() K15733() K15834() L-Infinity Centrality Lens Using Norm Correlation as Metric (Resolution: 242, Gain: 5.7) Entropy & Variance Lens Using Angle as Metric (Resolution: 30, Gain 3.00) Analysis by Mehrdad Yazdani, Calit2

Disease Arises from Perturbed Protein Family Networks: Dynamics of a Prion Perturbed Network in Mice Source: Lee Hood, ISB 31 Our Next Goal is to Create Such Perturbed Networks in Humans

Visualizing Time Series of 150 LS Blood and Stool Variables, Each Over 5-10 Years Calit2 64 megapixel VROOM One Blood Draw For Me

Only One of My Blood Measurements Was Far Out of Range--Indicating Chronic Inflammation Normal Range <1 mg/L Normal 27x Upper Limit Complex Reactive Protein (CRP) is a Blood Biomarker for Detecting Presence of Inflammation Episodic Peaks in Inflammation Followed by Spontaneous Drops

Adding Stool Tests Revealed Oscillatory Behavior in an Immune Variable Normal Range <7.3 µg/mL 124x Upper Limit Antibiotics Lactoferrin is a Protein Shed from Neutrophils - An Antibacterial that Sequesters Iron Typical Lactoferrin Value for Active IBD Hypothesis: Lactoferrin Oscillations Coupled to Relative Abundance of Microbes that Require Iron

Fine Time-Resolution Sampling Enables Analysis of Dynamical Innate and Adaptive Immune Dysfunction Normal Innate Immune System Normal Adaptive Immune System

CRP SED Lact Lyzo SigA Calp By Overlaying a Number of Immune/Inflammation Variables, It Appears There May be Phase Correlations Data Analytics by Benjamin Smarr, UC Berkeley

One Can Use Sine Fitting with Least Squares To Try and Approximate the Time Series Dynamics Data Analytics by Benjamin Smarr, UC Berkeley 5 Sines

With Low Resolution Sine Fitting, There Is Indication of Phase Correlation Data Analytics by Benjamin Smarr, UC Berkeley 2 Sines

Are There Ayasdi Tools to More Deeply Analyze Such Time Series?

UC San Diego Will Be Carrying Out a Major Clinical Study of IBD Using These Techniques Inflammatory Bowel Disease Biobank For Healthy and Disease Patients Drs. William J. Sandborn, John Chang, & Brigid Boland UCSD School of Medicine, Division of Gastroenterology Already 120 Enrolled, Goal is 1500 Announced Last Friday!

Inexpensive Consumer Time Series of Microbiome Now Possible Through Ubiome Data source: LS (Stool Samples); Sequencing and Analysis Ubiome

By Crowdsourcing, Ubiome Can Show I Have a Major Disruption of My Gut Microbiome (+) (-) LS Sample on September 24, 2014 Visit Ubiome in the Exponential Medicine Healthcare Innovation Lab

Where I Believe We are Headed: Predictive, Personalized, Preventive, & Participatory Medicine Will Grow to 1000, Then 10,000, Then 100,000

Genetic Sequencing of Humans and Their Microbes Is a Huge Growth Area and the Future Foundation of Medicine Twitter 9/27/2014

Thanks to Our Great Team! UCSD Metagenomics Team Weizhong Li Sitao Wu Future Patient Team Jerry Sheehan Tom DeFanti Kevin Patrick Jurgen Schulze Andrew Prudhomme Philip Weber Fred Raab Joe Keefe Ernesto Ramirez Ayasdi Devi Ramanan Pek Lum JCVI Team Karen Nelson Shibu Yooseph Manolito Torralba SDSC Team Michael Norman Mahidhar Tatineni Robert Sinkovits UCSD Health Sciences Team William J. Sandborn Elisabeth Evans John Chang Brigid Boland David Brenner Dell/R Systems Brian Kucic John Thompson