Presentation is loading. Please wait.

Presentation is loading. Please wait.

DNA sequencing, big data and health Mikael Huss Science for Life Laboratory / Stockholm Follow the Data blog:

Similar presentations


Presentation on theme: "DNA sequencing, big data and health Mikael Huss Science for Life Laboratory / Stockholm Follow the Data blog:"— Presentation transcript:

1 DNA sequencing, big data and health Mikael Huss Science for Life Laboratory / Stockholm University @mikaelhuss Follow the Data blog: http://followthedata.wordpress.com Stockholm Big Data Meetup #2

2 All* living organisms have DNA as their blueprint GTTACGTAACCGTTACGTA….. CCTTGATCGTAAC…. Etc. (2x3 billion letters for humans) *OK, some viruses have RNA ?

3 Science For Life Laboratory or SciLifeLab (Karolinska Institute Science Park, Solna) Joint research centre between KI, Royal Inst of Tech (KTH), Stockholm Uni., Uppsala Uni.. Presently sequencing ~3 megabases per second Corresponding to about 3 human genome sizes per hour

4

5 Mount Sinai Medical Center / Eric Schadt

6 Exploring the human microbiome Estimated 10x more bacterial cells than human cells in human body

7 Environmental samples: soil, ocean etc Identifying new viruses in human or environmental samples; <1% known so far

8 http://www.ted.com/talks/nathan_wolfe_what_s_left_to_explore.html

9 “Big data” in genomics: Data is often “transposed” compared to other “big data” types Genomics: Few samples, collected at great cost, information rich Example: 20 tissue samples x 30,000 features (genes); “large p, small n” Twitter, log files, purchase data etc.: Lots of samples, cheap, low information content Example: 200,000,000 tweets x 150 features (words) Analysis challenges: “large p, small n” Samples are hard to come by and expensive to collect, although you get a lot of information about each sample Hard to get enough data for statistics  extra important to share data and analysis methods globally Not enough people looking at the data that has been generated already

10 Analysis challenges: Dealing with the size of raw data Growth in sequencing capacity has outstripped Moore’s law Need to throw away data  Tailored streaming / approximate algorithms The Economist

11 Personal sequencing? Genomics apps

12 Predictive modelling competition for breast cancer prognosis

13 Community genomics & crowdsourced clinical trials https://www.23andme.com/about/factoids/

14 Coming challenges: ecology and lifestyle Perhaps: “genomic observatories” continuously monitoring environmental DNA  streaming, real-time analysis important Genes – Epigenetics – Lifestyle - Environment Understanding the interplay of lifestyle (including environment) and genes through the “interface layer”, epigenetics. Massive correlational analyses …

15 Thanks for listening!


Download ppt "DNA sequencing, big data and health Mikael Huss Science for Life Laboratory / Stockholm Follow the Data blog:"

Similar presentations


Ads by Google