Download presentation
Presentation is loading. Please wait.
Published byWilla Blankenship Modified over 8 years ago
1
DNA sequencing, big data and health Mikael Huss Science for Life Laboratory / Stockholm University @mikaelhuss Follow the Data blog: http://followthedata.wordpress.com Stockholm Big Data Meetup #2
2
All* living organisms have DNA as their blueprint GTTACGTAACCGTTACGTA….. CCTTGATCGTAAC…. Etc. (2x3 billion letters for humans) *OK, some viruses have RNA ?
3
Science For Life Laboratory or SciLifeLab (Karolinska Institute Science Park, Solna) Joint research centre between KI, Royal Inst of Tech (KTH), Stockholm Uni., Uppsala Uni.. Presently sequencing ~3 megabases per second Corresponding to about 3 human genome sizes per hour
5
Mount Sinai Medical Center / Eric Schadt
6
Exploring the human microbiome Estimated 10x more bacterial cells than human cells in human body
7
Environmental samples: soil, ocean etc Identifying new viruses in human or environmental samples; <1% known so far
8
http://www.ted.com/talks/nathan_wolfe_what_s_left_to_explore.html
9
“Big data” in genomics: Data is often “transposed” compared to other “big data” types Genomics: Few samples, collected at great cost, information rich Example: 20 tissue samples x 30,000 features (genes); “large p, small n” Twitter, log files, purchase data etc.: Lots of samples, cheap, low information content Example: 200,000,000 tweets x 150 features (words) Analysis challenges: “large p, small n” Samples are hard to come by and expensive to collect, although you get a lot of information about each sample Hard to get enough data for statistics extra important to share data and analysis methods globally Not enough people looking at the data that has been generated already
10
Analysis challenges: Dealing with the size of raw data Growth in sequencing capacity has outstripped Moore’s law Need to throw away data Tailored streaming / approximate algorithms The Economist
11
Personal sequencing? Genomics apps
12
Predictive modelling competition for breast cancer prognosis
13
Community genomics & crowdsourced clinical trials https://www.23andme.com/about/factoids/
14
Coming challenges: ecology and lifestyle Perhaps: “genomic observatories” continuously monitoring environmental DNA streaming, real-time analysis important Genes – Epigenetics – Lifestyle - Environment Understanding the interplay of lifestyle (including environment) and genes through the “interface layer”, epigenetics. Massive correlational analyses …
15
Thanks for listening!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.