Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Applied Statistics – Challenges and Reward Wenjiang Fu, Ph.D Computational Genomics Lab, Department of Epidemiology Michigan State University

Similar presentations


Presentation on theme: "1 Applied Statistics – Challenges and Reward Wenjiang Fu, Ph.D Computational Genomics Lab, Department of Epidemiology Michigan State University"— Presentation transcript:

1 1 Applied Statistics – Challenges and Reward Wenjiang Fu, Ph.D Computational Genomics Lab, Department of Epidemiology Michigan State University fuw@msu.edu www.msu.edu/~fuw

2 2 What is Statistics ? “Lies, Damned Lies, and Statistics” “Figures fool when fools figure” A branch of mathematical science that studies data through probability distribution and modeling. Fields: probability theory, actuarial science, biostatistics, finance statistics, industrial statistics, etc. Related fields: biometrics, bioinformatics, geo-statistics, statistical mechanics, econometrics, etc.

3 3 Grand challenges we are facing … “Data” Knowledge & Information Decision Statistics 21 st century will be the golden age of statistics !

4 4 Grand challenges we are facing … 1. Data collection technology has advanced dramatically, but without sufficient statistical sampling design and experimental design. 2. Advancement of technology for discovering and retrieving useful information has been lagging and has become the bottleneck. 3. More sophisticated approaches are needed for decision making and risk management.

5 5 Statistical Challenges -- Massive Amount of Data

6 6 Statistical Challenges – Image Data

7 7 Statistical Challenges – Functional Data, Graph (Network) Data, and Shape Data

8 8 Statistical Challenges – Click Stream Data

9 9 Statistical Challenges – Data Fusion and Assimilation Data

10 10 Statistics in Science Cosmic microwave background radiation High Energy Physics Tick-by-tick stock data Genomic/proteomic data

11 11 Statistics in Science Finger Prints Microarray

12 12 What do we do? New ways of thinking and attacking problems  Finding sub-optimal but computationally feasible solutions.  New paradigm for new types of data  Be satisfied with ‘very rough’ approximations  Turn research results into easy and publicly available software and programs Join force with computer scientists.

13 13 Some ‘hot’ research directions Dimension reduction Visualization Dynamic systems Simulation and real time computation Uncertainty and risk management Interdisciplinary research

14 14 Example 1. Sociology data

15 15 Result through statistical modeling

16 16 Example 2. Epidemiological study data

17 17 Results from statistical modeling

18 18 Example 3 Medical study data: Ob/Gyn Modeling of PlGF: Placental Growth Factor

19 19 SNP: Single Nucleotide Polymorphism Homologous pairs of chromosomes Paternal allele Maternal allele Paternal allele Maternal allele ACGAACAGCT TGCTTGTCGA ACGAGCAGCT TGCTCGTCGA SNP A/G

20 20 The International HapMap Consortium (Nature 2003)

21 21 Allele, Haplotype and Diplotype A B a b SNP 1: two alleles A and aSNP 2: two alleles B and b Haplotype [ AB ] Diplotype [ AB ][ ab ] Haplotype [ ab ]

22 22 Microarray Technology: 2 channels Hybridization: A T C G T A G | | | | | | | T A G C A T C

23 23 Microarray normalization: between slides Boxplots of log ratios from 3 replicate self-self hybridizations. Left panel: before normalization Middle panel: after within print-tip group normalization Right panel: after a further between-slide scale normalization.

24 24 Affymetrix SNP Array Illustration of SNP annotation on Affymetrix SNP array. Adopted from Matsuzaki et al 2004. ‘AB’ SNP: AC A – A, B – C.

25 25 Computational Genomics Data: SNP Genotype Error rate : 1 – 5 % : GIGO – Garbage in Garbage out

26 26 Computational Genomics Data: SNP Genotype

27 27 Genetic Variation influences - disease susceptibility - disease progression - therapeutic response - unwanted drug effects Genetics is pointing the way to personalized medicine… With the development of human HapMap project, coupling with advanced statistical approaches, we are entering an era to design personalized medicine based on individual’s genetic profile. Prospects I Genome-oriented Medicine

28 28 Whole Genome-wide Association Studies

29 29 Whole Genome-wide Association Studies Successful study: Wellcome Trust Case-Control Consortium GWAS on 7 diseases with 14,000 patients and 2000 common controls. (Nature 2007) Hypertension, diabetes, etc.

30 30 Recruiting Graduate Students Epidemiology: Study distribution of Disease; Biostatistics: data modeling, computation; Quantitative Biology Initiative: MSU cross-disciplinary center. Background: Mathematics, Statistics, Physics, Biology, Chemistry, and others. Opportunity: Contact your department graduate director/chairman for funding from the Ministry of Education. MSU Epi/Biostatistics provide partial funding and cover tuition fee. Qualification: TOEFL, GRE, GPA, Reference letter. My contact: fuw@msu.eduwww.msu.edu/~fuw Application: WWW.MSU.EDU

31 31 Thank you! Q and A. Office: CMS 415.


Download ppt "1 Applied Statistics – Challenges and Reward Wenjiang Fu, Ph.D Computational Genomics Lab, Department of Epidemiology Michigan State University"

Similar presentations


Ads by Google