Download presentation
Presentation is loading. Please wait.
Published byMary Walsh Modified over 9 years ago
1
1 Applied Statistics – Challenges and Reward Wenjiang Fu, Ph.D Computational Genomics Lab, Department of Epidemiology Michigan State University fuw@msu.edu www.msu.edu/~fuw
2
2 What is Statistics ? “Lies, Damned Lies, and Statistics” “Figures fool when fools figure” A branch of mathematical science that studies data through probability distribution and modeling. Fields: probability theory, actuarial science, biostatistics, finance statistics, industrial statistics, etc. Related fields: biometrics, bioinformatics, geo-statistics, statistical mechanics, econometrics, etc.
3
3 Grand challenges we are facing … “Data” Knowledge & Information Decision Statistics 21 st century will be the golden age of statistics !
4
4 Grand challenges we are facing … 1. Data collection technology has advanced dramatically, but without sufficient statistical sampling design and experimental design. 2. Advancement of technology for discovering and retrieving useful information has been lagging and has become the bottleneck. 3. More sophisticated approaches are needed for decision making and risk management.
5
5 Statistical Challenges -- Massive Amount of Data
6
6 Statistical Challenges – Image Data
7
7 Statistical Challenges – Functional Data, Graph (Network) Data, and Shape Data
8
8 Statistical Challenges – Click Stream Data
9
9 Statistical Challenges – Data Fusion and Assimilation Data
10
10 Statistics in Science Cosmic microwave background radiation High Energy Physics Tick-by-tick stock data Genomic/proteomic data
11
11 Statistics in Science Finger Prints Microarray
12
12 What do we do? New ways of thinking and attacking problems Finding sub-optimal but computationally feasible solutions. New paradigm for new types of data Be satisfied with ‘very rough’ approximations Turn research results into easy and publicly available software and programs Join force with computer scientists.
13
13 Some ‘hot’ research directions Dimension reduction Visualization Dynamic systems Simulation and real time computation Uncertainty and risk management Interdisciplinary research
14
14 Example 1. Sociology data
15
15 Result through statistical modeling
16
16 Example 2. Epidemiological study data
17
17 Results from statistical modeling
18
18 Example 3 Medical study data: Ob/Gyn Modeling of PlGF: Placental Growth Factor
19
19 SNP: Single Nucleotide Polymorphism Homologous pairs of chromosomes Paternal allele Maternal allele Paternal allele Maternal allele ACGAACAGCT TGCTTGTCGA ACGAGCAGCT TGCTCGTCGA SNP A/G
20
20 The International HapMap Consortium (Nature 2003)
21
21 Allele, Haplotype and Diplotype A B a b SNP 1: two alleles A and aSNP 2: two alleles B and b Haplotype [ AB ] Diplotype [ AB ][ ab ] Haplotype [ ab ]
22
22 Microarray Technology: 2 channels Hybridization: A T C G T A G | | | | | | | T A G C A T C
23
23 Microarray normalization: between slides Boxplots of log ratios from 3 replicate self-self hybridizations. Left panel: before normalization Middle panel: after within print-tip group normalization Right panel: after a further between-slide scale normalization.
24
24 Affymetrix SNP Array Illustration of SNP annotation on Affymetrix SNP array. Adopted from Matsuzaki et al 2004. ‘AB’ SNP: AC A – A, B – C.
25
25 Computational Genomics Data: SNP Genotype Error rate : 1 – 5 % : GIGO – Garbage in Garbage out
26
26 Computational Genomics Data: SNP Genotype
27
27 Genetic Variation influences - disease susceptibility - disease progression - therapeutic response - unwanted drug effects Genetics is pointing the way to personalized medicine… With the development of human HapMap project, coupling with advanced statistical approaches, we are entering an era to design personalized medicine based on individual’s genetic profile. Prospects I Genome-oriented Medicine
28
28 Whole Genome-wide Association Studies
29
29 Whole Genome-wide Association Studies Successful study: Wellcome Trust Case-Control Consortium GWAS on 7 diseases with 14,000 patients and 2000 common controls. (Nature 2007) Hypertension, diabetes, etc.
30
30 Recruiting Graduate Students Epidemiology: Study distribution of Disease; Biostatistics: data modeling, computation; Quantitative Biology Initiative: MSU cross-disciplinary center. Background: Mathematics, Statistics, Physics, Biology, Chemistry, and others. Opportunity: Contact your department graduate director/chairman for funding from the Ministry of Education. MSU Epi/Biostatistics provide partial funding and cover tuition fee. Qualification: TOEFL, GRE, GPA, Reference letter. My contact: fuw@msu.eduwww.msu.edu/~fuw Application: WWW.MSU.EDU
31
31 Thank you! Q and A. Office: CMS 415.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.