Download presentation
Presentation is loading. Please wait.
Published byBrenda Walton Modified over 8 years ago
1
Data Science Jon Kettenring
2
Brief Bio 1956-1962 BS and MS in Statistics from Stanford U. 1962-1964 US Army in Turkey 1964-1969 PhD in Statistics from U. North Carolina 1969-1984 Bell Labs 1984-2003 Bellcore, Telcordia Technologies 2004- 2016 Drew U. Teaching stints at U. Washington, U. Minnesota, Stanford U., and U. Michigan
3
My Interests Technical Applied statistics, data analysis, data science Multivariate methods ‘Big data’ Applied math, software engineering, and machine learning Personal Tennis 1 wife, 2 children, and 4 grandchildren
4
What is Data Science? An interdisciplinary field: math, statistics, information technology, probability, machine learning… Extraction of knowledge from data…’small’ data, ‘big’ data Growing field, many graduate programs, lots of jobs (whatever it is) Buzz words? Nate Silver, creator of fivethirtyeight.com: “I think data-scientist is a sexed up term for statistician…Statistics is a branch of science. Data scientist is slightly redundant.”
5
What is Data Science? (Venn diagram courtesy of Drew Conway)
6
Examples of Important Statistical Ideas in Data Science Sampling Bias Variability Replication Experimental design Modeling Inference Prediction
7
Examples of Projects (all with underlying data themes) Past projects Exercise and health Autism ‘Statistics 101’ Methane hydrates ‘Big data’ medical records Potential projects Climate change Reproducibility of scientific findings Sexism in science ‘Big data’ challenges
8
“Playing Dumb on Climate Change” Naomi Oreskes, New York Times Sunday Review, 1/3/15; artwork by Oliver Munday Are scientists too conservative? Presumption of innocence or no effect: “show me the data”. Type 1 error: conclude effect is real when it isn’t. Type 2 error: conclude effect is not there when it is. Have scientists under predicted the threat of climate change?
9
In Malé, the capital of the Maldives, more than 120,000 people live just a meter or so above current sea level. Julia Fahrenkamp-Uppenbrink et al. Science 2015;350:750- 751 Published by AAAS
10
“Why Most Published Research Findings Are False” article by J. Ioannidis, PLoS Medicine, August 2005, 696-701 Over reliance on p-values (Type 1 error estimates) Reject study hypothesis if p <.05 (a common practice) Publish the ‘research finding’ Bias problems in analysis and reporting Bury any negative findings Conflicts of interest and prejudice Isolated reporting of results from one of many studies Sample size too small Effect size too small Pressure to publish on hot topics
11
Sexism in Science: Has Gender Balance Been Reached? The jungle gym, a new metaphor for careers Fits the challenges faced by women 2012 paper in PNAS (Moss-Racusin et al.) Male and female science professors evaluated resumes for a lab manager position more favorably if they believed the applicant was male 2014 article in the NYT (10/31) “Academic science isn’t sexist” Gender balance achieved in math-intensive fields Role of implicit bias (a topic in itself)
12
What is ‘Big Data’? Massive amounts of complex information Data often observational (not randomized) Data quality vs. data quantity Standard data analytical tools break down How to visualize the data? How to analyze the data? Examples are everywhere Medicine, genomics, environment, business, finance, geography astronomy, social media…
13
Projects with Data Themes Specific aspects of Climate change (e.g., ocean warming, fish patterns) Reproducibility of scientific findings Sexism in science (e.g., implicit biases) ‘Big data’ Any topic of interest that involves statistical data
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.