Data Science, Statistics, and Projects

Slides:



Advertisements
Similar presentations
Types of Science:.
Advertisements

Science is a way of knowing.
Hypothesis Testing Part I – As a Diagnostic Test.
FCAT Review The Nature of Science
Winslow Homer: “On The Stile” INFERENTIAL PROBLEM SOLVING Hypothesis Testing and t-tests Chapter 6:
CSCD 555 Research Methods for Computer Science
STEPPING STONE PROJECT STEPPING STONE PROJECT designing a new engineering discipline presented by team 1.
Biostatistics Frank H. Osborne, Ph. D. Professor.
Thinking Processes By Marvi Matos. College of Engineering, UPR BS, Chem E My background.
Bio (“life”) + logy (“study of”) Scientific study of life (pg. 4)
Aim: How to test a hypothesis and design an experiment Do Now: 1) What do you need to know in order to test your hypothesis? 2)Take out your homework.
Science and Engineering Practices
Introduction to Behavioral Science Unit 1. I.Social Sciences  The study of society and the activities and relationships of individuals and groups within.
Software Engineering Experimentation Software Engineering Specific Issues (Mostly CS as well) Jeff Offutt
Introduction to Earth Science
Ch 1-Introduction to Earth Science
Section 2: Science as a Process
TEA Science Workshop #3 October 1, 2012 Kim Lott Utah State University.
Evaluation of software engineering. Software engineering research : Research in SE aims to achieve two main goals: 1) To increase the knowledge about.
Introduction: Why statistics? Petter Mostad
Scientific Method for a controlled experiment. Observation Previous data Previous results Previous conclusions.
BIOL 411 Lab. About the course BIOL 411 newly redesigned as an Inquiry course – Meets new Discovery Program requirements Attributes of Inquiry course.
Introduction to Science Informatics Lecture 1. What Is Science? a dependence on external verification; an expectation of reproducible results; a focus.
Education 795 Class Notes P-Values, Partial Correlation, Multi-Collinearity Note set 4.
How do scientists employ imagination and creativity in developing hypothesis through observations and drawing conclusions? The Mystery Tube.
GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.
What is Science? SECTION 1.1. What Is Science and Is Not  Scientific ideas are open to testing, discussion, and revision  Science is an organize way.
Chapter 1: The Nature of Science Section 1: What is Science?
Impact of the New ASA Undergraduate Curriculum Guidelines on the Hiring of Future Undergraduates Robert Vierkant Mayo Clinic, Rochester, MN.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 –Multiple hypothesis testing Marshall University Genomics.
Hypothesis Testing Steps for the Rejection Region Method State H 1 and State H 0 State the Test Statistic and its sampling distribution (normal or t) Determine.
Data Science Jon Kettenring. Brief Bio BS and MS in Statistics from Stanford U US Army in Turkey PhD in Statistics from.
Teacher Website To access teacher info from School website GO TO:
Oscar E. Cariceo, MSW, NSWM Chile Chapter
Software Engineering Experimentation
Mail Call Us: , , Data Science Training In Ameerpet
Critically Appraising a Medical Journal Article
Distinguish between an experiment and other types of scientific investigations where variables are not controlled,
Nature of science Ms. Fernandez.
Computational Reasoning in High School Science and Math
What is Chemistry??.
Section 2: Science as a Process
A Closer Look at Testing
Analytical Business Consultants
The Practice of Science TYPES OF SCIENTIFIC INVESTIGATIONS
What is the scientific method and why do scientists use it?
Chapter 1, lesson 1-How scientists work
Science A process, not just a set of facts
Week 3 Vocabulary Science Scientific Method Engineering Method
The Nature of Probability and Statistics
Science Science Practices GED Support
Chapter 6 Making Sense of Statistical Significance: Decision Errors, Effect Size and Statistical Power Part 1: Sept. 18, 2014.
Welcome to Introduction to Biology
Exercise #4: Cell Biology Research Paper
Qualitative Observation
Philip G. Zimbardo Robert L. Johnson Ann L. Weber
INNOvation in TRAINING BUSINESS ANALYSTS HAO HElEN Zhang UniVERSITY of ARIZONA
Unit: Science & Technology
MPA 630: Data Science for Public Management November 15, 2018
Science is... An organized way of using evidence to learn about the natural world Based on observations.
Data-Driven Decision-Making
What determines Sex Ratio in Mammals?
Science is... An organized way of using evidence to learn about the natural world Based on observations.
Bell Work  You will not need to write anything down. We will set up our notebooks Friday. Discuss with your shoulder partner any classroom rules that.
Cube Activity.
Unit 1 Lesson 3 What Are Some Types of Investigations?
Scientific Method.
Scientific Method Review
Statistical Power.
Notebook Catch-Up Unit 1.
Presentation transcript:

Data Science, Statistics, and Projects Jon Kettenring

My Background 1956-1962 BS and MS in Statistics from Stanford U. 1962-1964 US Army in Turkey 1964-1969 PhD in Statistics from U. North Carolina 1969-1984 Bell Labs 1984-2003 Bellcore, Telcordia Technologies 2004-2015 Drew U. Technical interests Applied statistics and data analysis Applied math, software engineering, and machine learning

Data Science A trendy term that captures the interdisciplinary nature of modern data problems. Data Science is becoming fashionable in universities. Many employers are looking for well-trained data scientists.

What is Data Science? Extraction of knowledge from data Uses math, statistics, information technology, probability, machine learning, visualization, … Applies to “small” and “big” data Just buzz words? Just another name for statistics? “I think data-scientist is a sexed up term for a statistician....Statistics is a branch of science. Data scientist is slightly redundant.”—Nate Silver 2008: Obama vs. McCain 49 of 50 states 2012: Obama vs. Romney 50 of 50 states

What is Data Science? Drew Conway, “The Data Science Venn Diagram”, http://drewconway.com/zia/2013/3/26/the-data-science-venn- diagram Danger zone: “know enough to be dangerous” … short on math and statistics skills. I would substitute “Computer Science” for “Hacking Skills” and “Domain Knowledge” for “Substantive Knowledge.”

Examples of Important Statistical Ideas in Data Science Sampling Bias Variability Replication Experimental design Modeling Inference Prediction

Examples of Projects (all with underlying data themes) Past projects: Exercise and health Autism Introductory statistics courses Potential projects: Climate change (as in your reading) Reproducibility of scientific findings Sexism in science Big Data Any topic of interest to you that involves data

“Playing Dumb on Climate Change” Naomi Oreskes, New York Times Sunday Review, 1/3/15 Are scientists too conservative? Presumption of innocence or no effect: “show me the data”. Type 1 error: conclude effect is real when it isn’t. Type 2 error: conclude effect is not there when it is. Have scientists under predicted the threat of climate change? Image by Oliver Munday Given the growing evidence of the impact of climate change, have scientists been too cautious in making their case thereby putting us and our planet at risk? Have they focused too much on controlling the type 1 error at the expense of the type 2 error?

“Why Most Published Research Findings Are False” article by J “Why Most Published Research Findings Are False” article by J. Ioannidis, PLoS Medicine, August 2005, 696-701 Over reliance on p-values (Type 1 error estimates) Reject study hypothesis if p < .05 (a common standard of practice) Publish the ‘research finding’ Bias problems in analysis and reporting Bury negative findings Conflicts of interest and prejudice Isolated reporting of results of one of many studies Sample size too small Effect size too small Pressure to publish on hot topics Need for more attention to replication among other things.

Sexism in Science (Has gender balance been reached?) 2012 paper in PNAS Male and female science professors evaluated resumes more favorably if they believed the applicant was a male. 2014 article in the NYT “Academic science isn’t sexist.” Gender balance achieved in math- intensive fields.

What is “Big Data”? Massive amounts of complex information Data often observational (not randomized) Data quality vs. data quantity Standard data analytical tools break down How to visualize the data? How to analyze the data? Examples arise in many domains: Medicine, genomics, environment, business, finance, geography, astronomy, and social media

Projects with Data Themes Potential projects: Climate change Reproducibility of scientific findings Sexism in science Big Data Any topic of interest to you that involves data If you would like to discuss a potential project, please stop by my office or send me an email.