Submit Predictions Statistics & Analysis Data Management Hypotheses Goal Get Data Predict whom survived the Titanic Disaster.

Slides:



Advertisements
Similar presentations
The parenthood effect: what explains the increase in gender inequality when British couples become parents? Pia Schober London School of Economics.
Advertisements

Titanic Analytic model to predict survival in Titanic Disaster. By,
Children Aged 5 to
The Titanic By Sophie. Contents 3. Construction 4. Facilities on board 5. Maiden Voyage 6. Sinking 7. After Math.
The Scientific Method Essential Questions:
Submit Predictions Statistics & Analysis Data Management Hypotheses Goal Get Data Predict whom survived the Titanic Disaster Score = Number of Passengers.
ANOVA: Analysis of Variance
Combining the strengths of UMIST and The Victoria University of Manchester An analysis of the relationship between time spent on active leisure and educational.
Logistic regression Who survived Titanic?.
Sociology 601: Class 1, September Syllabus Course website Objectives Prerequisites Text Homeworks Class time Exams Grading Schedule.
Survival analysis. First example of the day Small cell lungcanser Meadian survival time: 8-10 months 2-year survival is 10% New treatment showed median.
Hypothesis Testing. Outline The Null Hypothesis The Null Hypothesis Type I and Type II Error Type I and Type II Error Using Statistics to test the Null.
klevans07 Ordinal Numbers 1-5 KindergartenFrisco.
Measurement and Variables May 14, 2008 Ivan Katchanovski, Ph.D. POL 242Y-Y.
Chapter 1: The What and the Why of Statistics
Titanic “A Social Disaster” Eric Chronister. Construction of the “Unsinkable” Construction of the Titanic began on March 31, 1909 and it was complete.
Copyright © 2008, SAS Institute Inc. All rights reserved. RMS Titanic: Using SAS Enterprise Guide To Report On A Tragedy Matt Malczewski, SAS Canada.
SW388R6 Data Analysis and Computers I Slide 1 Central Tendency and Variability Sample Homework Problem Solving the Problem with SPSS Logic for Central.
MEASURES OF CENTRALITY. Last lecture summary Which graphs did we meet? scatter plot (bodový graf) bar chart (sloupcový graf) histogram pie chart (koláčový.
Problem StatementLiterature Jianjun Ji, Justine Cornelius & Kathryn Meinholz  Sociology  University of Wisconsin-Eau Claire  China’s elderly population.
The What and the Why of Statistics The Research Process Asking a Research Question The Role of Theory Formulating the Hypotheses –Independent & Dependent.
Chapter 1: The What and the Why of Statistics  The Research Process  Asking a Research Question  The Role of Theory  Formulating the Hypotheses  Independent.
A Few Handful Many Time Stamps One Time Snapshot Many Time Series Number of Variables Mobile Phone Galton Height Census Titanic Survivors Stock Market.
1 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה פרופ’ בנימין רייזר פרופ’ דוד פרג’י גב’ אפרת ישכיל.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
Chapter 2: Logistic Regression 2.1 Likelihood Approach 2.2 Binary Logistic Regression 2.3 Nominal and Ordinal Logistic Regression Models 1.
Section 4.4:Contingency Tables and Association Contingency table – What and why a contingency table – Marginal distribution – Conditional distribution.
Analyses using SPSS version 19
Perform Descriptive Statistics Section 6. Descriptive Statistics Descriptive statistics describe the status of variables. How you describe the status.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
10.3 Probability Representations Representation Activity: Assigned Groups You will be assigned to a group 1. Favorite Color 2. Are you a Lefty or.
STATISTICS: TYPES OF VARIABLES Claire 12B. Qualitative Variables  A qualitative variable is a categorical variable that represents different groups and.
GENDER EQUALITY IN ESTONIA. High employment rate among both men (71%) and women (65%, 6th in EU-27)* Gender parity on most levels of education (fe among.
Titanic: Machine Learning from Disaster
Linear Discriminant Analysis and Logistic Regression.
Diversity Dashboard Q Confidential. Only for information Jury Award Organizational Effectiveness.
 Life Expectancy is 180 th in the World.  Literacy Rate is 4 th in Africa.
SOCI332- Statistics for Social Science
Logan And Aidan's Presentation
Levels of Development. Indicators for Measuring Level of Development Infrastructure The basic foundations of an economy Transportation, sanitation, education,
From Exploring the Titanic A Floating Layer Cake.
Introduction to Dot Plots. There are two types of data we will be looking at Categorical Data - places someone or something into several groups or categories.
Sociolinguistic Patterns Social Class AgeGenderStyle Network Social Dimensions of concern.
Beginners statistics Assoc Prof Terry Haines. 5 simple steps 1.Understand the type of measurement you are dealing with 2.Understand the type of question.
The goal of the project is to predict the survival of passengers based off a set of data. To do this we train a prediction system.
(As if you didn’t have enough of them already…). Population Pyramids A graphic device that shows gender and age distribution of a population Here is a.
Titanic By Teresa Clohessy. Introduction The Titanic sailed to New York in 1912 but did not reach it. It picked up people from Belfast, Southampton, Cherbourg.
Titanic and Decision Trees Supplement. Titanic Predictions and Decision Trees Variable Selection Approaches – Hypothesis Driven – Data Driven – Kitchen.
Probability and Statistics AMP Institutes & Workshops Saturday, April 4 th, 2015 Trey Cox. Ph. D. Mathematics Faculty Chandler-Gilbert Community College.
GROUP GOAL Learn and understand python programing language Libraries: Pandas Numpy SKlearn Use machine learning algorithms Decision trees Random Forests.
Sinus Cancer (C30, C31): One-, Five- and Ten-Year Net Survival (%), Adults Aged 15-90, England 1-Year 5-Year 10-Year Ad Hoc Sex Survival (%)
Titanic B y Jason Bothwell.
Box and Whisker Plots or Boxplots
Hypothesis Testing.
Hypopharyngeal Cancer (C12, C13):
Predict whom survived the Titanic Disaster
Salivary Gland Cancer (C07, C08):
Gender pay gap report As of 5 April 2017.
Jeopardy Final Jeopardy Chapter 1 Chapter 2 Chapter 3 Chapter 4
Relations in Categorical Data
Box and Whisker Plots 50% Step 1 – Order the series.
Application of Logistic Regression Model to Titanic Data
Oral Cavity Cancer (C03, C04, C05, C06):
Comparing Statistical Data
Soft Tissue Sarcoma: Relative Survival (%), Adults Aged , UK 1-Year
Tongue Cancer (C01, C02 excluding C02.4):
Displaying and Describing Categorical data
Decision trees MARIO REGIN.
Exercise 1: Entering data into SPSS
Oropharyngeal Cancer (C09, C10, C02.4):
Presentation transcript:

Submit Predictions Statistics & Analysis Data Management Hypotheses Goal Get Data Predict whom survived the Titanic Disaster

+ Goal: Achieve High Prediction Score Score = Number of Passengers in Test Dataset Correctly Predict Passenger’s Fate

Submit Predictions Statistics & Analysis Data Management Hypotheses Goal Get Data Predict whom survived the Titanic Disaster Woman and Children First

Training and Test Data Training Data N=891 39% Survived Test Data N=418 All Titanic Passengers N= 2,223 All Employees Subset of Current Employees All Customers Subset of Customers Develop Model

VariableDescriptionTypeData pclassPassenger ClassCategorical, Ordinal 1 = 1st; 2 = 2nd; 3 = 3 rd Pclass is a proxy for socio-economic status 1st ~ Upper; 2nd ~ Middle; 3rd ~ Lower nameNameText Sex Categorical ageAgeNumeric sibspNumber of Siblings/Spouses AboardInteger parchNumber of Parents/Children AboardInteger ticketTicket NumberText farePassenger FareNumeric cabinCabinText embarkedPort of EmbarkationCategoricalC = Cherbourg; Q = Queenstown; S = Southampton Predictor Variables

Submit Predictions Statistics & Analysis Data Management Hypotheses Goal Get Data Predict whom survived the Titanic Disaster Woman and Children First Read dataset into Excel, R, etc

Datasets: Training and Test Develop Model Using Training Dataset and Apply to Test Data

Submit Predictions Statistics & Analysis Data Management Hypotheses Goal Get Data Predict whom survived the Titanic Disaster Woman and Children First Read dataset into Excel, R, etc Some Age Missing Data, Analyze Gender Only

Gender Model Training Data Test Data Develop Model

Submit Model

Leaderboard

Submit Predictions Statistics & Analysis Data Management Hypotheses Goal Get Data Predict whom survived the Titanic Disaster Woman and Children First Read dataset into Excel, R, etc Some Age Missing Data, Analyze Gender Only 74% Women, 19% Men 320 / 418 = 76.5%