The presidential Analysis

Slides:



Advertisements
Similar presentations
Request Dispatching for Cheap Energy Prices in Cloud Data Centers
Advertisements

SpringerLink Training Kit
Luminosity measurements at Hadron Colliders
From Word Embeddings To Document Distances
Choosing a Dental Plan Student Name
Virtual Environments and Computer Graphics
Chương 1: CÁC PHƯƠNG THỨC GIAO DỊCH TRÊN THỊ TRƯỜNG THẾ GIỚI
THỰC TIỄN KINH DOANH TRONG CỘNG ĐỒNG KINH TẾ ASEAN –
D. Phát triển thương hiệu
NHỮNG VẤN ĐỀ NỔI BẬT CỦA NỀN KINH TẾ VIỆT NAM GIAI ĐOẠN
Điều trị chống huyết khối trong tai biến mạch máu não
BÖnh Parkinson PGS.TS.BS NGUYỄN TRỌNG HƯNG BỆNH VIỆN LÃO KHOA TRUNG ƯƠNG TRƯỜNG ĐẠI HỌC Y HÀ NỘI Bác Ninh 2013.
Nasal Cannula X particulate mask
Evolving Architecture for Beyond the Standard Model
HF NOISE FILTERS PERFORMANCE
Electronics for Pedestrians – Passive Components –
Parameterization of Tabulated BRDFs Ian Mallett (me), Cem Yuksel
L-Systems and Affine Transformations
CMSC423: Bioinformatic Algorithms, Databases and Tools
Some aspect concerning the LMDZ dynamical core and its use
Bayesian Confidence Limits and Intervals
实习总结 (Internship Summary)
Current State of Japanese Economy under Negative Interest Rate and Proposed Remedies Naoyuki Yoshino Dean Asian Development Bank Institute Professor Emeritus,
Front End Electronics for SOI Monolithic Pixel Sensor
Face Recognition Monday, February 1, 2016.
Solving Rubik's Cube By: Etai Nativ.
CS284 Paper Presentation Arpad Kovacs
انتقال حرارت 2 خانم خسرویار.
Summer Student Program First results
Theoretical Results on Neutrinos
HERMESでのHard Exclusive生成過程による 核子内クォーク全角運動量についての研究
Wavelet Coherence & Cross-Wavelet Transform
yaSpMV: Yet Another SpMV Framework on GPUs
Creating Synthetic Microdata for Higher Educational Use in Japan: Reproduction of Distribution Type based on the Descriptive Statistics Kiyomi Shirakawa.
MOCLA02 Design of a Compact L-­band Transverse Deflecting Cavity with Arbitrary Polarizations for the SACLA Injector Sep. 14th, 2015 H. Maesaka, T. Asaka,
Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,
Fuel cell development program for electric vehicle
Overview of TST-2 Experiment
Optomechanics with atoms
داده کاوی سئوالات نمونه
Inter-system biases estimation in multi-GNSS relative positioning with GPS and Galileo Cecile Deprez and Rene Warnant University of Liege, Belgium  
ლექცია 4 - ფული და ინფლაცია
10. predavanje Novac i financijski sustav
Wissenschaftliche Aussprache zur Dissertation
FLUORECENCE MICROSCOPY SUPERRESOLUTION BLINK MICROSCOPY ON THE BASIS OF ENGINEERED DARK STATES* *Christian Steinhauer, Carsten Forthmann, Jan Vogelsang,
Particle acceleration during the gamma-ray flares of the Crab Nebular
Interpretations of the Derivative Gottfried Wilhelm Leibniz
Advisor: Chiuyuan Chen Student: Shao-Chun Lin
Widow Rockfish Assessment
SiW-ECAL Beam Test 2015 Kick-Off meeting
On Robust Neighbor Discovery in Mobile Wireless Networks
Chapter 6 并发:死锁和饥饿 Operating Systems: Internals and Design Principles
You NEED your book!!! Frequency Distribution
Y V =0 a V =V0 x b b V =0 z
Fairness-oriented Scheduling Support for Multicore Systems
Climate-Energy-Policy Interaction
Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,
Ch48 Statistics by Chtan FYHSKulai
The ABCD matrix for parabolic reflectors and its application to astigmatism free four-mirror cavities.
Measure Twice and Cut Once: Robust Dynamic Voltage Scaling for FPGAs
Online Learning: An Introduction
Factor Based Index of Systemic Stress (FISS)
What is Chemistry? Chemistry is: the study of matter & the changes it undergoes Composition Structure Properties Energy changes.
THE BERRY PHASE OF A BOGOLIUBOV QUASIPARTICLE IN AN ABRIKOSOV VORTEX*
Quantum-classical transition in optical twin beams and experimental applications to quantum metrology Ivano Ruo-Berchera Frascati.
The Toroidal Sporadic Source: Understanding Temporal Variations
FW 3.4: More Circle Practice
ارائه یک روش حل مبتنی بر استراتژی های تکاملی گروه بندی برای حل مسئله بسته بندی اقلام در ظروف
Decision Procedures Christoph M. Wintersteiger 9/11/2017 3:14 PM
Limits on Anomalous WWγ and WWZ Couplings from DØ
Presentation transcript:

The presidential Analysis By Group 7 1.Nishant Sharma 2.Karthik Raman 3.Aditi Mukherjee 4.Nandkishor Patil

Data Overview The analysis primarily consists of three Datasets- This dataset is a collection of state and national polls conducted from November 2015-November 2016 on the 2016 presidential election. Data on the raw and weighted poll results by state, date, pollster, and pollster ratings are included. This contains 27 relevant socio geographical survey variables. The original dataset is from the FiveThirtyEight 2016 Election Forecast. Poll results were aggregated from HuffPost Pollster, RealClearPolitics, polling firms and news reports. 1.Presidential Polls

Data Overview: 2. Primary Results- This contains data relevant for the 2016 US Presidential Election, including up-to-date primary results of 8 variables. Each row contains the votes and fraction of votes that a candidate received in a given county's primary. 3. County table- This contains categories such as population, age groups,race,gender,income,education by each county.

PROBLEM STATEMENTS AND OBJECTIVES Which are the major contributors towards the Adjusted Poll Prediction of the winner of Presidential Election 2016? Checking non variance and biasness of the survey. Which variables are major contributors towards a county voting for a particular party? Predicting and selecting best model for who will win a county based on demographics Random Forest. Naïve Baye’s. A correct prediction from the big-name surveys from simple mathematics (Surprise Analysis?) Which model classifies best the winner class for Presidential Polls Survey? Neural Networks

STATISTICAL INTERPRETATION 5 Number summary for Primary Results dataset-

STATISTICAL INTERPRETATION Best Fit Regression model for Surveyor’s opinion on Adjusted Trump Poll

STATISTICAL INTERPRETATION Residue Fitted Plot for our model:

STATISTICAL INTERPRETATION Detecting Outliers in the dataset: --With respect to prior grades of surveyors --With respect to population

CLASSIFICATION ANALYSIS OF PRIMARIES 2.)Applying Naïve Bayes Classifier: Using :winner ~ income + hispanic + white + college + density.

CLASSIFICATION ANALYSIS OF PRIMARIES Confusion matrix: CLASSIFICATION ANALYSIS OF PRIMARIES 2.)Applying Naïve Bayes Classifier: Using :winner ~ income + hispanic + white + college + density.

CLASSIFICATION ANALYSIS OF PRIMARIES 1.)Applying Random forest: Using :winner ~ income + hispanic + white + college + density. it has as roughly 70% accuracy. The blue line represent the error for Cruz, red is Trump and green is Rubio. Rubio’s errors are more erratic since we only have a few datapoints for him. Confusion matrix:

ANALYSIS INSIGHTS AND GRAPHS 1.)Raw VS Adjusted Polls over time: Clintons adjustments were much higher.

ANALYSIS INSIGHTS AND GRAPHS 2.) Polling with respect to surveyors(grade wise): Lower grade pollsters were better at predicting, perhaps due to their lower adjustments.

ANALYSIS INSIGHTS AND GRAPHS 3.)Standard Deviation between polls: High Standard deviation for both.

Predictive Analysis Here in order to predict whose winning from each pollster our dataset we used a simple formula: If(adjClinton-adjTrump)>0 then make the “winner "column as Clinton, Else Trump. Correctness of the formula is confirmed by this map. Trump Clinton

Predictive Analysis: Finding the best model for classification ? 1.) Applying Neural Networks to Presidential Data for our Formula: Two input nodes, stepmax to 1e9 Single hidden layer(3 nodes). Trainset(70% of data). Testset(30% of data) Confusion matrix: Result: Ann is not efficient for classification as it only predicts partially.

Predictive Analysis: Finding the best model for classification ? Confusion matrix for naïve bayes: Predictive Analysis: Finding the best model for classification ? 2.) Applying Naïve Baye’s to Presidential Data for our Formula: Folds=10,Trainset(90%),test(10%). 82% accuracy Confusion matrix for SVM’s: 3.) Applying Support Vector Machine’s: Folds=10,Trainset(90%),test(10%). 84.5% accuracy

Predictive Analysis: (extended) Box and Whisker Plots  Svm’s have the highest mean accuracy

Predictive Analysis: (extended) Parallel Plots: Predictive Analysis: (extended) Scatterplot Matrix 1.) Each trial of each cross validation fold behaved for svm and naïve bayes in a random manner. 2.)Svm and naïve bayes are strongly correlated to a certain extent.

Conclusion The Adjusted Poll was mostly influenced by the grade of the surveyor, population type, sample size and poll weightage apart from the raw poll counts. Surprisingly lower grade surveyors and v type populations consistently predicted the actual winner in most cases. 3. We ran two classification models to see which variables are major contributors in primaries and the main takeaway however is, Donald Trump seems to have a much broader appeal than his two main rivals, at least among Republican primary voters and he is most successful in counties that have: low median income low college attainment large(er) hispanic population. 4. Higher Adjustment in votes for Clinton than trump, perhaps due to media bias. 5. Formulation of class labels which match the final outcome of the 2016 election results, using adjusted votes. 6. Using the above prediction formula we used three classifiers to find the best result, and concluded that naïve Bayes and SVM’s both works quite efficiently in prediction on test set.