Assignment 2 (in lab) Peter Fox and Greg Hughes

Slides:



Advertisements
Similar presentations
BENJAMIN GAMBOA, RESEARCH ANALYST CRAFTON HILLS COLLEGE RESEARCHING: ALPHA TO ZETA.
Advertisements

Mark Dixon Page 1 13 – Coursework 2 Debrief: Spanish Words.
Mark Dixon, SoCCE SOFT 131Page 1 13 – Coursework 2 Debrief: Numbers Game.
Lecture 23 CSE 331 Oct 24, Temp letter grades assigned See the blog post for more details.
Evaluation of MineSet 3.0 By Rajesh Rathinasabapathi S Peer Mohamed Raja Guided By Dr. Li Yang.
1 Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 3b, February 7, 2014 Lab exercises: datasets and data infrastructure.
B.Ramamurthy. Data Analytics (Data Science) EDA Data Intuition/ understand ing Big-data analytics StatsAlgs Discoveries / intelligence Statistical Inference.
MJ2A Ch 5.10 – Arithmetic & Geometric Sequences. Bellwork Write and solve the following: 1.4 1/6 = r + 6 1/4 2.1/3 + h = 5/6 3.5/6q = 15/42 4.7/8d = 56.
1 Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 4b, February 14, 2014 Lab exercises: regression, kNN and K-means.
Applied Econometrics 1 Vincent Hogan
1 Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 4b, February 20, 2015 Lab: regression, kNN and K- means results, interpreting and evaluating models.
LAB.  Name : Maram ALdakheel   Web page:  O.H : --  My schedule:
Identifying At-Risk Students Gary R. Pike Information Management & Institutional Research Indiana University Purdue University Indianapolis.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill.
1 Peter Fox Data Analytics – ITWS-4600/ITWS-6600 Week 3b, February 12, 2016 Lab exercises /assignment 2.
1 Peter Fox Data Analytics – ITWS-4600/ITWS-6600 Week 12b, April 22, 2016 Cross-validation and Local Regression Lab.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.
Peter Fox and Greg Hughes Data Analytics – ITWS-4600/ITWS-6600
Data Analytics – ITWS-4600/ITWS-6600
Lab exercises: beginning to work with data: filtering, distributions, populations, significance testing… Peter Fox and Greg Hughes Data Analytics – ITWS-4600/ITWS-6600.
Group 1 Lab 2 exercises /assignment 2
Please hand in Project 4 To your TA.
FAQ 11 How can I look up my results on the website?
Classification, Clustering and Bayes…
Data Analytics – ITWS-4963/ITWS-6965
Informatics 291S Literature Survey in Software Engineering
Please me that you were in attendance today
Naviance: Do What You Are Personality Survey
Prepared by Kimberly Sayre and Jinbo Bi
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2017 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.
Peter Fox and Greg Hughes Data Analytics – ITWS-4600/ITWS-6600
Data Analytics – ITWS-4600/ITWS-6600/MATP-4450
Group 1 Lab 2 exercises and Assignment 2
Data Analytics – ITWS-4600/ITWS-6600/MATP-4450
Weighted kNN, clustering, “early” trees and Bayesian
How to see Assessment Task grades and feedback
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2016 Room 150 Harvill Building 10: :50 Mondays, Wednesdays.
How to see Assessment Task grades and feedback
Problem Solving Lab – Part C
Cross-validation and Local Regression Lab
Classification, Clustering and Bayes…
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill.
Attention ALL Mechanical and Civil Engineering Students
Local Regression, LDA, and Mixed Model Lab
Python 19 Mr. Husch.
ITWS-4600/ITWS-6600/MATP-4450/CSCI-4960
Lecturer’s desk Projection Booth Screen Screen Harvill 150 renumbered
Lab weighted kNN, decision trees, random forest (“cross-validation” built in – more labs on it later in the course) Peter Fox and Greg Hughes Data Analytics.
Year 2 Spring Term Week 8 Lesson 4
Cross-validation Brenda Thomson/ Peter Fox Data Analytics
Year 2 Spring Term Week 1 Lesson 2
Cross-validation and Local Regression Lab
Cross-validation and Local Regression Lab
CMSC201 Computer Science I for Majors Lecture 24 – Sorting
MEMORY PERSPECTIVES: DATA ANALYSIS Week 9 Practical.
Year 2 Spring Term Week 8 Lesson 5
Classification, Clustering and Bayes…
Local Regression, LDA, and Mixed Model Lab
Year 2 Spring Term Week 1 Lesson 2
Year 2 Spring Term Week 8 Lesson 4
Python 19 Mr. Husch.
Year 2 Spring Term Week 8 Lesson 5
Data Analytics – ITWS-4600/ITWS-6600/MATP-4450
ITWS-4600/ITWS-6600/MATP-4450/CSCI-4960
Electronic Census Rosters
Group 1 Lab 2 exercises and Assignment 2
Exams Today: 5th and 6th Period
Peter Fox Data Analytics ITWS-4600/ITWS-6600/MATP-4450/CSCI-4960
Phase I Research & Design Phase II Development & Implementation
Presentation transcript:

Assignment 2 (in lab) Peter Fox and Greg Hughes Data Analytics – ITWS-4600/ITWS-6600 Week 3, February 2, 2017

Labs Regression kNN Kmeans 1 each for Assignment 2 New multivariate dataset kNN New Abalone dataset Kmeans (Sort of) New Iris dataset 1 each for Assignment 2

The Dataset(s) http://aquarius.tw.rpi.edu/html/DA Some new ones; dataset_multipleRegression.csv, abalone.csv Code fragments, i.e. they will not run as-is, on the following slides as Lab3b_knn1_2016.R, etc.

How does this work? Following slides have 3 lab assignments for you to complete. These should be completed individually Once you complete (one or all), please raise your hand or approach me, or Rahul to review what you obtained (together these =10% of your grade) There is nothing to hand in If you do not complete part/all today that is okay but you will need to schedule a time to show your results

Refer to Tuesday slides and Script fragments on website..

Regression (1) Retrieve this dataset: dataset_multipleRegression.csv Using the unemployment rate (UNEM) and number of spring high school graduates (HGRAD), predict the fall enrollment (ROLL) for this year by knowing that UNEM=9% and HGRAD=100,000. Repeat and add per capita income (INC) to the model. Predict ROLL if INC=$30,000 Summarize and compare the two models. Comment on significance

Classification (2) Retrieve the abalone.csv dataset Predicting the age of abalone from physical measurements. The age of abalone is determined by cutting the shell through the cone, staining it, and counting the number of rings through a microscope: a boring and time-consuming task. Other measurements, which are easier to obtain, are used to predict the age. Perform knn classification to get predictors for Age (Rings). Interpretation not required.

Clustering (3) The Iris dataset (in R use data(“iris”) to load it) The 5th column is the species and you want to find how many clusters without using that information Create a new data frame and remove the fifth column Apply kmeans (you choose k) with 1000 iterations Use table(iris[,5],<your clustering>) to assess results