Project 1 Binary Classification

Slides:



Advertisements
Similar presentations
Regularized risk minimization
Advertisements

Logistic Regression.
Chapter 8 – Logistic Regression
Chapter 4: Linear Models for Classification
Linear Models Tony Dodd January 2007An Overview of State-of-the-Art Data Modelling Overview Linear models. Parameter estimation. Linear in the.
Effect of Selection Ratio on Predictor Utility Reject Accept Predictor Score Criterion Performance r =.40 Selection Cutoff Score sr =.50sr =.10sr =.95.
Kernel Methods Part 2 Bing Han June 26, Local Likelihood Logistic Regression.
1 Linear Classification Problem Two approaches: -Fisher’s Linear Discriminant Analysis -Logistic regression model.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Collaborative Filtering Matrix Factorization Approach
Modeling the probability of a binary outcome
Solving Systems of Linear Equations Graphically
沈致远. Test error(generalization error): the expected prediction error over an independent test sample Training error: the average loss over the training.
Copyright © 2015, 2011, 2007 Pearson Education, Inc. 1 1 Chapter 4 Systems of Linear Equations and Inequalities.
Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression.
Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
CS Statistical Machine learning Lecture 10 Yuan (Alan) Qi Purdue CS Sept
Introduction to Machine Learning Supervised Learning 姓名 : 李政軒.
Computational Intelligence: Methods and Applications Lecture 23 Logistic discrimination and support vectors Włodzisław Duch Dept. of Informatics, UMK Google:
Chapter 4: Introduction to Predictive Modeling: Regressions
Linear Discriminant Analysis and Its Variations Abu Minhajuddin CSE 8331 Department of Statistical Science Southern Methodist University April 27, 2002.
Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression Regression Trees.
Linear Models for Classification
CHAPTER 10: Logistic Regression. Binary classification Two classes Y = {0,1} Goal is to learn how to correctly classify the input into one of these two.
Machine Learning in Practice Lecture 19 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
1 Chapter 4: Introduction to Predictive Modeling: Regressions 4.1 Introduction 4.2 Selecting Regression Inputs 4.3 Optimizing Regression Complexity 4.4.
Regress-itation Feb. 5, Outline Linear regression – Regression: predicting a continuous value Logistic regression – Classification: predicting a.
Machine Learning 5. Parametric Methods.
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
MACHINE LEARNING 3. Supervised Learning. Learning a Class from Examples Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
CS 2750: Machine Learning Linear Models for Classification Prof. Adriana Kovashka University of Pittsburgh February 15, 2016.
Gist 2.3 John H. Phan MIBLab Summer Workshop June 28th, 2006.
Which list of numbers is ordered from least to greatest? 10 –3, , 1, 10, , 1, 10, 10 2, 10 – , 10 –3, 1, 10, , 10 –3,
LECTURE 05: CLASSIFICATION PT. 1 February 8, 2016 SDS 293 Machine Learning.
Linear Models Tony Dodd. 21 January 2008Mathematics for Data Modelling: Linear Models Overview Linear models. Parameter estimation. Linear in the parameters.
Who am I? Work in Probabilistic Machine Learning Like to teach 
Machine Learning – Classification David Fenyő
Chapter 7. Classification and Prediction
Logistic Regression: To classify gene pairs
Learning Coordination Classifiers
Empirical risk minimization
Session 7: Face Detection (cont.)
10701 / Machine Learning.
CH. 2: Supervised Learning
ECE 5424: Introduction to Machine Learning
Data Mining Lecture 11.
Project 4 User and Movie 2018/11/10.
Project 2 k-NN 2018/11/10.
Probabilistic Models for Linear Regression
Roberto Battiti, Mauro Brunato
Machine Learning Week 1.
Classification Discriminant Analysis
Statistical Learning Dong Liu Dept. EEIS, USTC.
CS 188: Artificial Intelligence
Collaborative Filtering Matrix Factorization Approach
Lasso/LARS summary Nasimeh Asgarian.
Ying shen Sse, tongji university Sep. 2016
دسته بندی با استفاده از مدل های خطی
10701 / Machine Learning Today: - Cross validation,
Generally Discriminant Analysis
Modeling with Dichotomous Dependent Variables
Empirical risk minimization
Project 3 Image Retrieval
Machine Learning – a Probabilistic Perspective
Logistic Regression.
Regression and Correlation of Data
Information Organization: Evaluation of Classification Performance
Presentation transcript:

Project 1 Binary Classification 2018/11/29

Data Input to classifier is 2-D point, output is one of two classes (class 0 is called FALSE, class 1 is called TRUE) Training data consist of 200 points, coordinates are given in xtrain.txt; these points are labelled with classes (in ctrain.txt) Test data consist of 6831 points, coordinates are given in xtest.txt; For each test point, given its probability of appearance (ptest.txt) For each test point, given its probability of belonging to class 1 (c1test.txt) 2018/11/29

Evaluation Use the training data to learn a classifier, and then classify each test point as belonging to FALSE or TRUE Calculate the following criterion where 𝑝(𝑥) is the probability of appearance (ptest.txt) and 𝑝(𝑇|𝑥) is the probability of belonging to class 1 (c1test.txt) 𝐸𝑟𝑟𝑜𝑟𝑅𝑎𝑡𝑒= 𝑥∈{𝑥 𝑖𝑠 𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑒𝑑 𝑎𝑠 𝐹𝐴𝐿𝑆𝐸} 𝑝 𝑥 𝑝(𝑇|𝑥) + 𝑥∈{𝑥 𝑖𝑠 𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑒𝑑 𝑎𝑠 𝑇𝑅𝑈𝐸} 𝑝 𝑥 (1−𝑝(𝑇|𝑥)) 2018/11/29

Basic requirements Use linear regression with basis functions, you can choose any basis functions as you like Calculate the least squares solution Calculate the ridge regression solution with different choices of 𝜆 Use logistic regression with basis functions, you can choose any basis functions as you like Calculate the maximum likelihood solution Use cross validation (within training data) to find out an optimal set of basis functions; for example, you can select from polynomials of different orders, and find out which order performs the best using cross validation; verify whether your choice is correct using the test data 2018/11/29

Plus If you fulfill the basic requirements, you can get at most 7/10 Try to do the following to get additional score (at most 3/10) Use lasso regression (+0.5) For linear regression: Try different sets of basis functions, and find which one is better (+1) Use Fisher’s LDA (+0.5) For logistic regression: Implement Newton-Raphson method by yourself, and use it in your experiments (+0.5) For logistic regression: Use the Bayesian approach (+1) Try to divide the training data into multiple classes (e.g. using k-means), and train a multi-class classifier accordingly, use it on the test data, and convert the multi-class results into binary classes; show the performance (+3) 2018/11/29

Report .pdf is best, .doc and .ppt are acceptable English is encouraged List all references Source code is a plus 2018/11/29