Machine Learning Usman Roshan Dept. of Computer Science NJIT.

Machine Learning Usman Roshan Dept. of Computer Science NJIT

What is Machine Learning? “Machine learning is programming computers to optimize a performance criterion using example data or past experience.” Intro to Machine Learning, Alpaydin, 2010 Examples: – Facial recognition – Digit recognition – Molecular classification

A little history 1946: First computer called ENIAC to perform numerical computations 1950: Alan Turing proposes the Turing test. Can machines think?Can machines think 1952: First game playing program for checkers by Arthur Samuel at IBM. Knowledge based systems such as ELIZA and MYCIN. 1957: Perceptron developed by Frank Roseblatt. Can be combined to form a neural network. Early 1990’s: Statistical learning theory. Emphasize learning from data instead of rule-based inference. Current status: Used widely in industry, combination of various approaches but data-driven is prevalent.

Example up-close Problem: Recognize images representing digits 0 through 9 Input: High dimensional vectors representing images Output: 0 through 9 indicating the digit the image represents Learning: Build a model from “training data” Predict “test data” with model

Data model We assume that the data is represented by a set of vectors each of fixed dimensionality. Vector: a set of ordered numbers We may refer to each vector as a datapoint and each dimension as a feature Example: – A bank wishes to classify humans as risky or safe for loan – Each human is a datapoint and represented by a vector – Features may be age, income, mortage/rent, education, family, current loans, and so on

Machine learning datasets NIPS 2003 feature selection contest mldata.org UCI machine learning repository

Machine Learning techniques we will learn in the course Bayesian classification Univariate and multivariate Linear regression Maximum likelihood estimation Naïve-Bayes Feature selection Dimensionality reduction PCA Clustering Nearest neighbor Decision trees and random forests Linear discrimination Logistic regression Support vector machines Kernel methods Regularized risk minimization Hidden Markov models Graphical models Perceptron and neural networks

In practice Combination of various methods Parameter tuning – Error trade-off vs model complexity Data pre-processing – Normalization – Standardization Feature selection – Discarding noisy features

Background Basic linear algebra and probability – Vectors – Dot products – Eigenvector and eigenvalue See Appendix of textbook for probability background – Mean – Variance – Gaussian/Normal distribution

Assignments Implementation of basic classification algorithms with Perl and Python – Nearest Means – Naïve Bayes – K nearest neighbor – Cross validation scripts Experiment with various algorithms on assigned datasets

Project Experiment with NIPS 2003 feature selection datasets – Goal: achieve highest possible prediction accuracy with scripts we will develop through the course Predict labels of given datasets with two different classifiers

Exams One exam in the mid semester Final exam What to expect on the exams: – Basic conceptual understanding of machine learning techniques – Be able to apply techniques to simple datasets – Basic runtime and memory requirements – Simple modifications

Grade breakdown Assignments and project worth 50% Exams worth 50%

Machine Learning Usman Roshan Dept. of Computer Science NJIT.

Similar presentations

Presentation on theme: "Machine Learning Usman Roshan Dept. of Computer Science NJIT."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Machine Learning Usman Roshan Dept. of Computer Science NJIT.

Similar presentations

Presentation on theme: "Machine Learning Usman Roshan Dept. of Computer Science NJIT."— Presentation transcript:

Similar presentations

About project

Feedback