Download presentation
Presentation is loading. Please wait.
Published byHilkka Annikki Laakso Modified over 5 years ago
1
Machine Learning – a Probabilistic Perspective
Introduction Cui Jiaqi
2
The goal of machine learning is to develop methods that can automatically detect patterns in data, and then to use the uncovered patterns to predict future data or other outcomes of interest.
3
Types of machine learning
TYPE1: the predictive or supervised learning approach TYPE2: descriptive or unsupervised learning approach TYPE3: reinforcement learning Learning how to act or behave when given occasional reward or punishment signals.
4
Supervised learning Classification :
- to make predictions on novel inputs - MAP (maximum a posteriori ) estimate Regression:like classification except the response variable is continuous
5
Unsupervised learning
just given output data, without any inputs two differences from the supervised case: unsupervised learning is unconditional density estimation instead of is a vector of features, so we need to create multivariate probability models.
6
Discovering clusters to estimate the distribution over the number of clusters to estimate which cluster each point belongs to represent the cluster to which data point i is assigned.
7
Discovering latent factors
dimensionality reduction: principal components analysis Discovering graph structure a set of correlated variables to discover which ones are most correlated with which others. graph G to discover new knowledge, and to get better joint probability density estimators. Matrix completion
8
K-nearest neighbors are the (indices of the) K nearest points to x in D is the indicator function defined as follows: Euclidean distance not work well with high dimensional inputs
9
Linear regression The connection between linear regression and Gaussians polynomial regression
10
Logistic regression generalize linear regression to the (binary) classification replace the Gaussian distribution for y with a Bernoulli distribution compute a linear combination of the inputs, but then pass this through a function that ensures 0 ≤ μ(x) ≤ 1 by defining
11
Logistic regression
12
Model selection misclassification rate
about 80% of the data for the training set, and 20% for the validation set cross validation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.