Patterson: Chap 1 A Review of Machine Learning

Patterson: Chap 1 A Review of Machine Learning
Dr. Charles Tappert The information here, although greatly condensed, comes almost entirely from the chapter content.

This Chapter Because the focus of this book is on Deep Learning, this first chapter presents only a rough review of the classical methods employed in machine learning These classical methods are covered in more detail in the Duda textbook

The Learning Machines Definition: Machine Learning is using algorithms to extract information from raw data and represent it in some type of model Deep learning emerged about 2006 and the deep learning systems are now winning the important machine learning competitions

The Learning Machines AI and Deep Learning

The Learning Machines Biological Inspiration
Biological neural networks (brains) contain Roughly 86 billion neurons Over 500 trillion connections between neurons Biological neural networks are much more complex than artificial neural networks (ANN) Main properties of ANNs Basic unit is the artificial neuron (node) We can train ANNs to pass along only useful signals

The Learning Machines What is Deep Learning?
For the purposes of this book we define deep learning as neural networks with a large number of parameters and layers in one of four fundamental network architectures Unsupervised pretrained networks Convolutional neural networks Recurrent neural networks Recursive neural networks

The Learning Machines Going Down the Rabbit Hole
Deep learning has penetrated the computer science consciousness beyond most techniques in recent history Top-flight accuracy with deep learning models This initiates many philosophical discussions Can machines be creative? What is creativity? Can machines be as intelligent as humans?

Framing the Questions The basics of machine learning are best understood by asking the correct questions What is the input data? What kind of model is best for the data? What kind of answer would we like to elicit from new data based on this model?

Math Behind Machine Learning
Linear Algebra Scalars, vectors, matrices, tensors, hyperplanes, solving systems of equations Probability and Statistics Conditional probabilities, Bayes Theorem, probability distributions Students are expected to have the math background for this course

How Does Machine Learning Work?
Fundamentally, machine learning is based on algorithmic techniques to minimize the error in Ax = b through optimization where A is a matrix of input row vectors x is the weight vector b is a column vector of output labels Essentially, we want to determine x = A-1b but it is usually not possible to invert A

How Does Machine Learning Work? Regression, especially Linear
Attempts to find a function that describes the relationship between input x and output y For linear regression, y = a + Bx

How Does Machine Learning Work? Classification
The model attempts to find classes based on a set of input features The dependent variable y is categorical rather than numerical Binary classifier is the most basic For example, someone has a disease or not

How Does Machine Learning Work? Clustering
Clustering is unsupervised learning that usually involves a distance measure and iteratively moves similar items more closely together At the end of the process, the items are clustered densely around n centroids

How Does Machine Learning Work? Underfitting and Overfitting

How Does Machine Learning Work? Optimization
Parameter optimization is the process of adjusting weights to produce accurate estimates of the data Convergence of an optimization algorithm finds the parameters providing the smallest error across the training samples The optimization function guides the learning toward a solution of least error

How Does Machine Learning Work? Convex Optimization
Convex optimization deals with convex cost functions

How Does Machine Learning Work? Gradient Descent
Gradient is a vector of n partial derivatives of the function f, a generalization of the 1D derivative Problems – local minima and non-normalized features

How Does Machine Learning Work? Stochastic Gradient Descent (SGD)
Stochastic gradient descent calculates gradient and updates parameter vector after each training sample Whereas gradient descent calculates the gradient and updates the parameter vector over all training samples The SGD method speeds up learning A variant of SGD, called mini-batch, uses more than a single training sample per iteration and leads to smoother convergence

How Does Machine Learning Work? Generative vs Discriminative Models
Two major model types – generative & discriminative Generative models understand how the data were created in order to generate an output These models generate likely output, such as art similar to that of a well-known artist Discriminative models simply give us a classification or category for a given input These models are typically used for classification in machine learning

Logistic Regression Logistic regression is a well-known linear classification model Handles binary classification as well as multiple labels The dependent variable is categorical (e.g., classification) We have three components to solve for our parameter vector x A hypothesis about the data A cost function, maximum likelihood estimation An update function, derivative of the cost function

Logistic Regression The Logistic Function
The logistic function is defined as This function is useful because it maps the input range of -infinity to +infinity into the output range 0-1, which can be interpreted as a probability

Logistic Regression Understanding Logistic Regression Output
The logistic function is often denoted with the Greek letter sigma because the graph representation resembles an elongated “s” whose max and min asymptotically approach 1 and 0, respectively f(x) represents the probability that y equals 1 (i.e., true)

Evaluating Models The Confusion Matrix
Various measures: e.g., Accuracy = (TP+TN)/(TP+TN+FP+FN)

Building an Understanding of Machine Learning
In this chapter, we introduced the core concepts needed for practicing machine learning The core mathematical concept of modeling is based around the equation We looked at the core ideas of getting features into the matrix A, ways to change the parameter vector x, and setting the outcomes in the vector b

Patterson: Chap 1 A Review of Machine Learning

Similar presentations

Presentation on theme: "Patterson: Chap 1 A Review of Machine Learning"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Patterson: Chap 1 A Review of Machine Learning

Similar presentations

Presentation on theme: "Patterson: Chap 1 A Review of Machine Learning"— Presentation transcript:

Similar presentations

About project

Feedback