CSE344/544 Machine Learning Richa Singh Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Some quotes “A breakthrough in machine learning would be worth ten Microsofts” (Bill Gates, Chairman, Microsoft) “Machine learning is the next Internet” (Tony Tether, former director, DARPA) “Machine learning is the hot new thing” (John Hennessy, President, Stanford) “Web rankings today are mostly a matter of machine learning” (Prabhakar Raghavan, former Dir. Research, Yahoo) “Machine learning is going to result in a real revolution” (Greg Papadopoulos, former CTO, Sun)
A scientific field is best defined by the central question it studies.
We might say the defining question of Computer Science is “How can we build machines that solve problems and which problems are inherently tractable/intractable?” Defining question of CS
The question that largely defines Statistics is “What can be inferred from data plus a set of modelling assumptions, with what reliability?” Defining question of Statistics
The field of Machine Learning seeks to answer the question: “How can we build computer systems that automatically improve with experience and what are the fundamental laws that govern all learning processes?” “How can we build computer systems that automatically improve with experience and what are the fundamental laws that govern all learning processes?” What is Machine Learning
This question covers a broad range of learning tasks, How to design autonomous mobile robots that learn to navigate from their own experience How to mine historical medical records to learn which future patients will respond best to which treatments How to build search engines that automatically customize to their user’s interests.
Develop methods that can automatically detect patterns in data, and then to use the uncovered patterns to predict future data or other outcomes of interest [Murphy] Extract important patterns and trends, and understand “what the data says” [Hastie, Tibshirani & Friedman] What is Machine Learning
In ML, we say that a machine learns with respect to a particular task T, performance metric P, and type of experience E, if the system reliably improves its performance P at task T, following experience E. Depending on how we specify T, P, and E, the learning task might also be called by names such as data mining, autonomous discovery, database updating, programming by example, etc. Formally:
Wherever there is no close form solution or direct computer program, ML approaches are widely used Where it is currently explored
Classification: from data to discrete classes Spam filtering, object detection, weather prediction Regression Weather prediction, stock market, real estate Ranking Web search, find similar images Collaborative filtering Recommendation systems Clustering Group similar things, clustering web search results ML (by examples)
Malfunctioning gearboxes have been the cause for CH-46 US Navy helicopters to crash. Although gearbox malfunctions can be diagnosed by a mechanic prior to a helicopter’s take off, what if a malfunction occurs while in-flight, when it is impossible for a human to detect? Machine Learning was shown to be useful in this domain and thus to have the potential of saving human lives! Machine Learning: A Case Study
How did it Work? Consider the following common situation: You are in your car, speeding away, when you suddenly hear a “funny” noise. To prevent an accident, you slow down, and either stop the car or bring it to the nearest garage. The in-flight helicopter gearbox fault monitoring system was designed following the same idea. The difference, however, is that many gearbox malfunction cannot be heard by humans and must be monitored by a machine.
Imagine that, instead of driving your good car, you were asked to drive this truck: Would you know a “funny” noise from a “normal” one? Well, probably not, since you’ve never driven a truck before! While you drove your car during all these years, you effectively learned what your car sounds like and this is why you were able to identify that “funny” noise. So, Where’s the Learning?
Obviously, a computer cannot hear and can certainly not distinguish between a normal and an abnormal sound. Sounds, however, can be represented as wave patterns such as this one: which in fact is a series of real numbers. of real numbers. And computers can deal with strings of numbers! For example, a computer can easily be programmed to distinguish between strings of numbers that contain a “3” in them and those that don’t. What did the Computer Learn?
Machine Learning in Daily Life Google search Google search Suggestions on social networking websites Suggestions on social networking websites spam filtering spam filtering Smart navigation Smart navigation Autopilots Autopilots Auto-parking Auto-parking Automatic cars Automatic cars …
Types of Learning Supervised – training samples are given and classes are known Unsupervised – training samples are given and classes are unknown Reinforcement Learning - training samples are not given but actions are known
1. Problem Description 2. Choosing the Training Experience 3. Choosing the Target Function 4. Choosing a Representation for the Target Function 5. Choosing a Function Approximation Algorithm 6. Final Design Designing a Learning System: An Example
What algorithms are available for learning a concept? How well do they perform? How much training data is sufficient to learn a concept with high confidence? When is it useful to use prior knowledge? Are some training examples more useful than others? What are best tasks for a system to learn? What is the best way for a system to represent its knowledge? Issues in Machine Learning 19
Topics to be Covered Concept learning Instance based learning Decision trees Neural network Support vector machine and kernel machines Reinforcement learning Genetic algorithm Evolutionary learning algorithms Boosting and bagging Unsupervised learning Regression Discussed in Pattern Recognition course, we will study some advanced versions here along with a recap of basics
Administrative Grading Assignments, Critiques and Seminar: 30%E Exams: 35% Project: 25% Quiz (surprise and announced): 10% Textbook: Tom Mitchell, Machine Learning Reference Books: C. Bishop, Pattern Recognition and Machine Learning, Springer K. Murphy, Machine Learning: a Probabilistic Perspective, MIT Press Project team size: 2 students Assignments: individually
Continuous evaluation and absolute grading Zero tolerance policy for cheating and plagiarism First cheating (in assignments/reviews/…): One grade lower Second cheating: F grade Any cheating in exams: F grade Course website: / / / Course mailing list:
Concepts required from PR course General idea of pattern classification Statistical pattern recognition: A review by A. Jain, R.P.W. Duin and J. Mao (already ed to the students who pre-registered and will be made available on the course website) Evaluation techniques Slides will be made available on the course website Recap lecture can be taken if you require