Rise of the Machines Jack Concanon Machine learning has become a huge business in the past decade, you will likely interact with a learning machine every day. Many large companies will be using some form of machine learning to deal with the large amounts of data produced by an internet connected world. In this talk I will give a brief introduction to what machine learning is, the history of it, why it is useful and some examples of machine learning implementations.
An example of a clustering problem An example of a clustering problem. Amazon groups items together by using information from customers such as their viewing history and purchasing history. This allows them to suggest other items that a customer may be interested in. Additionally it uses on-line learning, the pool of experience is constantly being updated to reflect changes in user behaviour.
Google mines all searches to allow it to suggest possible searches Google mines all searches to allow it to suggest possible searches. Hilarity ensues.
All examples of on-line unsupervised learning All examples of on-line unsupervised learning. Will match a user to a genre of films.
“A Field of study that gives computers the ability to learn without being explicitly programmed - Arthur Samuel Arthur Samuel - pioneer of artificial intelligence. Created the first checkers program for IBM, recognised as the first board game playing program with learning capabilities. The program could use previous games as a source of moves and was trained by playing against Arthur. He recognised that by providing a machine with enough training data it could use this experience when calculating possible moves.
“A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E - Tom M. Mitchell Tom Mitchell, an American computer scientist gave a more formal definition that allows us to see how we can create a mathematical model of what represents a learning machine.
Can Machines Think? Can machines do what we do? Can they pretend? How much help do they need? Alan Turing asked if machines can think, this is currently not possible but machines that follow the same principles as a human brain can be made. The question of machine learning is better asked as can machines replicate what a human brain does. Much like we learn new things through repetition a machine can learn a specific output based on input. This is an example of supervised learning/ classification. A learning machine is extremely dependant on the data provided; if the data does not contain any obvious connection then no useful learning will take place. In fact, the opposite will happen whereby a learning machine will make connections in the data that are not representative of the problem. It is not possible to remove the human element from machine learning, even the best learning machines in the world such as IBM’s Watson. Machines are not capable of asking questions, only analysing data to give approximate answers.
Why Now? http://internet-map.net Why should we be interested in machine learning? There is a massive amount of data available, computers are available at all times and the Internet is literally everywhere. How much data is produced each minute? YouTube users upload 48 hours of video, Facebook users share 684,478 pieces of content, Instagram users share 3,600 new photos, and Tumblr sees 27,778 new posts published. How do we sort and categorize this information? This is an ideal scenario for machines to learn on. By using machine learning for things such as ‘market basket analysis’ Amazon is able to offer recommended items. Netflix is able to offer personalised suggestions. Without machine learning it would be an incredible task to provide this level of feed back for each user. A constant increase in machine power means that previously unusable algorithms are now possible to run in near real time. The neural network was originally regarded as being capable but impractical due to the computing resources required. The image above comes from internet-map.net, the large circles in the middle are Yahoo, Google, Facebook and Bing. All user interactive websites as their core business. These websites rely on machine learning to operate and without it would not be possible. Machine learning has the ability to automate more and more human responsibilities, even ones that require some sort of intelligence, for example human hand writing recognition. Postal services can now automatically sort post by using hand writing recognition instead of humans to sort manually. This provides a massive speed and accuracy increase. http://internet-map.net
Supervised Learning Requires labelled training examples Function approximation Based on an input vector you should receive an expected output Linear regression is the simplest Line of best fit is a valid tool for prediction Labelled training examples, for example a set of emails flagged as spam or not spam. A learning algorithm could learn based on these inputs and then give a reasonable prediction for future unknown inputs. One of the most basic machine learning tools is linear regression, simply creates a line of best fit according to the X and Y parameters. The line of best fit can then be used to give a reasonable prediction for a new set on inputs. Function approximation
Unsupervised Learning Unclassified data Finding hidden structures Clustering Data mining Unsupervised learning is very useful is we have a large quantity of unknown data and we want to see if there are any patterns in the data. For example if we have the contents of shopping baskets from super markets, we can then use data mining techniques to identify common trends, such as people buying certain products together. This information can then be used to optimise the layout of the store to increase sales. A common urban legend is that Walmart found that people who bought nappies were also buying beer. Unsupervised learning is commonly used online in the form of online learning, for example Amazon can keep up to date with latest shopping trends and train it’s algorithms on the fly by constantly training it’s machines.
Linear Regression Example Very simple to implement. Here we plot temperature against the chirp rate of crickets. There is a positive correlation which allows us to predict either the temperature or the chirp rate based on only one input. We add a column of ones to stop the algorithm from going from origin but rather the first element.
Neural Networks A neural network seeks to represent an actual process in the brain. A neural network is completely generic, i.e. it does not know what data it is representing. A neural network must always have an input layer and an output layer, the hidden layer is frequently used as it helps to represent all types of data. A two layer network is only capable of linear and logistic regression. The input layer contains a node for each feature in your data (number of columns in your input vector), the output node returns a value – either classification (0 or 1) or a regression (e.g. a price). The hidden layer transforms the input layer into something the output layer can use. It can detect various features in the input data, for example a hidden layer might become sensitive to eyes or wheels in an input picture, this is not explicitly programmed but inferred by the neural network. The input layer could be something basic like pixel intensity but a hidden layer will be able to detect features within that data. It provides a layer of abstraction from the input. It represents data through connection weights (values between nodes), these are randomly initialised at start up. It is truly a black box system, given the same input data a neural network will be entirely different internally. It requires several iterations of training and adjusting to get a neural network how you want it.
Neural Network Visualisation Here we can see how input data has an effect on the weights connecting the layers. Certain features in the input data will create ‘stronger’ links as they are more frequently activated. Then when we apply an unknown input we can make a good guess as to the classification or value based on what internal weights it activates.
Neural networks are incredibly slow – this is due to their requiring a large amount of training data to be accurate. The most common methods of preparing image data is to scale the image (32 x 32 is most frequent) and then to remove colour information. Generally colour does not help improve the quality of the neural network unless colour is something you are interested in as a feature. Removing the colour information from training data actually speeds up the network convergence quite dramatically. As an example of the amount of training data required and the processing power required to make a good network – Google’s facial recognition testing used 10 million training images and ran the neural network across 1000 machines (16,000 cores total) for three days in order to train the network.
Example – Cat or Hat? Here’s an example of a tool I wrote that can learn to characterise images with a common subject. In this example we will train the network on a directory of cat pictures. The software takes each image, resizes and greyscales the image then trains the neural network until we get a root mean square error less than 0.001. We can then test images by dropping them on the right area of the tool. We classify each training image as a cat, based on the input image we should get a value close to one if it is a cat or far from one if it is not. Google actually created a research paper that studied neural networks for recognising cats and humans, apparently this is very useful for YouTube where a large portion of all videos contain either a cat or a human.
Thanks for Listening! How much data is made each minute? - http://www.visualnews.com/2012/06/19/how-much-data-created-every-minute/ What does a hidden layer do? - https://stats.stackexchange.com/questions/63152/what-does-the-hidden-layer-in-a-neural-network-compute#63163 Animating neural networks using R - http://beckmw.wordpress.com/2013/03/19/animating-neural-networks-from-the-nnet-package/ Google research document on identifying high level features from images of humans and cats – https://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/unsupervised_icml2012.pdf Google research documents on machine learning - http://research.google.com/pubs/ArtificialIntelligenceandMachineLearning.html Coursera Machine Learning - https://www.coursera.org/course/ml