Hierarchical Temporal Memory as a Means for Image Recognition by Wesley Bruning CHEM/CSE 597D Final Project Presentation December 10, 2008
The Grand Scheme A free, on-line resource that allows anyone to find information about symbols; a compendium of symbols, their names, meanings, and histories. Symbols? Yes, symbols! The Star of David (hexagram), the Greek symbol Sigma, the Masonic compass, the Wheel of Dharma, the bass clef, company logos, et cetera Would fill a niche, but a relatively easy enough niche to fill, and one that should be filled sometime. Not for profit.
The Neat Feature Users can search for symbols by drawing or uploading pictures. “What does this mean?” The server(s) will house a program that receives the image and determines which symbol in the database the user desires. Image recognition.
Computer Vision Visual pattern recognition, like understanding language and physically manipulating objects, is difficult for computers. There are no viable algorithms for performing these functions on a computer. 1 For humans, these are easy. 1 J. Hawkins and D. George, “Hierarchical Temporal Memory – Concepts, Theory, and Terminology,” Whitepaper, Numenta Inc.
A Couple of Existing Models “Classic” artificial neural networks At least 3 layers of nodes Bayesian networks Directed acyclic graph
Hierarchical Temporal Memory Abbreviated HTM. A novel machine learning paradigm. Can be considered a type of artificial neural network, but the founding principles differ. (I will only discuss the higher-level concepts of HTM—not its learning algorithms and the like)
Why HTM? It's rather new/untested. Has already shown promising results in the area of visual pattern recognition. Biologically inspired.
The Biological Inspiration HTM is based on a hierarchical theory of the human brain's neocortex and thalamus; it seeks to replicate their biological functions. A top-down solution that models the brain as a “device that computes by performing sophisticated pattern matching and sequence prediction.” 1 A hierarchy of uniform processing elements. HTM implements invariant pattern recognition, as seen in the visual cortex. 1 K. L. Rice, et. al. “A Preliminary Investigation of a Neocortex Model Implementation on the Cray XD1.”
Assumptions in the Basic Theory The neocortex is an efficient pattern matching device, not a computing engine. 1 The brain learns by storing patterns. It recognizes by matching sensory data to learned patterns. 2 The structure of the world is hierarchical: temporal as well as spatial. e.g. “A speaker expresses an idea over time by combining consonants and vowels to make syllables, syllables to make words, etc.” 3 1 J. Hawkins. "Learn Like a Human." 2 K. L. Rice, et. al. “A Preliminary Investigation of a Neocortex Model Implementation on the Cray XD1.” 3
How Does HTM Work? It's (not) a black box! It's a hierarchy of connected nodes.
An HTM Network Multiple levels of nodes. Sensory data is input to the lower level, and a belief is generated at the top level. Information is exchanged from parent to child and vice versa. Each node performs the same learning algorithm.
This Looks Similar to Some Types of Artificial Neural Networks HTM can be considered a type of ANN, as well as a type of Bayesian network. Big Difference: The majority of these networks try to emulate individual neurons, not the overall structure of the neocortex. Temporal data is typically not handled (well). Different learning algorithms are used.
So How Does it Work? Each node looks at its input and learns the “cause” of its input. A “cause” is whatever causes the input pattern to occur. The outputs of the nodes in one level become the inputs of the nodes in the next level. So! The nodes at the lower levels discover simple causes, such as edges and corners, while the nodes at the higher levels discover complex causes, such as faces. Intermediate nodes find causes of intermediate complexity.
Beliefs
How Do Nodes Generate Beliefs? 1. Node looks at input and assigns a probability that the input matches a spatial pattern. 2. The node takes this probability distribution and combines it with previous state information to assign a probability that the current input matches a temporal sequence. 3. The distribution over the set of sequences is the output of the node and is passed up the hierarchy. Finally, if the node is still learning, it might modify the set of stored spatial and temporal patterns to reflect the new input. 1 1 J. Hawkins and D. George, “Hierarchical Temporal Memory – Concepts, Theory, and Terminology.”
Discovering spatial patterns Discovering temporal patterns (sequences) In Pictures
Past Trial “Using Numenta’s hierarchical temporal memory to recognize CAPTCHAs” 1 HTM performed well, but performance could have been improved with more time Concluded HTMs are designed well to recognize CAPTCHAs 1 Y. J. Hall and R. E. Poplin, “Using Numenta’s hierarchical temporal memory to recognize CAPTCHAs”
Past Trial “Content-Based Image Retrieval Using Hierarchical Temporal Memory” 1 HTM was robust to spatial noise, blurring, and other distortions despite having been trained on only clean, undistorted images Concluded HTMs are flexible enough to provide efficient and accurate indexing of line drawings 1 B. A. Bobier and M. Wirth, “Content-Based Image Retrieval Using Hierarchical Temporal Memory”
My Own Firsthand First Impression Testing HTM's image recognition capabilities.
Testing HTM's Image Recognition Capabilities
Results With simple black and white tests, it was very successful. Handled noisy data well. Not so good with rotated images. Reason? (predominantly) Training. Training is essential.
Is HTM a Viable Option? Yes. It has already proven it is a good candidate for simple image processing. I still need to conduct more experiments to find its boundaries. E.g. color images, more complex images, larger database of trained images. Once these boundaries are found, I must decide if it is worth it to find solutions within HTM technology. I may need to implement additional processing. Numenta is a business, this is their product.