Dialog Processing with Unsupervised Artificial Neural Networks Andrew Richardson Thomas Jefferson High School for Science and Technology Computer Systems Laboratory 2005 - 2006
Dialog Processing with Unsupervised Neural Networks Contents: What I did... Background (unsupervised neural networks) Program Mechanics Attributes of Nodes Attributes of Connections Lessons from Neurobiology Algorithms Further Research
What I Did Interest in Neural Networks (Unsupervised) Most researchers use Supervised NN's (Boring) Theory's really complicated Learning from brains... I found a new Field! (Cognitive Science) Too complicated for now Program a failure
Background: Neural Networks Outside of research, the neural networks used today are supervised, such that output for an input is matched against the right answer, and connections that produce the right answer are reinforced. The idea is that connections which have been right in the past will be right in the future.
Background: Unsupervised Neural Networks, or a Connectionist Model However, I think that unsupervised neural networks have more promise for complex tasks. This is more analogous to the neurons within the brain. Instead of affecting the network in a series of supervised tests, the network is systematically modified as a series of inputs, such as words, are read in. In an attempt to mimic the brain, my network reinforces connections between nodes that often fire one after the other. In this case, each word is represented by a node.
Program Mechanisms: Nodes However, it's not as simple as that. If the brain only noted connections between words, it wouldn't note connections to emotions or abstract ideas. In order to mimic these attributes of the brain, the ones that really think, nodes are added to the network that do not represent words. These take on meaning as they build connections to words and to each other. In time, they may let the network form complex ideas represented by nodes that have been influenced by the input text.
Program Mechanisms: Attributes of Nodes Like neurons in the human brain, nodes in my program vary in a variety of ways. Plasticity: A measure of how easy it is to modify the connections to and from this node Metaplasticity: A measure of how much more difficult it becomes to modify connections. This is important because it allows connections within the brain to become fix and finalized after having been changed, resisting further change. Of course, nodes can become less rigid as time goes on, or else the network would become unusable. The ease with which nodes do this also varies. This is important in the human brain in facilitating short term memory, wherein connections remain constant after having been established, but then become plastic again.
Program Mechanics: Attributes of Nodes Number of Connections: Some nodes have the capacity to connect to more nodes than others. This is theoretically more important when metasystems get more advanced than those in my current project. Threshold: Some nodes require more stimulation in order to fire than others. Base Values for Connections: Most connections between nodes are only the basic connections that do not yet reflect changes from the environment. The nodes remember what these values are for their connections. Type of Node: This is a reflection of something the brain does. I'm not sure why, but I put it in for good measure, because it seems important in the brain.
Attributes of Connections The links between the nodes are where the nodes actually remember past actions, so these attributes are particularly important. Strength of Connection: This is the power a connection has to activate the end node. This also stores whether the connection is excitatory or inhibitory. This is affected by attributes of the connected nodes.
Lessons From Neurobiology In designing my project, I tried to copy neurobiology, because designing from scratch is difficult Hebbian Learning Excitory/Inhibitory Neurotransmitter types/receivers Cognitive Science Network structures Plasticity Metaplasticity
Difficulties in Modeling and the Need for Algorithms In the human brain, which can also be thought of as an unsupervised neural network, neurons each have thousands of connections, and there are billions of neurons in the brain. We cannot expect a computer to handle all this without the mechanisms being simplified and optimized a bit.
Program Mechanics: Algorithms An unsupervised neural network can be thought of as a collection of nodes which form connections to each other. In the beginning, the network is set up having different types of nodes, with different types of characteristics and connections. In the beginning, these attributes and connections are all cookie-cutter; they do not encode meaningful information. Only after the network has changed in response to stimuli will the connections and attributes be important. Furthermore, only those connections that have changed to reflect the stimuli have important changes, and then only before they have been changed back to being non-descript.
Program Mechanisms: Algorithms So, my program attempts to conserve computational resources by taking advantage of the fact that most nodes aren't important. It keeps track of which nodes encode meaningful information, and keeps statistical information on those nodes that do not. Whenever new information needs to be assimilated, the existence of nodes is predicted using statistical information which are then brought into reality in order to hold useful information. In this way, the program processes no more than is actually needed, while at the same time reducing informational artifacts of the program from becoming too large.
Theory Computational Complexity Number of important connections proportionate to information to be stored How much does it need to know? Processing kept to a minimum Cognitive Science
Further Research: Representations As it currently stands, the program represents information by storing the connections between nodes as well as storing which nodes are important. It would be better if information were stored in a more intuitive and less spacious manner. Representational standards should be developed based on symbolic cognitive science.
Bibliography http://scholar.lib.vt.edu/ejournals/SPT/v5n2/dietrich.html - Explanation of the computationalist approach to cognitive science, the approach used in the theory of this program. http://www.ulg.ac.be/cogsci/jsougne/JScogsci96.pdf - explanation of how neurons need to be in phase to communicate. http://yudkowsky.net/bayes/bayes.html - Explanation of Bayesian math, which I'm attempting to use to model this program. http://www-psych.stanford.edu/~andreas/Research/Papers/TextCategorization/Wiener.Pedersen.Weigend_SDAIR95.ps - Neural net used for topic spotting.
Bibliography http://acl.ldc.upenn.edu/acl2002/EMNLP/pdfs/EMNLP142.pdf - The ambiguous nature of words described in this article supports the use of neural networks for processing rather than more rigid rule- based approaches. http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html - This article is particularly loquacious in describing the difference between supervised and unsupervised networks. The real power of neural networks is that they can learn, and it is important that I get sufficient learning material for my network. This will include dictionaries (which I am having trouble obtaining), and conversational transcripts. http://www.dacs.dtic.mil/techs/neural/neural3.html#RTFToC10 - This article talks about how networks can "memorize" data. Tbat is to say that they avoid learning the rules about the data, but instead learn only to respond to the input data used so far. It is also important to consider the topology of networks, because that is an additional level of complexity within the brain, or a neural network.
Bibliography http://scholar.google.com/scholar?hl=en&lr=&q=cache:fYGM13j1fhUJ:www.p hysics.brown.edu/users/faculty/intrator/papers/face- j.ps.gz+unsupervised+neural+network - Face recognition is generally done with more rigid algorithms, but this presents a way to use neural networks to achieve the desired recognition.