A Roadmap towards Machine Intelligence

Slides:



Advertisements
Similar presentations
Turing Machines January 2003 Part 2:. 2 TM Recap We have seen how an abstract TM can be built to implement any computable algorithm TM has components:
Advertisements

Modelling with expert systems. Expert systems Modelling with expert systems Coaching modelling with expert systems Advantages and limitations of modelling.
Heuristic Search techniques
CS 345: Chapter 9 Algorithmic Universality and Its Robustness
Marković Miljan 3139/2011
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 7: Learning in recurrent networks Geoffrey Hinton.
Today’s Agenda  Stacks  Queues  Priority Queues CS2336: Computer Science II.
INTELLIGENT EDITOR FOR ANDROID MOBILES PHASE 1 : HANDWRITING RECOGNITION ADVANCED MOBILE SYSTEMS ENGINEERING RESEARCH PROJECT BY NITYATA N KUMAR AND AASHRAY.
DARPA Mobile Autonomous Robot SoftwareMay Adaptive Intelligent Mobile Robotics William D. Smart, Presenter Leslie Pack Kaelbling, PI Artificial.
Learning Objectives Explain similarities and differences among algorithms, programs, and heuristic solutions List the five essential properties of an algorithm.
Explicit Direct Instruction Critical Elements. Teaching Grade Level Content  The higher the grade the greater the disparity  Test Scores go up when.
Chapter 4 DECISION SUPPORT AND ARTIFICIAL INTELLIGENCE
SESSION 10 MANAGING KNOWLEDGE FOR THE DIGITAL FIRM.
R R R CSE870: Advanced Software Engineering (Cheng): Intro to Software Engineering1 Advanced Software Engineering Dr. Cheng Overview of Software Engineering.
Brent Dingle Marco A. Morales Texas A&M University, Spring 2002
Prof. Bodik CS 164 Lecture 171 Register Allocation Lecture 19.
Register Allocation (via graph coloring)
Integrating POMDP and RL for a Two Layer Simulated Robot Architecture Presented by Alp Sardağ.
CIS 197 Computers in Society Professor John Peterson Philosophy Week!
Review 4 Chapters 8, 9, 10.
Register Allocation (via graph coloring). Lecture Outline Memory Hierarchy Management Register Allocation –Register interference graph –Graph coloring.
LOFAR Self-Calibration Using a Blackboard Software Architecture ADASS 2007Marcel LooseASTRON, Dwingeloo.
Design and Analysis of Algorithms
Rohit Ray ESE 251. What are Artificial Neural Networks? ANN are inspired by models of the biological nervous systems such as the brain Novel structure.
Issues with Data Mining
GENERAL CONCEPTS OF OOPS INTRODUCTION With rapidly changing world and highly competitive and versatile nature of industry, the operations are becoming.
Chapter 14: Artificial Intelligence Invitation to Computer Science, C++ Version, Third Edition.
Artificial Neural Network Theory and Application Ashish Venugopal Sriram Gollapalli Ulas Bardak.
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
 The most intelligent device - “Human Brain”.  The machine that revolutionized the whole world – “computer”.  Inefficiencies of the computer has lead.
Artificial Intelligence
Hybrid AI & Machine Learning Systems Using Ne ural Networks and Subsumption Architecture By Logan Kearsley.
NIMIA October 2001, Crema, Italy - Vincenzo Piuri, University of Milan, Italy NEURAL NETWORKS FOR SENSORS AND MEASUREMENT SYSTEMS Part II Vincenzo.
Fundamentals of Information Systems, Third Edition2 Principles and Learning Objectives Artificial intelligence systems form a broad and diverse set of.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
Neural and Evolutionary Computing - Lecture 9 1 Evolutionary Neural Networks Design  Motivation  Evolutionary training  Evolutionary design of the architecture.
University of Windsor School of Computer Science Topics in Artificial Intelligence Fall 2008 Sept 11, 2008.
Neural Networks in Computer Science n CS/PY 231 Lab Presentation # 1 n January 14, 2005 n Mount Union College.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Chapter 3 System Performance and Models Introduction A system is the part of the real world under study. Composed of a set of entities interacting.
Distributed Models for Decision Support Jose Cuena & Sascha Ossowski Pesented by: Gal Moshitch & Rica Gonen.
Over-Trained Network Node Removal and Neurotransmitter-Inspired Artificial Neural Networks By: Kyle Wray.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
RULES Patty Nordstrom Hien Nguyen. "Cognitive Skills are Realized by Production Rules"
Reinforcement Learning AI – Week 22 Sub-symbolic AI Two: An Introduction to Reinforcement Learning Lee McCluskey, room 3/10
CMPSC 16 Problem Solving with Computers I Spring 2014 Instructor: Tevfik Bultan Lecture 4: Introduction to C: Control Flow.
Learning Kernel Classifiers 1. Introduction Summarized by In-Hee Lee.
From Use Cases to Implementation 1. Structural and Behavioral Aspects of Collaborations  Two aspects of Collaborations Structural – specifies the static.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
1 8.4 Extensions to the Basic TM Extended TM’s to be studied: Multitape Turing machine Nondeterministic Turing machine The above extensions make no increase.
From Use Cases to Implementation 1. Mapping Requirements Directly to Design and Code  For many, if not most, of our requirements it is relatively easy.
Computer Systems Architecture Edited by Original lecture by Ian Sunley Areas: Computer users Basic topics What is a computer?
Understanding AI of 2 Player Games. Motivation Not much experience in AI (first AI project) and no specific interests/passion that I wanted to explore.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
Introduction Edited by Enas Naffar using the following textbooks: - A concise introduction to Software Engineering - Software Engineering for students-
End-To-End Memory Networks
PART IV: The Potential of Algorithmic Machines.
Neural Networks Dr. Peter Phillips.
Introduction Edited by Enas Naffar using the following textbooks: - A concise introduction to Software Engineering - Software Engineering for students-
AV Autonomous Vehicles.
Algorithm An algorithm is a finite set of steps required to solve a problem. An algorithm must have following properties: Input: An algorithm must have.
Objective of This Course
Theory of Computation Turing Machines.
of the Artificial Neural Networks.
INTRODUCTION.
The use of Neural Networks to schedule flow-shop with dynamic job arrival ‘A Multi-Neural Network Learning for lot Sizing and Sequencing on a Flow-Shop’
CHAPTER I. of EVOLUTIONARY ROBOTICS Stefano Nolfi and Dario Floreano
Presented By: Darlene Banta
Learning linguistic structure with simple recurrent neural networks
Attention for translation
Presentation transcript:

A Roadmap towards Machine Intelligence Tomas Mikolov, Armand Joulin and Marco Baroni Facebook AI Research NIPS 2015 RAM Workshop

Ultimate Goal for Communication-based AI Can do almost anything: Machine that helps students to understand homeworks Help researchers to find relevant information Write programs Help scientists in tasks that are currently too demanding (would require hundreds of years of work to solve)

The Roadmap We describe a minimal set of components we think the intelligent machine will consist of Then, an approach to construct the machine And the requirements for the machine to be scalable

Components of Intelligent machines Ability to communicate Motivation component Learning skills (further requires long-term memory), ie. ability to modify itself to adapt to new problems

Components of Framework To build and develop intelligent machines, we need: An environment that can teach the machine basic communication skills and learning strategies Communication channels Rewards Incremental structure

The need for new tasks: simulated environment There is no existing dataset known to us that would allow to teach the machine communication skills Careful design of the tasks, including how quickly the complexity is growing, seems essential for success: If we add complexity too quickly, even correctly implemented intelligent machine can fail to learn By adding complexity too slowly, we may miss the final goals

High-level description of the environment Simulated environment: Learner Teacher Rewards Scaling up: More complex tasks, less examples, less supervision Communication with real humans Real input signals (internet)

Simulated environment - agents Environment: simple script-based reactive agent that produces signals for the learner, represents the world Learner: the intelligent machine which receives input signal, reward signal and produces output signal to maximize average incoming reward Teacher: specifies tasks for Learner, first based on scripts, later to be replaced by human users

Simulated environment - communication Both Teacher and Environment write to Learner’s input channel Learner’s output channel influences its behavior in the Environment, and can be used for communication with the Teacher Rewards are also part of the IO channels

Visualization for better understanding Example of input / output streams and visualization:

How to scale up: fast learners It is essential to develop fast learner: we can easily build a machine today that will “solve” simple tasks in the simulated world using a myriad of trials, but this will not scale to complex problems In general, showing the Learner new type of behavior and guiding it through few tasks should be enough for it to generalize to similar tasks later There should be less and less need for direct supervision through rewards

How to scale up: adding humans Learner capable of fast learning can start communicating with human experts (us) who will teach it novel behavior Later, a pre-trained Learner with basic communication skills can be used by human non-experts

How to scale up: adding real world Learner can gain access to internet through its IO channels This can be done by teaching the Learner how to form a query in its output stream

The need for new techniques Certain trivial patterns are nowadays hard to learn: 𝑎 𝑛 𝑏 𝑛 context free language is out-of-scope of standard RNNs Sequence memorization breaks We show this in a recent paper Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets

Scalability To hope the machine can scale to more complex problems, we need: Long-term memory (Turing-) Complete and efficient computational model Incremental, compositional learning Fast learning from small number of examples Decreasing amount of supervision through rewards Further discussed in: A Roadmap towards Machine Intelligence http://arxiv.org/abs/1511.08130

Some steps forward: Stack RNNs (Joulin & Mikolov, 2015) Simple RNN extended with a long term memory module that the neural net learns to control The idea itself is very old (from 80’s – 90’s) Our version is very simple and learns patterns with complexity far exceeding what was shown before (though still very toyish): much less supervision, scales to more complex tasks

Stack RNN Learns algorithms from examples Add structured memory to RNN: Trainable [read/write] Unbounded Actions: PUSH / POP / NO-OP Examples of memory structures: stacks, lists, queues, tapes, grids, …

Algorithmic Patterns Examples of simple algorithmic patterns generated by short programs (grammars) The goal is to learn these patterns unsupervisedly just by observing the example sequences

Algorithmic Patterns - Counting Performance on simple counting tasks RNN with sigmoidal activation function cannot count Stack-RNN and LSTM can count

Algorithmic Patterns - Sequences Sequence memorization and binary addition are out-of-scope of LSTM Expandable memory of stacks allows to learn the solution

Binary Addition No supervision in training, just prediction Learns to: store digits, when to produce output, carry

Stack RNNs: summary The good: Turing-complete model of computation (with >=2 stacks) Learns some algorithmic patterns Has long term memory Works for some problems that break RNNs and LSTMs The bad: The long term memory is used only to store partial computation (ie. learned skills are not stored there yet) Does not seem to be a good model for incremental learning due to computational inefficiency of the model Stacks do not seem to be a very general choice for the topology of the memory

Conclusion To achieve AI, we need: New set of tasks Develop new techniques Motivate more people to address these problems