© 2008 SRI International Systems Learning for Complex Pattern Problems Omid Madani AI Center, SRI International.

© 2008 SRI International Foundations of Intelligence: Concepts (Categories) Intelligent systems categorize their perceptions (objects, events, relations) Categorization involves substantial abstraction: you rarely see the same exact thing again… Categorization is necessarily for intelligence Categories are complex: have adaptive structure, composed of parts, of absrtactions,… High intelligence (advanced animals) requires myriad categories What are the principles behind such learning and development? Assumptions/Evidence: These (perceptual) categories are developed mainly in an unsupervised manner –Doubtful they are all programmed in.. Many are not (in particular, for humans) –Explicit teacher is absent

© 2008 SRI International Example Perceptual Concepts In text, every word, phrase, expression: “book”, “new”, “a”, … Single characters are primitive concepts: “a”, “b”, …, “1”,”2”, “;” …. Concepts can be composed of other concepts: –“n”+”e” = “ne” –“new” + “york” = “new york” Concepts can be abstractions: –week-day = {Monday, Tuesday, ….} –Digits = {1,2,3,4,….} Area code is a concept that involves both composing and abstraction: –Composition of 3 digits –A digit is a grouping, i.e., the set {0,1,2,…,9} ( 2 is a digit ) Other examples: phone number, address, resume page, face (in visual domain), etc.

© 2008 SRI International Acquiring and Developing Concepts Higher intelligence, such as “advanced” pattern recognition/generation (e.g. vision), may require – Long term learning (weeks, months, years,…) – Cumulative learning (learn these first, then these, then these,…) – Massive Learning: Myriad inter-related categories/concepts – Systems learning: multiple algorithms working together – Autonomy (relatively little human involvement) What are the learning processes? ? Applications: learning to segment words in speech stream in any language, visual object recognition, learn to play Go/Chess

© 2008 SRI International Prediction System …. 0011101110000…. After a while (much learning) predict observe & update Prediction System observe & update predict low level or “hard-wired” categories higher level categories (bigger chunks) (Input say text: characters,.. or vision: edges, curves,…) (e.g. words, digits, phrases, phone numbers, faces, visual objects, home pages, sites,…) In a nutshell, we seek a system such that: Learning by Repeatedly Predicting in a Rich World Prediction Games in Infinitely Rich Worlds, AAAI FSS07

© 2008 SRI International “ther ” Example Category Node (processed Jane Austen’s online books) “and ” “heart” 0.087 0.07 0.057 0.052 0.13 0.11 “love ” 0.10 “by ” ( Exploring Massive Learning via a Prediction System, AAAI FSS’07) 7.1 0.41 (keep local statistics) prediction weights categories appearing before “ bro” “ far” “toge” “nei”

© 2008 SRI International Some Challenges or Features of the Task Lots of –Features/predictors (input dimensionality), –classes (output dimensionality), –instances (episodes) Uncertainty in the value of features, classes, adequate segmentation, … –No one segments them for us! (what about written language?) Require algorithms that are primarily: – incremental, handle nonstationarities, uncertainty, asymptotic convergence, efficient sample complexity Objectives and evaluation criteria?

© 2008 SRI International Many-Class Learning (.. A Wiring Problem) The questions raised during this research: 1.Given the need to quickly classify (a given instance ) into one of myriad classes (e.g. millions), how can this be done? 1.How about space efficiency ? 2.How can we efficiently learn such efficient classification systems? many-class learning classification system ?

© 2008 SRI International A Solution: Index Learning features categories instances Input: tripartite graph learn features categories Output: an index = sparse weighted bipartite graph 0 0 0 Output: A (sparse) matrix

© 2008 SRI International Classification/Prediction (retrieval & scoring) 1. Features are “activated” features classes c1 c2 c3 c4 c5 f1 f2 f3 f4 2. Edges are activated 3. Receiving classes are activated 4. Classes sorted/ranked see omadani.net for the learning algorithms

© 2008 SRI International Summary Encouraging signs that elements of unsupervised (more “autonomous”) long-term learning systems are developing: – For instance, efficient many-class learning a good possibility – Good progress in machine learning (e.g. some evidence that hierarchical networks are useful) Our work stresses large-scale and long-term learning – A “systems” approach (compared to traditional neural network approaches): we require to solve multiple problems and need multiple algorithms – Many challenges:  Uncertainties (e.g. feature noise and label noise)  Nonstationarities (concepts evolve, the system evolves and develops)  System objective(s)?  Avoiding accumulation of error, local minima, slow learning  Understanding the interaction between different modules (segmentation and concept learning, etc.) Driven by goal of robustly solving practical problems (versus driven by “modeling” the brain), but problems that we think intelligence in the biological world solves.

© 2008 SRI International … New Jersey in … predictors (active categories) window containing context and target target (category to predict) … New Jersey in … next time step predictors target In this example, context contains one category on each side Expedition (a 1 st System)

© 2008 SRI International … loves New York life … predictors window containing context and target target (category to predict).. Some Time Later.. In terms of supervised learning/classification, in this learning activity (prediction games): The set of concepts grows over time Same for features/predictors (concepts ARE the predictors!) Instance representation (segmentation of the data stream) changes/grows over time..

© 2008 SRI International A View of ML: On the Source of Classes (A Spectrum of Feedback-Driven (“Supervised”) Learning) 1. Machine defined 2. Implicitly assigned (by the “world” or a “natural” activity/machine) 1. Human defined 2. Human/Explicitly assigned (human procures training data) 1. Human defined 2. Implicitly assigned (by the “world” or a “natural” activity, or by machine) More machine autonomy (less human involvement) More noise/uncertainty More training data More classes More open problems! More interesting! (classic supervised learning ) Annotator/Editorial label assignment, (Reuters RCV1, ODP,…) controlled image tagging, ~mechanical Turk, explicit personalization (news filtering, spam,…) predict a word using context in text The Newsgroup data set Image tagging in Flicker Users as classes Queries as classes Predict clicks ….. Autonomous learning systems ( systems acquiring and developing their own concepts, prediction games, complex sensory input streams, cumulative learning, life-long learning, development,... )

© 2008 SRI International Systems Learning for Complex Pattern Problems Omid Madani AI Center, SRI International.

Similar presentations

Presentation on theme: "© 2008 SRI International Systems Learning for Complex Pattern Problems Omid Madani AI Center, SRI International."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

© 2008 SRI International Systems Learning for Complex Pattern Problems Omid Madani AI Center, SRI International.

Similar presentations

Presentation on theme: "© 2008 SRI International Systems Learning for Complex Pattern Problems Omid Madani AI Center, SRI International."— Presentation transcript:

Similar presentations

About project

Feedback