Semantic communication with simple goals is equivalent to on-line learning Brendan Juba (MIT CSAIL & Harvard) with Santosh Vempala (Georgia Tech) Full.

Slides:

Advertisements

Similar presentations

1+eps-Approximate Sparse Recovery Eric Price MIT David Woodruff IBM Almaden.

Advertisements

Optimal Space Lower Bounds for All Frequency Moments David Woodruff MIT

A Full Characterization of Quantum Advice Scott Aaronson Andrew Drucker.

Chapter 5 Errors Bjarne Stroustrup

Tight Lower Bounds for the Distinct Elements Problem David Woodruff MIT Joint work with Piotr Indyk.

1 Data Link Protocols By Erik Reeber. 2 Goals Use SPIN to model-check successively more complex protocols Using the protocols in Tannenbaums 3 rd Edition.

An Introduction to Game Theory Part V: Extensive Games with Perfect Information Bernhard Nebel.

Ulams Game and Universal Communications Using Feedback Ofer Shayevitz June 2006.

Sublinear Algorithms … Lecture 23: April 20.

Space Hierarchy Results for Randomized Models Jeff Kinne Dieter van Melkebeek University of Wisconsin-Madison.

Complexity Theory Lecture 6

Security Seminar, Fall 2003 On the (Im)possibility of Obfuscating Programs Boaz Barak, Oded Goldreich, Russel Impagliazzo, Steven Rudich, Amit Sahai, Salil.

Of 12 12/04/2013CSOI: Communication as Coordination1 Communication as Coordination Madhu Sudan Microsoft, Cambridge, USA -

Of 19 03/21/2012CISS: Beliefs in Communication1 Efficient Semantic Communication & Compatible Beliefs Madhu Sudan Microsoft, New England Based on joint.

Primal Dual Combinatorial Algorithms Qihui Zhu May 11, 2009.

Submodular Set Function Maximization via the Multilinear Relaxation & Dependent Rounding Chandra Chekuri Univ. of Illinois, Urbana-Champaign.

Complexity Classes: P and NP

Distributed Snapshots: Determining Global States of Distributed Systems - K. Mani Chandy and Leslie Lamport.

Universal Semantic Communication Brendan Juba (Harvard and MIT) with Madhu Sudan (MSR and MIT) & Oded Goldreich (Weizmann)

MS 101: Algorithms Instructor Neelima Gupta

An Ω(n 1/3 ) Lower Bound for Bilinear Group Based Private Information Retrieval Alexander Razborov Sergey Yekhanin.

Locally Decodable Codes from Nice Subsets of Finite Fields and Prime Factors of Mersenne Numbers Kiran Kedlaya Sergey Yekhanin MIT Microsoft Research.

Timed Automata.

CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 20 Jim Martin.

Semantic Communication Madhu Sudan (based on joint work with Brendan Juba (MIT); and upcoming work with Oded Goldreich (Weizmann) & J. )

Universal Communication Brendan Juba (MIT) With: Madhu Sudan (MIT)

Of 29 May 2, 2011 Semantic Northwestern1 Universal Semantic Communication Madhu Sudan Microsoft Research Joint with Oded Goldreich (Weizmann)

Software Failure: Reasons Incorrect, missing, impossible requirements * Requirement validation. Incorrect specification * Specification verification. Faulty.

1 Vipul Goyal Abhishek Jain Rafail Ostrovsky Silas Richelson Ivan Visconti Microsoft Research India MIT and BU UCLA University of Salerno, Italy Constant.

Of 32 October 19, 2010Semantic U.Penn. 1 Semantic Goal-Oriented Communication Madhu Sudan Microsoft Research + MIT Joint with Oded Goldreich.

1 Complexity of Network Synchronization Raeda Naamnieh.

CPSC 411, Fall 2008: Set 12 1 CPSC 411 Design and Analysis of Algorithms Set 12: Undecidability Prof. Jennifer Welch Fall 2008.

Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.

1 Undecidability Andreas Klappenecker [based on slides by Prof. Welch]

Probably Approximately Correct Model (PAC)

CSE 830: Design and Theory of Algorithms

Vapnik-Chervonenkis Dimension Definition and Lower bound Adapted from Yishai Mansour.

CSE 830: Design and Theory of Algorithms Dr. Eric Torng.

CS555Spring 2012/Topic 41 Cryptography CS 555 Topic 4: Computational Approach to Cryptography.

On Kernels, Margins, and Low- dimensional Mappings or Kernels versus features Nina Balcan CMU Avrim Blum CMU Santosh Vempala MIT.

Halting Problem. Background - Halting Problem Common error: Program goes into an infinite loop. Wouldn’t it be nice to have a tool that would warn us.

Of 30 September 22, 2010Semantic Berkeley 1 Semantic Goal-Oriented Communication Madhu Sudan Microsoft Research + MIT Joint with Oded Goldreich.

Efficient Semantic Communication via Compatible Beliefs Brendan Juba (MIT CSAIL & Harvard) with Madhu Sudan (MSR & MIT)

Of 33 March 1, 2011 Semantic UCLA1 Universal Semantic Communication Madhu Sudan Microsoft Research + MIT Joint with Oded Goldreich (Weizmann)

LATIN’02 April 4 Cancun, Mexico 1 On the Power of BFS to Determine a Graph’s Diameter Derek G. Corneil University of Toronto Feodor F. Dragan Kent State.

Algorithm Evaluation. What’s an algorithm? a clearly specified set of simple instructions to be followed to solve a problem a way of doing something What.

Constraint Satisfaction Problems (CSPs) CPSC 322 – CSP 1 Poole & Mackworth textbook: Sections § Lecturer: Alan Mackworth September 28, 2012.

Chih-Ming Chen, Student Member, IEEE, Ying-ping Chen, Member, IEEE, Tzu-Ching Shen, and John K. Zao, Senior Member, IEEE Evolutionary Computation (CEC),

Kernels, Margins, and Low-dimensional Mappings [NIPS 2007 Workshop on TOPOLOGY LEARNING ] Maria-Florina Balcan, Avrim Blum, Santosh Vempala.

Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.

ACM SIGACT News Distributed Computing Column 9 Abstract This paper covers the distributed systems issues, concentrating on some problems related to distributed.

2/2/2009Semantic Communication: MIT TOC Colloquium1 Semantic Goal-Oriented Communication Madhu Sudan Microsoft Research + MIT Joint with Oded Goldreich.

The Complexity of Distributed Algorithms. Common measures Space complexity How much space is needed per process to run an algorithm? (measured in terms.

Interactive proof systems Section 10.4 Giorgi Japaridze Theory of Computability.

Umans Complexity Theory Lectures Lecture 1a: Problems and Languages.

May University of Glasgow Generalising Feature Interactions in Muffy Calder, Alice Miller Dept. of Computing Science University of Glasgow.

Lecture 12 P and NP Introduction to intractability Class P and NP Class NPC (NP-complete)

1 First order theories (Chapter 1, Sections 1.4 – 1.5) From the slides for the book “Decision procedures” by D.Kroening and O.Strichman.

CS151 Complexity Theory Lecture 16 May 20, The outer verifier Theorem: NP  PCP[log n, polylog n] Proof (first steps): –define: Polynomial Constraint.

On the Notion of Pseudo-Free Groups Ronald L. Rivest MIT Computer Science and Artificial Intelligence Laboratory TCC 2/21/2004.

Information Technology Michael Brand Joint work with David L. Dowe 8 February, 2016 Information Technology.

Common Intersection of Half-Planes in R 2 2 PROBLEM (Common Intersection of half- planes in R 2 ) Given n half-planes H 1, H 2,..., H n in R 2 compute.

The NP class. NP-completeness Lecture2. The NP-class The NP class is a class that contains all the problems that can be decided by a Non-Deterministic.

A Prime Example CS Lecture 20 A positive integer p  2 is prime if the only positive integers that divide p are 1 and p itself. Positive integers.

Information Complexity Lower Bounds

Chapter 6. Large Scale Optimization

CSCE 411 Design and Analysis of Algorithms

Universal Semantic Communication

Universal Semantic Communication

Chapter 6. Large Scale Optimization

Presentation transcript:

Semantic communication with simple goals is equivalent to on-line learning Brendan Juba (MIT CSAIL & Harvard) with Santosh Vempala (Georgia Tech) Full version in Chs. 4 & 8 of my Ph.D. thesis:

Interesting because… 1.On-line learning algorithms provide the first examples of feasible (“universal”) semantic communication. Or… 2.Semantic communication problems provide a natural generalization of on-line learning

So? New models of on-line learning will be needed for most problems of interest. These semantic communication problems may provide a crucible for testing the utility of new learning models.

1. What is semantic communication? 2. Equivalence with on-line learning 3. An application: feasible examples 4. Limits of “basic sensing”

Miscommunication happens… Q: CAN COMPUTERS COPE WITH MISCOMMUNICATION AUTOMATICALLY??

S What is semantic communication? ENVIRONMEN T A study of compatibility problems by focusing on the desired functionality (“goal”) x x f(x) “user message = f(x)?” “USER” “SERVER ” “S-UNIVERSAL USER FOR COMPUTING f ”

Multi-session goals [GJS’09] EN V SESSION 1 … SESSION 2 SESSION 3 INFINITE SESSION STRATEGY: ZERO ERRORS AFTER FINITE NUMBER OF ROUNDS THIS WORK - “ONE-ROUND” GOAL: ONE SESSION = ONE ROUND

Summary: 1-round goals Goal is given by Environment (entity) and Referee (predicate) Adversary chooses infinite sequence of states of Environment: σ 1, σ 2,… On round i, Referee produces a Boolean verdict based on σ i and messages received from User and Server Achieving goal = Referee rejects finitely often

S -Universal user for 1-round goal So: user strategy is S -Universal if for every S in S, the goal is achieved in the system with S. (thus: for every sequence of Environment states, Referee only rejects messages sent by user and S finitely many times—“finitely many errors”)

Anatomy of a user ENVIRONMENT Controller Sensing feedback GOAL-SPECIFIC FEEDBACK— E.G., INTERACTIVE PROOF VERIFIER FOR f GENERIC STRATEGY SEARCH ALGORITHM— E.G., ENUMERATION MOTIVATION FOR THIS WORK: CAN WE FIND AN EFFICIENT STRATEGY SEARCH ALGORITHM IN ANY NONTRIVIAL SETTING?? Strangely, learning theory played no role so far…

Sensing for multi-session goals SESSION 1 … SESSION 2 SESSION 3 EN V I’D BETTER TRY SOMETHING ELSE!! SAFETY: ERRORS DETECTED WITHIN FINITE # OF ROUNDS VIABILITY: SEE NO FAILURES WITHIN FINITE # OF ROUNDS FOR AN APPROPRIATE COMMUNICATION STRATEGY THIS WORK: ALL DELAYS BOUNDED TO ONE ROUND. 1-SAFETY: ERRORS DETECTED WITHIN FINITE # ONE ROUND 1-VIABILITY: SEE NO FAILURES WITHIN FINITE # ONE ROUND FOR AN APPROPRIATE COMMUNICATION STRATEGY

Key def’n: Generic universal user For a given class of user strategies U, we say that a (controller) strategy is a m-error generic universal user for U if, for any 1-round goal, class of servers S and sensing function V such that V is 1-safe for the goal with every S in S and V is 1-viable for the goal with every S in S via some user strategy U in U, the controller strategy using V makes at most m(U) errors with a S that is 1-viable with U in U.

1. What is semantic communication? 2. Equivalence with on-line learning 3. An application: feasible examples 4. Limits of “basic sensing”

Recall: on-line learning [BF’72,L’88] EN V TRIAL 1 … TRIAL 2 TRIAL 3 f ∈ C x1x1 f(x 1 )= y 1 ? x2x2 f(x 2 )= y 2 ? x3x3 f(x 3 )= y 3 ? m -MISTAKE BOUNDED LEARNING ALGORITHM FOR C: FOR ANY f ∈ C AND SEQUENCE x 1, x 2, x 3, … THE ALGORITHM MAKES AT MOST m(f) WRONG GUESSES Algorithm is said to be conservative if its state only changes following a mistake

Main result A conservative m-mistake bounded learning algorithm for C is an m+1-error generic universal user for C ; an m-error generic universal user for C is an m-mistake bounded learning algorithm for C. ⇒ ON AN ERROR, USER MUST NOT HAVE BEEN CONSISTENT WITH VIABLE f ∈ C. ⇐ ON-LINE LEARNING IS CAPTURED BY A 1-ROUND GOAL; EACH f ∈ C IS REPRESENTED BY A SERVER S f.

1. What is semantic communication? 2. Equivalence with on-line learning 3. An application: feasible examples 4. Limits of “basic sensing”

Theorem. There is a O(n 2 (b+log n))-mistake bounded learning algorithm for halfspaces with b-bit integer weights over Q n, running in time polynomial in n, b, and the length of the longest instance on each trial. Key point: the number of mistakes depends only on the representation size of the halfspace, not the examples Based on reduction of halfspace learning to convex feasibility with a separation oracle [MT’94] combined with technique for convex feasibility for sets of lower dimension [GLS’88].

Interesting because… 1.On-line learning algorithms provide the first examples of feasible (“universal”) semantic communication. (Confirms a main conjecture from [GJS‘09])

Extension beyond one round Work by Auer and Long (‘99) yields efficient universal user strategies for k-round goals (when U is a class of stateless strategies, k ≤ log log n) or for classes of log log n-bit valued functions, given an efficient mistake bounded algorithm for one round (resp. bitwise).

But of course, halfspaces << general protocols. We believe that only relatively weak functions are learnable. ☞ There are limits to what can be obtained by this equivalence…

1. What is semantic communication? 2. Equivalence with on-line learning 3. An application: feasible examples 4. Limits of “basic sensing”

Theorem. If C = {f:X→Y} is such that for every (x,y) ∈ X×Y some f satisfies f(x)=y, then any mistake-bounded learning algorithm for C (from 0-1 feedback) must make Ω(|Y|) mistakes on some f w.h.p. E.g., linear transformations…

Sketch Idea: negative feedback is not very informative—many f ∈ C indistinguishable. For every dist. over user strategies, every x, some y is guessed w.p. ≤ 1 / |Y|. – Min-max: there is a dist. over f s.t. negative feedback is received w.p / |Y|. After k guesses, total prob. of positive feedback only increased by k/(1- k / |Y| )-factor.

So, generic universal users for such a class must be exponentially inefficient in the message length. Likewise, traditional hardness for Boolean concepts shows eg., DFAs [KV’94] and AC 0 circuits [K’93] don’t have efficient generic universal users.

Recall… ENVIRONMENT Controller Sensing feedback Only introduced to make the problem easier to solve!

We don’t have to use “basic sensing!” Any feedback we can provide is fair game. Interesting because… 2.Semantic communication problems provide a natural generalization of on-line learning Negative results ⇒ New models of learning needed to tackle these problems; semantic communication problems provide natural motivation.

References [GJS’09] Goldreich, Juba, Sudan. A theory of goal-oriented communication. ECCC TR , [BF’72] Bā̄rzdiņš, Freivalds. On the prediction of general recursive functions. Soviet Math. Dokl. 13:1224–1228, [L’88] Littlestone. Learning quickly when irrelevant attributes abound: A new linear- threshold algorithm. Mach. Learn. 2(4):285–318, [AL’99] Auer, Long. Structural results about on-line learning models with and without queries. Mach. Learn. 36(3):147–181, [MT’94] Maass, Turán. How fast can a threshold gate learn? In Computational learning theory and natural learning systems: Constraints and prospects, vol. 1, pp , MIT Press, [GLS’88] Grötschel, Lovász, Schrijver. Geometric algorithms and combinatorial optimization. Springer, [KV’94] Kearns, Valiant. Cryptographic limitations on learning Boolean formulae and finite automata. J. ACM 41:67–95, [K’93] Kharitonov. Cryptographic hardness of distribution-specific learning. In: 25 th STOC. pp. 372–381, [J’10] Juba. Universal Semantic Communication. Ph.D. thesis, MIT, Available online at: (Springer edition coming soon)