Proof Planning as Understanding as Cortical Functions Brendan Juba With Manuel Blum Matt Humphrey Ryan Williams
WHAT and WHY WANTED: proof-clustering algorithm Characterizes high-level idea Aid theorem-provers WANTED: define CONSCSness Aid in answering fundamental questions Basis for developing protocols Directing development of robots, etc. Recap: WHAT: we wished to find a proof-clustering algorithm. WHY: we felt that sets of proofs would serve as a suitable characterization of high-level ideas. A natural first step: well-typed output, but one in which the specifics would (hopefully) provide insight Also: should improve theorem proving (by aiding proof-planners) Plays important role in larger project CONSCSness: computer scientist’s definition of consciousness. MIGHT help resolve philosophical and ethical disputes MUST prove invaluable in developing protocols to test for CONSCSness, Developing robots (or: other systems) that are CONSCS, etc.
CONTENTS NO algorithms INSTEAD: Where to discover an algorithm Viewing neocortex as a proof planner Why expect suitability Link to understanding don’t yet have the algorithm we wanted unsurprising: DESIRED FUNCTION NOT FULLY SPECIFIED; “guess and check” is not so efficient. Algorithm should come hand-in-hand with reasons to believe it is desirable… Reasons provided by (falsifiable) theories of function of neocortex WE SHOW: suitable algorithm PREDICTED by MEMORY-PREDICTION FRAMEWORK (HAWKINS) WE INDICATE where algorithm is IMPLEMENTED (demonstrate the neocortex acts as a proof-planner) WE ELABORATE on the INTERPRETATION of the cortical system as the mechanism for UNDERSTANDING This will suggest it is THE desired function. We’ll come back to this short outline later.
Proof Planning in the Memory-Prediction Framework Suppose Alice is studying proofs… Under the Framework: Regions of cortex representing proof steps switch on in sequence Hierarchically higher regions form “names” “names” and “names of patterns of names” Alice can recall the patterns later Patterns serve as proof plans (more…) Consider ALICE. Alice sits reading proof after proof, diligently studying. Maybe it’s an algorithms test? Maybe it’s personal edification? Who knows. Under memory-prediction framework: expect regions of the cortex encoding the steps of the proofs to switch on in sequence. These encodings are passed as inputs to hierarchically higher regions of the cortex. Receiving these inputs, the higher region assigns NAMES to RECURRING PATTERNS. In the future, these NAMES are passed to STILL HIGHER regions of the cortex whenever the pattern is seen or anticipated. The STILL HIGHER regions also form NAMES out of RECURRING PATTERNS of NAMES. Alice can RECALL these PATTERNS, PATTERNS OF PATTERNS, and so on… Down the hierarchy, these recalled patterns trigger a sequence of proof steps. (actual steps triggered will depend on context) Through this link, the PATTERNS serve as PROOF PLANS.
Patterns serve as proof plans? Generates sequence of proof steps Features: Expectancy Generality Satisfied by named patterns in cortex Proof steps encoded in lower regions proof plan: method for generating sequences of proof steps. Most significantly, they should feature: - expectancy (we have some reason to expect success since they appeared in real proofs) - generality (if we allow the actual steps generated to vary with context, we generate more than one proof) In case of failure… also should be possible to identify where pattern fails to be instantiated into valid proof sequence Yields: patchability the output sequences of proof steps (encoded in their respective hierarchically lower cortical regions) should have these features.
Where’s the algorithm? Critical link between cortical regions Cortical region forms name for an input pattern Translated: forms proof plan from pattern of already-formed proof plans Our algorithm! Presently: not understood. algorithm is implemented in mechanism that forms names for patterns in hierarchy: the link between a cortical region and the cortical regions hierarchically below it Recall: as names in lower regions are flashed to a higher region, the higher region forms a name for the pattern of names. In proof-planning terms: lower levels interpret input proofs in terms of constituent proof tactics and plans; Higher level forms a proof plan out of patterns of these already-formed proof plans. Exactly what we wanted! Unfortunately, mechanism is PREDICTED to exist---has NOT been DISCOVERED And the LINK between CORTICAL REGIONS is NOT UNDERSTOOD
Proof Planning and the Cortical Algorithms Conservative learning algorithm lower bounds Proof Planning: restricted domain Decoded cortical algorithms system for learning and utilizing proof plans “But, is it any good?” RELATIONSHIPS between proof planning algorithms and cortical algorithms: Proof-plan learning algorithms generating only “natural” proof plans (e.g., conservative) yield lower bounds: On variety of proof plans (potentially) on computational power of cortex Only lower bounds since proof planning is a RESTRICTED DOMAIN, In all senses---restricting domain of the learning algorithm Other direction: Decoding cortical algorithms yields system for learning & utilizing proof plans (since cortical algorithms treat a superset of these patterns) Haven’t yet justified the use of these algorithms---why should it be good?
YOU ARE HERE CONTENTS Where to discover an algorithm Viewing neocortex as a proof planner Why expect suitability Link to understanding Just showed: under memory-prediction framework, cortex implements algorithms capable of learning and utilizing proof plans Next: explain why we anticipate that the algorithms utilized by the cortex should be ideal for proof-planning by linking them to understanding.
Understanding in the Memory-Prediction Framework Take a minute to look at this famous example Visual example, but under Mountcastle’s common cortical algorithm hypothesis, it’s suitable. … Under the memory-prediction framework, Understanding occurs when the cortex predicts a pattern that is confirmed by the input (e.g., when the input matches some known pattern) -- in this case, when your cortex guesses ‘Dalmatian’ For proof plans: The case where a proof matches a known plan
Understanding as Proof Plans Share several characteristics Identifying a proof plan permits Prediction Correction of “minor” mistakes Re-use of ideas and/or techniques Generation of summaries You know a proof plan for a proof -- so what? Why elevate to status of “UNDERSTANDING?” Knowledge of a proof plan has characteristics identified with understanding: property stressed in the memory-prediction framework: the ability to predict the next step, by generating it from the plan By regenerating portions of a proof from the plan, one can correct mistakes … Only “minor” mistakes since one has to identify the proof with the plan in the first place! Trivially, provided the plan has generality, it can be used to generate proofs for other theorems. Expressing the proof in terms of constituent plans at various levels yields summaries with varying levels of detail. These four are the “big ones” that came to mind; I’d like to know about other properties of understanding…
Ideal Proof Plans Goals of proof-planning Goal for CONSCS Mimic human theorem-proving Produce human-oriented output Goal for CONSCS Characterize high-level ideas Under memory-prediction framework, proof plans generated by cortex are IDEAL in many ways. Aims of using proof plans to prove theorems were: Mimicry of the human theorem-proving process. (trivially satisfied) Production of human-oriented hierarchical proofs as output (“hiproof,” Lamport’s “structured proof”) Since identification of (internal) proof plan is taken to be understanding And plans generated by cortical functions should match those actually used by humans Then, including plans generated by cortical functions should trivialize understanding Likewise, cortical functions should yield precisely the class of patterns characterizing high-level ideas As needed by CONSCS
Directions for Future Work Decipher cortical algorithms!! Automate learning of proof plans Analyze cortical functions Refine definitions for CONSCS CLEARLY, the summer’s work only scratched the surface of this problem Most glaring omission: our lack of an algorithm. We indicated where we expect the ideal function to be implemented; One has “only” to decipher the implementation. This is the prerequisite for any of the other points indicated above. (Point mentioned by Luis von Ahn: deciphering the cortex solves AI. We only mention point #2 since that’s the goal we started with, but under Hawkins’ framework and Mountcastle’s hypothesis, we should be able to do much more with this algorithm like solving vision, language, etc… this slide could certainly include far more than four points.) Once the implementation is understood, we can replicate the function digitally & thus learn these “ideal” proof plans automatically Also: perform various analyses of the cortical functions… how powerful are they? What are good inputs? Bad inputs? Etc.. Finally, formal understanding of the cortical functions will permit formalization of portions of definition for CONSCS (Question by Russell Schwartz: why not use richer classes of patterns? A more powerful theorem-prover should be useful. Response #1: If you can get it, more power to you. Right now we’d be happy with simply having something as powerful as a human. Of course, there’s still some nagging questions about what class of patterns really yield an “optimal” theorem prover in terms of its output over time. Response #2: (one I didn’t think of until later) The downside to a richer class of patterns is that if a pattern can’t be learned by the cortex, then we don’t expect a human to be capable of learning the pattern, which prevents the output from being “human-oriented,” I.e., understandable)