Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder

Why Care About Human Memory? The neural architecture of human vision has inspired computer vision. Perhaps the cognitive architecture of memory can inspire the design of RAM systems. Understanding human memory essential for ML systems that predict what information will be accessible or interesting to people at any moment.  E.g., selecting material for students to review to maximize long-term retention (Lindsey et al., 2014)

The World’s Most Boring Task Stimulus X -> Response a Stimulus Y -> Response b frequency response latency

Sequential Dependencies Dual Priming Model (Wilder, Jones, & Mozer, 2009; Jones, Curran, Mozer, & Wilder, 2013)  Recent trial history leads to expectation of next stimulus  Responses latencies are fast when reality matches expectation  Expectation is based on exponentially decaying traces of two different stimulus properties

Examining Longer-Term Dependencies (Wilder, Jones, Ahmed, Curran, & Mozer, 2013)

Declarative Memory Cepeda, Vul, Rohrer, Wixted, & Pashler (2008) studytest

Forgetting Is Influenced By The Temporal Distribution Of Study Spaced studyMassed study produces more robust & durable learning than

Experimental Paradigm To Study Spacing Effect

Cepeda, Vul, Rohrer, Wixted, & Pashler (2008) Intersession Interval (Days) % Recall

Optimal Spacing Between Study Sessions as a Function of Retention Interval

Predicting The Spacing Curve characterization of student and domain intersession interval Multiscale Context Model predicted rec all forgetting after one session Intersession Interval (Days) % Recall

Multiscale Context Model (Mozer et al., 2009)  Neural network  Explains spacing effects Multiple Time Scale Model (Staddon, Chelaru, & Higa, 2002)  Cascade of leaky integrators  Explains rate-sensitive habituation Kording, Tenenbaum, Shadmehr (2007)  Kalman filter  Explains motor adaptation

Key Features Of Models Each time an event occurs in the environment… A memory of this event is stored via multiple traces Traces decay exponentially at different rates Memory strength is weighted sum of traces Slower scales are downweighted relative to faster scales Slower scales store memory (learn) only when faster scales fail to predict event trace strength mediumslow fast + +

time event occurrence

Exponential Mixtures ➜ Scale Invariance Infinite mixture of exponentials gives exactly power function Finite mixture of exponentials gives good approximation to power function With, can fit arbitrary power functions ++=

Relationship To Memory Models In Ancient NN Literature Focused back prop (Mozer, 1989), LSTM (Hochreiter & Schmidhuber, 1997)  Little/no decay Multiscale backprop (Mozer, 1992), Tau net (Nguyen & Cottrell, 1997)  Learned decay constants  No enforced dominance of fast scales over slow scales Hierarchical recurrent net (El Hihi & Bengio, 1995)  Fixed decay constants History compression (Schmidhuber, 1992; Schmidhuber, Mozer, & Prelinger, 1993)  Event based, not time based

Sketch of Multiscale Memory Module x t : activation of ‘event’ in input to be remembered, in [0,1] m t : memory trace strength at time t Activation rule (memory update) based on error,  Activation rule consistent with the 3 models (for Koerding model, ignore KF uncertainty)  This update is differentiable ➜ can back prop through memory module  Redistributes activation across time scales in a manner that is dependent on temporal distribution of input events Could add output gate as well to make it even more LSTM-like + ∆ fixed learned +1 -1 xtxt mtmt

Sketch of Multiscale Memory Module Pool of self-recurrent neurons with fixed time constants Input is the response of a feature-detection neuron  This memory module stores the particular feature that is detected  When the feature is present, the memory updates Update depends on error between is a feature detected at time t When feature detected, memory state compared to input, and a correction is made to memory to represent input strongly + ∆ fixed learned +1 -1

Why Care About Human Memory? Understanding human memory essential for ML systems that predict what information will be accessible or interesting to people at any moment.  E.g., shopping patterns  E.g., pronominal reference  E.g., music preferences

Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Similar presentations

Presentation on theme: "Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Similar presentations

Presentation on theme: "Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder."— Presentation transcript:

Similar presentations

About project

Feedback