Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reconstructing Visual Experiences from Brain Activity Evoked by Natural Movies Shinji Nishimoto, An T. Vu, Thomas Naselaris, Yuval Benjamini, Bin Yu, Jack.

Similar presentations


Presentation on theme: "Reconstructing Visual Experiences from Brain Activity Evoked by Natural Movies Shinji Nishimoto, An T. Vu, Thomas Naselaris, Yuval Benjamini, Bin Yu, Jack."— Presentation transcript:

1 Reconstructing Visual Experiences from Brain Activity Evoked by Natural Movies Shinji Nishimoto, An T. Vu, Thomas Naselaris, Yuval Benjamini, Bin Yu, Jack L. Gallant Presented by Matthew Leming

2 Overview Background: What is fMRI? What is the vocabulary used in this paper? Now is it relevant to computer vision? Challenges presented and main goal of this paper The experiment in the paper (encoding video and associating it with fMRI activity) Nature of the results Discussion

3 Background questions What is fMRI? Why is fMRI the best option we have for modeling brain activity? Are there alternatives? What are the limitations of fMRI?

4 Challenge presented You can easily see brain activation in fMRI as a result of viewing static images Due to limitations in temporal resolution, it may be difficult to see fMRI activation as a result of moving images Image source: Wikimedia Commons

5 Goal of paper To model dynamic visual stimuli in the early human visual system using fMRI data and moving visual input This allows us to understand the processing of the early human visual system. Image source: Wikimedia Commons

6 Vocabulary in the paper fMRI – “Functional Magnetic Resonance Imaging” – Measures brain activation with a big magnet BOLD signal – “Blood Oxygen Level Dependent” signal (functional neural activation) Areas V1/V2/V3 – Different areas of visual processing in the brain, with V1 being the lowest level “Static visual stimuli” – stuff that stands still “Dynamic visual stimuli” – stuff that moves “Eccentricity” – stuff that stands out

7 Vocabulary (cont.) “Motion-energy encoding model” – The model they used to encode video such that it could more easily be used to associate with particular BOLD signals in the brain Gabor filter – Linear filter used for edge detection, used here to encode spatiotemporal information in the video Image source: Wikimedia Commons

8 What is the problem with modeling “dynamic visual stimuli” (i.e., moving objects) with fMRI? fMRI is very slow and has trouble keeping up with faster changes in perception. How is it addressed in this paper? A new motion-energy encoding model that describes fast visual information and slow hemo-dynamics separately. In other words, it directly encodes the parts of video that have quick movement, then associates that with a BOLD time signal from the fMRI.

9 What they did Take three subjects Show them a bunch of 1- second video clips Take fMRI data of their V1, V2, V3 areas (in occipital lobe) while they look at these, in approx. 4-second intervals Image source: Wikimedia Commons

10 What they did (cont.) Build a dictionary (Bayesian encoding mechanism) associating video clips with brain activity, filtering the video clips with temporal filters (i.e., filters that detect movement — see Encoding Model on next slide). Show them new video clips Reconstruct what they’re seeing from dictionary Image source: Wikimedia Commons

11 What does this have to do with computer vision?

12 Per the abstract: "Visualization of the fit models reveals how early visual areas represent the information in movies...These results demonstrate that dynamic brain activity measured under naturalistic conditions can be decoded using current fMRI technology.” This is a demonstration of 1. How the brain processes visual information and 2. Ways that we can read the brain for this type of activity

13 Encoding model

14 Results

15

16 Encoding filters and accuracy

17

18 Implications of results Human speed perception depends on eccentricity. Our early visual system pays more attention to standout objects going faster. The fovea (center of visual field) pays more attention to slower moving objects, while the peripherals pay more attention to faster moving objects.

19 Discussion Encoding models that accounted explicitly for color in the video did not improve accuracy. Why? The encoding model that included direction in its motion information was only a slight improvement over that which it included motion, but not direction. Both were huge improvements over the static model. Why? What are the implications of this research? Do you think that this is any implication that we can read minds with machines (as the news articles concerning this research purported)?


Download ppt "Reconstructing Visual Experiences from Brain Activity Evoked by Natural Movies Shinji Nishimoto, An T. Vu, Thomas Naselaris, Yuval Benjamini, Bin Yu, Jack."

Similar presentations


Ads by Google