Using Evoked Magnetoencephalographic Responses for the Cognitive Neuroscience of Language Alec Marantz MIT KIT/MIT MEG Joint Research Lab Department of.

Using Evoked Magnetoencephalographic Responses for the Cognitive Neuroscience of Language Alec Marantz MIT KIT/MIT MEG Joint Research Lab Department of Linguistics and Philosophy

From Cog Sci to Cog Neurosci Cognitive Science, including Linguistics, has used behavioral data to develop computational theories of language representation and use These theories play out along the dimensions of time (sequential processing stages), space (separation of processing functions) and complexity (difficulty of processing)

Cognitive Neuroscience of Language Cognitive Science moves to Cognitive Neuroscience when the temporal, spatial, and complexity dimensions of cognitive theories are mapped onto the time course, localization, and intensity of brain activity However, because of the lack of temporal information, the development of Neurolinguistics with fMRI and PET techniques has tended to flatten theories of the Cognitive Neuroscience of Language

Cognitive Science: Taft & Forster 1977 (traditional articulated Cog Sci) Affix stripping, followed by recombination of stem and affix

sample prediction from model: -semble is a stem, since assemble, resemble, dissemble are words -sassin (assasin) is not a stem, since only assassin is a word It should take longer to reject “semble” as a non- word than “sassin,” since “semble” is a lexical item (“semble” requires looping from box 4 through box 5 in the model before reaching box 7, while “sassin” pushes directly from box 4 to box 7, “No”)

Taft 2004: further behavioral support for articulated model of processing stages More contemporary instantiation of model -- makes predictions about RTs based, e.g., on a theory of the experimental task

Flattened computational model: Gonnerman & Plaut (2000)

Masked priming experiment compares responses to  Semantic sofa-COUCH  Morphologicalhunter-HUNT  Orthographicpassive-PASS  Unrelatedaward-MUNCH Claim: failure to find special location for the morphological condition (using fMRI) supports flat model in which morphology is an emergent property of semantic and phonological/orthographic relatedness

fMRI experiment consistent with flattened computational model. Temporal/sequential processing not at issue. But the masked priming experimental design is confounded with respect to predictions from a Taft-style model with affix-stripping since the “orthographic” items consist of possible stems and stripable affixes (e.g., tenable/ten passive/pass)

Articulated vs. Flattened Model Taft’s articulated affix-stripping model predicts that “tenable” and “bendable” should be processed in the same “places” (in the model/brain) and in the same temporal sequence (affix stripping followed by stem activation followed by recombination), with differences in “complexity” (measured, e.g., by level of brain activity or latency of brain events) Thus the cognitive science model predicts the fMRI results and makes further predictions testable with techniques that allow exploration of the latency of brain responses

MEG allows cognitive neuroscience to fully embrace cognitive science MEG records the magnetic fields generated by electrical activity in the brain, millisecond by millisecond MEG has the spatial resolution, the temporal resolution and the sensitivity necessary to test predictions from cognitive science along the space, time and complexity dimensions

Plot Examples of MEG experiments exploiting the temporal, spatial, and intensity resolution of the technique A return to Taft’s stages The future: even closer ties between experimental designs in cognitive science and cognitive neuroscience

KIT/MIT MEG Lab

Magnetoencephalography (MEG) = study of the brain’s magnetic fields http://www.ctf.com/Pages/page33.html

Magnetoencephalography (MEG) Distribution of magnetic field at 93 ms (auditory M100) Averaged epoch of activity in all sensors, overlapping wave forms, one line/sensor Outgoing Ingoing Liina Pylkkänen, Aug 03, Tateshina

MEG exemplified

Parametric variation in letter string length and in added visual noise Categorical symbol vs letter manipulation

M100 response varies in intensity with visual noise; M170 response varies in intensity with string length M100 response M170 response Note separation in space and temporal sequence (M100 vs. M170) consistent with sequential processing model

Intensity of M170 response to letters as compared to symbols confirms function of processing at M170 time & location (“visual word form” or “letter string” area) Reaction time to read words predicted by combination of M170 amplitude and latency

Latency coding? Response latency correlates with stimulus properties.

Auditory M100 (from auditory cortex)

Frequency of tone predicts latency of M100 peak

Temporal Coding?: Shape of response over time at M100 latency and source location correlates with phonetic category of stimulus

Voiced (b,d) vs. voiceless (p,t) consonant auditory evoked response

Different ways of measuring the shape of the M100 response to voiced vs. voiceless consonants yield good computational “experts” that can classify data from a single response as either a pa/ta or a ba/da with significantly greater than chance accuracy

Sequential processing of words

What happens in the brain when we read words? -100 0 100 200 300 400 500 600 700 [msec] 0 200 [fT] 150-200ms (M170) 200-300ms (M250) 300-400ms (M350) 400-500ms Pylkkänen and Marantz, Trends in Cognitive Sciences Letter string processing (Tarkiainen et al. 1999) Lexical activation (Pylkkänen et al. 2002)

Note left lateralization of responses in standard perisylvian language areas

Latency of M350 sensitive to lexical factors such as lexical frequency and repetition M350 (Pylkkänen, Stringfellow, Flagg, Marantz, Biomag2000 Proceedings, 2000) Repetition Frequency (Embick, Hackl, Shaeffer, Kelepir, Marantz, Cognitive Brain Research, 2001)

M350 is (in time and place) the locus of lexical activation; lexical decision modulated by competition among activated items occurs later and elsewhere

Vitevich and Luce (1998), stages of word processing Phonotactic probability (sub-lexical frequency of bits of words) affects lexical activation, with frequency being facilitory Phonological neighborhood density affects lexical decision (“after” activation), with density being inhibitory Phonotactic probability and neighborhood density are usually highly correlated, so the same items that facilitate activation inhibit decision So, words with high phonotactic probabilities from dense neighborhoods should show quicker M350 latencies but slower RTs in lexical decision

Words and non-words with high probability sound sequences, from dense neighbors, show quicker M350s and slower RTs

Pylkkänen et al. (2002) M350: not sensitive to competition from phonological neighbors, RT is NEIGHBORHOOD COMPETITION EFFECT SUBLEXICAL PHON FREQUENCY EFFECT **

Irregular Past Tense Priming: Stockall & Marantz (to appear in Mental Lexicon) In cross-modal priming (hear one word, make a lexical decision on a letter string presented immediately after), irregulars don’t generally prime their stems behaviorally: gave-GIVE taught-TEACH Allen & Badecker show that orthographic overlap in this experimental design leads to RT inhibition and that past-tense/stem pairs with higher orthographic overlap yield less priming than those with less overlap

Prediction of linguistic theories (e.g., Distributed Morphology) Irregular past tense/”stem” priming paradigms (gave/give, taught/teach) should yield identity priming at the stage of root/stem activation (the M350) and form competition effects among allomorphs subsequently, slowing reaction time relative to pure stem/stem identity priming.

MEG irregular past-tense priming experiment Design: Visual-visual immediate priming, lexical decision on the target (see Pastizzo and Feldman 2002 ) + prime target 450 50 200 0 …2500ms Duration of trial (ms)

MEG Results: M350 Priming for Past Tense/Stem equivalent to identity priming Significant priming for Identity condition (*p=0.01) TAUGHT-TEACH vs. SMACK-TEACH (*p=0.04) GAVE-GIVE vs. PLUM-GIVE (*p=0.05) No reliable effect for: STIFF-STAFF vs GRAB-STAFF (p=0.13) Amount of Priing Amount of Priming n=8

RT Results: Competition effects; no significant priming for TAUGHT-TEACH ** * * n.s. Significant priming for Identity condition (**p=0.0009) GAVE-GIVE (*p=0.03) Significant inhibition for STIFF-STAFF (*p=0.01) No reliable effect for TAUGHT-TEACH (p=0.21) (but trend towards inhibition)

MEG & RT Results: MEG taps stem activation; RT reflects decision in the face of competition ** * * n.s.

Follow-up: Add regulars and ritzy/glitzy condition Regulars walk-walked Orthographic & Semantic Overlap: boil-broil Reverse order, stem before past tense

ritzy-glitzy items drop~dripclash~clang flip~flop blossom~bloom pet~pat ghost~ghoul gloom~glumshrivel~shrink squish~squash crumple~rumple boil~broil screech~scream strain~sprainconverge~merge mangle~tanglescald~scorch slim~trimcrinkle~wrinkle bump~lumpattain~gain burst~bustscrape~scratch

Order effect on RT; i.e., on form competition

Linguistic Computational Models of Morphology fully supported Relation between irregular past tense form and stem is like that between regular past tense form and stem (or between identical stems), not like that between words phonologically/orthographically and semantically related (boil - broil) Root priming separates from form competition (between allomorphs of stem) in time course of lexical access

Taft (2004), “Morphological Decomposition and the Reverse Base Frequency Effect.” Claim:Base frequency effects (RT to complex word correlates with freq of stem) reflect access of the stem of morphological complex forms whereas surface frequency effects (RT to complex word correlates with freq of complex word) reflect stage of checking recombination of stem and affix for existence and/or well-formedness. “The suggestion being made, then, is that the advantage at the early stages of processing of having a relatively high base frequency could be potentially obscured by counterbalancing factors happening at later stages of processing.” [750-1]

Lexical Decision Task non-word foils consisting of existing words with ungrammatical affixes (mirths, kettled, joying, redly, iratest) (just like the Devlin “orthographic” cases) three classes of words  “mending” class:low surface frequency low base frequency  “seeming” class:low surface frequency high base frequency  “growing” class:mid surface frequency high base frequency

Claim: advantage of high base frequency for “seem” at stem access stage (indexed by the M350) is offset in RT by a disadvantage for the low-frequency of the use of the –ing with the “seem” stem, i.e., at the post-affix recombination stage, indexed by RT (For Taft, manipulating the foils in lexical decision attenuated the surface frequency effect, arguing for two stages of processing in the indirect fashion typical of good cognitive science )

Reilly and Holt 2004, with the KIT/MIT MEG Team Replicate Taft’s experiment in the MEG Lab Predict:  base frequency affects root access and thus M350 latency  surface frequency affects post-M350 recombination stage and thus RT

Results: M350 Latency tracks Base Frequency, RT tracks Surface Frequency Surface Frequency effect at RT (significant at.05 level), Mending and Seeming slower than Growing Base Frequency effect at M350 Latency (significant at.05 level), Mending slower than Seeming and Growing > > >

Conclusion MEG serves as a tool to upgrade cognitive science (& linguistics) to cognitive neuroscience without losing the empirically motivated richness of cognitive computational theories Cog Sci notions of space, time, and complexity map onto brain space, latency and magnitude of neural activity

What’s the next step? Traditional approaches to MEG analysis involve averaging together many responses (repeated from an experimental “bin”) prior to computing differences in responses by condition within each subject This contrasts with standard cognitive science practice (e.g., with RT) of including a dependent measure from each trial in the ANOVA. To fully incorporate cognitive theories into cognitive neuroscience, including the correlation of continuous variables with continuous response measures and the use of item analyses in complex designs, we need to include single trial MEG data in our analyses

Why not single trial MEG? For the type of experiment discussed in this talk, we would need to extract response amplitude and latency information from each trial, given a “response” defined in terms of source localization So, we would look at each single response for dipole source activation (latency of peak response, amplitude of response) for a source identified from grand averaged data for a subject

M100 Latency, Single Trials (Marantz, in preparation) Left hemisphere M100 source computed via single dipole model from grand averaged response to 60 tones, 30 at 200Hz, 30 at 1KHz Weight matrix from dipole source used as spatial filter over raw data to derive dipole activation latency for each tone individually

Single trial M100 latencies 200Hz1 KHz

Single trial analysis as in behavioral studies is possible using only normal MEG techniques and tools No fancy pre-processing No fancy localization or statistical tools For responses less automatic than the M100, expect overlap in scatter plots to be greater (approaching that for RTs in e.g. lexical decision experiments)

Taft & Forster re-visited Is RT slow-down for -semble (bound stem) over -sassin (pseudo-stem) attributable to lexical access for “semble” but not for “sassin,” as Taft claims, or to response competition from words (resemble, dissemble, assemble vs. assassin)? Prediction: slow-down at lexical access should show up at M350 while slow-down for response competition should occur after (as shown by neighborhood density and past tense studies)

Brown & Marantz (in preparation) 3 subjects 20 real stems, 20 pseudo stems (matched by Taft & Forster along various dimensions) per condition Single trial analysis of MEG data: M350 dipole activation peak analysis, with M350 dipole fitted over left-hemisphere sensors on the grand average to all stimuli in the experiment

Slow-down is observed at M350: for 3 subjects and 108 observations, difference is significant over the single trial MEG data but not yet for RT Real Stems (-semble) Pseudo Stems (-sassin) Reaction time 784ms719msp=0.16 M350 Latency (over single trials) 356ms339msp=0.005

Taft theory of decomposition in which bound stems have lexical entries is fully supported by the MEG data Single trial MEG data is at least as consistent as reaction time data MEG can be used on par with RT to add additional dependent variables to experiments testing computational theories within cognitive neuroscience

Thank you. marantz@mit.edu

Using Evoked Magnetoencephalographic Responses for the Cognitive Neuroscience of Language Alec Marantz MIT KIT/MIT MEG Joint Research Lab Department of.

Similar presentations

Presentation on theme: "Using Evoked Magnetoencephalographic Responses for the Cognitive Neuroscience of Language Alec Marantz MIT KIT/MIT MEG Joint Research Lab Department of."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Using Evoked Magnetoencephalographic Responses for the Cognitive Neuroscience of Language Alec Marantz MIT KIT/MIT MEG Joint Research Lab Department of.

Similar presentations

Presentation on theme: "Using Evoked Magnetoencephalographic Responses for the Cognitive Neuroscience of Language Alec Marantz MIT KIT/MIT MEG Joint Research Lab Department of."— Presentation transcript:

Similar presentations

About project

Feedback