Natural Language Processing Syntax
Syntactic structure John likes Mary PN VtVt NP VP S DetPNVtVt NP VP S Every man likes Mary Noun
Syntactic structure PNDetNounVtVt DetNounPN Johnhatessomefilms eachwomanhasmanyfriends EverymanlikesMary Thefarmerownsacar
Syntactic structure Every man likes Mary Det PNVtVt NP Noun NP VP S
Parsing Parsing is the process of recovering the Two main strategies: –Top-down parsing –Bottom-up parsing
Bottom-up parsing Following this strategy, the analysis starts at the level of the worlds and proceeds upwards to the higher levels.
The transformational model One of the major achievements in the field of theoretical linguistics has been the development of the Transformational Model (TM) by Chomsky [Cho65]. The TM is an attempt to represent, through the use of mathematical abstraction, the Linguistic Competence of an individual. The linguistic competence is the implicit knowledge that every adult has about his/her own language. Chomsky [Cho57] defines a language as a finite/infinite set of well-formed strings of symbols taken from a finite vocabulary, called the lexicon. The well-formed condition is defined by the grammar of the language.
The transformational model A grammar is a finite specification of the sentences of a language: it may consist of an explicit account of every sentence in the language (if the language is finite) or a set of generative rules with the capability of producing all and only the grammatical sentences of the language they define. Chomsky showed that a natural language, like English, cannot be properly represented by a finite-state grammar. He realized that a context-free grammar did not had the power to define a natural language. In [cho65], Chomsky proposed the transformational model, as the first representation of linguistic competence.
The transformational model Base Component Phonological Component Semantic Component Transformational Component Sounds Meanings Surface Structure Syntactic Component
The transformational model In the core of the syntactic component we have two structures: the deep structure and the surface structure. The deep structure contains all the information pertinent to the semantic interpretation of the sentence. The surface structure captures all relevant information for the phonological interpretation of the sentence. The base component is comprising a context-free grammar and a lexicon. This component generates the deep structure of the sentence.
The transformational model The transformational component consists of a set of rewrite rules or transformations, that are applied to the deep structure, rearranging its constituents, and adding, deleting or replacing elements, until the sentence obtains its final form or surface structure. This process of transformation is based on the assumption that transformations do not modify meaning.
The transformational model Since Chomsky´s edition of its first paper [cho57], many linguists have worked on developing transformation rules which give a correct account of English. This work has been a source of disagreement and controversy among linguists. A set of accepted transformations is listed below: Number-Agreement SD:NP VREST SC:1,[num] 2,[num] 3
The transformational model THERE-INSERTION SD:NP VREST SC: [There+2]1 3 PASSIVE-FORMATION SD:NP V NPREST SC: 3 [BE+2] [BY+1] 4
The transformational model DO-INSERTION SD:Q NP VREST SC: 0[DO+2] 3 4 DATIVE-MOVEMENT SD:NP V NP1 to NP2 REST SC: