Presentation is loading. Please wait.

Presentation is loading. Please wait.

References Kempen, Gerard & Harbusch, Karin (2002). Performance Grammar: A declarative definition. In: Nijholt, Anton, Theune, Mariët & Hondorp, Hendri.

Similar presentations


Presentation on theme: "References Kempen, Gerard & Harbusch, Karin (2002). Performance Grammar: A declarative definition. In: Nijholt, Anton, Theune, Mariët & Hondorp, Hendri."— Presentation transcript:

1 References Kempen, Gerard & Harbusch, Karin (2002). Performance Grammar: A declarative definition. In: Nijholt, Anton, Theune, Mariët & Hondorp, Hendri. (Eds.), Computational Linguistics in the Netherlands 2001. Amsterdam: Rodopi. Harbusch, Karin & Kempen, Gerard (2002). A quantitative model of word order and movement in English, Dutch and German complement constructions. Proceedings of the 19th International Conference on Computational Linguistics (COLING-2002), Taipei (Taiwan). [pp. 328-334] Project Objective Problem For the user of a Data Base it is best if the textual part of results of a query is given as a well organized set of natural language sentences. The usual ways of dealing with this problem combine sentences from the database, but cannot produce new ones or combine parts of those already available. The goal of the project is to develop a generating system which will be fed with semantically rich information. In this way a closer and more content-sensitive relation can be established between the query and the information in the Data Base. This kind of generator would in the same time be fed more of the syntactically relevant information like the logical form, the temporal structure of the event, the Argument Structure and others, that help not only the process of generation, but also possibly the discourse organization of the text and information interchange with different ontologies and semantic webs. The model of the semantic representation and procedures developed should be as close as possible to the universal compatibility with other generators. One way to test this property is based on the availability of two different generators at the project: Performance Grammar Generator and Delilah, with both of which the semantic model should be able to work successfully. An important question that will be tackled is whether, and to what extent, it is possible to recover the complex entailment calculations by procedural means in the system operating over the Logical Form and semantic representation. Objective Build a generating system which operates with both the conceptual level and Logical Form and which is able to generate natural language sentences from the selected semantic material. The important tasks within this objective are to keep the entailing relations, i.e. make sure that the output preserves the information from the Data Base without adding any new meaning to it and to keep the compatibility with the other projects in I2RP working on related projects. The Data Base used for testing the system and applying it to is the Rijksmuseum’s Data Base on Rembrandt, the same one that is used by the Cuypers project of CWI. Principles Base the model on the semantics and logical form of the information from the Data Base instead of the statistic and other shallow methods. Keep all the modules universally applicable to any natural language and not only adjusted to Dutch, English or other language of application. Use the semantic base of the generator to make it able to establish the relations between different lexical contents denoting the same concepts. Follow the findings of experimental and theoretical sciences like psycholinguistics, semantics and syntax in the architecture of the system instead of the simple result oriented procedures which usually lead to oversimplifications and therefore to less robust applications. Current Work Parsing-Generating System with the conceptual interface Two software applications are being developed and upgraded: the Performance Grammar Generator, a generator based on the Performance Grammar model of Gerard Kempen and Delilah, the work of Cremers and Hijzelendoorn, which does both parsing and generating. The current work is based on the latter and presents an interface system that allows Delilah to do the generation on the basis of a parsed natural language input, e.g. its conceptual material. Delilah’s Parsing-Generating system takes a sentence of Dutch as its input and gives the output of new grammatical sentences of Dutch containing all the concepts from the input. At the point of its development, the system cannot control the process of generation so that only the sentences entailed by the input would be generated. It means that among the sentences generated by the system, a large number will have meaning contradictory or orthogonal to the ones from the input: based on the same concepts, but in very different relations and realizations. There are two plausible ways to solve this problem. The first one is Generate and Test, which means introducing another module which would test the generated sentences for their entailment and end the recursive generation when a satisfying sentence is made. The other two are the following. Future Plans A richer semantic representation Research in the areas of Logical Form, thematic roles, Aspectual and Argument Structure as well as in the Event Structure and nominalization are expected to lead o a richer semantic representation which would include event structure and in this way improve both the parser and the generator. It will also contribute to enriching the lexical information and allowing for entailment to be calculated on a more fine grained level. Improving entailment calculations Te general question of to what extent the information can be preserved in the processes of parsing and generation will be more thoroughly explored in order to find the optimal way to make the generation process controlled and strictly dependent on the entailment relations with the Data Base. Synchronizing the work with the other projects in I2RP There already is a significant overlapping in the areas of interests and problems to be solved with other projects inside I2RP. We are trying to establish a common ground in the approach to discourse structure, conceptual networks and ontologies with the other groups, especially with the Cuypers project at CWI, with which our project overlaps the most. We are also going to organize the cooperation so that we work on the same Data Bases and try to establish a compatibility between each other’s applications. Covering the Rembrandt Data Base The lexical and conceptual world of the Rembrandt Data Base will be further incorporated in the developed systems. One of the next steps is to impose a certain hierarchical or ontological organization of the set of concepts on the generating system, especially its semantic component. Semantically Based Parsing-Generating System Boban Arsenijević, dr. Crit Cremers, prof. Gerard Kempen with help of Hilke Reckman (a member of ToKeN’s Narator project), b.arsenijevic@let.leidenuniv.nlb.arsenijevic@let.leidenuniv.nl, C_L_J_M_Cremers@let.leidenuniv.nl, kempen@fsw.LeidenUniv.nl, H_G_B_Reckman@let.leidenuniv.nlC_L_J_M_Cremers@let.leidenuniv.nlkempen@fsw.LeidenUniv.nlH_G_B_Reckman@let.leidenuniv.nl The hybrid solution would still use the conceptual level, but the selection of the material for the base of generation would go on the level of the semantic form, which should allow it to keep the advantages of both other solutions. Delilah’s Parser-Generator is a parser and generator in the same time for which the parsing segment introduces the Data Base, which can be any text in Dutch. It parses the input into a semantic representation, and then further extracts the concepts related to the meaning of the sentence. No matter how many concepts extracted, it is able to make a new sentence with possibly new relations between the concepts and new words realizing them, but which will contain all the concepts extracted from the input. The current version prefers an unstructured Data Base to a structured one. Controlled procedure is supposed to work only on the level of the sematic representation, without the concepts, and generate the sentence only once the proper material is selected. It is much faster than the Generate and Test model, but also has certain disadvantages. De Spreekbuis


Download ppt "References Kempen, Gerard & Harbusch, Karin (2002). Performance Grammar: A declarative definition. In: Nijholt, Anton, Theune, Mariët & Hondorp, Hendri."

Similar presentations


Ads by Google