Download presentation
Presentation is loading. Please wait.
Published byDerek Hoover Modified over 9 years ago
1
Natural Language Processing Group Computer Sc. & Engg. Department JADAVPUR UNIVERSITY KOLKATA – 700 032, INDIA. Professor Sivaji Bandyopadhyay sivaji_ju@vsnl.com
2
Research Areas Natural Access to Internet & Other Resources – Headline Generation – Headline Translation – Document Translation – Multilingual Multidocument Summarization Cross-lingual Information Management – Multilingual and Cross-lingual IR – Open Domain Question Answering
3
Natural Access to Internet & Other Resources Headline Generation – A machine translation problem the input document identified by a set of features and output headline represents some of them – Example Base Set of features in the input document and the headline template(s) – Implemented for generating headlines from cricket news in English
4
Natural Access to Internet & Other Resources Headline Translation – A Hybrid MT system for translating English news headlines to Bengali – Syntactic and Semantic classification of news headlines done – Anaphora and Coreference classes identified in news headlines – Translation Strategy The input headline first searched in Translation Memory, else tagged and searched in Tagged Example Base, else analyzed and matched in Phrasal Example Base, else heuristics applied
5
Natural Access to Internet & Other Resources Document Translation – Prototype developed for A Hybrid MT system from English to Bengali – Translation Strategy Identify the constituent phrases of a sentence using a Shallow Parser translate them individually using an Example Base arrange the translated phrases using heuristics to form the target language output Verb phrases translated using Morphological Paradigm Suffix Tables
6
Natural Access to Internet & Other Resources Multilingual Multidocument Summarization – Multidocument summarization in each language Summarize one of the documents using extraction methods Revise the summary using other documents – Summary in the target language is the reference summary – Translate all summaries to the target language – Revise the reference summary
7
Cross-lingual Information Management Multilingual and Cross-lingual IR – A Cross Language Database (CLDB) System in Bengali and Hindi developed – Natural language query analyzed using a Template Grammar and Knowledge Bases to produce the corresponding SQL statement – Cooperative response in the query language – Anaphora / Coreference in CLDB studied – Database updates and elliptical queries also supported
8
Cross-lingual Information Management Open Domain Question Answering – Work being done for English – Currently building a set of question templates (Qtargets) and the corresponding Answer patterns with relative weights – Input question analyzed to produce the corresponding question template Appropriate answer pattern retrieved Answer generated using the input document and the synthesis rules of the language
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.