pre-CODIE System: Kiyoshi Sudo Satoshi Sekine Ralph Grishman New York University Crosslingual On-Demand Information Extraction IE from Japanese source text without knowing Japanese
May 28, 2003HLT-NAACL Information Extraction Company : La Jolla Genomatics Product:Geninfo Date:June 1999 release organization product datein Pattern of dependency structure Mapping from Text to Table by pattern matching subject object Fletcher Maddox, former Dean of the UCSD Business School, announced the formation of La Jolla Genomatics together with his two sons. La Jolla Genomatics will release its product Geninfo in June Geninfo is a turnkey system to assist biotechnology researchers in keeping up with the voluminous literature in all aspects of their field.
May 28, 2003HLT-NAACL CODIE is… Cross-lingual Information Extraction –IE from Japanese source text without knowing Japanese On-Demand Information Extraction –IE based on the user ’ s scenario request
May 28, 2003HLT-NAACL How to use CODIE… Step 1: Query –pre-CODIE system takes the user ’ s scenario request, either in keywords or narrative sentences. Step 2: Configuration –The user specifies what slots consist of the table. Step 3: Slot Assignment –The user associates each pattern to the appropriate slot. Step 4: Extraction –pre-CODIE system shows the table with each slot filled by the pattern match. IncrementalDevelopment
May 28, 2003HLT-NAACL CODIE Architecture English Query English Template English Japanese Pattern Acquisition Information Extraction (Pattern Matching) IR (Japanese Query) Japanese Template Slot Assignment Pre- IBM MT system
May 28, 2003HLT-NAACL Step 1: Query
May 28, 2003HLT-NAACL Step 2: Configuration
May 28, 2003HLT-NAACL Step 3: Slot Assignment
May 28, 2003HLT-NAACL Step 3.1: Example
May 28, 2003HLT-NAACL Step 4: Extraction