Presentation is loading. Please wait.

Presentation is loading. Please wait.

Machine Learning for Information Extraction Li Xu.

Similar presentations


Presentation on theme: "Machine Learning for Information Extraction Li Xu."— Presentation transcript:

1 Machine Learning for Information Extraction Li Xu

2 Objective Learn how to apply the machine learning concept to the application Learn how to improve the performance of the existed application by applying the machine learning algorithms

3 Introduction Information Extraction (IE) is concerned with extracting the relevant data from a collection of document. Key component: extraction patterns. Machine Learning algorithms.

4 IE for Free Text Syntactic and semantic constraints AutoSlog LIEP PALKA CRYSTAL CRYSTAL + Webfoot HASTEN

5 IE from online Document WHISK (Soderland 1998) –Domain: Rental Ads –Precision: ~95%; Recall: 73%-90% RAPIER (Califf & Mooney 1997) –Domain: software jobs –Precision: 84%; Recall: 53% SRV (Freitag 1998) –Domain: Seminar announcement –Precision: Speaker, 75%; Location,75%; start time 99%, end time 96%.

6 WHISK

7 RAPIER

8 SRV

9 Problems Bottom-up search –RAPIER –WHISK Single-slot extraction rules –SRV –RAPIER Heavily depend on the layout pattern

10 Obituary Ontology

11 Improvement

12 Lexical Object Relational Learning –FOIL –Feature design Regular expression Rote Learning

13 Multi-slot Hierarchy

14 Multi-slot Boundary Relational Learning Feature Design –Individual heuristics –Combining heuristics

15 Conclusion How to applying the machine learning algorithm to IE? What is the problem for each system? How to improve an existed IE approach through machine learning? And how to avoid the problems appeared in other machine learning based IE systems?


Download ppt "Machine Learning for Information Extraction Li Xu."

Similar presentations


Ads by Google