Presentation is loading. Please wait.

Presentation is loading. Please wait.

Knowledge Discovery for a Focused Domain Scanning of documents and messages of interest to a business and the extraction of relevant facts for knowledge.

Similar presentations


Presentation on theme: "Knowledge Discovery for a Focused Domain Scanning of documents and messages of interest to a business and the extraction of relevant facts for knowledge."— Presentation transcript:

1 Knowledge Discovery for a Focused Domain Scanning of documents and messages of interest to a business and the extraction of relevant facts for knowledge discovery by computer. Driven by a Brief, a description of the domain of interest. Based on linguistic and statistical analysis of text. Supported by Lexica and Knowledge Bases. Knowledge discovery by inference and visualization. Academic project partners: –TAI Research Centre, Helsinki University, Department of Computational Linguistics, VTT Information Technology Industrial project partners Linguistic Partners: Conexor Oy and Lingsoft Oy Contact: Matti Keijola, +358 9 451 2163, matti.keijola@hut.fi

2 The BRIEFS Refinement Process Text Morpho/Syntax Expressions Semantics Elements Event scenarios Element relations Knowledge base Inference of trends and consequences Interactive visualization

3 Brief/ Context Linguistic/ Statistical Analysis Terms/ Relations Ontology Creation Ontology (augmented with linguistic/ statistical knowledge) Ontologies’ Repository Domain Documents Linguistic/ Statistical Analysis Relevance Estimation Relevant Documents Information Extraction Inference and Knowledge Discovery Browsing and Visualization Ontological Information Ontological Information Ontological Information Knowledge Warehouse Domain Information BRIEFS Architecture

4 Example : BRIEFS WAP INDUSTRY RELATIONS Company Technology Product/ Service Person Stock Exchange Location Symbol Is_in Member_of Symbol_of Marketed_by, owned_by, manufactured-by... Uses... Owned_by…. Name category, function, features Name Employed_by... title, function Name Name, geography Name Is_used_in Deals with...

5 Extraction Results Database DOCUMENTS TEMPLATE TR TE

6 BRIEFS Frequency Chart

7 BRIEFS Relation List

8 BRIEFS: Extracted Deal Events

9 Clustering by Extracted Deal Events

10 Some Annotation Issues Identifying names of concepts is important for IE. –But what is a name? “The 7110 phone...” “The Nokia 7110 mobile phone...” “Nokia’s 7110 phone unit” “The phone…” “It …” Harmonisation is important for computer-based KA Coreference and pronominial anaphora resolution Some concepts are not addressed directly by name

11 Coreference and Anaphora Resolution

12 Some Potential Uses of a BRIEFS-like System Follow-up of presence of (e.g. companies, persons, products, technologies, …) in news reports and discovery of trends thereof Follow-up of deals companies make, discovery of evolving networks of deals Follow-up of events in in industry, discovery of trends and traits Follow-up of development of new technologies Follow-up of changes in business practices Investor/advisor review Other domains: e.g. maintenance reports In summary: extraction of specific data from written documents and messages, harmonizing the data, accumulating the data into knowledge warehouses and making of inferences based on the accumulated data


Download ppt "Knowledge Discovery for a Focused Domain Scanning of documents and messages of interest to a business and the extraction of relevant facts for knowledge."

Similar presentations


Ads by Google