Presentation is loading. Please wait.

Presentation is loading. Please wait.

School of Library and Information Science

Similar presentations


Presentation on theme: "School of Library and Information Science"— Presentation transcript:

1 School of Library and Information Science
Data Mining David Eichmann School of Library and Information Science The University of Iowa

2 Why? Given enough data represented through enough dimensions, we loose the ability to see the patterns

3 How? Decision Trees Nearest Neighbor Clustering Neural Networks
Rule Induction K-Means Clustering

4 What is it? The automated extraction of hidden predictive information from databases. Key points Automated Hidden Predictive

5 The Typical Process

6 Evaluation Criteria Receiver Operating Characteristic Curves

7 But Nobody Said We Had To Do MATH….

8 Forms of Data Structured Semi-Structured Unstructured Databases Forms
Tables on the Web Bibliographic citations Graphs & charts Unstructured Full text (e.g., journal articles, physician chart notes) Images

9 Text Mining Corpus now is a collection of text artifacts
Full text when you’ve got it (e.g. newswire) Metadata when you don’t (e.g. MEDLINE) The trick then becomes extracting ‘interesting’ relationships between ‘interesting’ entities Who killed who Who works for who Who makes what

10 The Classic Entities Persons Organizations Places (Geography) Events

11 A Newswire Example APW [Israel(0.271), Jonathan Pollard (0.153), Benjamin Netanyahu(0.102), Bill Clinton(0.102), United States(0.055), ...] Persons Bill Clinton (3) Jonathan Pollard (8) Moshe Fogel (2) Benjamin Netanyahu (2) Israeli Embassy (1) Organizations Cabinet (1) Places Israel (16) United States (5) Washington (2)

12 In the Medical/Health Realm
UMLS an excellent framework Organism Chemical Activity Disease

13 A MEDLINE Example Document: Reconstructive surgery in Nicaragua Provided MeSH Keywords Human Nicaragua Z Surgery, Plastic/* G Phrases [Reconstructive, surgery] [Nicaragua] [letter] MeSH Terms Surgery (1) G Letter [Publication Type] (1) Other Phrases Reconstructive surgery (1)

14 Concept Extraction Example
“Roman forces under Julius Caesar invade Britain.” (S (NP (NP Roman forces) (PP under (NP Julius Caesar))) (VP invade (NP Britain)) .) Entity Attributes: <organization Roman forces> <person Julias Caesar> <placename Britain> Concepts: <Roman forces - under - Julius Caesar> <Roman forces - invade - Britain>

15 And a Small Demo…


Download ppt "School of Library and Information Science"

Similar presentations


Ads by Google