Data Mining David Eichmann School of Library and Information Science The University of Iowa David Eichmann School of Library and Information Science The.

Slides:



Advertisements
Similar presentations
OPEN ACCESS? Online availability of full-text journals and databases at ITM Dirk Schoonbaert, 3/3/5 ITM Library.
Advertisements

Transferable Skills beyond the academic training 22nd January, 14-18h, Building 3, Floor 1, Computer Room 9 (16.P1.E3) 29nd January, 14-18h, Building.
New Technologies Supporting Technical Intelligence Anthony Trippe, 221 st ACS National Meeting.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Finding the Best Evidence Literature for Evidence Based Health Care.
Library Class for TCM Medline & AMED. Medline MEDLINE® is the U.S. National Library of Medicine's® (NLM) premier bibliographic database that contains.
2015/6/1Course Introduction1 Welcome! MSCIT 521: Knowledge Discovery and Data Mining Qiang Yang Hong Kong University of Science and Technology
1 Question Answering in Biomedicine Student: Andreea Tutos Id: Supervisor: Diego Molla.
Link Detection David Eichmann School of Library and Information Science The University of Iowa David Eichmann School of Library and Information Science.
Finding and managing information for your doctorate (including Endnote): part 2 David Heading and Laura Jeffrey.
1 Database Description and Details. Biological & Agricultural Index offers individuals convenient online access to the literature of biology and agriculture.
SAK 5609 DATA MINING Prof. Madya Dr. Md. Nasir bin Sulaiman
Fundamentals of Information Systems, Second Edition 1 Organizing Data and Information Chapter 3.
Data Mining.
Abstract and keywords Sadeghi Ramin, MD Nuclear Medicine Research Center, Mashhad University of Medical Sciences.
CSE 730 Information Retrieval of Biomedical Data The use of medical lexicon in biomedical IR.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
ISYS3015 Analytical Methods for Information systems professionals Week 3 Lecture 1: Finding the literature.
Advanced Database Applications Database Indexing and Data Mining CS591-G1 -- Fall 2001 George Kollios Boston University.
Neural Networks in Data Mining “An Overview”
Oracle Data Mining Ying Zhang. Agenda Data Mining Data Mining Algorithms Oracle DM Demo.
Introduction to Data Mining Engineering Group in ACL.
Mendeley What is it? How is it different from other “Bibliographic databases” like End Note and Reference.
LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
University of Toronto 8/30/20151 Data Mining The Art and Science of Obtaining Knowledge from Data Dr. Saed Sayad.
The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation SEASR Overview Loretta Auvil and Bernie Acs National.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Indexing 1/2 BDK12-3 Information Retrieval William Hersh, MD Department of Medical Informatics & Clinical Epidemiology Oregon Health & Science University.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
Improve your R&D Effectiveness and Manage Your Intellectual Property Assets with Luxid ® for Life Sciences.
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
Shelly Warwick, MLS, Ph.D – Permission is granted to reproduce and edit this work for non-commercial educational use as long as attribution is provided.
Data Mining By Dave Maung.
Copyright © 2004 Pearson Education, Inc.. Chapter 27 Data Mining Concepts.
Document Collections cs5984: Information Visualization Chris North.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Academic Resources: Exercise Prescription for Special Populations Frederic Murray Assistant Professor MLIS, University of British Columbia BA, Political.
Data Mining In contrast to the traditional (reactive) DSS tools, the data mining premise is proactive. Data mining tools automatically search the data.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Radar Chart Radar Data in Excel rd Reading Total** th Reading Total** th Reading Total** th.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #15 Secure Multimedia Data.
Foundations of Business Intelligence: Databases and Information Management.
Neural Networks Demystified by Louise Francis Francis Analytics and Actuarial Data Mining, Inc.
Citation-Based Retrieval for Scholarly Publications 指導教授:郭建明 學生:蘇文正 M
Data Mining By: Johan Johansson. Mining Techniques Association Rules Association Rules Decision Trees Decision Trees Clustering Clustering Nearest Neighbor.
Chapter 2 Data, Text, and Web Mining. Data Mining Concepts and Applications  Data mining (DM) A process that uses statistical, mathematical, artificial.
Deep Indexing in ProQuest Health and Medical Databases.
Data Mining With SQL Server Data Tools Mining Data Using Tools You Already Have.
How do you get here?
OVIDSP Searches Library Informatics 2011/2012 Edit Csajbók Semmelweis University Central Library.
Data Mining is the process of analyzing data and summarizing it into useful information Data Mining is usually used for extremely large sets of data It.
THE LEONS COLLEGE OF LAW1 Organizing Data and Information Chapter 4.
Chapter 9 Informatics and Community Health Nursing
Introduction to Data Mining
School of Library and Information Science
Literature Search Strategies
Data Mining Techniques So Far…
Data Mining: Concepts and Techniques Course Outline
Dr. Sudha Ram Huimin Zhao Department of MIS University of Arizona
Prepared by: Mahmoud Rafeek Al-Farra
Prepared by: Mahmoud Rafeek Al-Farra
School of Library and Information Science
Visualizing Document Collections
How to identify scholarly, academic or peer-reviewed articles
Automatic Detection of Causal Relations for Question Answering
Introduction to Information Retrieval
Lívia Vasas, PhD 2018 The Nation Library of Medicine and its databases Mozilla Firefox or Google Chrome Lívia Vasas, PhD.
The National Library of Medicine and its databases
Welcome! Knowledge Discovery and Data Mining
Presentation transcript:

Data Mining David Eichmann School of Library and Information Science The University of Iowa David Eichmann School of Library and Information Science The University of Iowa

Why? Given enough data represented through enough dimensions, we loose the ability to see the patterns

How? Decision Trees Nearest Neighbor Clustering Neural Networks Rule Induction K-Means Clustering Decision Trees Nearest Neighbor Clustering Neural Networks Rule Induction K-Means Clustering

What is it? The automated extraction of hidden predictive information from databases. Key points Automated Hidden Predictive The automated extraction of hidden predictive information from databases. Key points Automated Hidden Predictive

The Typical Process

Evaluation Criteria Receiver Operating Characteristic Curves

But Nobody Said We Had To Do MATH….

Forms of Data Structured Databases Forms Semi-Structured Tables on the Web Bibliographic citations Graphs & charts Unstructured Full text (e.g., journal articles, physician chart notes) Images Structured Databases Forms Semi-Structured Tables on the Web Bibliographic citations Graphs & charts Unstructured Full text (e.g., journal articles, physician chart notes) Images

Text Mining Corpus now is a collection of text artifacts Full text when youve got it (e.g. newswire) Metadata when you dont (e.g. MEDLINE) The trick then becomes extracting interesting relationships between interesting entities Who killed who Who works for who Who makes what Corpus now is a collection of text artifacts Full text when youve got it (e.g. newswire) Metadata when you dont (e.g. MEDLINE) The trick then becomes extracting interesting relationships between interesting entities Who killed who Who works for who Who makes what

The Classic Entities Persons Organizations Places (Geography) Events Persons Organizations Places (Geography) Events

A Newswire Example APW [Israel(0.271), Jonathan Pollard (0.153), Benjamin Netanyahu(0.102), Bill Clinton(0.102), United States(0.055),...] Persons Bill Clinton (3) Jonathan Pollard (8) Moshe Fogel (2) Benjamin Netanyahu (2) Israeli Embassy (1) Organizations Cabinet (1) Places Israel (16) United States (5) Washington (2) APW [Israel(0.271), Jonathan Pollard (0.153), Benjamin Netanyahu(0.102), Bill Clinton(0.102), United States(0.055),...] Persons Bill Clinton (3) Jonathan Pollard (8) Moshe Fogel (2) Benjamin Netanyahu (2) Israeli Embassy (1) Organizations Cabinet (1) Places Israel (16) United States (5) Washington (2)

In the Medical/Health Realm UMLS an excellent framework Organism Chemical Activity Disease UMLS an excellent framework Organism Chemical Activity Disease

A MEDLINE Example Document: Reconstructive surgery in Nicaragua Provided MeSH Keywords Human Nicaragua Z Surgery, Plastic/* G Phrases [Reconstructive, surgery] [Nicaragua] [letter] MeSH Terms Surgery (1) G Letter [Publication Type] (1) Other Phrases Reconstructive surgery (1) Document: Reconstructive surgery in Nicaragua Provided MeSH Keywords Human Nicaragua Z Surgery, Plastic/* G Phrases [Reconstructive, surgery] [Nicaragua] [letter] MeSH Terms Surgery (1) G Letter [Publication Type] (1) Other Phrases Reconstructive surgery (1)

Concept Extraction Example Roman forces under Julius Caesar invade Britain. (S (NP (NP Roman forces) (PP under (NP Julius Caesar))) (VP invade (NP Britain)).) Entity Attributes: Concepts: Roman forces under Julius Caesar invade Britain. (S (NP (NP Roman forces) (PP under (NP Julius Caesar))) (VP invade (NP Britain)).) Entity Attributes: Concepts:

And a Small Demo…