School of Library and Information Science

Slides:



Advertisements
Similar presentations
Data Mining David Eichmann School of Library and Information Science The University of Iowa David Eichmann School of Library and Information Science The.
Advertisements

Transferable Skills beyond the academic training 22nd January, 14-18h, Building 3, Floor 1, Computer Room 9 (16.P1.E3) 29nd January, 14-18h, Building.
New Technologies Supporting Technical Intelligence Anthony Trippe, 221 st ACS National Meeting.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
2015/6/1Course Introduction1 Welcome! MSCIT 521: Knowledge Discovery and Data Mining Qiang Yang Hong Kong University of Science and Technology
Link Detection David Eichmann School of Library and Information Science The University of Iowa David Eichmann School of Library and Information Science.
1 Database Description and Details. Biological & Agricultural Index offers individuals convenient online access to the literature of biology and agriculture.
SAK 5609 DATA MINING Prof. Madya Dr. Md. Nasir bin Sulaiman
Fundamentals of Information Systems, Second Edition 1 Organizing Data and Information Chapter 3.
Data Mining.
CSE 730 Information Retrieval of Biomedical Data The use of medical lexicon in biomedical IR.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Advanced Database Applications Database Indexing and Data Mining CS591-G1 -- Fall 2001 George Kollios Boston University.
Neural Networks in Data Mining “An Overview”
Oracle Data Mining Ying Zhang. Agenda Data Mining Data Mining Algorithms Oracle DM Demo.
Introduction to Data Mining Engineering Group in ACL.
Mendeley What is it? How is it different from other “Bibliographic databases” like End Note and Reference.
LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
University of Toronto 8/30/20151 Data Mining The Art and Science of Obtaining Knowledge from Data Dr. Saed Sayad.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Indexing 1/2 BDK12-3 Information Retrieval William Hersh, MD Department of Medical Informatics & Clinical Epidemiology Oregon Health & Science University.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Flexible Text Mining using Interactive Information Extraction David Milward
Data Mining By Dave Maung.
Document Collections cs5984: Information Visualization Chris North.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Data Mining In contrast to the traditional (reactive) DSS tools, the data mining premise is proactive. Data mining tools automatically search the data.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Radar Chart Radar Data in Excel rd Reading Total** th Reading Total** th Reading Total** th.
Foundations of Business Intelligence: Databases and Information Management.
Citation-Based Retrieval for Scholarly Publications 指導教授:郭建明 學生:蘇文正 M
Data Mining By: Johan Johansson. Mining Techniques Association Rules Association Rules Decision Trees Decision Trees Clustering Clustering Nearest Neighbor.
Chapter 2 Data, Text, and Web Mining. Data Mining Concepts and Applications  Data mining (DM) A process that uses statistical, mathematical, artificial.
Data Mining With SQL Server Data Tools Mining Data Using Tools You Already Have.
This multimedia product and its contents are protected under copyright law. The following are prohibited by law: any public performance or display, including.
How do you get here?
Data Mining is the process of analyzing data and summarizing it into useful information Data Mining is usually used for extremely large sets of data It.
THE LEONS COLLEGE OF LAW1 Organizing Data and Information Chapter 4.
Chapter 5 Foundations of Business Intelligence: Databases and Information Management.
Pengantar Sistem Informasi
Text Mining CSC 600: Data Mining Class 20.
Prepared by: Mahmoud Rafeek Al-Farra
Research Enablement Metrics
Introduction to R Programming with AzureML
Literature Search Strategies
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Data Mining Techniques So Far…
What is IR? In the 70’s and 80’s, much of the research focused on document retrieval In 90’s TREC reinforced the view that IR = document retrieval Document.
Data Mining: Concepts and Techniques Course Outline
Why peer review articles? How to find peer review materials?
Database & Record Structure
Machine Learning & Data Science
Research at Open Systems Lab IIIT Bangalore
Dr. Sudha Ram Huimin Zhao Department of MIS University of Arizona
Order Database – ER Diagram
MANAGING DATA RESOURCES
Part Three SOURCES AND COLLECTION OF DATA
Prepared by: Mahmoud Rafeek Al-Farra
Prepared by: Mahmoud Rafeek Al-Farra
School of Library and Information Science
Visualizing Document Collections
How to identify scholarly, academic or peer-reviewed articles
Automatic Detection of Causal Relations for Question Answering
CHAPTER 1: THE DATABASE ENVIRONMENT AND DEVELOPMENT PROCESS
Introduction to Information Retrieval
Text Mining CSC 576: Data Mining.
TEXT and WEB MINING.
Welcome! Knowledge Discovery and Data Mining
Presentation transcript:

School of Library and Information Science Data Mining David Eichmann School of Library and Information Science The University of Iowa

Why? Given enough data represented through enough dimensions, we loose the ability to see the patterns

How? Decision Trees Nearest Neighbor Clustering Neural Networks Rule Induction K-Means Clustering

What is it? The automated extraction of hidden predictive information from databases. Key points Automated Hidden Predictive

The Typical Process

Evaluation Criteria Receiver Operating Characteristic Curves

But Nobody Said We Had To Do MATH….

Forms of Data Structured Semi-Structured Unstructured Databases Forms Tables on the Web Bibliographic citations Graphs & charts Unstructured Full text (e.g., journal articles, physician chart notes) Images

Text Mining Corpus now is a collection of text artifacts Full text when you’ve got it (e.g. newswire) Metadata when you don’t (e.g. MEDLINE) The trick then becomes extracting ‘interesting’ relationships between ‘interesting’ entities Who killed who Who works for who Who makes what

The Classic Entities Persons Organizations Places (Geography) Events

A Newswire Example APW19981001.0262 [Israel(0.271), Jonathan Pollard (0.153), Benjamin Netanyahu(0.102), Bill Clinton(0.102), United States(0.055), ...] Persons Bill Clinton (3) Jonathan Pollard (8) Moshe Fogel (2) Benjamin Netanyahu (2) Israeli Embassy (1) Organizations Cabinet (1) Places Israel (16) United States (5) Washington (2)

In the Medical/Health Realm UMLS an excellent framework Organism Chemical Activity Disease

A MEDLINE Example Document: 89316090 - Reconstructive surgery in Nicaragua Provided MeSH Keywords Human Nicaragua Z01.107.169.690 Surgery, Plastic/* G02.403.810.788 Phrases [Reconstructive, surgery] [Nicaragua] [letter] MeSH Terms Surgery (1) G02.403.810.762 Letter [Publication Type] (1) Other Phrases Reconstructive surgery (1)

Concept Extraction Example “Roman forces under Julius Caesar invade Britain.” (S (NP (NP Roman forces) (PP under (NP Julius Caesar))) (VP invade (NP Britain)) .) Entity Attributes: <organization Roman forces> <person Julias Caesar> <placename Britain> Concepts: <Roman forces - under - Julius Caesar> <Roman forces - invade - Britain>

And a Small Demo…