April 23, 2001LBSC 878 Text Data Mining Douglas W. Oard.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

UNIT-2 Data Preprocessing LectureTopic ********************************************** Lecture-13Why preprocess the data? Lecture-14Data cleaning Lecture-15Data.
1 Copyright by Jiawei Han, modified by Charles Ling for cs411a/538a Data Mining and Data Warehousing v Introduction v Data warehousing and OLAP for data.
Text mining Extract from various presentations: Temis, URI-INIST-CNRS, Aster Data …
Data Mining Sangeeta Devadiga CS 157B, Spring 2007.

April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
Data Mining By Archana Ketkar.
TextMOLE: Text Mining Operations Library and Environment Daniel B. Waegel and April Kontostathis, Ph.D. Ursinus College Collegeville PA.
Data Mining – Intro.
Erroneous Distribution Data Identification Using Outlier Detection Techniques W. Zhuang, Y. Zhang, J.F. Grassle Rutgers, the State University of New Jersey,
CIS 674 Introduction to Data Mining
Major Tasks in Data Preprocessing(Ref Chap 3) By Prof. Muhammad Amir Alam.
CS2032 DATA WAREHOUSING AND DATA MINING
Data Mining: Concepts & Techniques. Motivation: Necessity is the Mother of Invention Data explosion problem –Automated data collection tools and mature.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Chun-Hung Chou
Understanding Data Analytics and Data Mining Introduction.
Ch2 Data Preprocessing part2 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
Chapter 1 Introduction to Data Mining
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Internet Information Retrieval Sun Wu. Course Goal To learn the basic concepts and techniques of internet search engines –How to use and evaluate search.
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
Principles of Data Mining. Introduction: Topics 1. Introduction to Data Mining 2. Nature of Data Sets 3. Types of Structure Models and Patterns 4. Data.
© Prentice Hall1 CIS 674 Introduction to Data Mining Srinivasan Parthasarathy Office Hours: TTH 4:30-5:25PM DL693.
Data Preprocessing Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2010.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Outline Knowledge discovery in databases. Data warehousing. Data mining. Different types of data mining. The Apriori algorithm for generating association.
Data Preprocessing Dr. Bernard Chen Ph.D. University of Central Arkansas.
DBSQL 9-1 Copyright © Genetic Computer School 2009 Chapter 9 Data Mining and Data Warehousing.
Foundations of Business Intelligence: Databases and Information Management.
Data Preprocessing Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Introduction to Data Mining by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
Data Mining and Decision Support
Web Analytics Xuejiao Liu INF 385F: WIRED Fall 2004.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
KNOWLEDGE DISCOVERY & DATA MINING Abhishek M. Mehta ROLL NO:24.
Pattern Recognition Lecture 20: Data Mining 2 Dr. Richard Spillman Pacific Lutheran University.
Data Mining – Intro.
What Is Cluster Analysis?
Clustering of Web pages
DATA MINING © Prentice Hall.
Data Mining: Data Preparation
Introduction to Data Mining
UNIT-2 Data Preprocessing
Introduction C.Eng 714 Spring 2010.
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Topic 3: Cluster Analysis
Data Mining: Concepts and Techniques Course Outline
Sangeeta Devadiga CS 157B, Spring 2007
Research Areas Christoph F. Eick
Data Warehousing and Data Mining
I don’t need a title slide for a lecture
Data Preprocessing Copyright, 1996 © Dale Carnegie & Associates, Inc.
Data Preprocessing Modified from
Chapter 1 Data Preprocessing
Intro to Machine Learning
Data Warehousing Data Mining Privacy
Data Mining Data Preprocessing
Data Mining: Concepts and Techniques
Topic 5: Cluster Analysis
CS 685: Special Topics in Data Mining Jinze Liu
Data Preprocessing Copyright, 1996 © Dale Carnegie & Associates, Inc.
Data Preprocessing Copyright, 1996 © Dale Carnegie & Associates, Inc.
Presentation transcript:

April 23, 2001LBSC 878 Text Data Mining Douglas W. Oard

Outline Knowledge Discovery in Databases Knowledge Discovery in Text Scoping the Problem

What can we find in databases? Data Information Knowledge Wisdom

Knowledge Discovery in Databases Select Preprocess Warehouse Transform Data Mining Presentation Data Convert schema Model noise Remove outliers Handle missing data Convert feature space Reduce dimensionality Synthesize features

The Data Mining Process Choose a model –Classification, clustering, dependency modeling, sequence analysis,... Specify what it means for the model to fit data –Minimum squared error, closest cubic spline,... Find the model parameters that best fit the data –Exhaustive search, heuristic search,...

Knowledge Discovery in Text Information Retrieval Metadata Assignment Warehouse Transform Data Mining Presentation Documents Named entities Anaphora resolution Temporal expressions Slot filling

Text Metadata Mining Bibliometrics –Impact assessment, co-citation analysis,... Web analysis –Clickstreams, link analysis, …

Text-derived Data Mining Theme relationship analysis –Proximity-based phrase clustering Literature-based discovery –Based on associating phrases and index terms New event detection –Cluster then identify outliers Multidocument summarization –Perspective analysis, temporal evolution, …