Course on Data Mining: Seminar Meetings Page 1/17 Course on Data Mining (581550-4): Seminar Meetings Ass. Rules EpisodesEpisodes Text Mining 02.11. 09.11.

Slides:



Advertisements
Similar presentations
Data Mining: Potentials and Challenges Rakesh Agrawal & Jeff Ullman.
Advertisements

Data e Web Mining Paolo Gobbo
Rule Discovery from Time Series Presented by: Murali K. Kadimi.
LOGO Association Rule Lecturer: Dr. Bo Yuan
1 XML Data Management Course Outline and Organisation Werner Nutt.
10 -1 Lecture 10 Association Rules Mining Topics –Basics –Mining Frequent Patterns –Mining Frequent Sequential Patterns –Applications.
Nadia Andreani Dwiyono DESIGN AND MAKE OF DATA MINING MARKET BASKET ANALYSIS APLICATION AT DE JOGLO RESTAURANT.
Berendt: Advanced databases, winter term 2007/08, 1 Advanced databases – Inferring implicit/new.
Rakesh Agrawal Ramakrishnan Srikant
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Rules Yao Meng Hongli Li Database II Fall 2002.
Using Shapes of Trends in Active Data Mining Duy Lam Norris Boothe.
Mining Sequential Patterns Rakesh Agrawal Ramakrishnan Srikant Proc. of the Int’l Conference on Data Engineering (ICDE) March 1995 Presenter: Phil Schlosser.
CS 590M Fall 2001: Security Issues in Data Mining Lecture 5: Association Rules, Sequential Associations.
Making Semantic Web Real: Some Building Blocks Rakesh Agrawal IBM Almaden Research Center.
Some Interesting Problems Rakesh Agrawal IBM Almaden Research Center.
Core Text Mining Operations 2007 년 02 월 06 일 부산대학교 인공지능연구실 한기덕 Text : The Text Mining Handbook pp.19~41.
Association Rule Mining (Some material adapted from: Mining Sequential Patterns by Karuna Pande Joshi)‏
2/8/00CSE 711 data mining: Apriori Algorithm by S. Cha 1 CSE 711 Seminar on Data Mining: Apriori Algorithm By Sung-Hyuk Cha.
1 Fast Algorithms for Mining Association Rules Rakesh Agrawal Ramakrishnan Srikant Slides from Ofer Pasternak.
1 Synthesizing High-Frequency Rules from Different Data Sources Xindong Wu and Shichao Zhang IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL.
FIRST COURSE Creating Web Pages with Microsoft Office 2007.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
『 Data Mining 』 By Jung, hae-sun. 1.Introduction 2.Definition 3.Data Mining Applications 4.Data Mining Tasks 5. Overview of the System 6. Data Mining.
Mining Sequential Patterns: Generalizations and Performance Improvements R. Srikant R. Agrawal IBM Almaden Research Center Advisor: Dr. Hsu Presented by:
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Paper versus speech versus poster: Different formats for communicating research.
Data Mining and Knowledge Discovery prof. dr. Bojan Cestnik 1 Data Mining and Knowledge Discovery (DM & KD) prof. dr. Bojan Cestnik Temida d.o.o. & Jozef.
Indexing Knowledge Daniel Vasicek 2014 March 27 Introduction Basic topic is : All Human Knowledge Who Cares? Simple Examples.
Multimedia Communication and Information Logistics for AFTER-SALES AND PRODUCT LIFE- CYCLE SUPPORT Click to edit Master title style
GLOSSARY COMPILATION Alex Kotov (akotov2) Hanna Zhong (hzhong) Hoa Nguyen (hnguyen4) Zhenyu Yang (zyang2)
25/03/2003CSCI 6405 Zheyuan Yu1 Finding Unexpected Information Taken from the paper : “Discovering Unexpected Information from your Competitor’s Web Sites”
Austin, TX, USA, Landscaping Performance Research at the ICPE and its Predecessors: A Systematic Literature Review Short Paper International.
Text Feature Extraction. Text Classification Text classification has many applications –Spam detection –Automated tagging of streams of news articles,
Data Mining: Potentials and Challenges Rakesh Agrawal IBM Almaden Research Center.
生物資訊程式語言應用 Part 5 Perl and MySQL Applications. Outline  Application one.  How to get related literature from PubMed?  To store search results in database.
Data Mining By Dave Maung.
Computing & Information Sciences Kansas State University Paper Review Guidelines KDD Lab Course Supplement William H. Hsu Kansas State University Department.
Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang.
BBI2421 General Writing Skills Week 1 Introduction to the Course.
Fast Algorithms for Mining Association Rules Rakesh Agrawal and Ramakrishnan Srikant VLDB '94 presented by kurt partridge cse 590db oct 4, 1999.
The Grammar Business © 2001 Glenrothes College The Grammar Business Part Two 4. Logic rules: paragraphs, links and headings.
HOW TO HAVE A GOOD PAPER Tran Minh Quang. What and Why Do We Write? Letter Proposal Report for an assignments Research paper Thesis ….
Tool for Ontology Paraphrasing, Querying and Visualization on the Semantic Web Project By Senthil Kumar K III MCA (SS)‏
Summarizing Encyclopedic Term Descriptions on the Web from Coling 2004 Atsushi Fujii and Tetsuya Ishikawa Graduate School of Library, Information and Media.
『 Personalization of Supermarket Product Recommendations 』 김용수.
Text Analytics A Tool for Taxonomy Development Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture.
UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.
Course on Data Mining: Seminar Meetings Page 1/30 Course on Data Mining ( ): Seminar Meetings Ass. Rules EpisodesEpisodes Text Mining
Mining Sequential Patterns © Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 Slides are adapted from Introduction to Data Mining by Tan, Steinbach,
Semantic Wiki: Automating the Read, Write, and Reporting functions Chuck Rehberg, Semantic Insights.
© Prentice Hall1 DATA MINING Web Mining Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University Companion slides.
Web mining is the use of data mining techniques to automatically discover and extract information from Web documents/services
Finding Text Trends l Word usage tracks interest changes l Segment documents by time period l Phrase frequency = number of documents l Phrase must have.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Fuzzy Set Approach for Improving Web Log Mining Sajitha Naduvil-Vadukootu Csc 8810 : Computational Intelligence Instructor: Dr. Yanqing Zhang Dec 4, 2006.
Queensland University of Technology
Jian Pei and Runying Mao (Simon Fraser University)
Byung Joon Park, Sung Hee Kim
Include everyone’s name Picture optional
Text & Web Mining 9/22/2018.
Summarizing Entities: A Survey Report
Applying Key Phrase Extraction to aid Invalidity Search
I don’t need a title slide for a lecture
Mining Path Traversal Patterns with User Interaction for Query Recommendation 龚赛赛
Content Analysis of Text
Publication Output on the Topical Area of "Energy" and Real Estate (Education) Bob Martens.
Dept. of Computer Science University of Liverpool
Include your personal presentation if necessary
CSE591: Data Mining by H. Liu
Presentation transcript:

Course on Data Mining: Seminar Meetings Page 1/17 Course on Data Mining ( ): Seminar Meetings Ass. Rules EpisodesEpisodes Text Mining ClusteringClustering KDD Process Home Exam M M P P Seminar by Mika Seminar by Pirjo P P P P P P M M M M

Course on Data Mining: Seminar Meetings Page 2/17 Today R. Feldman, M. Fresko, H. Hirsh, et.al.: "Knowledge Management: A Text Mining Approach", Proc of the 2nd Int'l Conf. on Practical Aspects of Knowledge Management (PAKM98), 1998R. Feldman, M. Fresko, H. Hirsh, et.al.: "Knowledge Management: A Text Mining Approach", Proc of the 2nd Int'l Conf. on Practical Aspects of Knowledge Management (PAKM98), 1998 B. Lent, R. Agrawal, R. Srikant: "Discovering Trends in Text Databases", Proc. of the 3rd Int'l Conference on Knowledge Discovery in Databases and Data Mining, 1997.B. Lent, R. Agrawal, R. Srikant: "Discovering Trends in Text Databases", Proc. of the 3rd Int'l Conference on Knowledge Discovery in Databases and Data Mining, Course on Data Mining ( ): Seminar Meetings

Course on Data Mining: Seminar Meetings Page 3/17 Good to Read as Background Both papers refer to the Agrawal and Srikant paper we had last week:Both papers refer to the Agrawal and Srikant paper we had last week: Rakesh Agrawal and Ramakrishnan Srikant: Mining Sequential Patterns. Int'l Conference on Data Engineering, Course on Data Mining ( ): Seminar Meetings

Course on Data Mining: Seminar Meetings Page 4/17 Knowledge Management: A Text Mining Approach R. Feldman, M. Fresko, H. Hirsh, et.al Bar-Ilan University and Instict Software, ISRAEL; Rutgers University, USA; LIA-EPFL, Switzerland Published in PAKM'98 (Int'l Conf. on Practical Aspects of Knowledge Management) Data Mining course Autumn 2001/University of Helsinki Summary by Mika Klemettinen

Course on Data Mining: Seminar Meetings Page 5/17 KM: A Text Mining Approach Basic idea (see selected phases on the next slides):Basic idea (see selected phases on the next slides): 1. Get input data in SGML (or XML) format Select only the contents of desired elements! (title, abstract, etc.) 2. Do linguistic preprocessing: 2.1 Term extraction (use linguistic software for this) 2.2 Term generation (combine adjacent terms to morpho- syntactic patterns like "noun-noun", "adj.-noun", etc. by calculating association coefficients) 2.3 Term filtering (select only the top M most frequent ones) 3. Create taxonomies (there is a tool for this) 4. Generate associations (you may constrain the creation) 5. Visualize/explore the results

Course on Data Mining: Seminar Meetings Page 6/17 2.1: Term Extraction

Course on Data Mining: Seminar Meetings Page 7/17 3: Taxonomy Construction

Course on Data Mining: Seminar Meetings Page 8/17 4: Association Rule Generation

Course on Data Mining: Seminar Meetings Page 9/17 4: Association Rule Generation

Course on Data Mining: Seminar Meetings Page 10/17 5.1: Visualization/Exploration

Course on Data Mining: Seminar Meetings Page 11/17 5.2: Visualization/Exploration

Course on Data Mining: Seminar Meetings Page 12/17 Discovering Trends in Text Databases Brian Lent, Rakesh Agrawal and Ramakrishnan Srikant IBM Almaden Research Center, USA Published in KDD'97 Data Mining course Autumn 2001/University of Helsinki Summary by Mika Klemettinen

Course on Data Mining: Seminar Meetings Page 13/17 Discovering Trends in Text Databases Basic ideas:Basic ideas: Identify frequent phrases using sequential patterns mining (see the slides & summaries from the Agrawal et. al paper "Mining Sequential Patterns" (MSP)) Generate histories of phrases Find phrases that satisfy a specified trend Definitions:Definitions: Phrase: phrase p is  (w 1 )(w 2 ) … (w n ) , where w is a word 1-phrase:   (IBM)   (data)(mining)   2-phrase:   (IBM)   (data)(mining)     (Anderson) (Consulting)   (decision)(support)   Itemset, sequence, is contained, etc.: as in MSP paper

Course on Data Mining: Seminar Meetings Page 14/17 Discovering Trends in Text Databases Gaps: Minimum and maximum gaps between adjacent words: identify relations of words/phrases inside sentences/paragraphs, between words/phrases in different paragraphs, between words/phrases in different sections, etc. Sentence boundary: 1000 Paragraph boundary: Section boundary: Phases: Partition data/documents based on their time stamps, create phrases for each partition (Lent & al. have patent data documents) Select the frequent phrases and save their frequences Define shape queries using SDL (Shape Definition Language)

Course on Data Mining: Seminar Meetings Page 15/17 Discovering Trends in Text Databases

Course on Data Mining: Seminar Meetings Page 16/17 Discovering Trends in Text Databases

Course on Data Mining: Seminar Meetings Page 17/17 Discovering Trends in Text Databases