Chapter 2 Data, Text, and Web Mining. Data Mining Concepts and Applications  Data mining (DM) A process that uses statistical, mathematical, artificial.

Slides:



Advertisements
Similar presentations
Supporting End-User Access
Advertisements

By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
Data Mining Glen Shih CS157B Section 1 Dr. Sin-Min Lee April 4, 2006.
Chapter 9 Business Intelligence Systems
Chapter 9 DATA WAREHOUSING Transparencies © Pearson Education Limited 1995, 2005.
DATA, TEXT, AND WEB MINING
Data Mining Knowledge Discovery in Databases Data 31.
DATA WAREHOUSING.
Database Access and Techniques, most notably Data Mining.
Data Mining By Archana Ketkar.
Data Mining – Intro.
1 Data and Knowledge Management. 2 Data Management: A Critical Success Factor The difficulties and the process Data sources and collection Data quality.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Business Intelligence
Chapter 4: Data Mining for Business Intelligence
Enterprise systems infrastructure and architecture DT211 4
Chapter 4 Data, Text, and Web Mining
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization.
CISB594 – Business Intelligence Data Mining. CISB594 – Business Intelligence Reference Materials used in this presentation are extracted mainly from the.
Data Mining Techniques
Intelligent Systems Lecture 23 Introduction to Intelligent Data Analysis (IDA). Example of system for Data Analyzing based on neural networks.
Data Mining Chun-Hung Chou
Data Management Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition.
DATA, TEXT, AND WEB MINING
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Chapter 4 Data, Text, and Web Mining
CISB594 – Business Intelligence Data Mining Part I.
Chapter 7 DATA, TEXT, AND WEB MINING Pages , 311, Sections 7.3, 7.5, 7.6.
Copyright © 2009 Pearson Education, Inc. Slide 6-1 Chapter 6 E-commerce Marketing Concepts.
Data Mining and Application Part 1: Data Mining Fundamentals Part 2: Tools for Knowledge Discovery Part 3: Advanced Data Mining Techniques Part 4: Intelligent.
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
Zhangxi Lin ISQS 3358 Texas Tech University 1.  Define data mining and list its objectives and benefits  Understand different purposes and applications.
Chapter 1 Introduction to Data Mining
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
Introduction to Text and Web Mining. I. Text Mining is part of our lives.
Datawarehouse Objectives
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Fox MIS Spring 2011 Data Mining Week 9 Introduction to Data Mining.
Data Mining By Dave Maung.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
CISB594 – Business Intelligence Data Mining. CISB594 – Business Intelligence Reference Materials used in this presentation are extracted mainly from the.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
Data Mining In contrast to the traditional (reactive) DSS tools, the data mining premise is proactive. Data mining tools automatically search the data.
CRM - Data mining Perspective. Predicting Who will Buy Here are five primary issues that organizations need to address to satisfy demanding consumers:
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
CHAPTER 4 Data Warehousing, Access, Analysis, Mining, and Visualization 2 1.
CISB594 – Business Intelligence Data Mining. CISB594 – Business Intelligence Reference Materials used in this presentation are extracted mainly from the.
MIS2502: Data Analytics Advanced Analytics - Introduction.
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
DATA MINING PREPARED BY RAJNIKANT MODI REFERENCE:DOUG ALEXANDER.
Academic Year 2014 Spring Academic Year 2014 Spring.
Data Mining. Overview the extraction of hidden predictive information from large databases Data mining tools predict future trends and behaviors, allowing.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Chapter 8: Web Analytics, Web Mining, and Social Analytics
1 Ahmed K. Ezzat, Data, Text, and Web Mining for BI Data Mining and Big Data.
Dr.S.Sridhar,Ph.D., RACI(Paris),RZFM(Germany),RMR(USA),RIEEEProc.
Data Mining – Intro.
MIS2502: Data Analytics Advanced Analytics - Introduction
DATA MINING © Prentice Hall.
Data Warehousing and Data Mining
Supporting End-User Access
TEXTAND WEB MINING.
TEXT and WEB MINING.
Presentation transcript:

Chapter 2 Data, Text, and Web Mining

Data Mining Concepts and Applications  Data mining (DM) A process that uses statistical, mathematical, artificial intelligence and machine-learning techniques to extract and identify useful information and subsequent knowledge from large databases

Data Mining Concepts and Applications  Major characteristics and objectives of data mining Data are often buried deep within very large databases, which sometimes contain data from several years; sometimes the data are cleansed and consolidated in a data warehouse Data are often buried deep within very large databases, which sometimes contain data from several years; sometimes the data are cleansed and consolidated in a data warehouse The data mining environment is usually client/server architecture or a Web-based architecture The data mining environment is usually client/server architecture or a Web-based architecture

Data Mining Concepts and Applications  Major characteristics and objectives of data mining Sophisticated new tools help to remove the information ore buried in corporate files or archival public records; finding it involves massaging and synchronizing the data to get the right results. Sophisticated new tools help to remove the information ore buried in corporate files or archival public records; finding it involves massaging and synchronizing the data to get the right results. The miner is often an end user, empowered by data drills and other power query tools to ask ad hoc questions and obtain answers quickly, with little or no programming skill The miner is often an end user, empowered by data drills and other power query tools to ask ad hoc questions and obtain answers quickly, with little or no programming skill

Data Mining Concepts and Applications  Major characteristics and objectives of data mining Striking it rich often involves finding an unexpected result and requires end users to think creatively Striking it rich often involves finding an unexpected result and requires end users to think creatively Data mining tools are readily combined with spreadsheets and other software development tools; the mined data can be analyzed and processed quickly and easily Data mining tools are readily combined with spreadsheets and other software development tools; the mined data can be analyzed and processed quickly and easily Parallel processing is sometimes used because of the large amounts of data and massive search efforts Parallel processing is sometimes used because of the large amounts of data and massive search efforts

Data Mining Concepts and Applications  How data mining works Data mining tools find patterns in data and may even infer rules from them Data mining tools find patterns in data and may even infer rules from them Three methods are used to identify patterns in data: Three methods are used to identify patterns in data: 1.Simple models 2.Intermediate models 3.Complex models

Data Mining Concepts and Applications  Classification Supervised induction used to analyze the historical data stored in a database and to automatically generate a model that can predict future behavior  Common tools used for classification are: Neural networks Neural networks Decision trees Decision trees If-then-else rules If-then-else rules

Data Mining Concepts and Applications  Clustering Partitioning a database into segments in which the members of a segment share similar qualities  Association A category of data mining algorithm that establishes relationships about items that occur together in a given record

Data Mining Concepts and Applications  Sequence discovery The identification of associations over time  Visualization can be used in conjunction with data mining to gain a clearer understanding of many underlying relationships

Data Mining Concepts and Applications  Regression is a well-known statistical technique that is used to map data to a prediction value  Forecasting estimates future values based on patterns within large sets of data

Data Mining Concepts and Applications  Hypothesis-driven data mining Begins with a proposition by the user, who then seeks to validate the truthfulness of the proposition  Discovery-driven data mining Finds patterns, associations, and relationships among the data in order to uncover facts that were previously unknown or not even contemplated by an organization

Data Mining Concepts and Applications – Marketing – Banking – Retailing and sales – Manufacturing and production – Brokerage and securities trading – Insurance – Computer hardware and software – Government and defense – Airlines – Health care – Broadcasting – Police – Homeland security Data mining applications

Data Mining Techniques and Tools  Data mining tools and techniques can be classified based on the structure of the data and the algorithms used: Statistical methods Statistical methods Decision trees Decision trees Defined as a root followed by internal nodes. Each node (including root) is labeled with a question and arcs associated with each node cover all possible responses

Data Mining Techniques and Tools  Data mining tools and techniques can be classified based on the structure of the data and the algorithms used: Case-based reasoning Case-based reasoning Neural computing Neural computing Intelligent agents Intelligent agents Genetic algorithms Genetic algorithms Other tools Other tools Rule inductionRule induction Data visualizationData visualization

Data Mining Techniques and Tools  A general algorithm for building a decision tree: 1. Create a root node and select a splitting attribute. 2. Add a branch to the root node for each split candidate value and label 3. Take the following iterative steps: a.Classify data by applying the split value. b.If a stopping point is reached, then create leaf node and label it. Otherwise, build another subtree

Data Mining Techniques and Tools

 Classes of data mining tools and techniques as they relate to information and business intelligence (BI) technologies Mathematical and statistical analysis packages Mathematical and statistical analysis packages Personalization tools for Web-based marketing Personalization tools for Web-based marketing Analytics built into marketing platforms Analytics built into marketing platforms Advanced CRM tools Advanced CRM tools Analytics added to other vertical industry-specific platforms Analytics added to other vertical industry-specific platforms Analytics added to database tools (e.g., OLAP) Analytics added to database tools (e.g., OLAP) Standalone data mining tools Standalone data mining tools

Data Mining Project Processes

Text Mining  Text mining Application of data mining to nonstructured or less structured text files. It entails the generation of meaningful numerical indices from the unstructured text and then processing these indices using various data mining algorithms

Text Mining  Text mining helps organizations: Find the “hidden” content of documents, including additional useful relationships Find the “hidden” content of documents, including additional useful relationships Relate documents across previous unnoticed divisions Relate documents across previous unnoticed divisions Group documents by common themes Group documents by common themes

Text Mining  Applications of text mining Automatic detection of spam or phishing through analysis of the document content Automatic detection of spam or phishing through analysis of the document content Automatic processing of messages or s to route a message to the most appropriate party to process that message Automatic processing of messages or s to route a message to the most appropriate party to process that message Analysis of warranty claims, help desk calls/reports, and so on to identify the most common problems and relevant responses Analysis of warranty claims, help desk calls/reports, and so on to identify the most common problems and relevant responses

Text Mining  Applications of text mining Analysis of related scientific publications in journals to create an automated summary view of a particular discipline Analysis of related scientific publications in journals to create an automated summary view of a particular discipline Creation of a “relationship view” of a document collection Creation of a “relationship view” of a document collection Qualitative analysis of documents to detect deception Qualitative analysis of documents to detect deception

Text Mining  How to mine text 1. Eliminate commonly used words (stop-words) 2. Replace words with their stems or roots (stemming algorithms) 3. Consider synonyms and phrases 4. Calculate the weights of the remaining terms

Web Mining  Web mining The discovery and analysis of interesting and useful information from the Web, about the Web, and usually through Web- based tools

Web Mining

 Web content mining The extraction of useful information from Web pages  Web structure mining The development of useful information from the links included in the Web documents  Web usage mining The extraction of useful information from the data being generated through webpage visits, transaction, etc.

Web Mining  Uses for Web mining: Determine the lifetime value of clients Determine the lifetime value of clients Design cross-marketing strategies across products Design cross-marketing strategies across products Evaluate promotional campaigns Evaluate promotional campaigns Target electronic ads and coupons at user groups Target electronic ads and coupons at user groups Predict user behavior Predict user behavior Present dynamic information to users Present dynamic information to users

Web Mining