Data Mining By Dave Maung
What is Data Mining? The process of automatically searching large volumes of data for patterns. Also known as KDD Knowledge-Discovery.
Different types of Data Mining Relational data mining Text mining Web mining
Relational Data Mining Data mining technique for relational databases Relational data mining algorithms look for patterns among multiple tables Used classification rules and Association rules
Classification Predicting an item class Finding rules that partition the given data into disjoints groups Popular classification Methods is decision tree
Decision Tree A graph of decisions and their possible consequences Decision trees are constructed to help making decisions. A decision tree used tree structure.
Example of Decision Tree
Text Mining Is the process of extracting interesting non-trivial information knowledge from unstructured text
Text Mining (continued) Also known as intelligent text analysis text data mining unstructured data management or knowledge-discovery in text
Web Mining Is the extraction of interesting potentially useful patterns Implicit information from artifacts Activity related to the Worldwide Web
Web Mining (continued) Three knowledge discovery domains that pertain to web mining Web Content Mining, Web Structure Mining, Web Usage Mining
Web Content Mining Is an automatic process that goes beyond keyword extraction. There are two groups of web content mining strategies: mine the content of documents improve on the content search of other tools like search engines.
Web Structure Mining Is Worldwide Web can reveal more information than just the information contained in documents
Web Structure Mining (example) Links pointing to a document indicate the popularity of the document. Links coming out of a document indicate the richness or perhaps the variety of topics covered in the document.
Web Usage Mining Web servers record and accumulate data about user interactions whenever requests for resources are received. Analyzing the web access logs of different web sites
Web Usage Mining Two main tendencies in Web Usage Mining driven: General Access Pattern Tracking Customized Usage Tracking
General access pattern Analyzes the web logs to understand access patterns and trends Give better structure and grouping of resource providers Can be used to restructure sites in a more efficient grouping, and target specific users for specific selling ads
Customized usage tracking Analyzes individual trends To customize web sites to users Success of Application depends on what and how much valid and reliable knowledge one can discover from the large raw log data.
Web Mining Architecture
Reference http://wikipedia.com