Download presentation
Presentation is loading. Please wait.
Published bySherilyn Warner Modified over 9 years ago
1
Page 1 WEB MINING by NINI P SURESH PROJECT CO-ORDINATOR Kavitha Murugeshan
2
Page 2 OUTLINE Introduction Data mining Vs Web mining Web mining subtasks Challenges Taxonomy Web content mining Web structure mining Web usage mining Applications
3
Page 3 INTRODUCTION Nowadays, it has become necessary for users to utilise automated tools to find, extract, filter & evaluate desired information & resources. The target of search engines is only to discover the resources on the web.
4
Page 4 INTRODUCTION Needs for Web Mining Narrowly searching scope Low precision
5
Page 5 INTRODUCTION Other Approaches Database approach (DB) Information retrieval Natural language processing (NLP) Web document community
6
Page 6 WEB MINING DEFENITION Web mining refers to the overall process of discovering potentially useful and previously unknown information or knowledge from the Web data.
7
Page 7 DATA MINING WEB MINING Extraction of useful patterns from data sources like databases, texts, web, images etc Extracting relevant information hidden in Web-related data, like hypertext documents on web
8
Page 8 WEB MINING SUBTASKS Resource finding Information selection & preprocessing Generalization Analysis
9
Page 9 CHALLENGES Search relevant information on web Create knowledge Personalization of Information Learn patterns Uniformity & standardisation
10
Page 10 CHALLENGES Redundant Information Noisy web Monitoring changes Sites providing Services Privacy
11
Page 11 TAXONOMY Web Mining Web Structure Mining Web Content Mining Web Usage Mining Web Text Mining Web Multimedia Mining Personalized Usages Track Gen. Access Pattern Track Link Mining URL Mining Internal Structure Mining
12
Page 12 WEB CONTENT MINING Discovering useful information & Analyses the content Automatic process beyond keyword extraction Approaches to restructure document content Two groups of mining strategies
13
Page 13 WEB CONTENT MINING Agent based Approach Intelligent search agents Information filtering/categorization Personalized web agents
14
Page 14 WEB CONTENT MINING Database Approach Multilevel databases Web query system
15
Page 15 WEB STRUCTURE MINING Discovering structure information from web Web graph : web pages as nodes & hyperlinks as edges
16
Page 16 WEB STRUCTURE MINING Two algorithms for handling of links PageRank HITS
17
Page 17 WEB STRUCTURE MINING PageRank Metric for ranking hypertext documents Depends on rank of pages pointing it Iterative process
18
Page 18 WEB STRUCTURE MINING n : Number of nodes in graph Outdegree(q) : Number of hyperlinks on page q d : damping factor
19
Page 19 WEB STRUCTURE MINING HITS Iterative algorithm Identify topic hubs & authorities Input : search results returned by traditional text indexing technique
20
Page 20 WEB STRUCTURE MINING Assigns weight to hub based on authoritiveness Outputs pages with largest hub & authority weights
21
Page 21 WEB USAGE MINING Extracting information from server logs Discover user access patterns of Web pages Decomposed into 3 subtasks Site Files Preprocessing Mining algorithms Pattern Analysis Raw logs User session file Rules, Patterns & Statistic Interesting Rules, Patterns & Statistic
22
Page 22 WEB USAGE MINING Preprocessing Data cleaning User identification User sessions identification Access path supplement Transaction identification
23
Page 23 WEB USAGE MINING Pattern discovery Statistical Analysis Association Rules Clustering analysis
24
Page 24 WEB USAGE MINING Classification analysis Sequential Pattern Dependancy Modeling
25
Page 25 WEB USAGE MINING Pattern Analysis Eliminates irrelevant rules or patterns Extract intresting patterns
26
Page 26 APPLICATIONS Personalized Services Improve website design System Improvement Predicting trends Carry out intelligent buisness
27
Page 27 PROS High trade volumes Classify threats & fight against Terrorism Establish better customer relationship Increase profitability
28
Page 28 CONS Invasion of Privacy Discrimination by controversial attributes
29
Page 29 CONCLUSION Rapidly growing area Promising area of future research
30
Page 30 REFERENCE [1] http://en.wikipedia.org/wiki/Web mining [2] http://www.galeas.de/webimining.html [3] Jaideep srivastava, Robert Cooley, Mukund Deshpande, Pan-Ning Tan, Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data, SIGKDD Explorations, ACM SIGKDD,Jan 2000. [4] Miguel Gomes da Costa Jnior,Zhiguo Gong, Web Structure Mining: An Introduction, Proceedings of the 2005 IEEE International Conference on Information Acquisition [5] R. Cooley, B. Mobasher, and J. Srivastava,Web Mining: Information and Pattern Discovery on the World Wide Web, ICTAI97 [6] Brijendra Singh, Hemant Kumar Singh, WEB DATA MINING RE- SEARCH: A SURVEY, 2010 IEEE [7] Mining the Web: discovering knowledge from hypertext data, Part 2 By Soumen Chakrabarti, 2003 edition [8] Web mining: applications and techniques By Anthony Scime
31
Page 31 WEB MINING Thank You
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.