Presentation is loading. Please wait.

Presentation is loading. Please wait.

WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.

Similar presentations


Presentation on theme: "WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and."— Presentation transcript:

1 WEB MINING

2 In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and other multimedia files available via internet and the number is still rising. But considering the impressive variety of the web, retrieving interesting content has become a very difficult task.

3  Web is the single largest data source in the world  Due to heterogeneity and lack of structure of web data, mining is a challenging task  Multidisciplinary field:  data mining, machine learning, natural language  processing, statistics, databases, information  retrieval, multimedia, etc.

4 Enormous wealth of information on Web Lots of data on user access patterns Possible to mine interesting nuggets of information

5 Structured Data Unstructured Data OLE DB offers some solutions!

6

7 A PPLICATIONS OF WEB MINING  E-commerce (Infrastructure)  Generate user profiles  Targetted advertizing  Fraud  Similar image retrieval  Information retrieval (Search) on the Web  Automated generation of topic hierarchies  Web knowledge bases  Extraction of schema for XML documents  Network Management  Performance management  Fault management

8 Service Provider Network Router Server Objective: To deliver content to users quickly and reliably  Traffic management  Fault management

9  Examine the contents of web pages as well as result of web searching  Can be thought of as extending the work performed by basic search engines  Search engines have crawlers to search the web and gather information, indexing techniques to store the information, and query processing support to provide information to the users  Web Content Mining is: the process of extracting knowledge from web contents

10 Many methods designed to analyze structured data If we can represent documents by a set of attributes we will be able to use existing data mining methods How to represent a document? Vector based representation (referred to as “bag of words” as it is invariant to permutations) Use statistics to add a numerical dimension to unstructured text

11 WEB USAGE MINING PROCESS

12 Web Usage Mining Process

13 WEB USAGE MINING PROCESS


Download ppt "WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and."

Similar presentations


Ads by Google