Web Mining Ref: http://www.csse.monash.edu.au/courseware/cse5230/html/lectures.html.

Slides:



Advertisements
Similar presentations
E-Business and e-Commerce. e-commerce and e-business e-commerce refers to aspects of online business involving exchanges among customers, business partners.
Advertisements

Web Mining.
Web Usage Mining Web Usage Mining (Clickstream Analysis) Mark Levene (Follow the links to learn more!)
Back to Table of Contents
Marketing for Hospitality and Tourism, 3e©2003 Pearson Education, Inc. Philip Kotler, John Bowen, James MakensUpper Saddle River, NJ Chapter 16.
Search Engines & Search Engine Optimization (SEO) Presentation by Saeed El-Darahali 7 th World Congress on the Management of e-Business.
WebMiningResearch ASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007.
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
The Web is perhaps the single largest data source in the world. Due to the heterogeneity and lack of structure, mining and integration are challenging.
Web Mining Research: A Survey
WebMiningResearchASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007 Revised.
CS 345 Data Mining Lecture 1 Introduction to Web Mining.
Interactive Advertising & Promotional Communication Class 8/9 Targeting the Internet Consumer Kuen-Hee Ju-Pak CSUF.
Web Usage Mining - W hat, W hy, ho W Presented by:Roopa Datla Jinguang Liu.
Interactive Brand Communication Class 9 Targeting the Internet Consumer Kuen-Hee Ju-Pak CSUF.
1 The World Wide Web. 2  Web Fundamentals  Pages are defined by the Hypertext Markup Language (HTML) and contain text, graphics, audio, video and software.
Overview of Web Data Mining and Applications Part I
Internet Research Search Engines & Subject Directories.
What’s The Difference??  Subject Directory  Search Engine  Deep Web Search.
Slide 1 Today you will: think about criteria for judging a website understand that an effective website will match the needs and interests of users use.
1.Understand the decision-making process of consumer purchasing online. 2.Describe how companies are building one-to-one relationships with customers.
FALL 2012 DSCI5240 Graduate Presentation By Xxxxxxx.
Chapter 16 The World Wide Web Chapter Goals ( ) Compare and contrast the Internet and the World Wide Web Describe general Web processing.
Page 1 WEB MINING by NINI P SURESH PROJECT CO-ORDINATOR Kavitha Murugeshan.
Fall 2006 Davison/LinCSE 197/BIS 197: Search Engine Strategies 7-1 Module II Overview PLANNING: Things to Know BEFORE You Start… Why SEM? Goal Analysis.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
We Are Learning To (WALT): Evaluate existing web graphics What I am Looking For (WILF): 4 evaluations that contain: – Detailed descriptions of target.
CSE Data Mining, 2002Lecture 11.1 Data Mining - CSE5230 Web Mining CSE5230/DMS/2002/11.
Lecture 9: Knowledge Discovery Systems Md. Mahbubul Alam, PhD Associate Professor Dept. of AEIS Sher-e-Bangla Agricultural University.
Web Usage Patterns Ryan McFadden IST 497E December 5, 2002.
Search Engines & Search Engine Optimization (SEO).
Internet Information Retrieval Sun Wu. Course Goal To learn the basic concepts and techniques of internet search engines –How to use and evaluate search.
Driving Traffic It is not enough to promote your site when it is first launched. You also need to actively promote your site on a long term basis.
Data Mining By Dave Maung.
Srivastava J., Cooley R., Deshpande M, Tan P.N.
WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.
Web-Mining …searching for the knowledge on the Internet… Marko Grobelnik Institut Jožef Stefan.
Web Mining Issues Size Size –>350 million pages –Grows at about 1 million pages a day Diverse types of data Diverse types of data.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
User Modeling and Recommender Systems: Introduction to recommender systems Adolfo Ruiz Calleja 06/09/2014.
Online Marketing. Types Marketing Link Building Content Marketing Search Engine Optimization(SEO) Social Media Marketing Advertising.
Introduction Web analysis includes the study of users’ behavior on the web Traffic analysis – Usage analysis Behavior at particular website or across.
© Prentice Hall1 DATA MINING Web Mining Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University Companion slides.
Chapter 8: Web Analytics, Web Mining, and Social Analytics
18-1 PRENTICE HALL ©2008 Pearson Education, Inc. Upper Saddle River, NJ FORENSIC SCIENCE An Introduction By Richard Saferstein.
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
Data mining in web applications
Search Engine Optimization
SEARCH ENGINE OPTIMIZATION.
How do Web Applications Work?
Recommender Systems & Collaborative Filtering
DATA MINING Introductory and Advanced Topics Part III – Web Mining
DATA MINING © Prentice Hall.
E-commerce | WWW World Wide Web - Concepts
E-commerce | WWW World Wide Web - Concepts
Introduction Position your online or offline business
1 SEO is short for search engine optimization. Search engine optimization is a methodology of strategies, techniques and tactics used to increase the amount.
Search Engines & Subject Directories
Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall
Chapter 4 Online Consumer Behavior, Market Research, and Advertisement
Search Search Engines Search Engine Optimization Search Interfaces
Boštjan Kožuh Statistical Office of the Republic of Slovenia,
Maximizing Exposure for Your Non-Profit
HTML Links.
Web Mining Department of Computer Science and Engg.
Search Engines & Subject Directories
Search Engines & Subject Directories
Chapter 16 The World Wide Web.
Web Mining Research: A Survey
Information Retrieval and Web Design
Presentation transcript:

Web Mining Ref: http://www.csse.monash.edu.au/courseware/cse5230/html/lectures.html

Lecture Outline How big is the web? What is “web data”? A taxonomy of web mining tasks Example: targeted advertising Example: personalization References

How big is the web? It is not easy to determine the size of the web In 1999, one estimate was that there were approximately 350 million web pages, growing at about 1 million pages per day In 2001, Google announced that they were indexing around 3 billion web documents In 2002 Google - Searching 3,083,324,652 web pages No matter which of these is more accurate – it’s very big! We can view the web as the world’s biggest database The word “database” is used loosely here, because the web has no real formal structure or database schema This makes the application of data mining to the web potentially very useful, but also difficult

What is “web data”? Web data can be classified as follows [Dun2002]: The actual content of web pages (text, images, multimedia) Intrapage structure – the HTML or XML mark-up specifying the organization of the page content Interpage structure – the links into and out of web pages Usage data describing how the users of a web site access pages – navigation patterns User profiles – these can include demographic data obtained from a registration process, or perhaps IP addresses. It can also include information found in cookies

A taxonomy of web mining tasks (1) Web Content Mining Web Structure Mining Web Usage Mining Web Page Content Mining Search Result Mining General Access Pattern Tracking Customized Usage Tracking From [Dun2002], following [Zai1999].

A taxonomy of web mining tasks (2) Web content mining Examines the contents of web pages (text, graphics) Examines the results of web searches Mining systems built on top of existing search engines Similar to traditional information retrieval (text categoriation, text filtering, etc.) Often goes further than simple keyword search – e.g. may cluster similar pages Web structure mining Looks at page structure e.g. text in <H1> tags may be more important Links between pages e.g. pages with many incoming links may be more useful

A taxonomy of web mining tasks (3) Web usage mining Looks at log files of web access General access tracking looks at history of pages visited Customised usage tracking may be focused on particular kinds of usage, or particular users Involves mining of sequential patterns Can use association rule discovery These patterns can be clustered to reveal users with similar access behaviour Can be used to improve web site design Customize presentation via collaborative filtering

Example: targeted advertising (1) In marketing, targeting is any technique used to direct marketing or advertising effort to the portion of the population thought to be most valuable to the business, e.g. those Likely to purchase Likely to spend a lot The business wants to avoid spending money on sending advertising to people who will not respond to it In the web context, this can mean displaying an add for a web site on a different web site Can use web usage information to work out what kind of people use a site: target demographics Sell advertising to companies wanting to target that demographic

Example: targeted advertising (2) For example, the Rugby Heaven web site (http://rugbyheaven.smh.com.au/) is hosting advertising for: MLC life insurance Fintrack Financial Services Business Review Weekly (BRW) They appear to think that this site is likely to be popular with older people who have money! The URL for the BRW ad. is: http://campaigns.f2.com.au/event.ng/Type=click&FlightID=10928&AdID=24947&TargetID=2389&Segments=2,13,23,31,35,77,81,88,93,94,153,855,976,993,1145,1301,1989,2320,2389,2394,2396,2477,2534,2576,2581,2689&Targets=535,2389,40,60,1834&Values=25,31,43,48,50,60,72,81,91,100,110,135,150,157,233,239,366,422,605,791,804,805,806,1203,1278,1403,1432,1476,1485,1499&RawValues=&Redirect=http:%2F%2Fwww.brw.com.au%2Fsubscription%2Fsubscribe.asp It is clear that some sophisticated targeting is going on

Example: personalization (1) Personalization spans the areas of web content mining and web usage mining Personalization aims to modify document contents or access patterns to better match the preferences of a particular user Personalization can involve Dynamically creating and serving web pages that are unique to an individual user Determining which pages to retrieve or link to on a user-by-user basis

Example: personalization (2) Unlike targeting, with personalization can be done for the target web page (unlike a targeted advertisement for another site) Simple example: including the name of the user in the page content Personalization techniques include Use of cookies Use of user databases Use of web usage patterns to identify similar users (for use in collaborative filtering) Often requires a user to log in – this part is not data mining

Example: personalization (3) A classic example of personalization is the recommending to a user of a product very similar to something they have bought before (if the web site is selling something) Content that is similar to something they have used before Personalization techniques can be based on clustering, classification or even prediction With classification, the desires of a user are determined based on the class to which he/she is assigned. Classes may be predetermined by experts. With clustering, clusters of users with similar navigation or purchasing behaviour are found, and the user’s desires are determined on this basis

Example: personalization (4) Amazon.com makes use of personalization. They make use of the user’s past behaviour They also use collaborative filtering – they recommend products bought by users who have similar profiles to the current user Could use clustering, or information filtering techniques

References [Dun2002] Margaret H. Dunham, Data Mining: Introductory and Advanced Topics, Prentice Hall, Upper Saddle River, NJ, USA, 2002, pp. 195-220. [Zai1999] Osmar R. Zaïane, Resource and Knowledge Discovery from the Internet and Multimedia Repositories, PhD Thesis, Simon Fraser University, Canada, March 1999.