IR CW2–09 Webpage summarization 01/13 Can webpage summarization improve the search engine user experience? By Barney Staddon.

Slides:



Advertisements
Similar presentations
HTML Basic Lecture What is HTML? HTML (Hyper Text Markup Language) is a a standard markup language used for creating and publishing documents on.
Advertisements

Mark Phillip markphillip.com From Easy to Geeky: A Top 10 List.
Accredited Supplier Communications Plan FY09-10 Q1 to Q4 May 2009, v2.0 Home Access Marketing & Stakeholder Engagement Team.
1_Panel Production. 380 pannelli 45 giorni di produzione = 8.4 pannelli/day.
Feichter_DPG-SYKL03_Bild-01. Feichter_DPG-SYKL03_Bild-02.
Schema.org and Microdata. If someone told you that there was a quick and easy way that many of you could improve your SERP CTR for minimal effort, you'd.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. *See PowerPoint Lecture Outline for a complete, ready-made.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 116.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 107.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 40.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 28.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 44.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 29.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 101.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 38.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 58.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 112.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 75.
Geo-Spatial Digital Archive Project (GDAP) gdap.crma.ac.th Feb 15, 2012 Col. Supachai Srihom Chulachomklao Royal Military Academy Nakhon Nayok, Thailand.
Chapter 1 Image Slides Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Special Features of Publishers Web Sites. Objectives Review standard features via Elsevier website Identify special features in the websites of the following.
Doc.: IEEE /0953r1 Submission November 2009 Adrian Stephens, Intel CorporationSlide TGmb Editor Report - Nov 2009 Date: Authors:
Search, access and impact: Web citation services Tim Brody Intelligence, Agents, Multimedia Group University of Southampton.
28 April 2004Second Nordic Conference on Scholarly Communication 1 Citation Analysis for the Free, Online Literature Tim Brody Intelligence, Agents, Multimedia.
OLAC Metadata Steven Bird University of Melbourne / University of Pennsylvania OLAC Workshop 10 December 2002.
Publishing Process An attempt -- GWB.
SEM25-01 ETSI Documentation Service (EDS) Antoinette van Tricht Editor © ETSI All rights reserved ETSI Seminar.
Peer to peer and Social Web How Web 2.0 and File Sharing mix.
Delivering Outstanding Tutorials new National Occupational Standards for Personal Tutoring and GROW coaching model.
MedlinePlus, MedlinePlus Connect & the NLM Web Site Loren Frant
Google as a Hacking Tool James Lee Advanced Searching.
Alexander Kotov and ChengXiang Zhai University of Illinois at Urbana-Champaign.
What Time Is It? Lesson by Mrs. Moody, FLE. M1M2 Students will develop an understanding of the measurement of time. a Tell time to the nearest hour and.
A Domain Level Personalization Technique A. Campi, M. Mazuran, S. Ronchi.
Service Access Management Tool Tour: Contract Number
Search Engine Optimisation (SEO) by Graham Sowerby (28 th November 2013)
26/10/2008 SWESE'08 1 Enhanced Semantic Access to Software Artefacts Danica Damljanović and Kalina Bontcheva.
RoMEO, JULIET and OpenDOAR: A Tale with a Happy Ending!
15. Oktober Oktober Oktober 2012.
1 1 Compilation of Natural Gas Statistics in Norway Prepared for the UNSC side event on natural gas Olav Ljones Deputy director general, Statistics Norway.

Integrating the Healthcare Enterprise (IHE) Patient Care Devices Domain (PCD) Alarm Communication Management (ACM) Requirements for AM – AC Interoperability.
We are learning how to read the 24 hour clock
Incident Location Tool for TraCS 10
DETECTING TERRORIST ACTIVITIES PRESENTED BY CATHERINE LUMB & ALI CLARKE.
Produced by the Department of Learning and Teaching Resources, Belfast Institute. Want to be a xxxxx? Welcome to College Name Click here to start.
Sunday October 28, www.eprints.org Tim Brody - Stevan Harnad -
Want to be a xxxxx? Welcome to College Name Click here to start.
Electric mobility a topic for Sustainable Urban Mobility Plans? Conny Louen ISB - RWTH Aachen University Georg Werdermann City of Aachen
17 Apr 2002 XML Syntax: DTDs Andy Clark. Validation of XML Documents XML documents must be well-formed XML documents may be valid – Validation verifies.
Slide 01 (of 22)Title 26/04/2010 Version 1.0 GUIDE to ‘SIMPLE’ Mouse click to continue AN OVERVIEW OF BT’s CONVEYACE INVOICE RECONCILIATION ASSISTANCE.
Stefan Dietze, Hong Qing Yu, Neil Benn, John Domingue Knowledge Media Institute, The Open University, UK Preparation for SWS Solution (WP5)
: 3 00.
5 minutes.
THE QUESTIONS THAT NO ONE ASKS Social Entrepreneurship Conference Luis Pareras.
05/10/2011http:// 1/15 Connected! How we Integrated our Collections in WordPress using the EMu API Paul Trafford
Visions of Australia – Regional Exhibition Touring Fund Applicant organisation Exhibition title Exhibition Sample Support Material Instructions 1) Please.
PRESENTATION ON SEO & LINK BUILDING… TIPS, TRICKS AND SECRETS TO MAKE DO- IT-YOURSELF SEO A REALITY FOR ANY SMALL BUSINESS OWNER By Bob McClain – WordsmithBob.com.
NI Executive Budget 2010 Pre-Consultation. Outline Background and Context UK Fiscal Position Implications for NI Budget Way Forward Key Questions.
Murach’s OS/390 and z/OS JCLChapter 16, Slide 1 © 2002, Mike Murach & Associates, Inc.
Writing for the web training Date 09/06/14 Abigail Parris/ Web Content Editor.
An Online Ecommerce Shopping Cart Software USER MANUAL Prepared for Pascal Apparel Prepared by: Rukhsar Ahmad Technology MS Visual.
Cambridge Technicals Unit 12 P3 -Security risks.
The SEO Basics WordCamp NYC 2010 Presented By: Alex Miranda.
Easy methods to control your RSS Feeds Footer in WordPress Guided By: wpglobalsupportwpglobalsupport.
SEARCH ENGINE OPTIMIZATION SEO. What is SEO? It is the process of optimizing structure, design and content of your website in order to increase traffic.
Presentation transcript:

IR CW2–09 Webpage summarization 01/13 Can webpage summarization improve the search engine user experience? By Barney Staddon

IR CW2–09 Webpage summarization 02/13 The need for webpage summarization Types of summary The approaches The problems The solutions (Do they work?) Conclusions

IR CW2–09 Webpage summarization 03/13 The need for webpage summarization Vertical listing is used by most search engines If a summary is provided, shouldn’t it be useful? “Only 1 in 4 user-queries is initially successful ” Microsoft (2009) A good summary could avoid ‘dead’ visits Fewer ‘dead’ visits makes a better experience

IR CW2–09 Webpage summarization 04/13 Types of summary Extract or abstract? Query-relevant or generic?

IR CW2–09 Webpage summarization 05/13 The approaches Content-based summarization: Target webpage Summary

IR CW2–09 Webpage summarization 06/13 The approaches Context-based summarization: (source: Amitay & Paris, 2000)

IR CW2–09 Webpage summarization 07/13 Content problems Webpage text Is there a pre-authored summary available? What text is important and relevant? Are words, phrases or sentences extracted? Is it good quality? Summary

IR CW2–09 Webpage summarization 08/13 Context problems Content text Where does the context come from? Is there enough context text and is it relevant? Are words, phrases,or sentences extracted? Is it good quality? Summary

IR CW2–09 Webpage summarization 09/13 The solutions Open Directory Project HTML parser Term Frequency – Inverse Document Frequency Lexical chains (disambiguation) Sentence segmentation Contextual linking Contextual query data

IR CW2–09 Webpage summarization 10/13 Bing ‘hover’ (

IR CW2–09 Webpage summarization 11/13 Bing ‘hover’ (

IR CW2–09 Webpage summarization 12/13 Conclusions Content-based summarization - More likely to find good quality pre-authored summary - Random extracts can be more like a preview - More space is useful Context-based summarization - Only as good as the search engine it’s linked to - Requires greater processing power Can webpage summarization improve the search engine user experience? Yes! Preview/summary/excerpt/snippet – representative & viewable. Is cohesion necessary? A better way to present?

IR CW2–09 Webpage summarization 13/13 Can webpage summarization improve the search engine user experience? Questions References: Microsoft. (2009). Bing: New Features Relevant to Webmasters. [Online]. Available from: publishers-released.aspx [Accessed 03/12/09] publishers-released.aspx Amitay, E. and Paris, E. (2000). Automatically Summarising Web Sites - Is There A Way Around It? [Online]. Available from: amitay.pdf?key1=354816&key2= &coll=GUIDE&dl=GUIDE&CFID= &CFTOKEN= [Accessed 03/12/09] amitay.pdf?key1=354816&key2= &coll=GUIDE&dl=GUIDE&CFID= &CFTOKEN= [Accessed 03/12/09]