Trends in Web Search and its relevance to Digital Libraries Min-Yen Kan Web IR NLP Group (WING) National University of Singapore.

Slides:



Advertisements
Similar presentations
EBooks and Audiobooks. This class will give you an overview of eBooks and electronic Audiobooks available from the Library. We will also explain the basic.
Advertisements

FOR PROFESSIONAL OR ACADEMIC PURPOSES September 2007 L. Codina. UPF Interdisciplinary CSIM Master Online Searching 1.
Databases vs the Internet Coconino Community College Revised August 2010.
The Big Idea for the “Emerging Young Artists” is to do SMART marketing using digital marketing avenues. The idea is to create awareness and increase.
Publishing and the Web. What do online customers want? The Google generation expect: To find everything quickly & efficiently Websites to be easy to use.
Advanced Searching Engineering Village.
OPEN ACCESS Your Publisher of Choice DE GRUYTER OPEN Society-Pays Publishing Program.
WHAT IS SOCIAL MEDIA SAYING ABOUT YOU? MAKE IT WORK FOR YOUR CAREER SARA MEANEY PARTNER, VICE PRESIDENT COMET BRANDING – HANSON DODGE CREATIVE.
WING Research Group Demos and Posters. Min-Yen Kan, Digital Libraries 22nd CSAIL MIT Workshop Demos SlideSeer (M.-Y. Kan) Coordinating presentation slides.
What is the Internet? The Internet is a computer network connecting millions of computers all over the world It has no central control - works through.
Information Retrieval
The Social Web: A laboratory for studying s ocial networks, tagging and beyond Kristina Lerman USC Information Sciences Institute.
SEO PACKAGES. Types of Plans Starter Plan Business Plan Enterprises Plan.
Thinking the unthinkable: a library without a catalogue Reconsidering the future of our discovery tools.
© 2013 Watermelon Mountain Web Marketing 1 Jan Zimmerman, Author Web Marketing for Dummies Social Media Marketing All-in-One for Dummies Watermelon Mountain.
Trends in scholarly electronic publishing Setting the context for the workshop.
SEO RAISERS is a group of young, dynamic and like-minded individuals who have an immense passion for technology. In the rapidly changing world of Internet.
Podcasting A Web 2.0 Learning Tool By, Doug Walker District Technology Coordinator Hillsdale Public Schools.
A Case Study in Success Online How to generate revenue through content marketing.
Web 2.0: Concepts and Applications 4 Organizing Information.
Getting started on informaworld™ How do I register my institution with informaworld™? How is my institution’s online access activated? What do I do if.
AVI/Psych 358/IE 340: Human Factors Web 2.0 November
Introduction to Online Journal System (OJS) using Latin America Journals Online (LAMJOL) Sioux Cumming INASP.
What is Social Media? And how best to use it.
Social Content ASIDIC, Tampa Fl, March 2009 What is Social Content? How can we use Social Content? What is the future of Social Content?
H OW T O G O S OCIAL In the Timeshare Industry. W HAT IS S OCIAL MEDIA ? “Social media describes the online technologies and practices that people use.
SOCIAL MEDIA MARKETING Vietseo.vn 30/09/2011 Share version Pls become VIETSEO PRO member for Full version (41 pages) become VIETSEO PRO member become VIETSEO.
ELECTRONIC RESOURCES WORKSHOP March 29, 2013 Databases and eBooks A subscription database is a collection of regularly updated scholarly and professional.
University of Antwerp Library TEW & HI UA library offers... books, journals, internet catalogue -UA catalogue, e-info catalogue databases -e.g.
Microsoft Academic Search Search | Explore | Discover Alex D. Wade Director - Scholarly Communication.
WebMining Web Mining By- Pawan Singh Piyush Arora Pooja Mansharamani Pramod Singh Praveen Kumar 1.
Finding Credible Sources
Web IR/NLP Group NUS Min-Yen Kan School of Computing National University of Singapore
Math Information Retrieval Zhao Jin. Zhao Jin. Math Information Retrieval Examples: –Looking for formulas –Collect teaching resources –Keeping updated.
PEERSPECTIVE.MPI-SWS.ORG ALAN MISLOVE KRISHNA P. GUMMADI PETER DRUSCHEL BY RAGHURAM KRISHNAMACHARI Exploiting Social Networks for Internet Search.
Ecology the Library Fall 2013 Fall 2013 L.A. McCready, MHS Teacher Librarian.
©2007 TopRank™ Online Marketing. All rights reserved. This material contains confidential and proprietary information of TopRank Online Marketing and may.
© 2010 Deep Web Technologies, Inc. Taking the Library Back from Google Abe Lederman, President and CTO Deep Web Technologies May 12, 2010.
Blogging Webinar LEARN THE BENEFITS OF BLOGGING & HOW TO GET STARTED!
+ The Use of Databases in the Instructional Program Increasing Rigor and Inquiry Throughout the Curriculum Donna Dick, Jacob Gerding, and Michelle Phillips.
SEO for Google in Hello I'm Dave Taylor from Webmedia.
Web 2.0: Making the Web Work for You, Illustrated Unit A: Research 2.0.
By Pamela Drake SEARCH ENGINE OPTIMIZATION. WHAT IS SEO? Search engine optimization (SEO) is the process of affecting the visibility of a website or a.
LIBRARY 2.0 Cleveland State University Library July 10, 2008.
Using Web 2.0 Applications as Information Awareness Tools Jay Bhatt and Dana Denick – Hagerty Library, Drexel University Smita Chandra – Indian Institute.
What you will learn “How to go online and be successful” The Landscape The Website Getting Found Managing Your Customers Automation.
Chapter 20 Asking Questions, Finding Sources. Characteristics of a Good Research Paper Poses an interesting question and significant problem Responds.
USER GUIDE TO BOOKS AT JSTOR November WHAT IS BOOKS AT JSTOR? Books at JSTOR is a program that offers ebooks from leading scholarly publishers,
Research Skills for Your Essay Where to begin…. Starting the search task for real Finding and selecting the best resources are the key to any project.
Think Digital, Think Ally Digital Media 1of19 SEO Press Release Strategy 2015.
Networked Information Resources Federated search, link server, e-books.
 GEETHA P.  Originally coined by Tim O’Reilly Publishing Media  Second generation of services available on www.  Lets people collaborate and share.
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
The Sci-tech Information Industry: Changes, Challenges and the CAS Response Presented by Michael Walsh Manager of International Marketing Operations Chemical.
An Easier Way to Build Credibility, Generate Buzz, and Increase Revenue Gail M. Romero, Executive Producer.
Free SEO for Blogs & YouTube Channels.
Databases vs the Internet
How to Develop and Write a Research Paper.
Search Engine Optimisation
User guide to books at jstor
Elsevier Activity Range
Business in a Connected World
SEARCH ENGINE OPTIMIZATION SEO. What is SEO? It is the process of optimizing structure, design and content of your website in order to increase traffic.
Overview The promotion of products or brands via Digital media Digital Media  Search Engine Marketing Search Engine Marketing  Social Media Marketing.
Advance SEO tips & techniques. Search Engine Optimization Search Engine Optimization(SEO) or Organic Search is a process that focuses on increasing the.
How to Improve the Visibility and Impact of Your Research
Information Retrieval
WorldCat: Broad Web visibility for our collection
Introduction to Information Retrieval
Citation databases and social networks for researchers: measuring research impact and disseminating results - exercise Elisavet Koutzamani
Presentation transcript:

Trends in Web Search and its relevance to Digital Libraries Min-Yen Kan Web IR NLP Group (WING) National University of Singapore

Min-Yen Kan, 226 Sep 2008World Scientific Talk Tips on Web Searching Visualize results, then come up with multiple queries Use multiple search engines Advanced Search – inurl:, site: – “Phrasal search” But that’s just general search… Federated resources / Niche search engines

Min-Yen Kan, 326 Sep 2008World Scientific Talk Site- and Task-specific resources Site Prestige Know what others think and do – Google PageRank (Link structure), Alexa (Traffic) – Google Trends / Insight (Queries) Social Searching (Web 2.0) The voice of the reader / critic – (Bookmarks / Tags) Del.icio.us, Citeulike.org, Bibsonomy.org – (News) Digg / Slashdot – (Blogs) Google Blog, Technorati People Search: Finding public information on a person – Spock (web), Zabasearch (US only) – LinkedIn, Facebook – Must validate your sources

Min-Yen Kan, 426 Sep 2008World Scientific Talk Expert Search Find people who will advocate on your behalf What do they want? Scholar: – Active? → Check their recent articles – Names common? → Define area of interest – Compare against peers – Download vs. citation counts Patent search: – Referenced by: (citation count; different than scholar) Identifying webfaced advocates: – Blog search, PageRank How do machines do it? Expert search task as benchmark test Download web pages to analyze Needed to deal with spam pages Used PageRank to assess prestige How do machines do it? Expert search task as benchmark test Download web pages to analyze Needed to deal with spam pages Used PageRank to assess prestige → Impact

Min-Yen Kan, 526 Sep 2008World Scientific Talk Problem or opportunity? Revenue from print continually declining Students and researchers rely on internet Researchers want archiving rights – freedom of academic information Characteristics: Not zero-sum content Distribution is now largely the role of search engines → Necessitates new role of publisher and new revenue model – Will classic models work? Advertising, Subscription, Transactional & Bundling – Variants? Versioning (Varian), Moving window (JSTOR) The game has fundamentally changed

Min-Yen Kan, 626 Sep 2008World Scientific Talk Forecasting – Content is becoming free – MIT / Stanford opening up textbooks – Open access archiving → long term: content will not be primary revenue source eBook revenue hasn’t held up its promise yet… – Device gap: iPhone and nextGen devices → Revenue may be further down the pipe + Academic publishers – Connect to libraries and federations at institution level – Individual customers are secondary Trusted source – Expertise in copyediting, typesetting, project management, distribution, social networking – Many individual web publishers rediscovering same problems → Consultancy model → Win-win partnerships with individual authors

Min-Yen Kan, 726 Sep 2008World Scientific Talk Web Trends Social Content Wisdom of masses: Crowdsourcing Rich Media Open Source / Access Paradigmatic change – Classifieds → Craigslist – POTS → Skype – CD store → iTunes – Publishers → ?? slash/iA_WebTrends_2007_2_1024_768.gif

Min-Yen Kan, 826 Sep 2008World Scientific Talk Where is research going? Search API usage Browser as computer Web page structure, mining text data Modeling web users at tasks: Exploring / Fact-finding Personalization, recommending Social networks Understanding opinion Query and log analysis User centricServer centric

Min-Yen Kan, 926 Sep 2008World Scientific Talk Webfaced pop quiz – which is which? Springer American Statistical Society World Scientific courtesy:

Min-Yen Kan, 1026 Sep 2008World Scientific Talk Forecast: Know your strengths Get advocates Make it easy to get individuals to insist to their institution to buy your materials Know who is accessing (not necessarily buying) your content Content revenue will continue to decline Find an economic model that works for you Work as partners in content creation Be savvy on trends Be visible: do “white hat” Search Engine Optimization (SEO) Make your abstracts indexable by others + Academic publishers – Connect to libraries and federations at institution level – Individual customers are secondary Trusted source – Expertise in copyediting, typesetting, project management, distribution, social networking – Many individual web publishers rediscovering same problems –→ Consultancy model –→ Win-win partnerships with individual authors

Min-Yen Kan, 1126 Sep 2008World Scientific Talk Trends in Digital Libraries Expanding types of information in search Automated tools for DLs Usability in E-books and online media User modeling Personalization, annotation and relation to other user tasks >> NUS

Min-Yen Kan, 1226 Sep 2008World Scientific Talk Scholarly Digital Libraries ForeCite: our scholarly DL Data Cleaning Slide and Document Alignment Searching in the OPAC Math Information Retrieval

Min-Yen Kan, 1326 Sep 2008World Scientific Talk ForeCite: Beyond the document as an item A user-centric DL framework Put author / reader functionality together Tagging, correction, annotation and viewing Automatic tools: keyphrases and sentence classification For use on and offline, organizes local PDF files for you Only need your web browser Server Client

Min-Yen Kan, 1426 Sep 2008World Scientific Talk Data Cleaning Addresses – Dongwon Lee, 110 E. Foster Ave. #410, State College, PA, – LEE Dong, 110 East Foster Avenue Apartment 410, Univ. Park, PA Products – Honda Fix vs. Honda Jazz – Apple iPod Nano 4GB vs. 4GB iPod nano 4GB Idea: use web as additional context for disambiguation and clustering Placed 3rd in Web People Search Task (WEPS 2007) Search results: “Jeffrey D. Ullman” 384,000 pages “Jeffrey D. Ullman” + “aho” 174,000 pages “J. Ullman” 124,000 pages “J. Ullman” + “aho” 41,000 pages “Shimon Ullman” 27,300 pages “Shimon Ullman” + “aho” 66 pages 45% 33% 0%

Min-Yen Kan, 1526 Sep 2008World Scientific Talk Slides and their relationship to documents Document in focus Slides in Focus

Min-Yen Kan, 1626 Sep 2008World Scientific Talk Searching in Libraries

Min-Yen Kan, 1726 Sep 2008World Scientific Talk Symbolic Information Search How do users want to search math materials? Our answer: Text-to-Expression Linking – Resolve text keywords to expressions – e.g., “Pythagorean Theorem”  “a 2 +b 2 =c 2 ” or “x 2 +y 2 =z 2 ” Reduce the need for expression input Solves the notational variation problem Not quite right…

Min-Yen Kan, 1826 Sep 2008World Scientific Talk Conclusions Consider us your research WING! Trade data and problems for solutions and interns Meanwhile: Use better search strategies Practice white hat SEO Identify webfaced advocates

Min-Yen Kan, 1926 Sep 2008World Scientific Talk References Kahin and Varian (2000) Internet Publishing and Beyond Towle et al. (2007) Electronic Books in the Period, Pub Res Q 23: Photo Credits Flickr Creative Commons Search Thanks to all of you for listening & my fellow WING group members

Min-Yen Kan, 2026 Sep 2008World Scientific Talk

Min-Yen Kan, 2126 Sep 2008World Scientific Talk Abstract I will present trends in current academic research on web search and digital libraries, and discuss their relevance to publishers and their economic model. With respect to the web, I will cover how search engines are starting to specialize and use click through and ad data to improve relevance ranking. With respect to digital library research, I discuss my group's research at NUS on advancing the state-of-the-art in scholarly digital libraries. I cover advances on how we deal with data cleaning issues, and slide and equation retrieval and alignment.