© Tefko Saracevic, Rutgers University1 Web sources and library & information services Finding, evaluating and using a variety of Web sources for searching.

Slides:



Advertisements
Similar presentations
Chapter 2. Slide 1 CULTURAL SUBJECT GATEWAYS CULTURAL SUBJECT GATEWAYS Subject Gateways  Started as links of lists  Continued as Web directories  Culminated.
Advertisements

ITIS 1210 Introduction to Web-Based Information Systems Internet Research Three Browsing Subject Guides.
Search Engines and Information Retrieval
© Tefko Saracevic, Rutgers University1 digital libraries and human information behavior Tefko Saracevic, Ph.D. School of Communication, Information and.
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
Starting Your Research American Indian Studies Anthropology Library Instruction Fall 2004 Mary S. Woodley
© Tefko Saracevic, Rutgers University adapted for sectoin 21 PRINCIPLES OF SEARCHING 17:610:530 (02) Paul Kantor SCILS, Rm. 307 (732) /Ext
© Tefko Saracevic, Rutgers University1 Services in digital libraries Following functions? Following new capabilities?
Mastering the Internet, XHTML, and JavaScript Chapter 7 Searching the Internet.
© Tefko Saracevic, Rutgers University1 metadata considerations for digital libraries.
© Tefko Saracevic, Rutgers University 1 EVALUATION in searching IR systems Digital libraries Reference sources Web sources.
© Tefko Saracevic, Rutgers University 1 evaluating information on the web Tefko Saracevic School of Communication, Information and Library Studies Rutgers.
Starting Your Research Anthropology 315 Library Instruction Mary Woodley Spring 2007
© Tefko Saracevic, Rutgers University1 digital libraries and human information behavior Tefko Saracevic, Ph.D. School of Communication, Information and.
What is the Internet? The Internet is a computer network connecting millions of computers all over the world It has no central control - works through.
Starting Your Research Library Instruction Summer 2003.
© Tefko Saracevic1 Types & structures of information resources What is out there for searching and what’s under the hood?
© Tefko Saracevic, Rutgers University1 The Invisible Web - finding things that are hard to find - Tefko Saracevic, PhD Rutgers University
Search engines. The number of Internet hosts exceeded in in in in in
© Tefko Saracevic, Rutgers University1 PRINCIPLES OF SEARCHING 17:610:530 (01) Tefko Saracevic SCILS, Rm. 306 (732) /Ext. 8222
Starting Your Research Anthropology 303 Library Instruction Mary Woodley Fall 2004
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
Searching the World Wide Web From Greenlaw/Hepp, In-line/On-line: Fundamentals of the Internet and the World Wide Web 1 Introduction Directories, Search.
© Tefko Saracevic, Rutgers University1 digital libraries and human information behavior Tefko Saracevic, Ph.D. School of Communication, Information and.
Starting Your Research Communication Studies Library Instruction Fall 2004 Mary Woodley
Unit 3 Web Search Engines. Can You Find the Answers? n Connect to Google Google n Search for items on Iran Records ________ n Combine Iran with nuclear.
Starting Your Research Library Instruction Fall 2005 Mary S. Woodley
© Tefko Saracevic, Rutgers University1 The Invisible Web Tefko Saracevic, PhD Rutgers University ( contains also a.
WHAT HAVE WE DONE SO FAR?  Weeks 1 – 8 : various components of an information retrieval system  Now – look at various examples of information retrieval.
Internet Research Search Engines & Subject Directories.
Copyright © Allyn & Bacon 2008 This multimedia product and its contents are protected under copyright law. The following are prohibited by law: any public.
Searching “Search results are only as good as the query you pose and how you search. There is no silver bullet”
Lecturer: Ghadah Aldehim
Topics Basic Internet Concepts. Types of Information. Search Tools & Techniques. Managing Internet Resources. Browsing a mail. Composing a mail. Attaching.
Search Engines and Information Retrieval Chapter 1.
Promotion & Cataloguing AGCJ 407 Web Authoring in Agricultural Communications.
Bio-Medical Information Retrieval from Net By Sukhdev Singh.
WISER Social Sciences: Politics & International Relations Gillian Beattie (Social Science Library) Jane Rawson (Vere Harmsworth Library)
Week 9 Search Engines and the Invisible Web. Resource Pages Collections of Links Compiled by “experts” Sometimes annotated Targeted Information for a.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
NCBI/WHO PubMed/Hinari Course Introduction Session #1, Sept 13, 2005 Session #2, Sept 14, 2005 Internet Concepts and Scientific Literature Resources Ho.
Search Engine By Bhupendra Ratha, Lecturer School of Library and Information Science Devi Ahilya University, Indore
SEARCH ENGINES Jaime Ma, Vancy Truong & Victoria Fry.
1/28: The Internet & Website Design What is the Internet? –Parts of the Internet –Internet & WWW basics –Searching the WWW Website design considerations.
Beyond Search Engines: Advanced Web Searching Subject Directories  Librarians’ Index to the Internet  Infomine Finding Databases on a Subject  The Invisible.
Fourth Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
Library Instruction Fall 2008 Mary S. Woodley t.
What to Know: 9 Essential Things to Know About Web Searching Janet Eke Graduate School of Library and Information Science University of Illinois at Champaign-Urbana.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
The Internet 8th Edition Tutorial 4 Searching the Web.
Where do I find it? Created by Connie CampbellConnie Campbell.
Strategies for Conducting Research on the Internet Angela Carritt User Coordinator, Oxford University Library Services Angela Carritt User Education Coordinator,
Internet Research Tips Daniel Fack. Internet Research Tips The internet is a self publishing medium. It must be be analyzed for appropriateness of research.
4 1 SEARCHING THE WEB Using Search Engines and Directories Effectively New Perspectives on THE INTERNET.
LOGO A comparison of two web-based document management systems ShaoxinYu Columbia University March 31, 2009.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Unit 1—Computer Basics Lesson 3 The Internet and Research.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Internet Power Searching: Finding Pearls in a Zillion Grains of Sand By Daniel Arze.
W orkshops in I nformation S kills and E lectronic R esources Oxford University Library Services – Information Skills Training Finding quality information.
Effective Internet Search Strategies: Search Engines & Directories Wendy E. Moore, M.S. in L.S. Acquisitions/Serials Librarian University of Georgia School.
Learning how to search on the web “If all you ever do is all you’ve ever done, then all you’ll ever get is all you’ve ever got.” (author unknown)
Searching the Web for academic information Ruth Stubbings.
Semmelweis University Library1 Finding Things - Hard To Find Finding Things - Hard To Find Jozsef GEGES PhD Ovidius Marketing Co Ltd.
Using Search Tools on the Internet
Search Engines & Subject Directories
Unit# 5: Internet and Worldwide Web
Search Engines & Subject Directories
Search Engines & Subject Directories
digital libraries and human information behavior
Presentation transcript:

© Tefko Saracevic, Rutgers University1 Web sources and library & information services Finding, evaluating and using a variety of Web sources for searching and reference

© Tefko Saracevic, Rutgers University2 Similarities between Web searching & IR & reference Basic principles to approach the same –human-human interaction - interview - social, organizational, cognitive, affective aspects to explore including task, need … –preparation of search concepts, terms, logic –determination of range, restrictions –estimation of relevance Basic principles to approach the same –human-human interaction - interview - social, organizational, cognitive, affective aspects to explore including task, need … –preparation of search concepts, terms, logic –determination of range, restrictions –estimation of relevance

© Tefko Saracevic, Rutgers University3 Differences Vastly different sources –as to contents, authority, reliability persistence –variation in amounts, depth, breadth Very different organization –little standardization, few if any fields Quite different search engines & capabilities -basic & advanced –also different from engine to engine Differing search strategies needed Vastly different sources –as to contents, authority, reliability persistence –variation in amounts, depth, breadth Very different organization –little standardization, few if any fields Quite different search engines & capabilities -basic & advanced –also different from engine to engine Differing search strategies needed

© Tefko Saracevic, Rutgers University4 Also: invisible Web Materials that general search engines cannot or WILL not include in their collection of Web pages (indexes) You cannot find through general search engines Contains a vast amount of information –much of it authoritative, qualitative Materials that general search engines cannot or WILL not include in their collection of Web pages (indexes) You cannot find through general search engines Contains a vast amount of information –much of it authoritative, qualitative

© Tefko Saracevic, Rutgers University5 Why search engines miss? Size: Web is huge, cannot cover all Economics: associated costs are high –also pay per crawl & rank Technical: still limited capabilities Spam: eliminating bad also looses good Restrictions: some site do not let in Deep structure: some sites complex Size: Web is huge, cannot cover all Economics: associated costs are high –also pay per crawl & rank Technical: still limited capabilities Spam: eliminating bad also looses good Restrictions: some site do not let in Deep structure: some sites complex

© Tefko Saracevic, Rutgers University6 Needed for Web searching Knowledge & competencies –variety of Web sources –their organization –search engines –Web search strategies –search dynamics, feedback Keeping up & up & up –constant updates, changes, innovations –many domain/subject specific Knowledge & competencies –variety of Web sources –their organization –search engines –Web search strategies –search dynamics, feedback Keeping up & up & up –constant updates, changes, innovations –many domain/subject specific

© Tefko Saracevic, Rutgers University7 Web size - who knows? Estimated over 16 million web servers Lawrence & Giles, 1999 –But only a fraction of direct search relevance Domains of sites 83% commercial, 6% scientific or educational; 3% health 2.5% personal; 2% societies; 1.5% government, about 1% each community, religion 1.5% pornographic Web Characterization Project - OCLC – statistics, trends, report, links … for 2001 reports 8.5 mill web sites – Estimated over 16 million web servers Lawrence & Giles, 1999 –But only a fraction of direct search relevance Domains of sites 83% commercial, 6% scientific or educational; 3% health 2.5% personal; 2% societies; 1.5% government, about 1% each community, religion 1.5% pornographic Web Characterization Project - OCLC – statistics, trends, report, links … for 2001 reports 8.5 mill web sites –

© Tefko Saracevic, Rutgers University8 Organization of sources No standardization across sources Major approaches in search engines –classification: many directory types used –statistical analyses of terms, links Metatags in sources –to enable retrieval by fields –HTML “keywords”, “description” 34% of sites use them –Dublin core -.3% sites use Organization: hindrance to retrieval –also faked contents to force retrieval No standardization across sources Major approaches in search engines –classification: many directory types used –statistical analyses of terms, links Metatags in sources –to enable retrieval by fields –HTML “keywords”, “description” 34% of sites use them –Dublin core -.3% sites use Organization: hindrance to retrieval –also faked contents to force retrieval

© Tefko Saracevic, Rutgers University9 Sources & search engines Indexed by search engines (publicly indexed) –by terms, selection, links, registration Not publicly indexed –many domain sources will not be found e.g digital libraries, online journals, reference –many commercial sites will hardly be found Differing approaches to inclusion/selection –mostly automatic; also generic source providers –increasingly added human evaluation & selection Indexed by search engines (publicly indexed) –by terms, selection, links, registration Not publicly indexed –many domain sources will not be found e.g digital libraries, online journals, reference –many commercial sites will hardly be found Differing approaches to inclusion/selection –mostly automatic; also generic source providers –increasingly added human evaluation & selection

© Tefko Saracevic, Rutgers University10 Search engine coverage No engine covers more than 16% of WWW In respect to combined coverage of 11 top: –Northern Light 38.3% ; Snap 37.1; AltaVista 37.1 HotBot 27.1 MS 20.3 Infoseek 19.2, Google 18.6, Yahoo 17.6 Excite 13.5, Lycos 5.9, EuroSeek 5.2 –HotBot MS, Snap & Yahoo use Inktomi as search provider, but have different filtering & Inktomi databases Northern Light has ‘special collection’ - documents not part of publicly indexabable web Hard to discern & compare coverage Many national search engines - own coverage No engine covers more than 16% of WWW In respect to combined coverage of 11 top: –Northern Light 38.3% ; Snap 37.1; AltaVista 37.1 HotBot 27.1 MS 20.3 Infoseek 19.2, Google 18.6, Yahoo 17.6 Excite 13.5, Lycos 5.9, EuroSeek 5.2 –HotBot MS, Snap & Yahoo use Inktomi as search provider, but have different filtering & Inktomi databases Northern Light has ‘special collection’ - documents not part of publicly indexabable web Hard to discern & compare coverage Many national search engines - own coverage

© Tefko Saracevic, Rutgers University11 Search features among engines Some search features the same across all but details differ - particularly in advanced –Boolean available but sometimes AND sometimes OR default –Differences may be found in: phrases, proximity, truncation, case sensitivity, relevance feedback, field searching, special features term expansion to concepts (latent semantic indexing) Some search features the same across all but details differ - particularly in advanced –Boolean available but sometimes AND sometimes OR default –Differences may be found in: phrases, proximity, truncation, case sensitivity, relevance feedback, field searching, special features term expansion to concepts (latent semantic indexing)

© Tefko Saracevic, Rutgers University12 Search strategies & outputs Geared toward very short searches –big majority of searches 2-3 terms (av. 2.5) in IR av making a big difference Directory browsing a big component - not in IR Geared toward limited top outputs Ranking output by relevance predominates –relevance calculation differ & proprietary (secret) –except Google - they published their method –affects search strategy - you guess how is done Geared toward very short searches –big majority of searches 2-3 terms (av. 2.5) in IR av making a big difference Directory browsing a big component - not in IR Geared toward limited top outputs Ranking output by relevance predominates –relevance calculation differ & proprietary (secret) –except Google - they published their method –affects search strategy - you guess how is done

© Tefko Saracevic, Rutgers University13 Meta search engines Search engines that cover search engines – many around e.g. –All4one four windows - good for comparison –CDNET Search.com meta engine of meta engines - customization Search Engines Worldwide 174 countries, over 1300 engines More on the horizon & differing Search engines that cover search engines – many around e.g. –All4one four windows - good for comparison –CDNET Search.com meta engine of meta engines - customization Search Engines Worldwide 174 countries, over 1300 engines More on the horizon & differing

© Tefko Saracevic, Rutgers University14 Specialized meta engines Selective with directories & large number of databases & search engines –Complete Planet –Invisible Web U.S. federal information via Government Printing Office Access –Federal Bulletin Board (file libraries for download from many agencies): Selective with directories & large number of databases & search engines –Complete Planet –Invisible Web U.S. federal information via Government Printing Office Access –Federal Bulletin Board (file libraries for download from many agencies):

© Tefko Saracevic, Rutgers University15 Reference (expert) services Reference services - several models –Q&A, directories, answers etc. – e.g. –Martindale’s Reference Desk - comprehensive –Ask Jeeves! – most popular –Ask ERIC – education questions- answers –Information Please - almanac type questions Academic libraries developing reference models - new service area Reference services - several models –Q&A, directories, answers etc. – e.g. –Martindale’s Reference Desk - comprehensive –Ask Jeeves! – most popular –Ask ERIC – education questions- answers –Information Please - almanac type questions Academic libraries developing reference models - new service area

© Tefko Saracevic, Rutgers University16 Libraries as Web sources Academic libraries providing open collections & services; models vary –Rutgers libraries - big long term effort –various sources & links involved for domain information& sources go to: –Electronic Reference Sources; Subject Research Guides: Social Sciences & Law; Library & Information Science –University of California, Berkeley - a most elaborate effort together with Sun Corporation Academic libraries providing open collections & services; models vary –Rutgers libraries - big long term effort –various sources & links involved for domain information& sources go to: –Electronic Reference Sources; Subject Research Guides: Social Sciences & Law; Library & Information Science –University of California, Berkeley - a most elaborate effort together with Sun Corporation

© Tefko Saracevic, Rutgers University17 Virtual libraries on the Web Libraries emerging only on the Web –More & more libraries & organizations involved Examples of academic & public libraries – Virtual Library - Switzerland, US, UK & other countries – ‘oldest virtual library on the Web’ –Toronto Public Library –Internet Public Library, Michigan Libraries emerging only on the Web –More & more libraries & organizations involved Examples of academic & public libraries – Virtual Library - Switzerland, US, UK & other countries – ‘oldest virtual library on the Web’ –Toronto Public Library –Internet Public Library, Michigan

© Tefko Saracevic, Rutgers University18 Domain sites Many domain/issue specific sites –rich & often unique coverage & services – different approaches & requirements Examples in health related domains: –Medscape - registration required –Rxlist - The Internet Drug Index –Mayo Clinic HealthOasis Many domain/issue specific sites –rich & often unique coverage & services – different approaches & requirements Examples in health related domains: –Medscape - registration required –Rxlist - The Internet Drug Index –Mayo Clinic HealthOasis

© Tefko Saracevic, Rutgers University19 Societies, organizations, publishers Great many rich sources for searching –differences in requirements, depth, richness Examples from variety of organizations: –Assoc. for Computing Machinery Digital Library; subscription or registration –State department about the U.S & other countries –R.R. Bowker Free Resources from Bowker; Library Resource Guide –Genealogy: Great many rich sources for searching –differences in requirements, depth, richness Examples from variety of organizations: –Assoc. for Computing Machinery Digital Library; subscription or registration –State department about the U.S & other countries –R.R. Bowker Free Resources from Bowker; Library Resource Guide –Genealogy:

© Tefko Saracevic, Rutgers University20 Language barriers on the Web English still the major language – but declining, now slightly over 50% Multilingual retrieval search engines –Euroseek – searches 40 languages –All the Web – 45 languages –in both, search in different languages covers primarily their language sources English still the major language – but declining, now slightly over 50% Multilingual retrieval search engines –Euroseek – searches 40 languages –All the Web – 45 languages –in both, search in different languages covers primarily their language sources

© Tefko Saracevic, Rutgers University21 Language barriers: translations A number of translation sites –machine aided – i.e. plug in terms, phrases, sentences in one & review in the other language, but effectiveness??? – Free Translations –Babel Fish –Travlang – great for travelers – phrases A number of translation sites –machine aided – i.e. plug in terms, phrases, sentences in one & review in the other language, but effectiveness??? – Free Translations –Babel Fish –Travlang – great for travelers – phrases

© Tefko Saracevic, Rutgers University22 Key professional competencies Knowledge of SOURCES in area of interest search engines not enough not too helpful in finding these other sources; structure hard to discern Evaluation of sources –a key professional skill! standard criteria: quality, veracity, coverage etc plus Web criteria: authority; accuracy; currency (timeliness); objectivity; coverage, persistence, usability – Knowledge of SOURCES in area of interest search engines not enough not too helpful in finding these other sources; structure hard to discern Evaluation of sources –a key professional skill! standard criteria: quality, veracity, coverage etc plus Web criteria: authority; accuracy; currency (timeliness); objectivity; coverage, persistence, usability –

© Tefko Saracevic, Rutgers University23 competencies … Knowledge of users & use Knowledge of searching Use of technology Adaptability, flexibility Integration with other resources Teaching others Constant learning & update Knowledge of users & use Knowledge of searching Use of technology Adaptability, flexibility Integration with other resources Teaching others Constant learning & update

© Tefko Saracevic, Rutgers University24