Download presentation
Presentation is loading. Please wait.
Published byKeanu Meeler Modified over 10 years ago
1
Sometimes Google Isn’t Enough Finding Information on the Invisible Web Shirley McDonald shirley.mcdonald@lpsb.org Hilda Donaldson hilda.donaldson@lpsb.org
2
First: a definition of the Visible (Surface) Web “It’s made up of HTML Web pages that the search engines have chosen to include in their indices. It’s no more complicated than that.” Sherman and Price.
3
Static Web pages Fixed, or static, pages do not change and can be linked to other pages. Ex: http://www.truthorfiction.comhttp://www.truthorfiction.com http://exploratorium.com
4
Dynamic Web Pages Dynamic - generated only by a specific query; does not exist after that query. www.mapquest.com http://www.aeroseek.com/webtrax/
5
The Invisible, Deep, or Hidden Web Web sites or information that Google or other popular search engines are not capable of indexing Websites specifically excluded by the search engine
6
Invisible (Deep or Hidden) Web Public info is 400 – 550 times larger 550 billion individual documents vs one billion on surface web Quality content is 1,000 to 2,000 times greater than surface web 95% of Deep Web is accessible to public (no fees or subscription required) (Bergman)
7
Opaque Web – material that can be, but is not included in search engine results. Ex: new material added and not yet picked up. Private Web – sites intentionally excluded from search engine results. Ex: password protected Proprietary Web – sites that require user registration. Ex: eBay, New York Times Pay per click – Ex: overture.com, FindWhat.comoverture.com FindWhat.com Hidden Web sites
9
Content of Databases Information stored in tables (Access, Oracle, SQL Server, DB2) and accessible only by query. Examples: Phone books, People finders Patents, laws Items for sale in a Web store or Web-based auctions Digital exhibits Multimedia and graphical files Stock and bond prices
10
Examples of Hidden Sites Pages in searchable databases: medical (WebMD.com), patent, scientific, legal (Lexis and Westlaw), reference Pages requiring login or registration: Blackboard, New York Times Government publications or databases: ERIC Online databases: Gale Research PDF files, audio, video, any new format
11
More Examples Dictionaries and thesauri Sites that require forms to be filled out (ex: travel direction, job hunting) Product catalogs and library catalogs Newspaper and magazine archives Dynamic web pages (ex: airline flight checkers, mapquest) Interactive tools (ex: calculators)
12
How are pages excluded from search engines? Google’s PageRank TM puts pages at the top of the hit list by the number of times they are linked to other pages (popular) Webmasters that have figured out how to manipulate PageRank’s TM behavior are able to move their pages to the top of the hit list
13
Faulty typing and/or judgment Search engine spiders and crawlers cannot see the site unless it contains a link to another site Search engines can primarily see text pages in HTML form This will change in the future as search engines become more capable of retrieving the “hidden” web
14
Use of blocking techniques by the webmaster or server Password protection HTML blocking in the web page A listing on the server of blocked pages
15
Searching the Invisible Web Use the following to get around, just like the visible web: Directories – subject guide compiled by human editors Search Engines Specialized Databases
16
Directories to search the Invisible Web Big Hub http://www.thebighub.com/ Complete Planet: The Deep Web Directory 70,000 searchable databases and specialty search engines http://www.completeplanet.com Digital Librarian: A Librarian’s Choice of the Best of the Web www.digital-librarian.com
17
More directories IncyWincy: The Invisible Web Search Engine Offers Web Search, Directory Search, Metasearch, News Note: Kids & Teens, Reference http://www.incywincy.com Invisible Web Directory http://www.invisible-web.net/
18
Infomine: Scholarly Internet Resource http://infomine.ucr.edu Invisible Web Directory http://www.invisible-web.net/ Librarian’s Index to the Internet www.lii.org Open Directory Project (dmoz) http://www.dmoz.org (want to edit?) http://www.dmoz.org ProFusion: The Original Meta-Search Engine http://www.profusion.com/
19
Search Engines for the Invisible Web AlltheWeb: find it all http://www.alltheweb.com Bright Planet http://www.brightplanet.com/ Direct Search: SearchCenter (59 pages!) Can get updates through emails - Resourceshelf http://www.freepint.com/gary/direct.htm IxQuick: the world’s most powerful metasearch engine http://ixquick.com/
20
More Search Engines Search-22 http://www.search-22.com Search Adobe PDF Online http://searchpdf.adobe.com/ Turbo10 http://turbo10.com Vivisimo/Vivisimo Clustering http://www.vivisimo.com
21
Specialized Databases Library of Congress http://catalog.loc.gov LookSmart’s Find Articles (over 900 publications http://www.findarticles.com National Science Digital Library http://www.nsdl.org Singing Fish – audio and video http://www.singingfish.com
22
Choosing the Best Search NoodleTools http://www.noodletools.com/debbie/literacies/infor mation/5locate/adviceengine.html http://www.noodletools.com/debbie/literacies/infor mation/5locate/adviceengine.html Great chart that connects the information need to the search strategy How to Choose a Search Engine or Directory http://library.albany.edu/internet/choose.html
23
Access to the Hidden Web is Constantly Improving “Google Scholar Offers Access to Academic Information.” written by Danny Sullivan, November 18, 2004 http://searchenginewatch.com/searchday/article. php/3437471 http://searchenginewatch.com/searchday/article. php/3437471 Google makes arrangement with publishers to get into password protected sites – sometimes shows only abstract Includes libraries of Oxford, Stanford, Michigan, Harvard, NY Public http://scholar.google.com/
24
Issues “Let a Thousand Googles Bloom.” – by Lawrence Lessig http://www.latimes.com/news/opinion/commentary Questions the legality and copyright issues “Does Google move augur commercialization of libraries?” – Detroit Free Press http://www.freep.com/news/statewire/sw108716_2 0041214.htm http://www.freep.com/news/statewire/sw108716_2 0041214.htm
25
Alternative to Google Scholar “Internet Archive to Build Alternative to Google.” – by Mark Chillingworth “Ten major international libraries have agreed to combine their digitized book collections in a free text-based archive hosted online by the not-for- profit Internet Archive.” Open Access
26
Bibliography Bergman, Michael K. “The Deep Web: Surfacing Hidden Value.” http://www.beta.brightplanet.com/deepcontent/tutorials/DeepWeb/index.asp (8 November 2004). http://www.beta.brightplanet.com/deepcontent/tutorials/DeepWeb/index.asp Cadwallader, Joy. “Searching the Invisible Web.” http://www.inf.aber.ac.uk/academicliaison/internet/invisible.asp (4 November 2004). http://www.inf.aber.ac.uk/academicliaison/internet/invisible.asp Chillingworth, Mark. “Internet archive to build alternative to Google.” Information World. http://www.iwr.co.uk/IWR/1160176. (30 December 2004).http://www.iwr.co.uk/IWR/1160176 Cohen, Laura. “How to Choose a Search Engine or Directory.” http://library.albany.edu/internet/choose.html (4 November 2004). http://library.albany.edu/internet/choose.html “Does Google move augur commericalization of libraries?” http://www.freep.com/news/statewire/sw108716_20041214.htm (15 December 2004). http://www.freep.com/news/statewire/sw108716_20041214.htm (15 Grimes, Brad. “Expand your Web search horizons: six tips for finding the info you want by searching hidden corners of the Web.” PC World. June, 2002. “Invisible Web: What it is, Why it exists, How to find it, and Its inherent ambiguity.” http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/InvisibleWeb.html (4 November 2004). http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/InvisibleWeb.html Lessig, Lawrence. “Let a Thousand Googles Bloom.” http://www.latimes.com/news/opinion/commentary/la-oe- lesig12Jan12,1,1292618.story?ctrack=1 (13 January 2005). http://www.latimes.com/news/opinion/commentary/la-oe- lesig12Jan12,1,1292618.story?ctrack=1 McLaughlin, Laurianne. “Beyond Google: the web is so full of useful info that no search engine can find it all. But a multitude of specialty sites deliver shopping advice, reference databases, leisure-time ideas, and more – fast.” PC World. April, 2004.
27
Bibliography Niederlander, Mary. “More on Searching: The Hidden Web or Invisible Web Resources.” http://www.librarysupportstaff.com/hiddenweb.html (4 November 2004).http://www.librarysupportstaff.com/hiddenweb.html O’Leary, Mick. “Invisible Web Discovers Hidden Treasures.” Information Today. January, 2000. “Search Engines 101 – Search Engines Explained.” http://www.submittoday.com/search_engines_101.htm (4 November 2004). http://www.submittoday.com/search_engines_101.htm “Searching the Hidden Web.” http://www2.canisius.edu/canhp/canlib/guides/hidden-web.html (4 November 2004). http://www2.canisius.edu/canhp/canlib/guides/hidden-web.html Sherman, Chris and Gary Price. “The invisible web: uncovering sources search engines can’t see.” Library Trends Fall, 2003. Smith, C. Brian. “Invisible Web: Explore hidden troves of information.” http://www.libraryspot.com/features/invisibleweb.htm (4 November 2004). http://www.libraryspot.com/features/invisibleweb.htm Sullivan, Danny. “Google Scholar Offers Access to Academic Information.” http://searchenginewatch.com/searchday/article.php/3437471 (1 Dec. 2004). http://searchenginewatch.com/searchday/article.php/3437471 Vine, Rita. “Going beyond Google for faster and smarter web searching.” Teacher Librarian. October, 2004.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.