Evaluation of the NSDL and Google for Obtaining Pedagogical Resources Frank McCown, Johan Bollen, and Michael L. Nelson Old Dominion University Computer.

Slides:



Advertisements
Similar presentations
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.
Advertisements

GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
A Quality Focused Crawler for Health Information Tim Tang.
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
Measuring Scholarly Communication on the Web Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK Bibliometric Analysis.
1 CS 502: Computing Methods for Digital Libraries Lecture 16 Web search engines.
ISP 433/633 Week 7 Web IR. Web is a unique collection Largest repository of data Unedited Can be anything –Information type –Sources Changing –Growing.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos.
Patterns of International and National Web Inlinks to US University Departments Rong Tang Catholic University of America, USA Mike Thelwall University.
Searching the Web Dr. Frank McCown Intro to Web Science Harding University This work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike.
An Application of Graphs: Search Engines (most material adapted from slides by Peter Lee) Slides by Laurie Hiyakumoto.
Search Engines and their Public Interfaces: Which APIs are the Most Synchronized? Frank McCown and Michael L. Nelson Department of Computer Science, Old.
Searching and Accessing the Cultural Heritage in a Digital World Yoram Elkaim International Conference on Intellectual Property & Cultural Heritage in.
Evaluation David Kauchak cs458 Fall 2012 adapted from:
Evaluation David Kauchak cs160 Fall 2009 adapted from:
1 Announcements Research Paper due today Research Talks –Nov. 29 (Monday) Kayatana and Lance –Dec. 1 (Wednesday) Mark and Jeremy –Dec. 3 (Friday) Joe and.
Web Characterization: What Does the Web Look Like?
Searching the Web Dr. Frank McCown Intro to Web Science Harding University This work is licensed under Creative Commons Attribution-NonCommercial 3.0Attribution-NonCommercial.
Detecting Semantic Cloaking on the Web Baoning Wu and Brian D. Davison Lehigh University, USA WWW 2006.
Issues in the Australian Media Area of study 3 Using Language to Persuade.
Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
Topical Crawlers for Building Digital Library Collections Presenter: Qiaozhu Mei.
Dynamic Web File Format Transformations with Grace Daniel S. Swaney, Frank McCown, and Michael L. Nelson Old Dominion University Computer Science Department.
Web Searching. How does a search engine work? It does NOT search the Web (when you make a query) It contains a database with info on numerous Web sites.
COMM 170 USING THE WEB. WHEN I NEED TO FIND RESOURCES FOR AN ASSIGNMENT THE FIRST THING I DO IS... 1.Go to the Web & Google it 2.Ask my friends on Facebook.
Crawling and Aligning Scholarly Presentations and Documents from the Web By SARAVANAN.S 09/09/2011 Under the guidance of A/P Min-Yen Kan 10/23/
Search engines are used to for looking for documents. They compile their databases by employing "spiders" or "robots" to crawl through web space from.
YZUCSE SYSLAB A Study of Web Search Engine Bias and its Assessment Ing-Xiang Chen and Cheng-Zen Yang Dept. of Computer Science and Engineering Yuan Ze.
Coverage and Independence: Measuring Quality in Web Search Results Panagiotis Takis Metaxas Lilia Ivanova Eni Mustafaraj Department of Computer Science.
My Website Was Lost, But Now It’s Found Frank McCown CS 110 – Intro to Computer Science April 23, 2007.
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.
인지구조기반 마이닝 소프트컴퓨팅 연구실 박사 2 학기 박 한 샘 2006 지식기반시스템 응용.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
Sharon M. Jordan Assistant Director for Program Integration U.S. DOE Office of Scientific & Technical Information Vantage Point: Government R&D Results.
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Department of Electrical Engineering and Computer Science Kunpeng Zhang, Yu Cheng, Yusheng Xie, Doug Downey, Ankit Agrawal, Alok Choudhary {kzh980,ych133,
Ph.D. Progress Report Frank McCown 4/14/05. Timeline Year 1 : Course work and Diagnostic Exam Year 2 : Course work and Candidacy Exam Year 3 : Write and.
Lazy Preservation, Warrick, and the Web Infrastructure Frank McCown Old Dominion University Computer Science Department Norfolk, Virginia, USA JCDL 2007.
Google Custom Search Engine Presented by David Bickford Director, Arizona Health Sciences Library at the Phoenix Biomedical Campus October 21, 2015.
Information Retrieval CSE 8337 Spring 2007 Introduction/Overview Some Material for these slides obtained from: Modern Information Retrieval by Ricardo.
Factors Affecting Website Reconstruction from the Web Infrastructure Frank McCown, Norou Diawara, and Michael L. Nelson Old Dominion University Computer.
The Availability and Persistence of Web References in D-Lib Magazine Frank McCown, Sheffan Chan, Michael L. Nelson and Johan Bollen Old Dominion University.
Naver vs. Google.co.kr For foreigners visiting Korea Zdenek Slegl Michal Kaciuba Jorge Sanchez.
1 FollowMyLink Individual APT Presentation First Talk February 2006.
By: Kem Forbs Advanced Google Search. Tips and Tricks Keywords: adding additional terms or keywords can redefine your search and make the most relevant.
Brass: A Queueing Manager for Warrick Frank McCown, Amine Benjelloun, and Michael L. Nelson Old Dominion University Computer Science Department Norfolk,
General Architecture of Retrieval Systems 1Adrienn Skrop.
Web Programming Week 14 Old Dominion University Department of Computer Science CS 418/518 Fall 2006 Michael L. Nelson 11/27/06.
Session 5: How Search Engines Work. Focusing Questions How do search engines work? Is one search engine better than another?
Crawling When the Google visit your website for the purpose of tracking, Google does this with help of machine, known as web crawler, spider, Google bot,
Google Scholar and ShareLaTeX
Who is NCCT? National Center for Computational Toxicology – part of EPA’s Office of Research and Development Research driven by EPA’s Chemical Safety for.
DATA MINING Introductory and Advanced Topics Part III – Web Mining
Evaluation of IR Systems
Lazy Preservation, Warrick, and the Web Infrastructure
Agreeing to Disagree: Search Engines and Their Public Interfaces
Correlation of Term Count and Document Frequency for Google N-Grams
Data Mining Chapter 6 Search Engines
Characterization of Search Engine Caches
Anatomy of a Search Search The Index:
Measuring Complexity of Web Pages Using Gate
Correlation of Term Count and Document Frequency for Google N-Grams
If You Harvest arXiv.org, Will They Come?
Best Practices On-Line Analysis Tools
Web Programming Assignment 4 - Extra Credit
Introduction to Digital Libraries Assignment #1
EERQI Innovative Indicators and Test Results
Presentation transcript:

Evaluation of the NSDL and Google for Obtaining Pedagogical Resources Frank McCown, Johan Bollen, and Michael L. Nelson Old Dominion University Computer Science Department Norfolk, Virginia, USA ECDL 2005 September 21, 2005

Useful K-12 Educational Content The Entire Web

ECDL Related Work Many studies compare web search engines’ ability to find “relevant” documents Human evaluators (typically students) are used Top 10 or 20 results are evaluated None have used NSDL or evaluated educational usefulness of search results Sumner said educators expect DLs to save them time over using web search engines because DLs filter their content 1 1 T. Sumner et al.: Understanding Educator Perceptions of “Quality” in Digital Libraries, (JCDL 2003)

ECDL Virginia Standards of Learning BIO.3The student will investigate and understand the chemical and biochemical principles essential for life. Key concepts include a) water chemistry and its impact on life processes; b) the structure and function of macromolecules; c) the nature of enzymes; …

ECDL Evaluators

ECDL Query Ratings Were the search terms well chosen? 1 = Strongly agree 2 = Agree 3 = Neutral 4 = Disagree 5 = Strongly disagree Average rating 2.08 Median rating 2

ECDL Link Ratings Were the search results useful for educating a student about the Learning Statement? 1 = Strongly agree 2 = Agree 3 = Neutral 4 = Disagree 5 = Strongly disagree 6 = N/A Wilcoxon signed rank test revealed a statistically significant difference (p < 0.05) between the NSDL and Google ratings.

ECDL Calculate Precision Google’s precision = 145/380 = 38.2% NSDL’s precision = 57/334 = 17.1%

ECDL Search Result Ratings All domains

ECDL Search Result Ratings Median ratings for each domain

ECDL Ranking of Search Results How well did Google and NSDL rate their search results? Performed Spearman Rank correlation between rankings and ratings Google: rho=0.125, p=0.001 ** NSDL: rho=0.057, p=0.173

ECDL How Did this Happen? Why did Google perform better than NSDL? Google has more stuff 8 billion pages vs. a few million? But Google’s stuff is not screened for educational quality Google and NSDL show different results 38 queries  6 duplicate results in the top 10 results 25% of NSDL results not indexed by Google

ECDL How Did this Happen? More NSDL results are inaccessible or login- protected (NSDL = 9.3%, Google = 5.3%) 81% of inaccessible NSDL scores were from Some NSDL resources were not for K-12 students 17% of NSDL results were from

ECDL Improvements for NSDL Provide advanced search features Provide the ability to target the grade level appropriateness of information Rank results based on relevance

ECDL Thank You Questions? Evaluation results, data files, slides: