Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,

Slides:



Advertisements
Similar presentations
QUALITATIVE RESEARCH SOCIAL METHODS SC20062 Leah Wild Week Four.
Advertisements

Google and Beyond… Hatch Library Bay Path College / Spring 2010.
What is Webometrics? Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK Virtual Knowledge Studio (VKS) Information Studies.
Mixed-methods data analysis Graduate Seminar in English Language Studies Suranaree, March 2011 Richard Watson Todd KMUTT
BUILD & PUBLISH A WEBSITE Introduction. WORDPRESS It is a Blogging software, like an online diary Take away the Blogging part and it as a simple CMS (Content.
The Ethics of Large-Scale Web Data Analysis (Webmetrics) Mike Thelwall, Statistical Cybermetrics Research Group, University of Wolverhampton, UK Rob Ackland,
 How many pages does it search?  How does it access all those pages?  How does it give us an answer so quickly?  How does it give us such accurate.
Scientific Web Intelligence The Birth of a New Research Field Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK.
Link analysis as a social science technique Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK
Measuring Scholarly Communication on the Web Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK Bibliometric Analysis.
1 Exploring Marketing Research William G. Zikmund Chapter 2: Global Information Systems.
Chapter 1 Conducting & Reading Research Baumgartner et al Chapter 1 Nature and Purpose of Research.
An Overview of Link Analysis Techniques for Academic Web Sites Mike Thelwall, Statistical Cybermetrics Research Group, University of Wolverhampton, UK.
ISP 433/633 Week 7 Web IR. Web is a unique collection Largest repository of data Unedited Can be anything –Information type –Sources Changing –Growing.
- Hyperlink Analysis - Merton & Garfield vs. Malinowski & MacRoberts Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton,
Introduction Web Development II 5 th February. Introduction to Web Development Search engines Discussion boards, bulletin boards, other online collaboration.
Kregg Aytes Professional Scholarship The road to making intellectual contributions to the Academy.
Patterns of International and National Web Inlinks to US University Departments Rong Tang Catholic University of America, USA Mike Thelwall University.
Analysing the link structures of the Web sites of national university systems Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton,
Methods for Exploiting Academic Hyperlinks Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK.
My Research, its Potential, and its Contribution to SCIT Mike Thelwall.
Hyperlinks and Scholarly Communication Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK Virtual Methods Seminar, University.
Mother and Child Health: Research Methods G.J.Ebrahim Editor Journal of Tropical Pediatrics, Oxford University Press.
1 Introduction to Web Development. Web Basics The Web consists of computers on the Internet connected to each other in a specific way Used in all levels.
Reporting & Ethical Standards EPSY 5245 Michael C. Rodriguez.
Types of Research 1. Categorized by Practicality a. Basic research  done to satisfy a need to know with no intention of resolving an immediate social.
RESEARCH A systematic quest for undiscovered truth A way of thinking
DR. AHMAD SHAHRUL NIZAM ISHA
Citations and links as measures of effectiveness of online LIS journals Alastair G. Smith School of Information Management, Victoria University of Wellington.
Reliable Sources Six questions to ask to determine the trustworthiness of an internet source.
Evaluating Online Information Sources Ask yourself the following questions…
Search Engine optimization.  Search engine optimization (SEO) is the process of affecting the visibility of a website or a web page in a search engine's.
The Internet The Good The Bad & The UGLY. What is the internet? While it may seem like a simple question, defining the Internet isn’t easy. Unlike any.
Assessing Google as a Teaching & Research Tool Dennis G. Jerz Seton Hill University Teaching & Learning Forum 31 Jan 2005
CHAPTER III IMPLEMENTATIONANDPROCEDURES.  4-5 pages  Describes in detail how the study was conducted.  For a quantitative project, explain how you.
Digging Deep for Hidden Information in the Web Part 1: Automated blog analysis Part 2: Automated hyperlink analysis.
RESEARCH METHODOLOGY. WHAT IS RESEARCH METHODOLOGY?  In this section, the researcher must state the type of research, its meaning, and how it is applicable.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Google Scholar as a cybermetric tool Alastair G Smith Victoria University of Wellington New Zealand
How to Register for Apprenticeship Vacancies. 1.Visit 2.Click on ‘Search for vacancies’. 3.Click.
Quantitative Comparisons of Search Engine Results Mike Thlwall School of Computing and Information Technology, University of Wolverhampton ( 伍爾弗漢普頓 UK)
McLean HIGHER COMPUTER NETWORKING Lesson 7 Search engines Description of search engine methods.
LITERATURE REVIEW  A GENERAL GUIDE  MAIN SOURCE  HART, C. (1998), DOING A LITERATURE REVIEW: RELEASING THE SOCIAL SCIENCE RESEARCH IMAGINATION.
Market research for a start-up. LEARNING OUTCOMES By the end of this lesson I will be able to: –Define and explain market research –Distinguish between.
Evaluation: Preliminary Results from the Server Side Frank A. Settle Elizabeth Blackmer Thomas Whaley The Alsos Digital Library for Nuclear Issues Washington.
1: From webometrics to altmetrics: one and a half decades of digital research at Wolverhampton Jonathan M Levitt Statistical Cybermetrics Research Group.
Week 2 The lecture for this week is designed to provide students with a general overview of 1) quantitative/qualitative research strategies and 2) 21st.
Extracting Information from the Links in Academic Webs Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK An overview.
NSC 440 RESEARCH IN NURSING 4 UNITS DEPARTMENT OF NURSING SCIENCE FACULTY OF BASIC MEDICAL SCIENCES 1.
How to Register for Apprenticeship Vacancies 1 Supporting young people’s services.
Search Engines: A History  First search engine was Veronica for the Gopher network  1991 Gopher  After Gopher disappeared, the first one for modern.
Digital Literacy The Basics ent.
Bringing Order to the Web : Automatically Categorizing Search Results Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Hao Chen Susan Dumais.
1 Prepared by: Laila al-Hasan. 1. Definition of research 2. Characteristics of research 3. Types of research 4. Objectives 5. Inquiry mode 2 Prepared.
What is Research Design? RD is the general plan of how you will answer your research question(s) The plan should state clearly the following issues: The.
1 UNIT 13 The World Wide Web. Introduction 2 Agenda The World Wide Web Search Engines Video Streaming 3.
1 Chapter 5 (3 rd ed) Your library is an excellent resource tool. Your library is an excellent resource tool.
STEPS IN RESEARCH PROCESS 1. Identification of Research Problems This involves Identifying existing problems in an area of study (e.g. Home Economics),
Data Collection Techniques
Jonathan M Levitt Statistical Cybermetrics Research Group
Lecture 11: Honours Thesis Structure
Searching the Literature
AN INTRODUCTION TO EDUCATIONAL RESEARCH.
Are my Sources Reliable?
Literature review Lit. review is an account of what has been published on a topic by accredited scholars and researchers. Mostly it is part of a thesis.
Objective % Explain concepts used to create websites.
يقول رسول الله صلى الله عليه وسلم ”انما الاعمال بالنيات وانما لكل امرىء ما نوى فمن كانت هجرته الى الله ورسوله فهجرته الى الله ورسوله ومن كانت هجرته الى.
Title 3 column poster – Research - Qualitative
Objective Explain concepts used to create websites.
Chap. 1: Introduction to Statistics
Presentation transcript:

Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton, UK RC33 August 2004

Link Analysis in Social Science Research Use to study web phenomena E.g. NGO web site interlinking E.g. university web site interlinking Use to study offline phenomena with web aspects E.g. scholarly communication E.g. the perception of news events The web is a free, accessible massive data source for information about many aspects of life

What use is hyperlink data to qualitative researchers? Part of a mixed methodology Numbers to back up theories To obtain samples of types of Web pages for qualitative analyses Background information on how the Web is used

Quick example 1: UK university interlinking with geographic clusters indicated

Quick example 2: Asia-Pacific university interlinking. {Research with Alastair Smith, VUW, NZ}

Quick example 3: Geographic interlinking trends for UK universities.

Talk overview A social science approach for link analysis Data collection with commercial search engines Data collection and analysis with SocSciBot

A social science approach for link analysis 1: Preliminary steps 1. Formulate an appropriate research question, taking into account existing knowledge of web structure 2. Conduct a pilot study 3. Identify web pages or sites that are appropriate to address a research question 4. Collect link data from a commercial search engine or a personal crawler taking appropriate safeguards to ensure that the results obtained are accurate

A social science approach for link analysis 2: Validation 5. Partially validate the link count results through correlation tests 6. Partially validate the interpretation of the results through a link classification exercise or web author interviews

A social science approach for link analysis 3: Reporting 8. Report results with an interpretation consistent with link classification exercise include either a detailed description of the classification or exemplars to illustrate the categories 9. Report the limitations of the study and parameters used in data collection and processing

Link data from commercial search engines Commercial search engines can give information about the existence of links in the web Can be used for data collection Advanced interfaces are usually needed, or special commands

Google Can find all links to a given web page with the link: command E.g. link:

Yahoo! site-specific searches Yahoo! allows searching for links between pairs of web sites/web spaces E.g. linkdomain:db.dk +site:ac.uk returns web pages in the ac.uk domain that link to the db.dk site …ac.uk/……db.dk/…

SocSciBot Personal crawler for link research Available free at socscibot.wlv.ac.uk Crawls sets of web sites and analyses the links between them, producing: Link lists Link counts Network diagrams

Reprise: Link Analysis in Social Science Research Use to study web phenomena E.g. NGO web site interlinking E.g. university web site interlinking Use to study offline phenomena with web aspects E.g. scholarly communication E.g. the perception of news events The web is a free, accessible massive data source for information about many aspects of life But don’t forget the need for validation!