Methods for Exploiting Academic Hyperlinks Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK.

Slides:



Advertisements
Similar presentations
28 April 2004Second Nordic Conference on Scholarly Communication 1 Citation Analysis for the Free, Online Literature Tim Brody Intelligence, Agents, Multimedia.
Advertisements

The Internet.
The added value information service that focuses on the European Union, the countries of Europe, and on the issues of concern to citizens, stakeholders.
Linking to Institutional Repositories from the general Web Alastair G Smith School of Information Management Victoria University of Wellington New Zealand.
Disciplinary Differences in Selected Scholars' Twitter Transmissions Kim Holmberg 1 and Mike Thelwall 2 1 |
AUTOMATICALLY CITE YOUR SOURCES FOR FREE AT
BIBLIOMETRICS Presented by Asha. P Research Scholar DOS in Library and Information Science Research supervisor Dr.Y.Venkatesha Associate professor DOS.
Informetrics Umeå Kim Holmberg Information Studies Åbo Akademi Åbo, Finland Supervisors: Dr. Gunilla Widén-Wulff Dr. Mike Thelwall
Scientific Web Intelligence The Birth of a New Research Field Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK.
Link analysis as a social science technique Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK
Measuring Scholarly Communication on the Web Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK Bibliometric Analysis.
Mastering the Internet, XHTML, and JavaScript Chapter 7 Searching the Internet.
Aims Correlation between ISI citation counts and either Google Scholar or Google Web/URL citation counts for articles in OA journals in eight disciplines.
Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,
An Overview of Link Analysis Techniques for Academic Web Sites Mike Thelwall, Statistical Cybermetrics Research Group, University of Wolverhampton, UK.
- Hyperlink Analysis - Merton & Garfield vs. Malinowski & MacRoberts Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton,
Vocabulary Spectral Analysis as an Exploratory Tool for Scientific Web Intelligence Mike Thelwall Professor of Information Science University of Wolverhampton.
Patterns of International and National Web Inlinks to US University Departments Rong Tang Catholic University of America, USA Mike Thelwall University.
Analysing the link structures of the Web sites of national university systems Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton,
My Research, its Potential, and its Contribution to SCIT Mike Thelwall.
Library and Web Resources for International Education June 19, 2002 Margaret Power DePaul University Library.
Hyperlinks and Scholarly Communication Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK Virtual Methods Seminar, University.
Literature Reviews Library Workshop March 11, 2013.
Publishing strategies – A seminar about the scientific publishing landscape Peter Sjögårde, Bibliometric analyst KTH Royal Institute of Technology, ECE.
Overview of Search Engines
The added value information service that focuses on the European Union, the countries of Europe, and on the issues of concern to citizens, stakeholders.
“ The Initiative's focus is to dramatically advance the means to collect,store,and organize information in digital forms,and make it available for searching,retrieval,and.
Internet Research, Second Edition- Illustrated 1 Internet Research: Unit A Searching the Internet Effectively.
Citations and links as measures of effectiveness of online LIS journals Alastair G. Smith School of Information Management, Victoria University of Wellington.
Bibliometrics and Impact Analyses at the National Institute of Standards and Technology Stacy Bruss and Susan Makar Research Librarians SLA Pharmaceutical.
The Web of Science, Bibliometrics and Scholarly Communication 11 December 2013
Digging Deep for Hidden Information in the Web Part 1: Automated blog analysis Part 2: Automated hyperlink analysis.
Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.
1 University of Qom Information Retrieval Course Web Search (Link Analysis) Based on:
Internet Business Foundations © 2004 ProsoftTraining All rights reserved.
 The World Wide Web is a collection of electronic documents linked together like a spider web.  These documents are stored on computers called servers.
Google Scholar as a cybermetric tool Alastair G Smith Victoria University of Wellington New Zealand
Web site  a group of World Wide Web pages usually containing hyperlinks to each other and made available online by an individual, company, educational.
COM1721: Freshman Honors Seminar A Random Walk Through Computing Lecture 2: Structure of the Web October 1, 2002.
Validity and Reliability Edgar Degas: Portraits in a New Orleans Cotton Office, 1873.
IL Step 3: Using Bibliographic Databases Information Literacy 1.
The Web of Science, Bibliometrics and Scholarly Communication
1: From webometrics to altmetrics: one and a half decades of digital research at Wolverhampton Jonathan M Levitt Statistical Cybermetrics Research Group.
AMERICAN CHEMICAL SOCIETY
Internet Research – Illustrated, Fourth Edition Unit A.
Extracting Information from the Links in Academic Webs Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK An overview.
Citation Searching To trace influence of publications Tracking authors Tracking titles.
CiteSearch: Multi-faceted Fusion Approach to Citation Analysis Kiduk Yang and Lokman Meho Web Information Discovery Integrated Tool Laboratory School of.
Building a Multi-Year Database of AAG Conference Abstracts André Skupin /Shujing Shu Dept. of Geography / Dept. of Computer Science University of New Orleans.
Testing a Single Mean Module 16. Tests of Significance Confidence intervals are used to estimate a population parameter. Tests of Significance or Hypothesis.
Announcements Intro to Legal Research on Wednesday Keep track of your searching to document on the search log…for each search instance: – what database.
Chapter 20 Asking Questions, Finding Sources. Characteristics of a Good Research Paper Poses an interesting question and significant problem Responds.
1 UNIT 13 The World Wide Web. Introduction 2 Agenda The World Wide Web Search Engines Video Streaming 3.
1 UNIT 13 The World Wide Web. Introduction 2 The World Wide Web: ▫ Commonly referred to as WWW or the Web. ▫ Is a service on the Internet. It consists.
The Problem of Pattern and Scale in Ecology - Summary What did this paper do that made it a citation classic? 1.It summarized a large body of work on spatial.
Jonathan M Levitt Statistical Cybermetrics Research Group
Demonstrating Scholarly Impact: Metrics, Tools and Trends
Bibliometrics toolkit: Thomson Reuters products
AUTOMATICALLY CITE YOUR SOURCES FOR FREE AT
Planning Curriculum Materials for Reuse and Recycling
AUTOMATICALLY CITE YOUR SOURCES FOR FREE AT
AUTOMATICALLY CITE YOUR SOURCES FOR FREE AT
AUTOMATICALLY CITE YOUR SOURCES FOR FREE AT
Hyperlinks in academia: some stylised facts and a first attempt at model development by Franz Barjak, University of Applied Sciences Northwestern Switzerland.
IL Step 3: Using Bibliographic Databases
AUTOMATICALLY CITE YOUR SOURCES FOR FREE AT
AUTOMATICALLY CITE YOUR SOURCES FOR FREE AT
Other databases and websites
AUTOMATICALLY CITE YOUR SOURCES FOR FREE AT
AUTOMATICALLY CITE YOUR SOURCES FOR FREE AT
Presentation transcript:

Methods for Exploiting Academic Hyperlinks Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK

The Problem To map patterns of communication between researchers in a country based upon university web sites Patterns of communication are also mapped based upon journal citations or journal title words Provides useful information about the structure and evolution of research fields Can identify previously unknown field connections Web analysis could illustrate wider and more current patterns

Data collection Web crawler AltaVista advanced queries host:wlv.ac.uk AND link:gla.ac.uk AllTheWeb advanced queries Google Does not support same level of Boolean querying

Types of link count Direct link counts Inter-site links only Co-inlink counts B and C are co-inlinked Co-outlink counts D and E are co-outlinked BC A DE F

Alternative Document Models Domain ADM Count links between domains (ignoring multiple links) instead of pages P1 P2 P3 P4 P5 P6

Alternative Document Models Directory ADM Counts links between directories Estimated using URL slashes University ADM Counts links between entire university Web sites Too extreme for most purposes ADMs reduce the impact of replicated links E.g. a subsite of 1000 pages linking to another university home page in its navigation bar

Some Inter-University Hyperlink Patterns For the UK and Europe

Citation-Style Hyperlink Analysis Citation counts are known to be reasonable indicators of research quality but is the same true for inlink counts? Counts of links to universities within a country can correlate significantly with measures of research productivity The significance of this result is in giving ‘permission’ to investigate the use of inter-university links for researching scholarly communication

Most links are only loosely related to research 90% of links between UK university sites have some connection with scholarly activity, including teaching and research But less than 1% are equivalent to citations So link counts do not measure research dissemination but are more a natural by-product of scholarly activity Cannot use link counts to assess research Can use link counts to track an aspect of communication

Links to UK universities against their research productivity The reason for the strong correlation is the quantity of Web publication, not its quality This is different to citation analysis

Universities tend to link to neighbours

Universities cluster geographically

Language is a factor in international interlinking English the dominant language for Web sites in the Western EU In a typical country, 50% of pages are in the national language(s) and 50% in English Non-English speaking extensively interlink in English {Research with Rong Tang & Liz Price}

Can map patterns of international communication Counts of links between EU universities in Swedish are represented by arrow thickness.

Counts of links between EU universities in French are represented by arrow thickness.

Which language???

Linking patterns vary enormously by discipline No evidence of a significant geographic trend Disciplinary differences in the extent of interlinking: e.g., history Web use is very low, Chemistry is very high Individual research projects can have an enormous impact upon individual departments E.g. Arts web sites are often for specific exhibitions or for digital media projects Links not frequent enough to reliably reveal patterns of interdiscipliniarity

Clustering using links

Background: Power laws in Academic Webs Academic Webs have a topology dominated by power laws, including Counts of links to pages (inlink counts) Counts of links to pages (outlink counts) Groups of interconnected pages Directed component sizes Undirected component sizes Power laws mean that clustering connected components will not yield useful results

Page Outlinks

Topological component sizes

Community Identification Algorithm Can apply to page, directory and domain models Gives complimentary results: a “layered approach”

Stretching links further: co- inlinks, co-outlinks For the UK academic Web, about 42% of domains connected by links alone host similar disciplines, and about 43% connected by links, co-inlinks and co-outlinks But over 100 times more domains are colinked or coupled than are directly linked Links in any form are less than 50% reliable as indicators of subject similarity

Summary Studies of the relatively restricted subdomain of university web sites Produce directly useful results For Web IR, they also Help refine methodologies Help build intuition