Analysing the link structures of the Web sites of national university systems Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton,

Slides:



Advertisements
Similar presentations
1 Probability and the Web Ken Baclawski Northeastern University VIStology, Inc.
Advertisements

Access Part I Accessing Health Information Through the Internet.
Linking to Institutional Repositories from the general Web Alastair G Smith School of Information Management Victoria University of Wellington New Zealand.
Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
What is Webometrics? Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK Virtual Knowledge Studio (VKS) Information Studies.
The Ethics of Large-Scale Web Data Analysis (Webmetrics) Mike Thelwall, Statistical Cybermetrics Research Group, University of Wolverhampton, UK Rob Ackland,
1 Search Engines What is the Internet? The Web is only part of the Internet The Internet is a computer network connecting millions of computers.
Informetrics Umeå Kim Holmberg Information Studies Åbo Akademi Åbo, Finland Supervisors: Dr. Gunilla Widén-Wulff Dr. Mike Thelwall
Information Retrieval in Practice
Search Engines and Information Retrieval
1 CS 430 / INFO 430: Information Retrieval Lecture 16 Web Search 2.
Scientific Web Intelligence The Birth of a New Research Field Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK.
Link analysis as a social science technique Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK
Measuring Scholarly Communication on the Web Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK Bibliometric Analysis.
Mastering the Internet, XHTML, and JavaScript Chapter 7 Searching the Internet.
INFO 624 Week 3 Retrieval System Evaluation
Aims Correlation between ISI citation counts and either Google Scholar or Google Web/URL citation counts for articles in OA journals in eight disciplines.
Using Search Engines and Web Crawlers in Social Science Research Mike Thelwall Head, Statistical Cybermetrics Research Group University of Wolverhampton,
An Overview of Link Analysis Techniques for Academic Web Sites Mike Thelwall, Statistical Cybermetrics Research Group, University of Wolverhampton, UK.
- Hyperlink Analysis - Merton & Garfield vs. Malinowski & MacRoberts Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton,
1 MARG-DARSHAK: A Scrapbook on Web Search engines allow the users to enter keywords relating to a topic and retrieve information about internet sites (URLs)
An investigation of query expansion terms Gheorghe Muresan Rutgers University, School of Communication, Information and Library Science 4 Huntington St.,
Patterns of International and National Web Inlinks to US University Departments Rong Tang Catholic University of America, USA Mike Thelwall University.
Methods for Exploiting Academic Hyperlinks Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK.
My Research, its Potential, and its Contribution to SCIT Mike Thelwall.
Hyperlinks and Scholarly Communication Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK Virtual Methods Seminar, University.
Overview of Search Engines
“ The Initiative's focus is to dramatically advance the means to collect,store,and organize information in digital forms,and make it available for searching,retrieval,and.
By: Bihu Malhotra 10DD.   A global network which is able to connect to the millions of computers around the world.  Their connectivity makes it easier.
Citations and links as measures of effectiveness of online LIS journals Alastair G. Smith School of Information Management, Victoria University of Wellington.
Search Engines and Information Retrieval Chapter 1.
University of Dublin Trinity College Localisation and Personalisation: Dynamic Retrieval & Adaptation of Multi-lingual Multimedia Content Prof Vincent.
Master Thesis Defense Jan Fiedler 04/17/98
 Search Engine Search Engine  Steps to Search for webpages pertaining to a specific information Steps to Search for webpages pertaining to a specific.
SEARCH ENGINES Jaime Ma, Vancy Truong & Victoria Fry.
Spatial Data Analysis Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What is spatial data and their special.
Google Scholar as a cybermetric tool Alastair G Smith Victoria University of Wellington New Zealand
Fourth Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
COM1721: Freshman Honors Seminar A Random Walk Through Computing Lecture 2: Structure of the Web October 1, 2002.
What to Know: 9 Essential Things to Know About Web Searching Janet Eke Graduate School of Library and Information Science University of Illinois at Champaign-Urbana.
WISER Humanities: Quality Information on the Internet Johanneke Sytsema Linguistics Subject Consultant
European Studies David Kereselidze European Studies Relatively new field, the origin of which was conditioned by the integration processes.
Measuring the Size of the Web Dongwon Lee, Ph.D. IST 501, Fall 2014 Penn State.
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
The INTERNET Worldwide network of computers linked together.
Parallel and Distributed Searching. Lecture Objectives Review Boolean Searching Indicate how Searches may be carried out in parallel Overview Distributed.
Publication Spider Wang Xuan 07/14/2006. What is publication spider Gathering publication pages Using focused crawling With the help of Search Engine.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
LOGO A comparison of two web-based document management systems ShaoxinYu Columbia University March 31, 2009.
L JSTOR Tools for Linguists 22nd June 2009 Michael Krot Clare Llewellyn Matt O’Donnell.
Can scientific collaboration and excellence be measured by Web presence and Web links? Judit Bar-Ilan Bar-Ilan University and The Hebrew University of.
1 Making a Grope for an Understanding of Taiwan’s Scientific Performance through the Use of Quantified Indicators Prof. Dr. Hsien-Chun Meng Science and.
Extracting Information from the Links in Academic Webs Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK An overview.
Page Ranking Algorithms for Digital Libraries Submitted By: Shikha Singla MIT-872-2K11 M.Tech(3 rd Sem) Information Technology.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
CiteSearch: Multi-faceted Fusion Approach to Citation Analysis Kiduk Yang and Lokman Meho Web Information Discovery Integrated Tool Laboratory School of.
The Internet and World Wide Web Sullivan University Library.
1 Friends and Neighbors on the Web Presentation for Web Information Retrieval Bruno Lepri.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
CIS750 – Seminar in Advanced Topics in Computer Science Advanced topics in databases – Multimedia Databases V. Megalooikonomou Link mining ( based on slides.
W orkshops in I nformation S kills and E lectronic R esources Oxford University Library Services – Information Skills Training Finding quality information.
Internet Searching the World Wide Web. The Internet and the World Wide Web The Internet is a worldwide collection of networks that allows people to communicate.
1 CS 430 / INFO 430: Information Retrieval Lecture 20 Web Search 2.
SEMINAR ON INTERNET SEARCHING PRESENTED BY:- AVIPSA PUROHIT REGD NO GUIDED BY:- Lect. ANANYA MISHRA.
Information Retrieval in Practice
Search Engine Architecture
CS 100 Mount Union College Fall, 2002
Data Mining Chapter 6 Search Engines
Aggregating Online Resources: Grolier Online as an Educational Portal
Presentation transcript:

Analysing the link structures of the Web sites of national university systems Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK

Why analyse university link structures? Obtain evidence of online impact of work Identify trends in informal scholarly communication Basic research: the Web is important and a valid object for scientific study

Methodology: Data collection Web crawler AltaVista advanced queries host:york.ac.uk AND link:wlv.ac.uk AllTheWeb advanced queries Google Does not support same level of Boolean querying

Methodology: Data analysis 1 Link counts to target universities Inter-site links only Colink counts

Methodology: Data analysis 2 Alternative Document Models Aggregate Web pages into documents based upon site, domains or directories for link counting Can’t be done easily from search engine data Produces better results in some situations than simple link counting

Methodology: Data analysis 3 Statistical techniques for evaluating results Correlation with known research performance measures Factor analysis, Multi-Dimensional Scaling, Cluster analysis for patterns Techniques from Communication Networks research

Methodology: Data analysis 4 Simple graphical techniques Display linkages above a certain threshold Community identification techniques from computer science

Results 1: Links associate with research Counts of links to UK, Australian, Taiwanese universities correlate significantly with measures of research productivity Counts of links in China appear not to Results are better with ADMs for the UK but not Taiwan

Results 2: Most links are only loosely related to research A random sample of links between UK university sites revealed over 90% had some connection with scholarly activity, including teaching and research. Less than 1% were equivalent to citations

Results 3: Links are related to geography Interlinking between universities in the UK decreases with geographic distance

Results 4: Universities cluster by geographic region This is clearest for Scotland but also for other groupings, including Manchester- based universities Coherent clusters are difficult to extract because of overlapping trends

Results 5: Linguistic factors in EU communication English the dominant language for Web sites in the Western EU In a typical country, 50% of pages are in the national language(s) and 50% in English Non-English speaking extensively interlink in English {Research with Rong Tang, SUNY Albany}

Results 6: Power laws in the Web Academic Webs have a topology dominated by power laws, including Inlink counts Outlink counts Directed component sizes Undirected component sizes

Results 6: Power laws in the Web

Results 7: Academic Web Topology

Criticism What do the statistics mean? A variety of factors influence link creation, mainly informal About 90% of inter-site links have some connection to research Links an informal scholarly communication soup, from which patterns can be sieved out

The future Results of research leading into: improved Web-related policy making in the EU Improved Web information retrieval algorithms Improved understanding of informal scholarly communication on the Web It is easy to get some statistics, but very hard to get meaningful statistics