The views expressed in this presentation are those of the presenter, not necessarily those of the CCHS or the CSIC. repositories.webometrics.info The July 2011 Webometrics repository ranking Isidro F. Aguillo
repositories.webometrics.info 22 Agenda Introduction to the Cybermetrics Lab Webometrics, an emerging discipline Webometrics, OA and repositories Ranking Web –Preliminary results July 2011 Final comments Open debate
repositories.webometrics.info Scholars making scientific research –Researchers belonging to the National Research Council (CSIC) –The largest Spanish research public organization –Recognised by our peers –15 years experience in quantitative analysis and evaluation of scholar communication and academic institutions –Papers in referred scientific journals, contributions to international conferences, reports to governmental bodies –Funded by public resources –International cooperation projects funded by European Commission Research Agenda –Promote Open Access initiatives –Global coverage, including developing countries –Building Cybermetrics/Webometrics as an emerging discipline 3 The Cybermetrics Lab
repositories.webometrics.info Number of external inlinks, Web impact factor, g-factor, PageRank Inter-linking, co-linking, clusters, similarity, network measurements Names of authors, papers, institutions, journals, hot topics Size, geographical coverage, languages, biases, algorithms, updating frequency, operators Number of webpages, rich files, academic papers, media files, languages, age Social networks presence, blogmetrics, wikimetrics Search Engines Size Web 2.0 Visibility Networks Mentions Activity Impact TrafficRank Patterns of visits, referrers, referrals Number of visits, visitors, geographical and temporal distribution Frequency, presence in selected html tags, title, URL, bad practices Presence in search engines and directories Rank in search results Criteria Presence Position Popularity Behavior Visits, visitors PositionAnalytics (usage) 4 Webometrics
repositories.webometrics.info Webometrics requires public Web –Direct crawling –OA Electronic Journals –Repositories –Indirect crawling: Search engines as proxies –Link analysis –Mention analysis Analytics –Usage –from log files –Google Analytics or similar OpenAIRE WP8 –Combining Bibliometrics, Webometrics and Analytics indicators 5 Webometrics, OA and repositories
repositories.webometrics.info Priorities in OA initiatives –Populate the repositories –Obtaining mandates –Applying standards –Increase visibility Intellectual property issues –Authors not transferring full rights to editors –Participation in repositories intended for: –Increasing the number of citations –Improving author (and institutional) prestige –But … current OA practices means some rights are being lost –At the level of repository –At the level of institution 6 A few objectives and some problems
repositories.webometrics.info Research results are the most important assets of the universities, but in a few cases the repository is outside the institutional webdomain HAL Sciences de l'Homme et de la Société White Rose Consortium ePrints Repositoryhttp://eprints.whiterose.ac.uk/ University of Arizona's Campus Repositoryhttp://arizona.openrepository.com/ Paris Institute of Technology Pastel Theseshttp://pastel.archives-ouvertes.fr/ Universidad de Chile Cybertesishttp:// Open Access Server Woods Holehttp://darchive.mblwhoilibrary.org/ TeesRep Teesside Universityhttp://tees.openrepository.com/ Auckland Univ Technology ScholarlyCommonshttp://aut.researchgateway.ac.nz/ University of Wolverhampton Digital Repositoryhttp://wlv.openrepository.com/ HAL Ecole Polytechniquehttp://hal-polytechnique.archives-ouvertes.fr/ 7 Transfer of institutional rights
repositories.webometrics.info Regarding naming –Institutional repository URL should be in the institutional web domain –The relevant item is the full text file not the webpage of the record –It is recommended that the URL of the file includes: –Institutional webdomain –Last name of (main) author –Explicit file type (something.pdf) Regarding linking –The item URL (not the record) should be easily linkable (citable). Short, no complex or long numerical codes –Nothing against purls but not as main linking target – – 8 A different point of view
repositories.webometrics.info 9 Recommended URL
repositories.webometrics.info 10 Discrepancies in records numbers
repositories.webometrics.info 11 DOI recognise editor not author
repositories.webometrics.info 12 Complex URLs
Ranking Web of Repositories (July 2011) 13
repositories.webometrics.info Repositories with their own domain or subdomain –1,222 repositories –Including 1,154 institutional repositories –Plus 49 portals Major changes from previous editions –Sources –Exalead data no longer collected –Yahoo Site Explorer instead of Yahoo Search –Only for Size –New formats added: docx, pptx, eps –Total number of rich files excluded from Size count –Scholar full count (50%) + Scholar (50%) 14 July 2011 edition
repositories.webometrics.info SourceOperatorNormalizationWeightIndicator Google Yahoo SE 1 Bing site 2 Log- normalization 3 20%SIZE Google Yahoo Bing filetype 2 (pdf, doc, docx, ppt, pptx, ps, eps) 15% RICH FILES Google Scholar site (al least summaries) 50% total+50%( ) 15%SCHOLAR Yahoo SE 1 linkdomain50%VISIBILITY 1 Yahoo is using Bing database, except for Site Explorer (SE) and a few national mirrors (till mid 2012) 2 Number of rich files excluded from the global size count 3 ln(a i +1)/ln(a max +1) Methodology 15
repositories.webometrics.info SCORE RANK WR QS CWTS ARWU HEEACT log-norm z-score Log-normalization 16
repositories.webometrics.info 17 Top Repositories
repositories.webometrics.info 18 Top Institutional Repositories
repositories.webometrics.info 19 Top Portals
repositories.webometrics.info Providers and end-users of repositories are scientists and their institutions –For them papers are the most important asset they produce –Granting increased access and visibility is universally acknowledged –But some practices are dislodging deposited material from authorships, making difficult to cite (link) the papers and penalizing the prestige of the scientists and their academic employers Ranking Web of Repositories intends to promote OA initiatives and support best practices –Current classification is still not reflecting the repositories diversity, but further efforts will be done in the future –Methodology is also evolving, but overall results are not changing abruptly among consecutive editions 20 Final comments
repositories.webometrics.info 21 Thank you! repositories.webometrics.info Questions?