ArXiv.org 250,000 documents 47,000 registered users 1 million+ downloads per year Cost Per Paper $10000 Commercial Journal $1000 Non-Profit Journal $10.

Slides:



Advertisements
Similar presentations
MCB/Emerald. Name of service: Emerald License in place: country-wide for all university libraries, not-for-profit research and learning institutes within.
Advertisements

50 Years of Experience in Making Grey Literature Available Matching the Expectations of the Particle Physics Community Carmen ODell.
28 April 2004Second Nordic Conference on Scholarly Communication 1 Citation Analysis for the Free, Online Literature Tim Brody Intelligence, Agents, Multimedia.
Oxford University Press Journals Collection Online.
Oxford University Press Journals Collection Online.
The Biosafety Clearing-House of the Cartagena Protocol on Biosafety Tutorial – BCH Resources.
AmeriCorps is introducing a new online payment system for the processing of AmeriCorps forms
Comparison of BIDS ISI (Enhanced) with Web of Science Lisa Haddow.
The Electronic Office Some supplementary information Corporate websites Office automation Company intranet.
Library Automation Overview of Results January 24 th 2006 Jomo Kenyatta Memorial Library.
A Toolbox for Blackboard Tim Roberts
Query Chains: Learning to Rank from Implicit Feedback Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr.
New Features Update ISI Web of Knowledge. Copyright 2006 Thomson Corporation 2 New features added Mozilla Firefox web browser is now supported New access.
Navigating The Faculty of 1000 Biology Site The public site that displays all evaluations made by Faculty Members F1000 Toolbar.
© 2008 RightNow Technologies, Inc. Title Best Practices for Maintaining Your RightNow Knowledge Base Penni Kolpin Knowledge Engineer.
Advanced Searching Engineering Village.
Windsor Solutions, Inc. State and Local Emissions Inventory System (SLEIS) WV DEP Regulated Community Training Tuesday, December 4 th 2012.
Elibrary.worldbank.org World Bank eLibrary User Guide Take full advantage of your eLibrary subscription!
Information Management for Science in Korea Hyun Y. Cho Department of Library & Information Science Kyonggi University
Engineering Village ™ ® Basic Searching On Compendex ®
How to fill an institutional repository - winning scientists over – the example from CERN Joanne Yeomans CERN Scientific Information Group Geneva - Switzerland.
IP Address Management and Request Service Kim Huynh CS491B.
Introducing Symposia : “ The digital repository that thinks like a librarian”
Welcome to the Turnitin.com Instructor Quickstart Tutorial ! This brief tour will take you through the basic steps teachers and students new to Turnitin.com.
©2006, CSA Using COS Funding Alert Automatic Notification of Relevant New Opportunities from the World’s Largest Funding Database ™ Easily Accessible Via.
By: Beth Gardner Procurement and Grants Office Technical Information Management Section Phone: ,
Chapter 10 Publishing and Maintaining Your Web Site.
 What is a “blog?” Short for “web log” An online journal.  Allows for interaction between the writer and the readers through comments Includes articles,
"Plagiarism Prevention Technology for Promoting Critical Thinking and Maintaining Academic Integrity" Pootorn Ruangying, Products Specialist Book Promotion.
Student Employment Student Training Note: This is a template that can be utilized to create your own institutional specific Student Employment Student.
THOMSON SCIENTIFIC Web of Science 7.0 via the Web of Knowledge 3.0 Platform Access to the World’s Most Important Published Research.
CISTI Source & SiteSearch OCLC User Meeting 2001 Danielle Langlois & Carol Serroul May 9, 2001.
Getting started on informaworld™ How do I register my institution with informaworld™? How is my institution’s online access activated? What do I do if.
Databases and Library Catalogs Global Index Medicus/Global Health Library PubMed Source Bibliographic Database: International Health and Disability.
Depth customization of DSpace: Best practices and techniques of institutional repository at IIT Kanpur, India By S. K. Vijaianand V. D. Shrivastava Gaurav.
Thomson Scientific October 2006 ISI Web of Knowledge Autumn updates.
Chapter 9 Publishing and Maintaining Your Site. 2 Principles of Web Design Chapter 9 Objectives Understand the features of Internet Service Providers.
Use & Access 26 March Use “Proof of Concept” Model for General Libraries & IS faculty Model for General Libraries & IS faculty Test bed for DSpace.
Training by the Office of Library and Information Services Contact for more information: karen.gardner- or
Automated (meta)data collection – problems and solutions Grete Christina Lingjærde and Andora Sjøgren USIT, University of Oslo.
Developing Policy and Procedure Management System إعداد برنامج سياسات وإجراءات العمل 8 Safar February 2007 HERA GENERAL HOSPITAL.
Scarlett Gibb NIH Office of Extramural Research Office of Electronic Research and Reports Management Interim Chief, eRA User Support, Training & Documentation.
Online registration Presented by: Ymer LEKSI. Learning objectives By the end of this session you will be able to: Login to the web post messages to forums.
Chapter 1 Getting Listed. Objectives Understand how search engines work Use various strategies of getting listed in search engines Register with search.
FDOT Database Training #2 May 3, 2010 Presented by Erica Hughes & Michael Faraone Bridge Software Institute University of Florida.
Hubnet Training One Health Network South East Asia Network Overview | Public and Members-only Pages; Communicating and Publishing using Blogs and News.
TOPIC 7.0 LINUX SERVICES AND CONFIGURATION. ROOT USER Root user is called “super user” because it has power far beyond those of mortal user. As root,
Presentation National Taiwan Ocean University November 2004.
A Project of the University Libraries Ball State University Libraries A destination for research, learning, and friends.
Internet Privacy Define PRIVACY? How important is internet privacy to you? What privacy settings do you utilize for your social media sites?
FHA Training Module 1 This document reflects current policy related to this topic. Its content is approved for use in all external and internal FHA-related.
How to complete and submit a Final Report through Mobility Tool+ Technical guidelines Authentication, Completion and Submission 1 Antonia Gogaki IT Officer.
The Chest – Supplier Training Information. Website url
Navigating The Faculty of 1000 Medicine Site F1000 Medicine Toolbar Rating badge & F1000 Factor Based on a consensus score.
CitiBuy Support January, 2009 This guide will provide you with a quick overview of the new Support Portal for the Baltimore CitiBuy Purchasing System City.
REMI Database Antall Fernandes. REMI ● A relational database to facilitate data - metadata organization of various research studies. ● Interface into.
Searching for Scientific Research Using Environmental Index (EBSCO)
OARE Module 5A: Scopus (Elsevier)
Using Open Access to Increase Personal Internet Presence
Vehicle Inspection Report (DVIR)
TBR Institution Guide to Updating Existing TSM Vendors and Entering New TSM Vendors Existing in Banner.
ICOTS Helpdesk Training
Firstly, you need to find your journal…
Manuscript Transcription Assistant Initiative
IEEE Transactions Journals Scopus Viewpoint
5. Setting up Alerts.
ISI Web of Knowledge New Features, April 2007
TBR Institution Guide to Updating Existing TSM Vendors and Entering New TSM Vendors Existing in Banner.
Searching the Web.
Track Your Research Impact
Presentation transcript:

ArXiv.org 250,000 documents 47,000 registered users 1 million+ downloads per year Cost Per Paper $10000 Commercial Journal $1000 Non-Profit Journal $10 arXiv

Goal: Process increasing number of submissions at constant or declining cost

arXiv has an active core of users: 10% of users are responsible for about 1/3 of all submissions, 50% of all users have logged in (to submit or update a paper) in the past 1.5 years

Authentication and Access Control Recently moved from an http authentication/Berkeley database system to a system based on cookies and a relational database. Currently, all registered users (who haven’t been suspended) can submit to all subjects classes in all archives – the original submitter or somebody with the paper password can update the paper. People are allowed to register depending on their address: can register, but can’t unless company=ibm,lucent,…; this list is hard to maintain (we have to block popular ISPs in every country), exceptions are dealt with manually at great cost (each case takes detective work), and there are many people in.edu (alumni, non-research staff) who shouldn’t be able to submit. Because registration and submission are linked, user database can’t be used to offer other services: notification, personalization.

Endorsements and Trust Management AdministratorsGrandfathered Users In new system, everyone will be able to register. Users who registered under the old system will still be able to upload to any archive or subject class, but new users will need to be endorsed by an author with a publication history in that category. Burden shifts from one senior staff person to 47,000 registered users. User database can be used

Endorsee Endorser Endorsement code

Web-based interface for administrators: View user history and publications Monitor endorsement process Manage authority records Disable ability to submit or endorse Keep “institutional memory”

Future Directions Flexible Submission Queue (Currently submissions are published the following evening – we can’t easily delay a submission) Validating Metadata Form (Force users to clean up entry errors, so administrators don’t have to) Automatic Protection (Suspicious submissions and endorsements will be automatically delayed) New Search Engine based on Lucene Retrofit notification (current awareness) to use new user database.

Classifying Articles with the Support Vector Machine Paul Ginsparg Paul Houle Thorsten Joachims Jae-Hoon Sul Goal: identify papers in existing archives that are relevant to a new subject archive, q-bio (Quantitative Biology)

Active Training of SVM Training: q-bio Training: not q-bio Other far from margin Other close to margin SVM finds maximum-margin hyperplane. We do first training run on one year of data, then identify other papers that lie close to the dividing line. We iteratively classify these by hand to refine the classification

Classifer performance improves as the size of a category increases.

Time Series Analysis of Content and Usage Information Paul Ginsparg Jon Kleinberg

Kleinberg’s algorithm uses a hidden Markov model to detect bursts of word usage in arXiv titles, reveals intellectual trends in the last decade of high-energy physics theory.

Review papers have a distinctive pattern of use: an initial spike after announcement, followed by a long nearly-constant tail. Announcement Cited by other papers Web Link Added