Simeon Warner - 21 April 2003 The arXiv eprint archive CS520 guest lecture Simeon Warner
Simeon Warner - 21 April 2003 What is arXiv? Eprint - lots of meanings, here: pre-print, post-print, no-print, refereed or un- refereed, openly accessible via internet 230,000 papers (35k/year, ~150/day) Mostly physics, some math and cs Mostly TeX/LaTeX source Automated PS/PDF production 17 mirror sites world-wide, nightly update Secondary submission site in Lyon, France
Simeon Warner - 21 April 2003 Full-text downloads/week
Simeon Warner - 21 April 2003 History hep-th created ( reflector, messages were saved). ~200 users ftp interface. hep-ph… added web interface data on remote sites moved to main site, they become mirrors automatic PS from TeX PDF generation, mirror network grows ~70,000 users
Simeon Warner - 21 April 2003 Not often cited as HCI exemplar
Simeon Warner - 21 April 2003 Submission form
Simeon Warner - 21 April 2003 Submission methods
Simeon Warner - 21 April 2003 Submissions to different areas
Simeon Warner - 21 April 2003 Why is cs archive failing? Why are physics/math working? Strong pre-print culture in physics Active promotion from inside communities CS less TeX based CS has stronger homepage culture? CS has Citeseer (aka ResearchIndex) CS researchers hate arXiv interface? CS has more conference publication? LESSON: Can’t apply `one size fits all’
Simeon Warner - 21 April 2003 Different habits in CS (1)
Simeon Warner - 21 April 2003 Different habits in CS (2)
Simeon Warner - 21 April 2003 Current problems at arXiv Too much admin time 150 new/day, 30 replacements/day Inappropriate submissions Encouraged by arXiv’s popularity Should be “peer reviewable” quality Must be appropriate to subject areas Copyright PS/PDF Don’t (can’t?) automatically detect Submission size (otherwise) good tools produce bloated figures
Simeon Warner - 21 April 2003 Discipline-Based Repositories, Institutional Repositories, Open Access… Driven by cost, ideology, self-promotion… arXiv is exemplar archive Largest, best-known, discipline based Need open archiving Current interest in institutional repositories Slice the other way to disciple based Coexist? Metadata sharing - OAI Essential infrastructure for more disperse model than arXiv
Simeon Warner - 21 April 2003 Publishers and Peer-Review Peer-reviewed publication necessary for academic rewarding (promotion, tenure, prestige) System slow to change Physics publishers accept/live-with arXiv Can even submit to journals via arXiv Not true in chemistry/biology Peer-review costs ~$500/paper Where would that come from without subscriptions?
Simeon Warner - 21 April 2003 Resources arXiv: Front (interface to math arXiv): Algebraic Geometry and Topology (arXiv overlay journal) OAI: BOAI: