11/18/02Travis Brooks-ASIST The Unpublishing of High Energy Physics Travis Brooks SPIRES Scientific Databases Manager Stanford Linear Accelerator Center
11/18/02Travis Brooks-ASIST What is SPIRES? Bibliographic database of over half a million High Energy Physics(HEP)- related articles Citation searching and tracking for e- prints and journals First website in U.S. Over 25,000 searches a day Mirrors in 5 countries
11/18/02Travis Brooks-ASIST Unpublished Research I am a former HEP theorist so the words “unpublished research” call to mind immediately the eprint arXiv.org and its use in High Energy Physics (HEP), especially theory HEP theory is a relatively tight community of over 1,000 scientists
11/18/02Travis Brooks-ASIST hep-th (Pr)eprints: a Timeline Prior to 1974 preprints sent by mail to select groups 1974 SPIRES indexes preprints, allows more general distribution, retrieval 1991 arXiv.org (then LANL) allows immediate universal electronic access to full-text of preprints Preprints become eprints Posted by author, no content review Demise of all HEP journals predicted
11/18/02Travis Brooks-ASIST Current use of hep-th Studied hep-th from ,000 papers 13,000 eventually published in Journals 1,000 in conferences 3,000 remain eprints only
11/18/02Travis Brooks-ASIST A New Type of Publication? Over these 6 years hep-th has remained stable as a “mature” arXiv. Over 90% of papers published in Phys. Rev. D were submitted to arXiv
11/18/02Travis Brooks-ASIST Topics How do HEP theorists use eprints? From a statistical view From a physics researcher’s view Implications and reasons for success of eprints in HEP theory Issues and opportunities in HEP experimental research
11/18/02Travis Brooks-ASIST Cite Counts Much research has been done using citations as a measure of eprint usage Citations are important as a measure of what the scientists read They are also a mark of quality The author believes this work to be important enough to revise, extend or improve upon its ideas Citations show where the action is
11/18/02Travis Brooks-ASIST Cite Counts II It has been seen that cites to HEP and related eprints from journals are high and rising (Brown 2001, Youngen 1998, others) hep-th eprints are similar quality (as measured by cites from all sources) as average journals Impact factor similar (Fabbrichesi and Montolli, 2001)
11/18/02Travis Brooks-ASIST Time series of cites Brody (2000) has examined the time series of citations within the arXiv SPIRES allows citation tracking to an article through its life as an eprint, then as a journal article, making no distinction This reflects the HEP scientific culture
11/18/02Travis Brooks-ASIST Why Citations over time? When (in a paper’s publication journey) does most citing occur? Plot the number of citations a published hep-th article receives per month after its arXiv submission 8000 published papers in sample Includes citations from journal papers and arXiv papers (essentially the same set)
11/18/02Travis Brooks-ASIST What do HEP theorists read? Wherever the citation peak is, that is when the most exposure occurred Citations show that the work was not only read, but taken seriously If HEP theorists treat unpublished eprints differently than published, peer-reviewed papers: One would expect to see higher citation rates after publication
11/18/02Travis Brooks-ASIST They read eprints, not journals Journal lag time roughly 6 months Citation peak occurs after eprint release, not journal release HEP Theorists don’t care whether an article is published or not when citing it Invisible bump in citations at journal release
11/18/02Travis Brooks-ASIST From a HEP theorist’s perspective You read the arXiv papers to find out the latest scientific information You base your work on what you read in the arXiv Scientific priority given by arXiv time stamp, not journal submission date You don’t notice if it is published
11/18/02Travis Brooks-ASIST Peer Review? This dependence upon the arXiv is not the loss of peer review All hep-th articles are posted for all of your peers to see! Put shoddy work out there for all to see, it is known Post uninteresting incoherent ramblings, it is ignored
11/18/02Travis Brooks-ASIST Why do they still publish? Only a few articles remain unpublished forever “For the record,” or more likely, “for the tenure/search committee” Respected, tenured authors may not publish at all Dr. Edward Witten has 9 papers with over 50 citations that are not published in conferences or journals
11/18/02Travis Brooks-ASIST HEP theorist’s viewpoint arXiv is for daily (journal like) communication Journals are for “archival” value Overheard about a paper not sent to hep-th: “He didn’t publish it, he just sent it to Phys. Rev. D” Eprints are really published literature now
11/18/02Travis Brooks-ASIST Why HEP Theory? No proprietary/patent issues Papers can be verified by hand, by any knowledgeable reader Work is like a continuing dialog, each paper sparking new, creative ideas
11/18/02Travis Brooks-ASIST Same basic style Note that the basic publication style has not really changed HEP Theory has not moved away from papers written by a few authors to more complex technology-enabled collaborations
11/18/02Travis Brooks-ASIST HEP Experiment HEP experiment has had more radical changes in working style Pushing pre-publication scientific collaboration to new levels Close to 1000 “authors” on a paper
11/18/02Travis Brooks-ASIST Experimental Data Worldwide data processing grid World’s largest database (over 600TB) from one experiment Unpublished, how is it maintained? Will it persist as useful data? Current solution is to publish 2 year summary paper of all HEP data Web, db, and maybe raw data may change this
11/18/02Travis Brooks-ASIST Conclusions hep-th eprints are an incredibly successful tool Filling many traditional journal roles Still a traditional publication model, simply a different medium Opportunities for truly different uses of unpublished research in HEP experiment