Distributed Current Awareness Services Thomas Krichel 2003-09-18.

Slides:



Advertisements
Similar presentations
The Messy World of Grey Literature in Cyber Security 8 th Grey Literature Conference 4-5 December 2006 New Orleans, Louisiana Patricia Erwin – I3P Senior.
Advertisements

28 April 2004Second Nordic Conference on Scholarly Communication 1 Citation Analysis for the Free, Online Literature Tim Brody Intelligence, Agents, Multimedia.
Thomas Krichel LIU & HГУ 2007–11—21
LIS618 lecture 2 Thomas Krichel Structure Theory: information retrieval performance Practice: more advanced dialog.
Quality assessment of an academic current awareness system: the case of NEP Thomas Krichel LIU & HГУ 2007–1121.
Rclis in vision and reality Thomas Krichel
Quality assessment of a current awareness system Thomas Krichel LIU & HГУ 2007–1023.
RePEc and OLS Thomas Krichel prepared for the first retreat for disciplinary repositories Monterey
Current Awareness in a Large Digital Library José Manuel Barrueco Cruz Thomas Krichel Jeremiah Trinidad.
Open Archives and Open Libraries Thomas Krichel
The future of scholarly communication in Economics Thomas Krichel work partly sponsored by the Joint Information Systems.
Document data & personal data Thomas Krichel Long Island University & Novosibirsk State University
New Century, New Metadata Thomas Krichel University of Surrey, Hitotsubashi University and Long Island University.
Use your bean. Count it. Thomas Krichel
My life and times Thomas Krichel LIU & НГУ
Four slides for the future Thomas Krichel given at 4 th International Socionet seminar Novosibirsk
Current work on CitEc José Manuel Barrueco Cruz Thomas Krichel
THE STEPS OF SEARCH You have opened a new veterinary clinic in a small town, and want people in the vicinity to know about it. You need some new ideas.
How the University Library can help you with your term paper
Journals.
First, let’s talk about some of your introductions from last time: – What did you think was good about it? – What did you think was poor about it? What.
© Tefko Saracevic, Rutgers University1 digital libraries and human information behavior Tefko Saracevic, Ph.D. School of Communication, Information and.
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
© Tefko Saracevic, Rutgers University1 digital libraries and human information behavior Tefko Saracevic, Ph.D. School of Communication, Information and.
1 Using Scopus for Literature Research. 2 Why Scopus?  A comprehensive abstract and citation database of peer- reviewed literature and quality web sources.
Using LegalTrac To Find Law Review Articles. What Is LegalTrac? A commercial service UW Libraries subscribe Indexes law reviews, other legal periodicals.
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
Copyright c 2001 The McGraw-Hill Companies, Inc.1 Chapter 2 The Research Process: Getting Started Researcher as a detective Seeking answers to questions.
© Tefko Saracevic, Rutgers University1 digital libraries and human information behavior Tefko Saracevic, Ph.D. School of Communication, Information and.
1 CS 502: Computing Methods for Digital Libraries Lecture 4 Identifiers and Reference Links.
How the University Library can help you with your term paper Computer Science SC Hester Mountifield Science Library x 8050
«Tag-based Social Interest Discovery» Proceedings of the 17th International World Wide Web Conference (WWW2008) Xin Li, Lei Guo, Yihong Zhao Yahoo! Inc.,
LBTO IssueTrak User’s Manual Norm Cushing version 1.3 August 8th, 2007.
Assessing a human mediated current awareness service International Symposium of Information Science (ISI 2015) Zadar, Zeljko Carevic 1, Thomas.
Library Resources Barbara Dorward November Previous session  Catalogues  Library resources  Finding information on the web  Evaluation of information.
Research evaluation requirements José Manuel Barrueco Universitat de València (SPAIN) Servei de Biblioteques i Documentació May, 2011.
LIS510 lecture 3 Thomas Krichel information storage & retrieval this area is now more know as information retrieval when I dealt with it I.
Preprint publication and knowledge organization in Economics Sune Karlsson Stockholm School of Economics.
THOMSON SCIENTIFIC Patricia Brennan Thomson Scientific January 10, 2008.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
LIS618 lecture 11 Citation indexing and searching Thomas Krichel
Event Management & ITIL V3
Building a discipline-specific aggregate for computing and library and information science Thomas Krichel Long Island University, NY, USA
ICOM 6115: COMPUTER SYSTEMS PERFORMANCE MEASUREMENT AND EVALUATION Nayda G. Santiago August 16, 2006.
Science Fair How To Get Started… (
The ISI Web of Knowledge nce/training/wok/#tab3.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 2 The Research Process: Getting Started Researcher as a detective –Seeking answers.
Collection Description in the 1 November 2001Collection Description in the Archives Hub Archival perspective Collection description has always been central.
Research Methods School of Economic Information Engineering Dr. Xu Yun :
Mosaics: Reading and Writing Essays Sixth Edition by Kim Flachmann Chapter Twenty: Finding Sources PowerPoint by Lauren Martinez California State University,
Developing Smart objectives and literature review Zia-Ul-Ain Sabiha.
Chapter 20 Asking Questions, Finding Sources. Characteristics of a Good Research Paper Poses an interesting question and significant problem Responds.
1 Senn, Information Technology, 3 rd Edition © 2004 Pearson Prentice Hall Information Technology, 3 rd Edition Chapter 1 Information Technology: Principles,
THE BIBLIOMETRIC INDICATORS. BIBLIOMETRIC INDICATORS COMPARING ‘LIKE TO LIKE’ Productivity And Impact Productivity And Impact Normalization Top Performance.
CitEc as a source for research assessment and evaluation José Manuel Barrueco Universitat de València (SPAIN) May, й Международной научно-практической.
Attributes and Values Describing Entities. Metadata At the most basic level, metadata is just another term for description, or information about an entity.
Demonstrating Scholarly Impact: Metrics, Tools and Trends
Information Sources for Academic Work: Beyond Google and Wikipedia
LECTURE 3: DATABASE SEARCHING PRINCIPLES
The Basics of Literature Reviews
Suffolk Public Schools
Data Management: Documentation & Metadata
Attributes and Values Describing Entities.
Zetoc: Electronic Table of Contents from the British Library
Introduction into Knowledge and information
Thomas Krichel Long Island University, NY, USA
Zetoc: Electronic Table of Contents from the British Library
Building an autonomous citation index for grey literature: the
Introduction of KNS55 Platform
Describing Documents Ch3 in textbook Organizing Knowledge: An
Presentation transcript:

Distributed Current Awareness Services Thomas Krichel

Thanks JISC, sponsor of Mailbase and JISCMail Mailman team WoPEc project Manchester Computing Bob Parks & Washington University of St. Louis CO PAH –Сергей И. Парнов –Tатьяна И. Яковлева Heinrich Stammerjohans and the SINN03 organizers

What is current awareness? An old fashioned concept that implies a series of reports on –New items in a library –Per subject category Thus current awareness implies a two- dimensional classification on time and subject matter.

Is it useful in 7 A. Google? The time component is something that the search engines can not do easily –Can not divide items indexed according to types. –Do not understand subject matter. –Do not have a mode to find recent items. But generally can we trust computers to do it?

computers & thematic component In computer generated current awareness one can filter for keywords. This is classic information retrieval, and we all know what the problems are with that. In academic digital libraries, since the papers describe research results, they contain all ideas that have not been previously seen, therefore getting the keywords right is impossible.

Computers and time component In a digital library the date of a document can mean anything. The metadata may be dated in some implicit form. –Recently arrived records can be calculated –But record handles may be unstable –Recently arrived records do not automatically mean new documents.

We need human users! Cataloguers are expensive. We need volunteers to do the work. Junior researchers have good incentives –Need to be aware of latest literature –Absent in informal circulation channels of top level academics –Need to get their name around among researchers in the field.

History We use the RePEc digital libray about economics System was conceived by Thomas Krichel Name NEP by Sune Karlsson Implemented by José Manuel Barrueco Cruz. Started to run in May 1998, has been expanding since…

General set-up General editor compiles a list of recent additions to the RePEc working papers data. –Computer generated –Journal articles are excluded –Examined by the General Editor (GE, a person) This list forms an issue of nep-all NEP-all contains all new papers Circulated to –nep-all subscribers –Editors of subject-reports

Subject reports These are filtered versions of nep-all. Each report has an editor who does the filtering. Each pertains to a subject defined by a one or more words Circulated by .

Report management Reports are in a flat space, without hierarchy. They have a varying size. Report creation has not followed an organized path –Volunteers have come forward with ideas. –If report creator retires as editor a volunteer among subscribers is easily found. –It has become practice for the GE to ask for CV before awarding an editorship.

NEP evaluation Ideally one would have a model of –Readers –Subjects –Resource constraints This model would predict values of observable variables in an optimum state. Distance between actual and optimum state can be calculated.

Data on readers Readers are people who have subscribed to reports. They are proxied by addresses. Since , Thomas Krichel has captured readership data –Once a month –For every report No historic readership data

Data on papers difficult Logs of Mailbase, JISCMail and Mailman dont have detailed headers –Date information is difficult to parse and unreliable –Only reliable from with dummy subscriber set up Dates of issues (as opposed to mail dates) changed by editors Paper handles garbled up by –Mailing software –Editing software Report issue parser > 500 lines of Perl, growing!

Coverage ratio analysis Coverage ratio, is announced papers/size of nep-all It is a time varying characteristic of NEP as a whole. We expect it to increase over time because we have an expanding portfolio of reports.

Target-size theory Subject concepts are fuzzy. Evidence of subject is flimsy at times. Editors have a target size for a report issue. Depending on the size of the nep-all issue, editors are more or less choosey. This theory should be most appropriate for medium-size reports. This could be confirmed by further research.

Lousy paper theory Some papers in RePEc –are not good –are perceived not to be good They will never be announced. Editors dipute this theory but it may be possible to show that they are wrong.

Future developments Thomas Krichel sees NEP as a crucial tool for alternative peer review. Lousy paper theory supports that. But evaluation of papers is not enough. It is only a necessary step to the evaluation of the author. It will have to be done with respect to a neighborhood of the author. ACIS project is crucial.

Evaluation through downloads Data from Tim Brody shows that downloads data is strongly correlated with impact as measured by citations. But downloads of one have to be compared to a neighborhood of other documents –some areas of interest are more popular than others –logs accumulate over time NEP data crucial.

Download data manipulable If Tim´s work becomes more widely known, authors will rush to download This needs to be filtered. In addition, we need good filtering for search engine access.

Ticketing system to be done Ticketing is issueing a url for downloads that has an encrypted string encoding –report reader address –report issue data This is not an access restriction tool. Repeated downloads with the same ticket will be discarded.

Aggregation The data is very rich –dissagregate per issue time –dissagregate per report –dissagregated by download time We need to merge with data on author using RePEc author service We need to produce numbers for authors. This can be done in many ways.

Conclusion NEP is an innovative digital library service. –model implementation –Generates rich and interesting data if properly monitored. Run by volunteers –No requirement for funding to run. –Technical infrastructure quite weak. –Needs an investment in specific software.

Thank you for your attention!