Subject-specific international services in Physics Eberhard R. Hilf, H. Stamerjohanns, and Thomas Severiens Institute for Science Networking

1 Subject-specific international services in Physics Eberhard R. Hilf, H. Stamerjohanns, and Thomas Severiens Institute for Science Networking 3. September Duisburg, Germany Workshop International Interdisciplinary Open Archives and Subject specific services in Mathematics and Physics.

2 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Content of talk: I: Why subject-specific services? II: Open Archive Distributed in Physics III: International embedding and organizatio

3 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Part I: Why subject- specific services? Knowledge repository requirements 1.Restricted 2.Complete 3.Professional 4.Research-driven 5.Additional subsject-specific services

4 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg 1. Why restrict knowledge basis? Higher ratio of relevant information retrieved Less ‚missunderstanding‘ [different meanings and content for same word in different fields] Search for Ideal Altavista: no relevant in first twenty Google: no relevant in first twenty >Science>Math: one in five PhysDoc (in title): third title relevant ; with metadata: all relev. Mpress (in title): only relevant documents

5 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Maschine-readable metadata Tool for authors in MathNet and Phys-Net Webform for adding metadata MyMetaMaker

6 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Subject-specific Additional Information Examination regulations Teaching plans Technical specifications for experiments

7 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Problem of Interdisciplinarity Interest of documents only in border areas Border areas are often most active scientifically Upgrade services in both fields Additional functionality into used to services Use knowledge repository of both fields A)Intellectual Mapping of keywords failed [few usable docs, level mismatch] B) Automated Mapping: 17.000 INSPEC with PACS AND MSC. Statistical analysis, ranking, grammar truncation. Workpackage 9 of CARMEN (BMBF) J. Pluemer et al. (Osnabrueck), Th. Severiens (ISN).

8 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Problem of Interdisciplinarity II

9 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg 2. Why complete repository? Prime research needs instant (Web, no delay) information of all relevant new results complete information fom anywhere in the world One stop service despite a multitude of distributed heterogenous repositories. Consequences for financing concepts

10 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg 3. Why professional content repository Researchers need mostly information from their professional colleagues. Researchers can act only in their subject-field as referees, quality filters for the wider public, comment and select. The Web allows for a multilevel professional quality management for all heterogenous purposes

11 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg 4. Why research driven repository? Authors have the highest motivation to be read, to get their documents distributed and archived. Author communication communities are subject-specific. Scientists understand only their subject- colleagues Research is organized most often in subject- specific topical institutes

12 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Part II: Distributed Open Archive for Physics OAD Vision of the ultimate subject-specific Open Archive All departments worldwide as prime, complete, open free repositories Secondary virtual add-on services use these: 1.Quality filters 2.Collections 3. topical archives

13 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Present incomplete realization All worldwide departments Few cooperate by local quality filters yet Few comply with metadata (1000 of 40000 documents) Few give explicit open access (keep author‘s rights) Phys Net

14 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Completeness of data in a heterogenous world Free locally posted documents: PhysDoc Free archived theses [Depts, Univs., DDB,..] Free preprint repositories: ArXiv Free fulltext journals Free research lab docs: CERN, ANL,.. University Publishers Journals of Natl. Societies: APS, IoPP Commercial journals

15 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg OAD Physics Project 2001 Oai compliant service provider for PhysDoc [1.000 out of 40.000] ArXiv IoPP [APS] PhysDiss [European] NDLDT [2001] Cornell, CERN, MIT [Oai-compliant Document providers] showshow

16 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Part III: Organizing international distributed repositories

17 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Oai: Cooperation of repositories Data providers comply with Oai Yes, if they are not service providers [Departments] Yes, if they are free access providers [ArXiv] Subtle, if national society publishers [APS, IoPP] No, if commercial publishers [Elsevier,..] Cut throat competition of service providers with best service for same documents Commercial publ. collect free access documents Oai lists scirusscirus

18 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Political and Funding Policy Effective services for research Money to libraries per No of accessible documents Multiple access ways [TibOrder vs others] Regulations for hiring scientists to Universities Funding selforganization of research communities University publishers as regular prime research outlet Fund IuK research to professionalize content search

19 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Subject-specific National Port of Entry German Physical Society DPG plan Cooperative project of partners [FIZ, TIB, ISN; KFP] Rescue boat syndrom?

20 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg International Networking No bias policy: no single society allowed to dominate Funding policy: each society finds ist own funds Broker policy: democratic‘ network of brokers [DFN-Project] Department cooperation: 1.Operator 2.Quality filters [select what to enter PhysNet] 3.Metadata for documents 4.Home page for document lists University publishers (vetting and archiving) National entry points for Oai. PhysNet Charter

21 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg

22 Joint project VT-ISN Funded under a new scheme jointly by NSF and DFG (German Science Foundation) One application, one refereeing body, one funding scheme Thus one team, one final intelligent Online service suited to be adapted to any language and any field. Started: 1.March 2001

23 Searching - Retrieving in the past age Libraries with different schemes (OPAC, PICA..) Multiple Publishers with a monopole on content and different schemes PhysDoc arXiv Inconvenience for the user! SFX or MetaSearch Cross Ref.Links Multiple costs for Providers

24 Activitities in e-Archives Universities / Univ. Libraries –OPUS (Stuttgart...) –Eldorado (Dortmund) –e-Lib (Osnabrück) –MILESS (Essen) –COPACABANA (Oldenburg)... Regional Bibliographic Utility Systems –PiCarta GBV Göttingen –BSZ-Media Server Baden-Würtemberg –DigiBib North-Rhine-Westfalia... National Projects –GlobalInfo (BMBF) [Metadata rdf XML] –DissOnline (DFG) [all fields] –Virtual Subject-based Library (DFG) Institutionownedpublications are mostly: DissertationsDissertations TeachingTeaching old digitized Materialold digitized Material

25 A dicussion in the train Scientist Did not know what services we were deprived of Librarian Assumed to know what services are good for science did ask the scientists: „What new services are needed?“ The young Elsevier......

26 Principles for Document Services for the Sciences Must be scalable [1 Bill. Docs in Physics/a] Distributed data bases [author controlled] Free distribution [exclusive author´s right] Worldwide accepted Metadata standards [DC] Free access to all research results [ownership] Comply with needs of scientists Competitive add-on services To serve what customers want, not what they ask for.

27 PhysNet, a field specific service Headed by EPS controlled by its Action Commmittee on Publication and Scientific Communication

28 Crawl across all distributed Physics Departments Same Metadata as Math-Net [IMU, EPS] Distributed Gatherers [locally allow/deny !!] Distributed Brokers [no nation to dominate] Agreements for an unbiased distributed system [Charter] Distributed manpower [at present: 1 Mill. $/a] Serve all types of information The Concept of PhysNet

29 Departments worldwide Physicists Harvest gathere r SOIF DC Harvest Broker 3000 distributed repositories their local documents and document lists numerous distrib. gatherers numerous brokers No central repository To cope with about

30 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg 30 PhysDep Linklist + Seachengine approved by National Societies businessmodel  administrational inform.  distributed gatherers: 26  search depth: 2-full  acceptance 500/day 400/day PhysDoc publications distributed gatherers: 3 search depth: special

31 Present Status (April 2001) About 40 local, regional, national gatherers Brokers at US, DE, Russia, Hungary, France, UK, DK, India, Japan, Australia,.., EPS [DFN-Project] 39.000 documents and document lists MyMetaMaker author tool to add DC:metadata [with Mathematics (IMU) and Physics (EPS).] Distributed physicists´/institutes´ homepages system with DC:metadata [jointly with Math.] 30.000 page impressions per month... OnlineOnline skim through

32 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg A field specific professional service has to meet the expectation of a quality service: The service should not contain everything but only material certified by physicists to be relevant and good physics. Thus we need certification levels. PhysNet has but just one: what is on Physics Department‘s webservers

33 Scholarly Publishing, Vetting and Peer Reviewing, Metadata, and Archiving in the past age Library Publisher PhysDoc Physics Dep. arXiv Peer Reviewing Document + Metadata National Library A lot of work for the author! Exclusive rights for the publisher! Some e-prints free for the community! High prices for the library! Author

34 What refereeing do we need ? Instant publishing before refereeing Time stamp for prime research before refereeing Archiving of relevant information Competitive parallel) refereeing Multilevel refereeing Full information published to be fair to referees Open refereeing [signed Annotation instead of advice] Voluntary refereeing to be a pleasure for referees

35 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg The role of University Libraries Be Oai-Database Provider of complete local Information Assure free full text access of all research material Assure correct metadata usage (by training or adding) Do handshake with National Archives Be Oai-Service Provider of complete local information Vetting system with the local department scientists Train users to pick from the multitude of Oai-service Prov.

36 Scenario for Tomorrow: OAi Data and Service Providers including Vetting to Peer Reviewing Library Physics Dep. Group heads Referees of other Univ. Referees of Learned Societies Document Metadata National Library Archiving Author Documents Reviews Metadata Service X arXiv PhysDoc Data Provider Service Provider Multi-level Peer Rev.

37 Vetting and reviewing at German Universities Cooperation of universities in North Germany (Hamburg, Oldenburg, Bremen, Kiel, Rostock, Greifswald): in evaluation of online teaching and research in usage and production of multimedia in e-publishing and establishing a joint university press for e-publications –pilot project of Hamburg + Oldenburg –Local vetting with department scientists and library –peer reviewing between different universities –shared functions (work flow system, marketing...) –separate functions (business model, financing...)  virtual university press of an open and growing number of online and peer-reviewed university presses

38 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg 1. Discussion (workshops, meetings,..) 2. Concept (free access, a multitude of data providers and service providers but one internationally to be accepted standard) 3. Software and workforce sharing. Three layers: The concept of the Open Archive Initiative OAi

39 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg PhysNet, MareNet, PhysDis Math- Net comply right from their beginning in 1995/6 the concept of the Open Archive Initiative

40 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg A success story: Dissertations Online in Germany Workflow and Metadata from Author to Department, Library, National Archive All fields, all Universities One scheme for DC-Metadata Local Archives, national providing Formal rules for all. Online skim sthrough

41 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Retrieval interface TheO – –using Dublin Core Set for Theses and Diss. –Work of Bahne, Törner, Schwänzl, Plümer,..

42 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg The role of Service Providers

43 Scenario of Tomorrow: Types of Searching – Retrieving offers Competition by -quality of add-ons -level of refereeing -quality of contents -specialization -depth of search -size -comfort of retrieval -level of integration -local focus -... PublisherServers PhysDoc arXiv Universit yLibrarie s Servers (Google,..

44 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Implementing OAI at German Universities DINI: ( –German Initiative for Networked Information –carries out guidance for implementations all over Germany –develop a strategy to cover German universities (libraries with document servers) Aim: –Serving a distributed archive network –Setting up a contact point for OAI in Germany

45 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg PhysNet as Oai Data Provider and Server

46 Any Oai Data Provider Harvest gathere r DC-converter SQL DB: MySQL OAI-Data Provider OAI- Harvester OAi Broker Any Oai Servic e Provid er

47 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg DataProvider Implementation 8. March 2001 Service Provider Implementation 13. April 2001, 11.30 am VT-time Skim throughthrough

48 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Collections to be Represented in Oai- PhysDoc PhysDoc: –Distributed document Database for Physics worldwide –using HARVEST as Retrieval mechanism University document servers North German Univ. superstructure Physics part Physics part of NDLTD Arxiv, MIT,.. Physics part

49 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg OAI _Identify

50 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg OAI Implementation modified HARVEST holds SOIF and DC metadata in local text files storage size no problem decision to convert data offline and store structured data in SQL database (mysql) use DC when possible, otherwise map SOIF to DC

51 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg OAI Implementation documents HARVEST SQL DB normalize metadata OAI Server

52 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg OAI Implementation software written in PHP protocol –easy because it uses modified implementation of HU Berlin metadata converter –maps SOIF to DC –converts different DC representations to one common one

53 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Future work improve metadata converter –improve summarizers –closer look at different DC representations tell people to use metadata –OAI workshops ease production of metadata

54 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg DC-Standards and Sets of OAi German National Library DC7 TheO-Duisburg OCLC-NDLDT Math-Net Worldwide: Int.Math.Union Phys-Net Worldwide: EPS Differences: Html 2, 4 XML, rdf

55 Advantages of the new Scenario Less work for the author! Non-exclusive rights for the publisher! Most e-prints free for the community! Lower costs for the library! Value-added services by different providers! Easy integration of metadata into existing services! Open multi- level peer reviewing! Less printed journals but more accessible e-publications! Immediate publication! OAi = „Napster for the Sciences“ Richard Sietmann in: c´t 6/2001, S. 78

56 Departments worldwide Physicists Harvest gathere r SOIF DC DC-converter SQL DB: MySQL Marian for ranking OAI-Data Provider OAI- Harvester OAi Broker Harvest Broker Marian Bypass

57 National activities to support the OAi DINI German Initiative for Networked Information –similar to CNI –Cooperation between Research Libraries (DBV), Computer Centres (ZKI), Media Centres (AMH), Initiative of Learned Societies IuK –DINI´s Appeal to join the OAi (2000) –Training camps for German Oai-Data Providers

58 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Next steps I German National Library DDB set: write MMM and import (all fields) VT ETD-Metadata: write MMM and import Use set for import/export VT-ISN Good for increasing No of documents Increase acceptance Prove VT-ISN

59 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Next steps II Benefit from Research for added Intelligence at VT for MARIAN broker Branching queries to VT and import answers Install modules Go towards one joint broker

60 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Thanks for the invitation Virginia Tech group of E. Fox leading in development of online digital library concepts for any learned field and language ISN complements in developing online services for a specific field (Physics) and just one single language (broken englisch).

61 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg MARIAN Bypasses for ISN-PhysDoc: Intelligence for the Oai-Dataprovider Intelligence for the Oai-Serviceprovider Next steps III

62 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Worked on add on services Learning personalized search engine Learning personalized browser engine query User and his institute sever DB of information link stars browsing

63 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Joint Representation of DFG-NSF project One Server, one address, one project, one crew.. VT-ISN steps IV

64 E.R.Hilf, Institute for Science Networking, Germany: 3.9.2001 Duisburg Share User statistics and Evaluation Next steps V What do Physicists want? (In a new surrounding) Experiments and Evaluation Instead of questionaires

65 Acceptance of a service : 1.Bottom 1.Bottom up: just do it and spread the rumour 2.Top 2.Top down : Charter of IMU, EPS 3.To 3.To register: Institutions, Departments, Graduate Schools, Universities 4. Joint international standards and cooperation 5. Work sharing (infinite workforce) 6. Professionalism: let the scientists provide content Let the libraries, computer centers provide service Let the Administration levels find a way to assure quality selection to the worldwide services.

