RMLL visits at CERN – July 2012 What is it used for? Depositing Archiving Organizing Disseminating Any type of document ~350GB of PDFs at CERN ~20TB.

4 What is it used for? Depositing Archiving Organizing Disseminating Any type of document ~350GB of PDFs at CERN ~20TB of images and videos 1M records

5 What is ‣ Integrated Digital Library / Repository software ‣ A platform of choice for managing documents in HEP ‣ also adopted in other fields (medium to big repositories) ‣ Web application ‣ Open-source GPL-2 project ‣ LAMP stack: Python (mostly), MySQL and Apache ‣ Based on open standards MARCXML, OAI-PMH, OpenURL, OpenSearch, etc. ‣ Flexible, scriptable


7 Invenio’s gears Lots of Python, with a sprinkle of C and Lisp(!) 630K lines of Python code MySQL ISAM for storing data Native indexing engine Apache + mod_wsgi + mod_xsendfile

8 Invenio’s History 1954 CERN library starts paper dissemination of preprints (early Open Access) 1965First computers at CERN library to help with cataloging 1990Electronic distribution of preprints via FTP 1993 CERN Preprint Server, web front-end of electronic preprint catalogue. Institutional repository 1996 CERN Library Server (weblib): added books, periodicals and "other material”. 2000CERN Document Server: multimedia material, internal notes 2002First public release of the software under GNU-GPL. Worldwide installations and collaborations

9 Open Access at CERN “Consistent with the stated position of the Collaborations and the General Conditions applicable to Experiments at CERN, every effort will be made to publish papers under Open Access conditions, as defined by the SCOAP 3 initiative. As at the date of this document, the Creative Commons Attribution ("cc by") license meets these conditions.” OA at CERN has a long history, the CERN Convention of 1953 states: "...the results of its experimental and theoretical work shall be published or otherwise made generally available". Convention

10 Our development Environment Git distributed version control system Trac for ticket tracking VirtualBox + Vagrant for testing deployment We develop on SLC5/6 (based on RHEL5/6), on Ubuntu, on Debian…

11 Quality Assurance Coding standards Eg. PEP8 (Style Guide for Python), etc. Documentation "If the code and the comments disagree, then both are probably wrong." – attributed to Norm Schryer Test suite ~1,000 unit/regression/web tests Security XSS, CSRF, SQL injection, etc. Code review Kwalitee check: "measuring" quality "It looks like quality, it sounds like quality, but it’s not quite quality.” – CPAN Testing Service (quoting Michael Schwern)

12 Our community 30 institutions worldwide CERN + DESY + Fermilab + SLAC EPFL … ADS and arXiv joining forces Translated so far into 26 languages 45 committers (in the last year) Free + Paid support

13 An example installation 1 Load balancer (HAProxy + Apache mod_proxy + mod_evasive) 5 Worker nodes: 2 VMs for static files 3 Real machines for Python handled requests 2 DB nodes (MySQL master + MySQL replica) AFS distributed FS for backups and file storage Sustained recent Higgs announcement load (230 requests per second with peaks of 800 req/s)

14 What’s next? Werkzeug/Flask + Jinja2 + WTForms for the web framework SQLAlchemy for DB abstraction Twitter Bootstrap + jQuery for the style Optional Solr indexing


16 History and Features Technologies Development

17 What is Indico ? Web-based event organization Archive of events metadata and related documents (minutes, slides, etc) Booking service and collaboration hub Rooms Videoconference Webcast

18 What is Indico ? Started as an European Project - 2002 First time used in 2004 In production at CERN: And in >100 institutions around the world GSI, DESY, Fermilab,… Free and Open Source

19 Indico @ CERN > 170.000 events > 700.000 presentations > 900.000 files

20 Event Management with Indico All kinds of events

21 Managing Simple Events

22 Managing Meetings

23 Managing Conferences

24 Full Lifecycle

25 Managing Conferences

26 Collaboration Hub Room Booking

27 Collaboration Hub Collaboration service requests: Videoconference, webcast, recording

28 Technology Python >2.6 + WSGI ​ babel, webassets, ​ pytz, ​ zope.index, zope.interface, ​ simplejson, ​ suds, ​ lxml, ​ zc.queue, ​ py thon-dateutil, ​ pypdf, pyatom, reportlab, etc ​ Mako 0.4.1+ as template engine ZODB as underlying database ( Web frameworks: jQuery Backbone.js

29 Infrastructure

30 Compatibility Many browsers compatibility: IE8+, FF3.6+, GChrome, Safari, etc Working on mobile version

31 Development Tools Git as Control Version System ~ Eclipse + PyDev Unit and Selenium Test + Jenkins (Continuous Integration Server) Sphinx for Documentation Trac as Project Site Github: Transifex for i18n: /indico/

32 What’s Next ? Enhance the software: v1.0 end of 2012 Enlarge the community: more advertising

33 What’s Next ? Build on community adoption Indic8r will be an Indico content aggregator

34 Questions?


