Download presentation
Presentation is loading. Please wait.
Published byNigel Donald Bond Modified over 8 years ago
1
Dr Tim Smith CERN/IT For the visit of the Alliance of German Science Organizations
2
[Oct 2013] - 2 As Designed: W. LHC Computing Grid Distributed Data Management – Limited Network resources – Optimize / minimize movement – File placement logic – Deterministic / Static Site Data Management – HSMs – Transparent file access and movement Disk-Tape migration/recall
3
[Oct 2013] - 3 Research Data Infrastructure of today Distributed Data Management – Network: a resource to schedule – Dynamic data placement – Data transfer services – Expt replica management rules Site Data Management – Indep. technology choices – Decoupled tiers – Disk caches Managed by owners – Bulk 3 rd party migration to tertiary by owners AAA: any data, any time, any where
4
[Oct 2013] - 4 CERN Infrastructure of tomorrow Connectivity (100 Gbps) 2015: 15k servers, 300k VMs
5
[Oct 2013] - 5 Big Data … in small pieces Long tail of science Big facilities Data Size x (a small number) x (a large number) Dedicated Big Data Stores
6
[Oct 2013] - 6 http://zenodo.org
7
[Oct 2013] - 7 Naming Zenodotus of Ephesus – First librarian of the Ancient Library of Alexandria – First recorded use of metadata
8
[Oct 2013] - 8 Features http://www.altmetric.com http://www.datacite.org http://www.openaire.eu
9
[Oct 2013] - 9 Communities
10
[Oct 2013] - 10 Deposit http://www.dropbox.com
11
[Oct 2013] - 11 HEP: Data Reduction / Analysis Publication Reduced Reconstructed Raw Researchers T2s, T1s Analysis Coordinators T1s Production Managers T0, T1s File Size # Files
12
[Oct 2013] - 12 HEP: More than Data Papers Tabular Data Correlation Matrices Internal Notes Wikis Presentations Quality monitoring data Filter / selection algorithms Formatters Calibration Data Conditions Data Log Books Researchers T2s, T1s Analysis Coordinators T1s Production Managers T0, T1s Workflows Contextual metadata SW: 10M LoC
13
[Oct 2013] - 13 Deposit
14
[Oct 2013] - 14 Differentiating Features Easy to use and attractive – DropBox integration – Drag-n-drop deposition Low barriers – Little fixed metadata Open on input as well as output – No restrictions on type of data – No restrictions on format of data – No restrictions on licences Distributed community curation
15
[Oct 2013] - 15 Retro/Per -spective OpenAIRE – FP7 Open Access pilot for peer reviewed articles OpenAIREplus – FP7 OA pilot for publications and research data CERN – Cloud Service
16
[Oct 2013] - 16 Interested Communities Workshops – Proceedings and presentations Projects – Research output and project artifacts Research Groups – Datasets – snapshots of a live store Universities – Datasets and articles Libraries – Newsletters – Data not fitting in traditional repositories Publishers – Publication/subsidiary datasets and software – Scanned and annotated logbooks Young Radiation Oncologists’ Conference
17
[Oct 2013] - 17 Perceived Attraction Trust / Security / Know-how – LHC data is thought safe there – Bit Preservation & Media Migration Longevity – An institute with a clear future – A memory institution for HEP Not a company Not a profit enterprise – No tricks and changes
18
[Oct 2013] - 18 http://zenodo.orghttp://zenodo.org www.openaire.euwww.openaire.eu @zenodo_org @openaire_eu lars.holm.nielsen@cern.ch
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.