The Helmholtz Association Project „Large Scale Data Management and Analysis“ (LSDMA) Kilian Schwarz, GSI; Christopher Jung, KIT.

Slides:



Advertisements
Similar presentations
The Access Grid Ivan R. Judson 5/25/2004.
Advertisements

U.S. Department of Energy’s Office of Science Basic Energy Sciences Advisory Committee Dr. Daniel A. Hitchcock October 21, 2003
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
FACULTY OF COMPUTER SCIENCE OUTPUT DD  annual event from students for students with contact to industry (~800 visitors)  live demonstrations  research.
Problems of development of high performance infrastructure for scientific center S. Shikota 1, A.Yu.Menshutin 1,2, L. Shchur 1,2 1 Department of Applied.
An Introduction to the Open Science Data Cloud Heidi Alvarez Florida International University Robert L. Grossman University of Chicago Open Cloud Consortium.
HPC and e-Infrastructure Development in China’s High- tech R&D Program Danfeng Zhu Sino-German Joint Software Institute (JSI), Beihang University Dec.
KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association Steinbuch Centre for Computing (SCC)
GridPP Steve Lloyd, Chair of the GridPP Collaboration Board.
A long tradition. e-science, Data Centres, and the Virtual Observatory why is e-science important ? what is the structure of the VO ? what then must we.
SICSA student induction day, 2009Slide 1 Social Simulation Tutorial Session 6: Introduction to grids and cloud computing International Symposium on Grid.
EGI-Engage EGI-Engage Engaging the EGI Community towards an Open Science Commons Project Overview 9/14/2015 EGI-Engage: a project.
Climate Sciences: Use Case and Vision Summary Philip Kershaw CEDA, RAL Space, STFC.
Advanced Computing Services for Research Organisations Bob Jones Head of openlab IT dept CERN This document produced by Members of the Helix Nebula consortium.
CoG Kit Overview Gregor von Laszewski Keith Jackson.
INFSO-RI Enabling Grids for E-sciencE EGEODE VO « Expanding GEosciences On DEmand » Geocluster©: Generic Seismic Processing Platform.
KIT – University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association Institute for Data Processing and Electronics.
VO Sandpit, November 2009 e-Infrastructure to enable EO and Climate Science Dr Victoria Bennett Centre for Environmental Data Archival (CEDA)
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
Research and Educational Networking and Cyberinfrastructure Russ Hobby, Internet2 Dan Updegrove, NLR University of Kentucky CI Days 22 February 2010.
INFSO-RI Enabling Grids for E-sciencE Project Gridification: the UNOSAT experience Patricia Méndez Lorenzo CERN (IT-PSS/ED) CERN,
Astro-WISE & Grid Fokke Dijkstra – Donald Smits Centre for Information Technology Andrey Belikov – OmegaCEN, Kapteyn institute University of Groningen.
LHC Computing Plans Scale of the challenge Computing model Resource estimates Financial implications Plans in Canada.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
1 HiGrade Kick-off Welcome to DESY Hamburg Zeuthen.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
News from DESY. Joachim Mnich Plenary ECFA CERN November 20, 2014 XFEL PETRA III & FLASH II Particle Physics DESY Test beam.
…building the next IT revolution From Web to Grid…
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
DESY Photon Science XFEL official start of project: 5 June 2007 FLASH upgrade to 1 GeV done, cool down started PETRA III construction started 2 July 2007.
ESFRI & e-Infrastructure Collaborations, EGEE’09 Krzysztof Wrona September 21 st, 2009 European XFEL.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Computing Coordination Aspects for HEP in Germany International ICFA Workshop on HEP Networking, Grid and Digital Divide Issues for Global e-Science nLCG.
Simulations and Software CBM Collaboration Meeting, GSI, 17 October 2008 Volker Friese Simulations Software Computing.
1 Accomplishments. 2 Overview of Accomplishments  Sustaining the Production Earth System Grid Serving the current needs of the climate modeling community.
| nectar.org.au NECTAR TRAINING Module 2 Virtual Laboratories and eResearch Tools.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Earth System Curator and Model Metadata Discovery and Display for CMIP5 Sylvia Murphy and Cecelia Deluca (NOAA/CIRES) Hannah Wilcox (NCAR/CISL) Metafor.
CBM Computing Model First Thoughts CBM Collaboration Meeting, Trogir, 9 October 2009 Volker Friese.
LHC Computing, CERN, & Federated Identities
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
Erwin Laure ScalaLife Project Director.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI strategy and Grand Vision Ludek Matyska EGI Council Chair EGI InSPIRE.
Directions in eScience Interoperability and Science Clouds June Interoperability in Action – Standards Implementation.
NASA Earth Exchange (NEX) A collaborative supercomputing environment for global change science Earth Science Division/NASA Advanced Supercomputing (NAS)
Northwest Indiana Computational Grid Preston Smith Rosen Center for Advanced Computing Purdue University - West Lafayette West Lafayette Calumet.
Building European Scientific Cloud Computing Infrastructure An overview by Marc-Elian Bégin, SixSq 1.
EGEE is a project funded by the European Union under contract IST Generic Applications Requirements Roberto Barbera NA4 Generic Applications.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EUDAT Aalto Data.
RI EGI-InSPIRE RI Astronomy and Astrophysics Dr. Giuliano Taffoni Dr. Claudio Vuerli.
ENEA GRID & JPNM WEB PORTAL to create a collaborative development environment Dr. Simonetta Pagnutti JPNM – SP4 Meeting Edinburgh – June 3rd, 2013 Italian.
EGI-Engage EGI Webinar - Introduction - Gergely Sipos EGI.eu / MTA SZTAKI 6/26/
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Support to scientific.
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Grant.
Grid Operations in Germany T1-T2 workshop 2015 Torino, Italy Kilian Schwarz WooJin Park Christopher Jung.
CEPC software & computing study group report
Accessing the VI-SEEM infrastructure
ALICE & Clouds GDB Meeting 15/01/2013
Ian Bird WLCG Workshop San Francisco, 8th October 2016
“The Challenge of Big Data in Science”
Recap: introduction to e-science
Data Management & Analysis in MATTER
EOSCpilot All Hands Meeting 8 March 2018 Pisa
Malte Dreyer – Matthias Razum
Brian Matthews STFC EOSCpilot Brian Matthews STFC
Welcome to 5th International LSDMA Symposium “The Challenge of Big Data in Science” Prof. Dr. Michael Decker Head of Division 2 – Informatics, Economics.
EO[D|S]C Christoph Reimer EODC – Earth Observation Data Centre.
Expand portfolio of EGI services
Presentation transcript:

The Helmholtz Association Project „Large Scale Data Management and Analysis“ (LSDMA) Kilian Schwarz, GSI; Christopher Jung, KIT

Christopher JungSCC, KIT Overview Motivation Data Life Cycle LSDMA’s dual approach Facts and Numbers Initial Communities LSDMA, FAIR and ALICE

Christopher JungSCC, KIT Why is Scientific Big Data important? Honestly, I do not need to explain this to you.

Christopher JungSCC, KIT Examples of Scientific Big Data in non-HEP Examples for sciences with Big Data: Systems Biology: ~10 TB per day in high- throughput microscopy (zebra fish embryos) Climate simulation: PB per year Brain research: 1 PB per year for brain mapping Photon Science: XFEL 10 PB/year and many other sciences which do know their needs yet

Christopher JungSCC, KIT Challenges of Big Data Non-reproducibility of scientific data (or at high costs) Current analysis methods scale poorly Existing big data knowledge in the respective fields Each discipline has its specific needs Multidiscliplanary research Metadata Authentication and authorization (single sign-on) Data privacy (incl. removal of private data) “Good scientific practice” Cost estimation for long-term archival (at different service levels) Data preservation Open Access …

Christopher JungSCC, KIT Data Life Cycle Inspiration for LSDMA: support the whole data life cycle!

Christopher JungSCC, KIT Dual approach: community-specific and generic Data Life Cycle Labs Joint r&d with the scientific user communities –Optimization of the data life cycle –Community-specific data analysis tools and services Data Services Integration Team Generic r&d –Interface between federated data infrastructures and DLCLs/communities –Integration of data services into scientific working process

Christopher JungSCC, KIT Facts and numbers Initial project period: Funded by Helmholtz Association (13 MEUR for 5 years) To become a part of the sustainable program-oriented funding of Helmholtz Association in 2015 Partners: 4 Helmholtz research centers, 6 universities and the German climate research center Leading project partner: KIT

Christopher JungSCC, KIT Initial communities Energy –Smart grids, battery research, fusion research Earth and Environment –Climate model, environmental satellite data Health –Virtual human brain map Key Technologies –Synchroton radiation, nanoscopy, systems biology, electron- microscopical imaging techniques Structure of Matter –Photon Science: Petra 3, XFEL (14 experiments with big and small communities)

Christopher JungSCC, KIT LHC Computing – Prototype for FAIR FAIR profits from computing experience within an already running experiment ALICE can test new developments in FAIR new FAIR developments are on the way, and to some extend they already go back to ALICE FAIR will play an increasing role (funding, network architecture, software development and more...)

Christopher JungSCC, KIT parallel and distributed computing –triggerless “online” system porting of needed algorithms to GPU –Grid/Cloud infrastructure enable the possibility to submit compute jobs to Clouds –create interfaces to existing environments (AliEn,...) data archives –long term data archives including concepts for xrootd and gStore –meta data calatog and data analysis To be developed within LSDMA (DLCL: structure of matter) in collaboration with LSDMA – DSIT, the FAIR community, and ALICE (whereever synergy can be found) Goals for GSI/FAIR in LSDMA Metropolitan Area Systems –include the distributed FAIR T0/T1 centre into a global Grid/Cloud infrastructure –Federated Identity Management Global Federations –Global File System –Optimization of Data Storage hot versus cold data corrupt and incomplete data sets parallel storage 3rd party copy Additional synergies via DSIT

Christopher JungSCC, KIT Next Steps at GSI Advertise LSDMA positions (2 for FAIR DLCL) – do you know candidates ? –GSI DSIT already started to hire people Discussion with FAIR experiments and ALICE Set-up of e-science infrastructures, first for PANDA and CBM, based on the experiences with ALICE (AliEn/xrootd/...) Include smaller FAIR experiments Continue to develop existing e-science infrastructure, also in close collaboration with DSIT and ALICE

Christopher JungSCC, KIT Summary and Outlook There are many challenges in Scientific Big Data LSDMA is a sustainable Helmholtz Association project, supporting the whole data life cycle, using a community-specific and a generic approach FAIR is an important initial community in the research field ‘structure of matter’; several developments planned -> synergies w/ALICE GSI has two open job positions for LSDMA