Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Helmholtz Association Project „Large Scale Data Management and Analysis“ (LSDMA) Kilian Schwarz, GSI; Christopher Jung, KIT.

Similar presentations

Presentation on theme: "The Helmholtz Association Project „Large Scale Data Management and Analysis“ (LSDMA) Kilian Schwarz, GSI; Christopher Jung, KIT."— Presentation transcript:

1 The Helmholtz Association Project „Large Scale Data Management and Analysis“ (LSDMA) Kilian Schwarz, GSI; Christopher Jung, KIT

2 205.10.2012Christopher JungSCC, KIT Overview Motivation Data Life Cycle LSDMA’s dual approach Facts and Numbers Initial Communities LSDMA, FAIR and ALICE

3 305.10.2012Christopher JungSCC, KIT Why is Scientific Big Data important? Honestly, I do not need to explain this to you.

4 405.10.2012Christopher JungSCC, KIT Examples of Scientific Big Data in non-HEP Examples for sciences with Big Data: Systems Biology: ~10 TB per day in high- throughput microscopy (zebra fish embryos) Climate simulation: 10-100 PB per year Brain research: 1 PB per year for brain mapping Photon Science: XFEL 10 PB/year and many other sciences which do know their needs yet

5 505.10.2012Christopher JungSCC, KIT Challenges of Big Data Non-reproducibility of scientific data (or at high costs) Current analysis methods scale poorly Existing big data knowledge in the respective fields Each discipline has its specific needs Multidiscliplanary research Metadata Authentication and authorization (single sign-on) Data privacy (incl. removal of private data) “Good scientific practice” Cost estimation for long-term archival (at different service levels) Data preservation Open Access …

6 605.10.2012Christopher JungSCC, KIT Data Life Cycle Inspiration for LSDMA: support the whole data life cycle!

7 705.10.2012Christopher JungSCC, KIT Dual approach: community-specific and generic Data Life Cycle Labs Joint r&d with the scientific user communities –Optimization of the data life cycle –Community-specific data analysis tools and services Data Services Integration Team Generic r&d –Interface between federated data infrastructures and DLCLs/communities –Integration of data services into scientific working process

8 805.10.2012Christopher JungSCC, KIT Facts and numbers Initial project period: 1.1.2012-31.12.2016 Funded by Helmholtz Association (13 MEUR for 5 years) To become a part of the sustainable program-oriented funding of Helmholtz Association in 2015 Partners: 4 Helmholtz research centers, 6 universities and the German climate research center Leading project partner: KIT

9 905.10.2012Christopher JungSCC, KIT Initial communities Energy –Smart grids, battery research, fusion research Earth and Environment –Climate model, environmental satellite data Health –Virtual human brain map Key Technologies –Synchroton radiation, nanoscopy, systems biology, electron- microscopical imaging techniques Structure of Matter –Photon Science: Petra 3, XFEL –FAIR@GSI (14 experiments with big and small communities)

10 1005.10.2012Christopher JungSCC, KIT LHC Computing – Prototype for FAIR FAIR profits from computing experience within an already running experiment ALICE can test new developments in FAIR new FAIR developments are on the way, and to some extend they already go back to ALICE FAIR will play an increasing role (funding, network architecture, software development and more...)

11 1105.10.2012Christopher JungSCC, KIT parallel and distributed computing –triggerless “online” system porting of needed algorithms to GPU –Grid/Cloud infrastructure enable the possibility to submit compute jobs to Clouds –create interfaces to existing environments (AliEn,...) data archives –long term data archives including concepts for xrootd and gStore –meta data calatog and data analysis To be developed within LSDMA (DLCL: structure of matter) in collaboration with LSDMA – DSIT, the FAIR community, and ALICE (whereever synergy can be found) Goals for GSI/FAIR in LSDMA Metropolitan Area Systems –include the distributed FAIR T0/T1 centre into a global Grid/Cloud infrastructure –Federated Identity Management Global Federations –Global File System –Optimization of Data Storage hot versus cold data corrupt and incomplete data sets parallel storage 3rd party copy Additional synergies via DSIT

12 1205.10.2012Christopher JungSCC, KIT Next Steps at GSI Advertise LSDMA positions (2 for FAIR DLCL) – do you know candidates ? –GSI DSIT already started to hire people Discussion with FAIR experiments and ALICE Set-up of e-science infrastructures, first for PANDA and CBM, based on the experiences with ALICE (AliEn/xrootd/...) Include smaller FAIR experiments Continue to develop existing e-science infrastructure, also in close collaboration with DSIT and ALICE

13 1305.10.2012Christopher JungSCC, KIT Summary and Outlook There are many challenges in Scientific Big Data LSDMA is a sustainable Helmholtz Association project, supporting the whole data life cycle, using a community-specific and a generic approach FAIR is an important initial community in the research field ‘structure of matter’; several developments planned -> synergies w/ALICE GSI has two open job positions for LSDMA

Download ppt "The Helmholtz Association Project „Large Scale Data Management and Analysis“ (LSDMA) Kilian Schwarz, GSI; Christopher Jung, KIT."

Similar presentations

Ads by Google