Lehrstuhl Informatik III: Datenbanksysteme Grid-based Data Stream Processing in e-Science 1 Richard Kuntschke 1, Tobias Scholl 1, Sebastian Huber 1, Alfons.

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

Lehrstuhl Informatik III: Datenbanksysteme Community Training: Partitioning Schemes in Good Shape for Federated Data Grids Tobias Scholl, Richard Kuntschke,
A Lightweight Platform for Integration of Mobile Devices into Pervasive Grids Stavros Isaiadis, Vladimir Getov University of Westminster, London {s.isaiadis,
Technische universität dortmund fakultät für informatik informatik 12 Specifications and Modeling Peter Marwedel TU Dortmund, Informatik
1 Vorlesung Informatik 2 Algorithmen und Datenstrukturen (Parallel Algorithms) Robin Pomplun.
Lehrstuhl Informatik III: Datenbanksysteme Astrometric Matching - E-Science Workflow 1 Lehrstuhl Informatik III: 1 Datenbanksysteme 1 Fakultät für Informatik.
1 Towards an Open Service Framework for Cloud-based Knowledge Discovery Domenico Talia ICAR-CNR & UNIVERSITY OF CALABRIA, Italy Cloud.
1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.
1 Chapter 40 - Physiology and Pathophysiology of Diuretic Action Copyright © 2013 Elsevier Inc. All rights reserved.
By D. Fisher Geometric Transformations. Reflection, Rotation, or Translation 1.
Integrating 3D Geodata in Service-Based Visualization Systems Jan Klimke, Dieter Hildebrandt, Benjamin Hagedorn, and Jürgen Döllner Computer Graphics Systems.
Agrartechnik Hohenheim 1 Universität Hohenheim, Institute of Agricultural Engineering, Livestock Systems Engineering (Director: Prof. Dr. T. Jungbluth)
…to Ontology Repositories Mathieu dAquin Knowledge Media Institute, The Open University From…
© 2007 Open Grid Forum Data Management Challenge - The View from OGF OGF22 – February 28, 2008 Cambridge, MA, USA Erwin Laure David E. Martin Data Area.
Business Transaction Management Software for Application Coordination 1 Business Processes and Coordination.
Designing Services for Grid-based Knowledge Discovery A. Congiusta, A. Pugliese, Domenico Talia, P. Trunfio DEIS University of Calabria ITALY
Höchstleistungsrechenzentrum Stuttgart SEGL Parameter Study Slide 1 Science Experimental Grid Laboratory (SEGL) Dynamical Parameter Study in Distributed.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
0 - 0.
ALGEBRAIC EXPRESSIONS
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
MULTIPLYING MONOMIALS TIMES POLYNOMIALS (DISTRIBUTIVE PROPERTY)
ADDING INTEGERS 1. POS. + POS. = POS. 2. NEG. + NEG. = NEG. 3. POS. + NEG. OR NEG. + POS. SUBTRACT TAKE SIGN OF BIGGER ABSOLUTE VALUE.
SUBTRACTING INTEGERS 1. CHANGE THE SUBTRACTION SIGN TO ADDITION
MULT. INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
Addition Facts
1 9 Moving to Design Lecture Analysis Objectives to Design Objectives Figure 9-2.
ZMQS ZMQS
Lehrstuhl Informatik III: Datenbanksysteme 1 HiSbase – Informationsfusion in P2P Netzwerken Tobias Scholl, Bernhard Bauer, Benjamin Gufler, Richard Kuntschke,
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 14 Slide 1 Object-oriented Design 1.
INTRODUCTION TO SIMULATION WITH OMNET++ José Daniel García Sánchez ARCOS Group – University Carlos III of Madrid.
Describing Complex Products as Configurations using APL Arrays.
AMES-Cloud: A Framework of Adaptive Mobile Video Streaming and Efficient Social Video Sharing in the Clouds 作者:Xiaofei Wang, MinChen, Ted Taekyoung Kwon,
1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs.
© S Haughton more than 3?
25 July, 2014 Martijn v/d Horst, TU/e Computer Science, System Architecture and Networking 1 Martijn v/d Horst
25 July, 2014 Martijn v/d Horst, TU/e Computer Science, System Architecture and Networking 1 Martijn v/d Horst
5 August, 2014 Martijn v/d Horst, TU/e Computer Science, System Architecture and Networking 1 Martijn v/d Horst
Twenty Questions Subject: Twenty Questions
Energy & Green Urbanism Markku Lappalainen Aalto University.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 4 Slide 1 Software processes 2.
Past Tense Probe. Past Tense Probe Past Tense Probe – Practice 1.
Scalable and Dynamic Quorum Systems Moni Naor & Udi Wieder The Weizmann Institute of Science.
Executional Architecture
Chapter 5 Test Review Sections 5-1 through 5-4.
Addition 1’s to 20.
25 seconds left…...
Test B, 100 Subtraction Facts
Fakultät für informatik informatik 12 technische universität dortmund Lab 3: Scheduling Solution - Session 10 - Heiko Falk TU Dortmund Informatik 12 Germany.
Week 1.
Chapter 10: The Traditional Approach to Design
Systems Analysis and Design in a Changing World, Fifth Edition
We will resume in: 25 Minutes.
1 Ke – Kitchen Elements Newport Ave. – Lot 13 Bethesda, MD.
1 Unit 1 Kinematics Chapter 1 Day
Chapter 11: Systems Development and Procurement Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall Chapter
1 PART 1 ILLUSTRATION OF DOCUMENTS  Brief introduction to the documents contained in the envelope  Detailed clarification of the documents content.
How Cells Obtain Energy from Food
From Model-based to Model-driven Design of User Interfaces.
Traversing symmetric NAT with predictable port allocation function SIN 2014 Dušan Klinec, Vashek Matyáš Faculty of Informatics, Masaryk University.
Martin Wagner and Gudrun Klinker Augmented Reality Group Institut für Informatik Technische Universität München December 19, 2003.
20 Spatial Queries for an Astronomer's Bench (mark) María Nieto-Santisteban 1 Tobias Scholl 2 Alexander Szalay 1 Alfons Kemper 2 1. The Johns Hopkins University,
The GAVO Cross-Matcher Application Hans-Martin Adorf, Gerard Lemson, Wolfgang Voges GAVO, Max-Planck-Institut für extraterrestrische Physik, Garching b.
Assembly and Classification of Spectral Energy Distributions – A New VO Web Service Hans-Martin Adorf, GAVO, Max-Planck-Institut für extraterr. Physik,
Efficient Opportunistic Sensing using Mobile Collaborative Platform MOSDEN.
StreamGlobe: P2P Stream Sharing
B. Stegmaier und R. Kuntschke TU München – Fakultät für Informatik
A. Kemper, R. Kuntschke, and B. Stegmaier
Data Stream Sharing Richard Kuntschke and Alfons Kemper
Presentation transcript:

Lehrstuhl Informatik III: Datenbanksysteme Grid-based Data Stream Processing in e-Science 1 Richard Kuntschke 1, Tobias Scholl 1, Sebastian Huber 1, Alfons Kemper 1, Angelika Reiser 1, Hans-Martin Adorf 2, Gerard Lemson 3, and Wolfgang Voges 3 1 Lehrstuhl Informatik III: 1 Datenbanksysteme 1 Fakultät für Informatik 1 Technische Universität München 2 Max-Planck-Institut 2 für Astrophysik 3 Max-Planck-Institut 3 für extraterrestrische 3 Physik

Lehrstuhl Informatik III: Datenbanksysteme 2 Grid-based Data Stream Processing in e-Science Important Challenges in e-Science In general: Large and exponentially growing amounts of data Distributed data archives No unique identifiers Uncertainty In astrophysics: Spectral Energy Distributions (SEDs) Used to classify celestial objects (active galactic nuclei, brown dwarfs, neutron stars,...) Generation requires spatial (astrometric) matching

Lehrstuhl Informatik III: Datenbanksysteme 3 Grid-based Data Stream Processing in e-Science Spatial (Astrometric) Matching Current solutions … … load all data into main memory Uses a lot of memory Infeasible if memory size is insufficient … process all data at once and deliver the complete result at the end Inefficient No results until all processing has completed

Lehrstuhl Informatik III: Datenbanksysteme 4 Grid-based Data Stream Processing in e-Science Our Contributions StarGlobe Grid-based P2P Data Stream Management System implemented on top of Globus In-network processing Early filtering Parallelization Pipelining Load-balancing Mobile user-defined operators Astrophysical Example Workflow Astrometric matching Performance evaluation

Lehrstuhl Informatik III: Datenbanksysteme 5 Grid-based Data Stream Processing in e-Science The StarGlobe Architecture filter transform Load mobile operators Fct-Provider filter transform Stream 1 Publish Query 2 Subscribe

Lehrstuhl Informatik III: Datenbanksysteme 6 Grid-based Data Stream Processing in e-Science Traditional Approach: Bring Data to Code union NN_10 Data-Prov. B Data-Prov. CData-Prov. D Data-Prov. A

Lehrstuhl Informatik III: Datenbanksysteme 7 Grid-based Data Stream Processing in e-Science New Approach: Bring Code to Data Data-Prov. A scan NN_10 Data-Prov. B scan NN_10 Data-Prov. C scan NN_10 Data-Prov. D scan NN_10 union NN_10 Fct-Provider NN_10

Lehrstuhl Informatik III: Datenbanksysteme 8 Grid-based Data Stream Processing in e-Science Mobile User-Defined Operators Load user-defined operators from function provider servers in the network Common interface for integrating external operators Push-based iterator Flexibility

Lehrstuhl Informatik III: Datenbanksysteme 9 Grid-based Data Stream Processing in e-Science StreamIterator Interface open(Config, StreamWriter) Configuration parameters Writer for result stream next(StreamIteratorEvent) Next element in input stream Writing output to result stream using StreamWriter.write() close()

Lehrstuhl Informatik III: Datenbanksysteme 10 Grid-based Data Stream Processing in e-Science Communication between StreamProcessor and StreamIterator

Lehrstuhl Informatik III: Datenbanksysteme 11 Grid-based Data Stream Processing in e-Science Astrophysical Example Workflow

Lehrstuhl Informatik III: Datenbanksysteme 12 Grid-based Data Stream Processing in e-Science Distributed Query Evaluation Plan

Lehrstuhl Informatik III: Datenbanksysteme 13 Grid-based Data Stream Processing in e-Science Distributed Query Evaluation Plan

Lehrstuhl Informatik III: Datenbanksysteme 14 Grid-based Data Stream Processing in e-Science Distributed Query Evaluation Plan

Lehrstuhl Informatik III: Datenbanksysteme 15 Grid-based Data Stream Processing in e-Science Evaluation of Early Filtering

Lehrstuhl Informatik III: Datenbanksysteme 16 Grid-based Data Stream Processing in e-Science Conclusion Synergies between research in computer science and other scientific disciplines, e.g., astrophysics StarGlobe Handling large data volumes efficiently Early filtering, parallelization, pipelining Returning first results early on Pipelining Flexible support of domain-specific application logic Mobile user-defined operators Results also applicable to other domains

Lehrstuhl Informatik III: Datenbanksysteme 17 Grid-based Data Stream Processing in e-Science The SED Scenario 1.Catalog query 2.Spatial (astrometric) matching 3.Assembly of raw photometry 4.Photometric transformation 5.SED classification

Lehrstuhl Informatik III: Datenbanksysteme 18 Grid-based Data Stream Processing in e-Science Peer Architecture Built on top of Open Grid Services Architecture (OGSA) Peers implemented as communicating grid services Availability of services according to capabilities of peers FluX query engine for subscription evaluation

Lehrstuhl Informatik III: Datenbanksysteme 19 Grid-based Data Stream Processing in e-Science StreamProcessor Input stream Output stream write(StreamIteratorEvent) StreamIteratorEvent Communication between StreamProcessor and StreamIterator StreamIterator StreamIteratorEvent open(Config,Writer) photon StreamIteratorEvent next(StreamIteratorEvent) photon StreamIteratorEvent next(StreamIteratorEvent) StreamIteratorEvent

Lehrstuhl Informatik III: Datenbanksysteme 20 Grid-based Data Stream Processing in e-Science Sequence Diagram