Spatial Data Management Challenges in the Simulation Sciences

Slides:



Advertisements
Similar presentations
Hit or Miss ? !!!.  Cache RAM is high-speed memory (usually SRAM).  The Cache stores frequently requested data.  If the CPU needs data, it will check.
Advertisements

Presented by Marie-Gisele Assigue Hon Shea Thursday, March 31 st 2011.
1 Searching the Web Junghoo Cho UCLA Computer Science.
Complexity Analysis (Part I)
1 Internet and Data Management Junghoo “John” Cho UCLA Computer Science.
New Challenges in Cloud Datacenter Monitoring and Management
Data Structures for Computer Graphics Point Based Representations and Data Structures Lectured by Vlastimil Havran.
C-Store: A Column-oriented DBMS Speaker: Zhu Xinjie Supervisor: Ben Kao.
World space = physical space, contains robots and obstacles Configuration = set of independent parameters that characterizes the position of every point.
Processing Monitoring Queries on Mobile Objects Lecture for COMS 587 Department of Computer Science Iowa State University.
Hopkins Storage Systems Lab, Department of Computer Science A Workload-Driven Unit of Cache Replacement for Mid-Tier Database Caching Xiaodan Wang, Tanu.
Computational issues in Carbon nanotube simulation Ashok Srinivasan Department of Computer Science Florida State University.
Trace Generation to Simulate Large Scale Distributed Application Olivier Dalle, Emiio P. ManciniMar. 8th, 2012.
HPDC 2014 Supporting Correlation Analysis on Scientific Datasets in Parallel and Distributed Settings Yu Su*, Gagan Agrawal*, Jonathan Woodring # Ayan.
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
So far we have covered … Basic visualization algorithms Parallel polygon rendering Occlusion culling They all indirectly or directly help understanding.
Daniel J. Abadi · Adam Marcus · Samuel R. Madden ·Kate Hollenbach Presenter: Vishnu Prathish Date: Oct 1 st 2013 CS 848 – Information Integration on the.
ICPP 2012 Indexing and Parallel Query Processing Support for Visualizing Climate Datasets Yu Su*, Gagan Agrawal*, Jonathan Woodring † *The Ohio State University.
Astra A Space Charge Tracking Algorithm
Resource Addressable Network (RAN) An Adaptive Peer-to-Peer Substrate for Internet-Scale Service Platforms RAN Concept & Design  Adaptive, self-organizing,
Computer Science Department In-N-Out: Reproducing Out-of-Order Superscalar Processor Behavior from Reduced In-Order Traces Kiyeon Lee and Sangyeun Cho.
Leonardo Guerreiro Azevedo Geraldo Zimbrão Jano Moreira de Souza Approximate Query Processing in Spatial Databases Using Raster Signatures Federal University.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Click to edit Master title style HCCMeshes: Hierarchical-Culling oriented Compact Meshes Tae-Joon Kim 1, Yongyoung Byun 1, Yongjin Kim 2, Bochang Moon.
Data Grid Research Group Dept. of Computer Science and Engineering The Ohio State University Columbus, Ohio 43210, USA David Chiu & Gagan Agrawal Enabling.
SAGA: Array Storage as a DB with Support for Structural Aggregations SSDBM 2014 June 30 th, Aalborg, Denmark 1 Yi Wang, Arnab Nandi, Gagan Agrawal The.
Replicating Memory Behavior for Performance Skeletons Aditya Toomula PC-Doctor Inc. Reno, NV Jaspal Subhlok University of Houston Houston, TX By.
Monitoring k-NN Queries over Moving Objects Xiaohui Yu University of Toronto Joint work with Ken Pu and Nick Koudas.
Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.
Remarks on the TAU grid adaptation Thomas Gerhold.
Copyright © 2006, GemStone Systems Inc. All Rights Reserved. Increasing computation throughput with Grid Data Caching Jags Ramnarayan Chief Architect GemStone.
Chapter 5 Index and Clustering
1 Cache-Oblivious Query Processing Bingsheng He, Qiong Luo {saven, Department of Computer Science & Engineering Hong Kong University of.
Thomas Heinis* Eleni Tzirita Zacharatou ‡ Farhan Tauheed § Anastasia Ailamaki ‡ RUBIK: Efficient Threshold Queries on Massive Time Series § Oracle Labs,
Data Grid Research Group Dept. of Computer Science and Engineering The Ohio State University Columbus, Ohio 43210, USA David Chiu and Gagan Agrawal Enabling.
Background Computer System Architectures Computer System Software.
Database Systems, 8 th Edition SQL Performance Tuning Evaluated from client perspective –Most current relational DBMSs perform automatic query optimization.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
DB Index Expert Copyright © SoftTree Technologies, Inc.
Oracle Announced New In- Memory Database G1 Emre Eftelioglu, Fen Liu [09/27/13] 1 [1]
BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University Joint work with Alex Fabrikant,
Fast Data Analysis with Integrated Statistical Metadata in Scientific Datasets By Yong Chen (with Jialin Liu) Data-Intensive Scalable Computing Laboratory.
Massively Parallel Cosmological Simulations with ChaNGa Pritish Jetley, Filippo Gioachin, Celso Mendes, Laxmikant V. Kale and Thomas Quinn.
A Flexible Spatio-temporal indexing Scheme for Large Scale GPS Tracks Retrieval Yu Zheng, Longhao Wang, Xing Xie Microsoft Research.
Complexity Analysis (Part I)
T-Share: A Large-Scale Dynamic Taxi Ridesharing Service
Recognition of biological cells – development
Parallel Databases.
Querying and Analysing Big Scientific Data
Dynamic Graph Partitioning Algorithm
Modern Data Management
A Replica Location Service
Location Cloaking for Location Safety Protection of Ad Hoc Networks
So far we have covered … Basic visualization algorithms
SpatialHadoop: A MapReduce Framework for Spatial Data
Spatial Online Sampling and Aggregation
Yu Su, Yi Wang, Gagan Agrawal The Ohio State University
Optimizing MapReduce for GPUs with Effective Shared Memory Usage
Degree-aware Hybrid Graph Traversal on FPGA-HMC Platform
THERMAL-JOIN: A Scalable Spatial Join for Dynamic Workloads
Data Structures & Algorithms Union-Find Example
CSE572, CBS572: Data Mining by H. Liu
Mosaic code update: mapping the kronian magnetosphere
Declarative Transfer Learning from Deep CNNs at Scale
Outline Ganesan, D., Greenstein, B., Estrin, D., Heidemann, J., and Govindan, R. Multiresolution storage and search in sensor networks. Trans. Storage.
Advanced Geospatial Techniques: Aiding Earth Observation Applications
Complexity Analysis (Part I)
Computational issues Issues Solutions Large time scale
Border Control: Sandboxing Accelerators
Complexity Analysis (Part I)
Presentation transcript:

Spatial Data Management Challenges in the Simulation Sciences Thomas Heinis, Farhan Tauheed, Anastasia Ailamaki

Simulation-Based Scientific Discovery Simulating natural phenomena leads to in-depth understanding!

Simulation Background Multitude of Analysis & Update Queries SIMULATION Analysis Queries MONITOR Time step time MONITOR SIMULATION Time step 1 Time step 2 Time step 3 → Massive Updates, Few Queries → In-Memory Spatial Indexing

In-Memory Spatial Indexing Using current approaches: Time breakdown: Most potential in reducing comparisons!

Research Directions Avoid data-oriented partitioning Avoid hierarchical structure → use uniform space-oriented partitioning → Configuration of the grid is difficult! Compression has only limited potential! Cache alignment is crucial

Massive Changes Simulations cause massive changes E.g., neural plasticity simulation: all elements move average change of 0.04μm in 285μm3 only 0.5% move more than 0.1μm Considerable indexing overhead State-of-the-art: moving object indexes assume known trajectory grace windows shift overhead to query execution 38%

Research Directions Use data itself! Use uniform grid as index Inherent connectivity Added connectivity (FLAT [ICDE 2012]) Use uniform grid as index Small overhead to build Only few objects change cell → few updates required → Tunable parameter grid resolution Big cells → few updates but inefficient queries Small cells → many updates but efficient queries Simulation Timestep

Spatial Indexing for the Simulation Sciences Challenges are not independent New class of spatial indexes: Efficient in main memory Efficient support of massive updates Likely based on grid: quick to build & only few updates needed More challenges ahead! New design point: probably not going to be as fast as known spatial indexes, but there is the trade off of updating fast and querying Out of core model building (large-scale spatial join) In-situ compression New storage media

Spatial Data Management Challenges in the Simulation Sciences Thomas Heinis, Farhan Tauheed, Anastasia Ailamaki Thank You!