Query Execution on NetTraveler Angel L. Villalaín-García Manuel Rodríguez-Martínez University of Puerto Rico - Mayaguez Campus
WALSAIP 1/6/ Objectives Develop a framework for Parallel and Distributed Query Optimization and Execution on NetTraveler Develop a framework for Parallel and Distributed Query Optimization and Execution on NetTraveler Facilitate and Optimize the access of data across WANs Facilitate and Optimize the access of data across WANs Transparent data access Transparent data access Uniform access interface Uniform access interface Robust operation by exploiting replication Robust operation by exploiting replication
WALSAIP 1/6/ Road Map Objectives Objectives Motivation Motivation Problem Formulation Problem Formulation Proposed Solution Proposed Solution Execution Example Execution Example Contributions Contributions Technical Details Technical Details Status Status Next Steps Next Steps Summary Summary Questions & Demo Questions & Demo
WALSAIP 1/6/ Motivation
5 Problem Formulation Dispersed and Heterogeneous data sources No uniformity on WANs Several limitations Bandwidth Memory Power Processing Capabilities
WALSAIP 1/6/ Problem Formulation (cont.) Traditional DBMS Plan Centralized Query Optimizer Scan Relations Select A.id, A.name from A,B where A.id = B.id and A.sage <30
WALSAIP 1/6/ Proposed Solution Decentralized Query Optimizer Distributed and Parallel DBMS Plan Scan Replicated Relations Select A.id, A.name from A,B where A.id = B.id and A.sage <30
WALSAIP 1/6/ Execution Example Q QSB 1 Knows: QSB 2, QSB 3, QSB 4 QSB 2 QSB 3 QSB 4 QQ IG 2 IG 1 R R R R IG 3
WALSAIP 1/6/ Replicates Management Manage Partitions Pre HashedOn The Fly Hashing Mechanism Parallel Ops IG Level Operations Memory Management Schedulers Mechanisms Hash Join Exchange Op Parallel Ops QSB Level Operations Site Management Scheduling Management Parallel & Distributed Ops Physical Optimizer Logical Optimizer Optimizer Level Operations Technical Details
WALSAIP 1/6/ Contributions Facilitate Integration for scientific applications Facilitate Integration for scientific applications Heterogeneous data sources Heterogeneous data sources Heterogeneous schemas Heterogeneous schemas Load Balancing Load Balancing Spread work to various nodes Spread work to various nodes Robustness Robustness Can get data from multiple sources Can get data from multiple sources Asynchronous Asynchronous Dynamically replace nodes used for processing Dynamically replace nodes used for processing Decentralized Query Optimization Decentralized Query Optimization
WALSAIP 1/6/ Status Manage Replicates Manage Partitions Pre HashedOn The Fly Hashing Mechanism Parallel Ops IG Level Operations Memory Management Schedulers Mechanisms Hash Join Exchange Op Parallel Ops QSB Level Operations Site Management Scheduling Management Parallel & Distributed Ops Physical Optimizer Logical Optimizer Optimizer Level Operations
WALSAIP 1/6/ Next Steps Study Scheduling Effect and improvements Study Scheduling Effect and improvements Hash Join operators and functionalities Hash Join operators and functionalities User interface for configuration and demonstration purposes User interface for configuration and demonstration purposes
WALSAIP 1/6/ Additional Areas of Research Distributed Catalog Manager Distributed Catalog Manager Oliver Moreno Oliver Moreno Server-Side Query Recovery Mechanism Server-Side Query Recovery Mechanism Victor Kareh Victor Kareh NetTraveler System Administration NetTraveler System Administration Osvaldo Ferrero Osvaldo Ferrero
WALSAIP 1/6/ Summary Facilitate and Optimize the access of data across WANs Facilitate and Optimize the access of data across WANs Query Parallelization and Execution Query Parallelization and Execution Exploiting Replication Exploiting Replication Response Time improvement Response Time improvement See website for API and user manual See website for API and user manual
WALSAIP 1/6/ Demo and Questions