Li Weng, Umit Catalyurek, Tahsin Kurc, Gagan Agrawal, Joel Saltz Using Space and Attribute Partitioned Partial Replicas for Data Subsetting and Aggregation Queries Li Weng, Umit Catalyurek, Tahsin Kurc, Gagan Agrawal, Joel Saltz
Motivation: Data-Driven Science Oil Reservoir Management Magnetic Resonance Imaging Data-driven applications from science, Engineering, biomedicine: Large Spatio-temporal datasets Several attributes at each point 11/11/2018 ICPP2006
Replication of Scientific Datasets A variety of queries on the same dataset Each requires different spatial-temporal region and subset of attributes No chunking and indexing strategy can optimize for all Replication: Create multiple copies Use different chunking and indexing schemes Large storage overhead 11/11/2018 ICPP2006
Partial Replication Can we get benefits of replication without the large overheads ? Not all attributes accessed uniformly Not all spatio-temporal regions accessed with uniform probability Partial Replication: Each replica has Only a subset of attributes (attribute partitioned) and/or Only a rectilinear spatio-temporal region (space partitioned) Challenge: No single partial replica may be able to answer the query Can we choose and combine partial replicas to optimize query processing ? 11/11/2018 ICPP2006
Prior Work (CCGRID 05) Query planning with partial replicas Cost models Greedy selection algorithm Only considered space partitioned replicas Consider SELECT SQL queries Implemented as an extension to Automatic Data Virtualization System (HPDC 04) 11/11/2018 ICPP2006
Contributions Support combined use of space and attribute partitioned partial replicas Dynamic programming algorithm for selecting the best set of attribute partitioned replicas New greedy strategy for recommending a combination of replicas Extend replica selection algorithm to address queries with aggregations -- replicas may be unevenly stored across storage units 11/11/2018 ICPP2006
System Overview The Replica Selection Module is coupled tightly with our prior work of supporting SQL Select queries on scientific datasets in a cluster environment. 11/11/2018 ICPP2006
Outline Introduction Query execution and algorithm design Motivation Contributions System overview Query execution and algorithm design Uniformly partitioned chunks and select queries Uneven partitioning and aggregation operation Experimental results Related work Conclusions 11/11/2018 ICPP2006
Uniformly Partitioned Chunks and Select Queries Computing Goodness Value goodness = useful dataper-chunk / costper-chunk Chunk: an atomic unit in space partitioned replicas or a logic unit in attribute partitioned replicas Full chunks and partial chunks of a partial replica Cost per-chunk = tread * nread + tseek tread : average read time for a disk page nread : number of pages fetched tseek : average seek time Fragment intermediate unit between a replica and its chunks a group of full or partial chunks having same goodness value in a replica goodnessper-fragmen = useful dataper-fragment / costper-fragment 11/11/2018 ICPP2006
An Example – Query and Intersecting Replicas 3 full chunks and 2 partial chunks 3 fragments Composite Replica 2 10 full chunks 1 fragment 11/11/2018 ICPP2006
General Structure of Replica Selection Algorithm 11/11/2018 ICPP2006
Dynamic Programming Algorithm Input R Calculate the Costj,j Dynamic Programming Algorithm R: a group of attribute-partitioned replicas R’: the optimal combination output l: the number of referred attributes in Q M1..l: the referred attribute list Foreach k from 2 to l Foreach u from 1 to l-k+1 Yes r1 contains only Mu..v No Calculate Costu..v, Locu..v->s=-1, Locu..v->r=r1 Yes r2 contains Mu..v No Calculate Costu..v, Locu..v->s=-1, Locu..v->r=r2 Costu..v=∞ Find the qmin=Costu..p+Costp+1..v Costu..v=q, Locu..v->s=p, Locu..v->r=-1 Output(loc1..l) Output R’ 11/11/2018 ICPP2006
Greedy Strategy Q : an issued query R : the partial replicas Input Q, R, D Greedy Strategy Q : an issued query R : the partial replicas D : the original dataset F : all fragments intersecting with the query boundary Fmax : the fragment with the maximum goodness value in F S : the ordered list of the candidate fragments in decreasing order of their goodness value Calculate the fragment set F Yes F is null? No Append Fmax Into S Remove Fmax from F No Overlap with Fmax exists in F? Yes Subtract the overlap Re-compute the goodness value Add D if needed Output S 11/11/2018 ICPP2006
Uneven Partitioning and Aggregation Operations Computing Goodness Value Goodness(F) = ΣpᄐP data(F) /maxpᄐP (costp(CurLoad)+costp(F)) P : all available storage nodes CurLoad : current workload across all storage nodes due to previously chosen candidate replicas Cost fragment = tread*nread+tseek* nseek+tfilter*nfilter+tagg*nagg+ttrans*ntrans tfilter : average filtering time for a tuple nfilter : number of total tuples in all chunks taggr : average aggregate computation time for a tuple naggr : number of total useful tuples ttrans : network transfer time for one unit of data ntrans : the amount of data after aggregate operation 11/11/2018 ICPP2006
Workload aware greedy strategy Input Q, F, D Foreach Fi in F Workload aware greedy strategy Q : an issued query F : the interesting fragment sets D : the original dataset F : all fragments intersecting with the query boundary Fmax : the fragment with the maximum goodness value in F S : the ordered list of the candidate fragments in decreasing order of their goodness value Yes Overlap with F-{Fi} exists? No Append Fi into S Yes F is NULL? No Calculate the current goodness value for Fi in F Append Fmax Into S Remove Fmax from F Overlap with Fmax exists in F? No Yes Subtract the overlap Add D if needed Output S 11/11/2018 ICPP2006
Outline Introduction Query execution and algorithm design Motivation Contributions System overview Query execution and algorithm design Uniformly partitioned chunks and select queries Uneven partitioning and aggregation operation Experimental results Related work Conclusions 11/11/2018 ICPP2006
Experimental Setup & Design A Linux cluster connected via a Switched Fast Ethernet. Each node has a PIII 933MHz CPU, 512 MB main Memory, and three 100GB IDE disks. Performance evaluation of the combination of space-partitioned and attribute-partitioned replicas, and the benefit of attribute-partitioned replicas; Scalability test when increasing the number of nodes hosting dataset; Performance test when data query sizes are varied; Performance evaluation for aggregate queries with unevenly partitioned replicas. 11/11/2018 ICPP2006
11/11/2018 ICPP2006
and Y>=0 and Y<=28 and Z>=0 and Z<=28; SELECT attrlist from IPARS where RID in [0,1] and TIME in [1000,1399] and X>=0 and X<=11 and Y>=0 and Y<=28 and Z>=0 and Z<=28; 11/11/2018 ICPP2006
the combined use of all replicas space part : attr+space part : the combined use of all replicas space part : only use the space-partitioned replicas A run-time optimization 11/11/2018 ICPP2006
and Y>=0 and Y<=31 and Z>=0 and Z<=31; #Query SELECT * from IPARS where TIME>=1000 and TIME<=1599 and X>=0 and X<=11 and Y>=0 and Y<=31 and Z>=0 and Z<=31; Upto 4 nodes, query execution time scales linearly. Due to the dominating seek cost in the total I/O overhead, execution time is not reduced by half while using 8 nodes. 11/11/2018 ICPP2006
and Y>=0 and Y<=28 and Z>=0 and Z<=28; # Query SELECT * from IPARS where TIME>=1000 and TIME<=TIMEVAL and X>=0 and X<=11 and Y>=0 and Y<=28 and Z>=0 and Z<=28; Our algorithm has chosen {1,3,4,6} out of all replicas in Table #1. The query filters 83% of the retrieved data when using the original dataset only; however, it need to filter about 50% of the retrieved data in the presence of replicas. 11/11/2018 ICPP2006
Aggregate Queries with Unevenly Partitioned Replicas 11/11/2018 ICPP2006
Aggregate Queries with Unevenly Partitioned Replicas 11/11/2018 ICPP2006
Alg – solution by the proposed algorithm Alg+Ref – solution after the refinement step Solution-1 & 2 – two manually created solutions 11/11/2018 ICPP2006
Related Work Replication research Data caching techniques Exact copies of portions of data Data availability and reliability Multi-disk system with replicated data Data caching techniques Using aggregate memory and cooperative caches Management and replacement of replicas Our previous work on performance optimization using space partitioned replicas 11/11/2018 ICPP2006
Conclusions The proposed cost models are capable of estimating execution time trends. The designed greedy strategy together with dynamic programming algorithm can choose a good set of candidate replicas that decrease the query execution time. Our implementations show good scalability. When data transfer bandwidth is the limiting factor, using a combination of space and attribute partitioned replicas should be preferred. 11/11/2018 ICPP2006