Download presentation
Presentation is loading. Please wait.
Published byGarry Woods Modified over 9 years ago
1
CS6321 Query Optimization Over Web Services Utkarsh Kamesh Jennifer Rajeev Shrivastava Munagala Wisdom Motwani Presented By Ajay Kumar Sarda
2
CS632 2 Motivation Web services emerging as a popular standard for sharing data and functionality Databases behind web services DBMS-like capabilities when data sources are web services Need for query optimization for queries spanning multiple web services
3
CS632 3 Motivating Example A credit card company wants to send out mails for it’s new credit card offer. I: Potential recipient names WS 1 :name(n) credit rating (cr) WS 2 :name(n) credit card number (ccn) WS 3 :card number (ccn) payment history (ph) One Possible execution is WS 1,WS 2,WS 3 Is it optimal?
4
Challenges Different response time of web services Precedence constraints Tradeoff between linear pipeline and parallelism Parsing SOAP/XML headers overhead
5
Related Work Query optimization in the presence of limited access patterns Binding pattern R (A b, B f ) Annotated query plans in the search space,prunes invalid and non-viable plans Starts with initial set S of plans containing only atomic plans S is iteratively updated by adding new plans obtained by combining plans from S using selection and join operations
6
CS632 6 Outline of the Talk WSMS Preliminaries Query Optimization with and without precedence constraints Data Chunking Experimental Evaluation Conclusion Future work
7
CS632 7 WSMS Architecture
8
CS632 8 Query Model Web Service denoted as WS(X b i,,Y f i ) X i - Bound Attributes Y i - Free Attributes
9
CS632 9 Query Model (Contd.)
10
CS632 10 Query Plans
11
CS632 11 Execution Model T i created for each web service T i takes input from join thread J i J i joins the outputs of parents of WS i J out joins the outputs of all leaves web service.
12
CS632 12 Execution Model (Contd.)
13
CS632 13 Statistics Per-tuple response time(C i ) c i =1/r i where r i is maximum rate of at which results of invocations can be obtained from Ws i Depends on web service provisioning, network conditions and load on the web service Selectivity(S i ) Average number of returned tuples that remain unfiltered after applying predicates S i 1 (proliferative)
14
CS632 14 Bottleneck Cost Metric Query plan H P i (H) -the set of predecessors of WS i in H R[S]-- the combined selectivity of all the web services in S Every tuple in I input to plan H, the average number of tuples that WS i needs to process is given by R[P i (H)] Average processing time required by WS i per original input tuple in I is is R[P i (H)].C i Cost of the query plan H max(R[P i (H)].C i )
15
CS632 15 Bottleneck Cost Metric (Contd.) Plan 1 : max(2*I, 10*0.1*I, 5*0.5*I)=2.5 Plan 2 : max(2*I, 10*I, 5*5*I)=25 Plan 2 is 10 times slower than plan 1
16
CS632 16 Q.O without Precedence Constraints Lemma: “There exists an optimal plan that is a linear ordering of the selective web services, i.e., has no parallel dispatch of data.” SiSi
17
Q.O without Precedence Constraints Lemma: “Let WS 1,..., WS n be a plan with a linear ordering of the selective web services. If c i > c i+1, then WS i and WS i+1 can be swapped without increasing the cost of the plan.” C i > C i+1 FiCiFiCi F i S i C i+1 C i+1 (S i, C i ) F i S i+1 C i F i C i+1 (S i+1, C i+1 ) CiCi
18
CS632 18 Q.O without Precedence Constraints(Contd.) Theorem : “For selective web services with no precedence constraints, the optimal plan is a linear ordering of the web services by increasing response time, ignoring selectivity's.”
19
CS632 19 Q.O with Precedence Constraints Constructs the plan DAG H incrementally by greedily adding to it one web service at a time Web service chosen should be the one that can be added to H with minimum cost, and all of whose prerequisite web services have already been added to H M i -- the set of all web services that are prerequisites for WS i
20
CS632 20 Adding a Web Service to the Plan A partial plan H (bar) and add WS x Compute the best cut C x such that on placing edges from the web services in C x to WS x, cost is minimized PC x –set of all the web services in C x and all the predecessors in H(bar) Cost incurred by adding WS x is Cost(WS x )=R[PC x ]. C x
21
CS632 21 Adding a Web Service (Contd.) A variable Zi with every WSi, set to 1 if Wsi belongs to PCx. Optimal set PCx obtained by solving LP problem
22
CS632 22 Greedy Algorithm
23
CS632 23 Data Chunking Parsing SOAP/XML headers and network cost overhead on web service call Pass tuples to a web service in chunks Response time of WS i depends on input chunk size C i (k) – Response time of WS i on a chunk of size k A limit k i max exists on max chunk size
24
CS632 24 Data Chunking (Contd.) Query Optimizer must decide on optimal chunk size for each web service “The optimal chunk size to be used by WSi is Ki* such that ci(Ki*)/Ki* is minimized” Profiling combined with query processing for trying out various chunk sizes Intermediate tuples between any two web services in the pipelined plan are buffered
25
CS632 25 Experimental Evaluation Total running time as metric Compare the plans produced by optimizer against Parallel – Dispatch data in parallel SelOrder—Choose WS with lower selectivity Compare the running time with and without chunking Compare the WSMS cost against the slowest web service
26
CS632 26 Experimental Setup WSMS prototype is multithreaded system in Java Apache Axis tools for communicating with web services Java Reflection Different costs by varying delays Different selectivities by rejecting tuple with probability 1-S i
27
CS632 27 No Precedence Constraints WS1,WS2,WS3,WS 4 Selectivities set as 0.4,0.3,0.2,0.1 Range of cost c varied from [0.2,2] to [2,2] Parallel – WS4 SelOrder – WS4
28
CS632 28 Precedence Constraints WS 1,WS 2,WS 3,WS 4 WS 1 < WS 3,WS 2 < WS 4 Selectivities : 2,1,0.1,0.1 Uniform cost of WS 1,WS 2,WS 3 with WS 4 varied from 0.4 to 2
29
CS632 29 Data Chunking WS1,WS2,WS3,WS 4 No precedence constraints Uniform cost Selectivity set to 0.5 Web Services are arranged in linear pipeline (Optimizer) Equal chunk size
30
CS632 30 WSMS Cost Vs Bottleneck Cost No precedence constraints Uniform web service costs Selectivity set to 0.5 Web Services arranged in linear pipeline
31
CS632 31 Future Work Different input tuples to follow different plans Adaptive plans that changes with response times Web Services with monetary costs Multiple web services for same data Profiling techniques that track response time and selectivities Caching Techniques at WSMS
32
CS632 32 Conclusion Web Service Management System Bottleneck cost – cost of pipelined plan Optimal pipelined plan respecting precedence constraints Optimal chunk size
33
References Query Optimization over Web Services U. Srivastava, J. Widom, K. Munagala, and R. Motwani Query optimization in the presence of limited access patterns. In Proc. of ACM SIGMOD Conf. on Management of Data
34
CS632 34 Thank You!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.