Download presentation
Presentation is loading. Please wait.
Published byJudith Little Modified over 9 years ago
1
Gestion efficace de Séries Temporelles en P2P Application à l'analyse technique et l'étude des objets mobiles G. Gardarin, B. Nguyen, L. Yeh, K. Zeitouni, B. Butnaru, I. Sandu-Popa Laboratoire PRiSM – Université de Versailles Saint-Quentin BDA’09 - Namur
2
Gardarin et al. -- BDA'092 Motivation Technical Analysis (Economy) Determine buy / sell operations based on time series calculations parameter tuning Empirical / Tuning : many simulations to be run Very large time series (quotes every 15 secs over thousands of items, with years of data) Objective : delegate computing power and caching in a P2P network Mobile objects Compute aggregate queries over time series of sensor data (see paper for more details)
3
Gardarin et al. -- BDA'093 /!\ Time Series vs. Data Streams /!\ Time Series Persistent data queried on demand using complex queries. Historical type data. Size is an issue. Current commercial performance : under 1M per second Data Stream Transient data queried by simple continuous queries (event detection). Real Time oriented.
4
Gardarin et al. -- BDA'094 Contributions Extensible TS model with functional operators compatible with XQuery 1.1 Efficient P2P techniques for XQuery 1.1 with special TS management Current XQ 1.1 engines have very poor performance.
5
Gardarin et al. -- BDA'095 Outline Introduction Time Series Model P2P TS computing Experiments Conclusion
6
Time Series Model
7
Gardarin et al. -- BDA'097 Date (ISO) Value (xs:double) TS Entry ROSeS Model : an infinite vector Date 2007-01-052007-01-08…2009-03-06 Value 5517.355518.59…2534.45 PX1-Close (CAC) TS define a (metric) vector space TS 3 =TS 1 +s*TS 2
8
Gardarin et al. -- BDA'098 Granularity Some choice of semantics needs to be made in order to perform scale change. Adopted semantics : Entry Date = interval start (included) Next entry date = interval end (excluded) Need to define beginning of day with ms. precision
9
Gardarin et al. -- BDA'099 Null values Unknown value (?) Undefined value (!) Date 2007-01-052007-01-062007-01-082007-01-092009-03-06 Value 5517.35!5518.59?2534.45 Assume value ! for all dates preceding the first one. Management of “end” of TS needs ! value
10
Gardarin et al. -- BDA'0910 Relational like operators FilterMap
11
Gardarin et al. -- BDA'0911 Union and Intersection
12
Gardarin et al. -- BDA'0912 K-ary joins JOIN fun (S 1,... S k ) = {[t, m] | [t, val 1 ] in S 1 and … [t, val k ] in S k and m = fun(val 1, …val k )} /!\ Must define behavior for null values.
13
Gardarin et al. -- BDA'0913 Some window functions Moving Average Relative Strength Index (RSI) Moving Average Convergence/Divergence (MACD ) In general : “Constant” or Linear complexity in w Linear in t
14
Gardarin et al. -- BDA'0914 SELL = SEL >1.1 (MAVG 26 (S) /MAVG 12 (S))) BUY = SEL >0 (XAVG 9 (MAVG 12 (S) - MAVG 26 (S))) Some buy/sell rules
15
Gardarin et al. -- BDA'0915 TS/XML : a practical exchange format TS Schema XQ 1.1 mavg implementation (naïve) /!\ Limited maths functions Benefit from the expressive power of XQ to write rules !
16
Gardarin et al. -- BDA'0916 Preliminary results NW Our JAVA System X 100010<116 100050<145 1000100<191 200010<128 200050<190 2000100<1178 400010<153 400050<1179 4000100<1357 16000104212 16000504765 1600010041404 10000010251914 10000050255026 100000100259251 500000101289862 5000005013028259 50000010012949347 Xeon-X5450@3.00GHz Java 6, 1GB Heap mavg MACD (java)
17
Gardarin et al. -- BDA'0917 XQ Problems Important overhead with XML type- checking and structure (limit to xs:double) Limited Maths functions TS are in fact manipulated in let clauses Enhance our XQ processor with non-XQ functions on XML-TS data that respect our schema
18
P2P TS Computing
19
Gardarin et al. -- BDA'0919 How can we achieve scalability ? Observation : Many runs of a given user share intermediate results Many users share intermediate results Divide computation cost by n Divide disk read/write time by n Divide memory usage by n
20
Gardarin et al. -- BDA'0920 TS Distribution – horizontal partitioning CHORD DHT /!\ Overlap is necessary /!\ Choice limits window size N/K (N)
21
Gardarin et al. -- BDA'0921 DHT Two sorts of “key/value” pairs Key : TSName list of slices IDs (numbered) Key : TSName+SliceID peer containing the slice Connect/Disconnect is managed by the DHT Computation algorithm P1 wants to compute Q1 P1 gets the location of all TS Slices needed Ship query to peers Compute query on peer (if possible) Transfer results to P1
22
Gardarin et al. -- BDA'0922 (Naïve) Caching JOIN(MAVG(CAC40,10), SCALE(RSI(CAC40, 14), 100), SUM) [7, 8, 9] Limitation : equivalent expressions Open issue : how to choose peer ?
23
Experiments
24
Gardarin et al. -- BDA'0924 XQ2P Prototype 98% XQuery 1.0 compliant database (java) XQ 1.1 window functionalities Optimized external TS functions P2P storage and computing http://cassiopee.prism.uvsq.fr/XQ2P/
25
Gardarin et al. -- BDA'0925 P2PTester infrastructure
26
Gardarin et al. -- BDA'0926 Experimental evaluation using P2PTester (4 machines) PT INDEX TRTR TPTP TQTQ T NET T P2P 87,156,84473<14004930 166,3100,82176<14002677 327,4236,81106<14001743 648,4537,6580<14001518 1288,91139,2286<14001825 2569,72483,2140<14003023
27
Gardarin et al. -- BDA'0927 Relative gain simulation (no caching)
28
Conclusion Already efficient, still lots to do…
29
Gardarin et al. -- BDA'0929 Current / Future Work TS Granularity operators “Enhanced” Caching (canonical form transformation) Date-based join optimization TS Distance computation, top-k XQ 1.1 window operator optimization Other XQ2P improvements
30
Merci !
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.