Download presentation
Presentation is loading. Please wait.
1
Locality Optimizations in OceanStore Patrick R. Eaton Dennis Geels An introduction to introspective techniques for exploiting locality in wide area storage utilities.
2
Agenda OceanStore Review Problem Overview Previous Work Proposed Solution Prefetching Algorithm Preliminary Results Future Work
3
OceanStore Review Properties of OceanStore relevant to introspective locality optimizations –implemented in the extremely wide area –has many places to put any single piece of data –cannot rely on users to make relationships among data explicit –depends on effective locality optimizations for improved performance –No possible way to solve exactly
4
Problem Overview Passively observe data accesses –data shared among multiple users –single users accessing the network from different physical locations –data is replicated across the network Optimize the location of data to provide quicker access to users –cluster semantically related data –replicate data to move it closer to consumers –migrate primary replicas toward the source of updates
5
Measurable Attributes File Temperature –A measure that indicates the frequency of access to the file –A hot file is frequently accessed Semantic Distance (Kuenning) –Any measure that can quantify relationships between files on the range [0, ) –Local distance relates one instance of a file access to another –Reference distance is an aggregate measure that summarizes all local distances for a pair of files –Typical measures use access order or timing information
6
Prefetching Techniques Automatic Prefetching (Griffoen and Appleton) –construct a probability graph that records accesses which follow within a lookahead period –predict a prefetch when the chance of an access is above a tunable parameter Context Modeling (Kroeger and Long) –record in a trie all access sequences which have been observed –maintain pointers to all nodes which represent current contexts –predict a prefetch when the chance of an access to a child of a current context is above a probability threshold
7
Our Approach Exploit the ideas of semantic distance to compute relationships among data –Cluster data based on the observed relationships –Store a summary of these relationships with the data Migrate (prefetch) files based on familiar patterns in the access stream –recognize higher order correlations as in context modeling –tolerate noise in the access stream
8
Motivation for Prefetching Algorithm A B Y Z K C A B Other patterns can only be detected through identification and filtering of noise. Many patterns can be predicted only by observation of higher-order correlation--combining several pieces of past history.
9
General Prefetching Algorithm Update –Record the most recent file accesses in the file history buffer (FHB) –Each time a new file S is accessed, extract all triples of the form (FHB(i), FHB(j)) S from the FHB and update in the second-order distance table Predict –Each time a new file S is accessed, examine the distance table entries of (FHB(i), S) –Prefetch files that are predicted with confidence above a certain threshold Problems –O(k 2 ) work to update distance table –Noise infects distance table FHB y B w g o F w K Distance Table (B,F)w K (y,B)w g o F K (o,w)K
10
Optimizations to the Prefetching Algorithm First-order distance table –Records files that are close, as measured by semantic distance –Allows reverse lookup Use first-order distance tables to filter out irrelevant file relationships –Update only relevant entries in the second- order distance table –Search for predictions based on only relevant access pairs Indicative FHB’s y B w g w K p e Distance Table yB w g t Bw g t o wg K t o y B w t o F w K y B w g o F K h
11
Prefetching Algorithm Example FHB y Q t u v R w x S 1 st Order Table Qa b R c d Rb S g h t St d e R v Update –Extract relevant triples by intersecting the FHB with the results from the reverse lookup in first-order tables 2 nd Order Table S a b d f (Q,w)b t (Q,v)t d e (Q,R) Find parents of S Predict –Extract relevant doubles by intersecting the FHB with the results from the reverse lookup in the first-order tables –Prefetch if the second-order table predicts a future access with sufficient confidence Find parents of R Update table t x b y t Q u v R Find parents of R Check table for prediction
12
Preliminary Results (Local System)
13
Future Work Retarget the simulations to model OceanStore Continue to refine the prefetching algorithm Examine the potential of higher order prefetching Combine prefetching and clustering Look for opportunities to test the ideas on different workloads
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.