Replication and Consistency (3). Reference r Automated Hoarding for Mobile Computers by G. Kuenning and G. Popek.

Replication and Consistency (3)

Reference r Automated Hoarding for Mobile Computers by G. Kuenning and G. Popek

Hoarding r Prefetching and storing objects for access when the network is disconnected or weakly-connected r Example: Internet Explorer lets you browse web pages offline.

Problem r Local storage space is limited. How do I hoard all the data that I need so that I can continue to work, while I’m offline? m Suppose I have 80 GB worth of files in the connected machine. I have a 20 GB hard drive in the mobile computer, how can I copy files such that I can continue to work without disruption? m Suppose I have 80 GB hard drive in the mobile device, hoarding profile is not necessary. r The difficult challenge is selecting which files should be stored locally.

Problem r User input is necessary. m Users should say what they want to be hoarded for disconnected access. But, users do not know what files they really need. They can tell that they need Powerpoint. But they can’t say that they also need some.dll file that is needed to print the presentation onto a laser printer.

SEER System Overview r Observer m Observes user behavior in the form of high-level file accesses (e.g. open). r Correlator m Computes relation between files (called semantic distance). r Clustering Algorithm m Clusters related files to projects r Replication System m Takes care of the replication of the files r External Investigators m Can add additional file-relation data

Flow Diagram r Observing user behavior r Computing semantic distance r User wishes to disconnect r Clustering the files into projects r Storing whole projects on local disk until maximum hoard size is reached. r Disconnect

Semantic Distance r Relationship between two file references. r Assumption: m Semantic locality correlates with temporal locality. r Definition 1: m Semantic distance is equal to the elapsed clock cycles. m Phone calls, disparity between human/computer time scales. r Definition 2: m Semantic distance is equal to number of intervening references. m Lifetimes are more important then point-in-times.

Semantic Distance r Definition 3: m Semantic distance between opening of the file A and opening of file B equals 0 if A is still open while B is opened. m Otherwise the semantic distance is equal to number of intervening references. r Not symmetric!

Semantic Distance r Example: m Reference Sequence ABBCCADD m The first appearance is open, the second is close  A => B,C,D = 0,0,3B => C,D = 1,2 C => D = 1 All other distances are undefined AD B C

Semantic Distance r O(N²) space complexity (N is the number files) r O(N) time complexity for update m N is the number of tracked files r Improvement m Only the n (=20) nearest distances are tracked m Update only distances from files within less then M (=100) m Exchanges in the n nearest files use priority and aging system

Data Reduction r File relation is wanted, not reference relation. r Can use the average of distances computed, but doesn’t always work well. m Example:If three event pairs produce distances of 1,1, and 1498 the arithmetic mean would be 500 which is what you would get if you had 500,500 and 500. r Geometric mean is computed.

Other Measures r Files in the same directory r Files with related filenames r Hot links m Example: #include in C/C++ programs r External investigators m Provide additional information about file relationship m Example: Parses C-source files for #include m Information is used during clustering.

Agglomerative Algorithm r Bottom-up algorithm m Compute n nearest neighbors m Intersect the nearest neighbor lists of all file pairs m Intersection has more than k members -> join clusters m Parameters: n,k m Transitive r O(N) space complexity r O(N²) time complexity

Modified Agglomerative Algorithm r Use Correlator list as n nearest neighbors r Use k-near and k-far instead of k m Combine clusters using k-near instead of k m Share files having k-near and k-far shared neighbors. x is number of shared neighbors and is modified by external investigators O(N) time complexity RelationshipAction k-near <= xClusters merged k-far <= x < k-nearFiles inserted; clusters not merged x < k-farNo action

Issues r Meaningless activitites r Shared libraries r Critical files r Detecting hoard misses r Temporary files and directories r Non-files r Simultaneous accesses r Non-open references r Parameter settings r Avoiding deadlock r Tracing system calls

Conclusion r SEER provides a high level of automation r SEER allows disconnected computing without user interaction r SEER has had almost no hoard misses in tests r SEER uses too much memory

Replication and Consistency (3). Reference r Automated Hoarding for Mobile Computers by G. Kuenning and G. Popek.

Similar presentations

Presentation on theme: "Replication and Consistency (3). Reference r Automated Hoarding for Mobile Computers by G. Kuenning and G. Popek."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Replication and Consistency (3). Reference r Automated Hoarding for Mobile Computers by G. Kuenning and G. Popek.

Similar presentations

Presentation on theme: "Replication and Consistency (3). Reference r Automated Hoarding for Mobile Computers by G. Kuenning and G. Popek."— Presentation transcript:

Similar presentations

About project

Feedback