Automating Hoarding Prasun Dewan Department of Computer Science University of North Carolina

Slides:



Advertisements
Similar presentations
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Advertisements

Page Replacement Algorithms
Chapter 10: Virtual Memory
Background Virtual memory – separation of user logical memory from physical memory. Only part of the program needs to be in memory for execution. Logical.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Linkage Editors Difference between a linkage editor and a linking loader: Linking loader performs all linking and relocation operations, including automatic.
Chapter 3 Loaders and Linkers
Allocation of Frames Each process needs minimum number of pages
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 10: Virtual Memory Background Demand Paging Process Creation Page Replacement.
Module 9: Virtual Memory
Module 10: Virtual Memory Background Demand Paging Performance of Demand Paging Page Replacement Page-Replacement Algorithms Allocation of Frames Thrashing.
Virtual Memory Background Demand Paging Performance of Demand Paging
Final Project of Information Retrieval and Extraction by d 吳蕙如.
Instructor: Umar KalimNUST Institute of Information Technology Operating Systems Virtual Memory.
Assembler – Assembler Design Options. One-Pass Assemblers (1/2) Main problem  Forward references Data items Labels on instructions Solution  Data items:
Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing, Volume: 3.
CS2422 Assembly Language & System Programming December 22, 2005.
Informationsteknologi Friday, November 16, 2007Computer Architecture I - Class 121 Today’s class Operating System Machine Level.
Virtual Memory Chapter 8.
1 Chapter 8 Virtual Memory Virtual memory is a storage allocation scheme in which secondary memory can be addressed as though it were part of main memory.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 9: Virtual Memory.
Outline Objective –Attribute file naming: “Why can’t I find my files?” –Hoarding techniques and their use in disconnected file systems Administrative –New.
03/29/2004CSCI 315 Operating Systems Design1 Page Replacement Algorithms (Virtual Memory)
Backup and Recovery Part 1.
1 Introduction to Tool chains. 2 Tool chain for the Sitara Family (but it is true for other ARM based devices as well) A tool chain is a collection of.
1 File Systems Chapter Files 6.2 Directories 6.3 File system implementation 6.4 Example file systems.
Chapter 3 Memory Management: Virtual Memory
Template v5 October 12, Copyright © Infor. All Rights Reserved. 1 Warehouse Mobility for LN An Overview Swaroop Patnaik, Infor.
LBTO IssueTrak User’s Manual Norm Cushing version 1.3 August 8th, 2007.
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
Page 19/17/2015 CSE 30341: Operating Systems Principles Optimal Algorithm  Replace page that will not be used for longest period of time  Used for measuring.
Replication for Mobile Computing Prasun Dewan Department of Computer Science University of North Carolina
File System Implementation Chapter 12. File system Organization Application programs Application programs Logical file system Logical file system manages.
1 Interface Two most common types of interfaces –SCSI: Small Computer Systems Interface (servers and high-performance desktops) –IDE/ATA: Integrated Drive.
CIS250 OPERATING SYSTEMS Memory Management Since we share memory, we need to manage it Memory manager only sees the address A program counter value indicates.
1 Mobile File Systems: Disconnected and Weakly Connected File Systems 3/29/2004 Richard Yang.
Replication and Consistency (3). Reference r Automated Hoarding for Mobile Computers by G. Kuenning and G. Popek.
Virtual Memory Chapter 8. Hardware and Control Structures Memory references are dynamically translated into physical addresses at run time –A process.
Feb 27, 2001CSCI {4,6}900: Ubiquitous Computing1 Announcements.
File Storage Organization The majority of space on a device is reserved for the storage of files. When files are created and modified physical blocks are.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Virtual Memory.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 10: Virtual Memory Background Demand Paging Page Replacement Allocation of.
Lecture 11 Page 1 CS 111 Online Virtual Memory A generalization of what demand paging allows A form of memory where the system provides a useful abstraction.
20 Copyright © 2008, Oracle. All rights reserved. Cache Management.
THE EVOLUTION OF CODA M. Satyanarayanan Carnegie-Mellon University.
1 IP Routing table compaction and sampling schemes to enhance TCAM cache performance Author: Ruirui Guo, Jose G. Delgado-Frias Publisher: Journal of Systems.
Configuring the User and Computer Environment Using Group Policy Lesson 8.
Virtual Local Area Networks In Security By Mark Reed.
Mobile File Systems.
CS161 – Design and Architecture of Computer
Practical Database Design and Tuning
Module 11: File Structure
CS161 – Design and Architecture of Computer
Chapter 11: Storage and File Structure
Appendix D: Network Model
Main Memory Management
File Management.
Assembler Design Options
Module 9: Virtual Memory
Lecture 10: Buffer Manager and File Organization
Chapter 9: Virtual-Memory Management
Machine Independent Features
Practical Database Design and Tuning
Indexing and Hashing Basic Concepts Ordered Indices
Introduction to Operating Systems
Appendix D: Network Model
Module 9: Virtual Memory
COMP755 Advanced Operating Systems
Presentation transcript:

Automating Hoarding Prasun Dewan Department of Computer Science University of North Carolina

2 Manual Hoarding Per user, per workstation hoard profiles, specifying u Files to be added or deleted u Current and future (+) F children (c) F or descendents(d) u Priority a /coda/usr/jjk d+ a /coda/usr/jjk/papers 100:d+ Personal Files a /usr/X11/bin/xterm a /usr/X11/bin/xinit Executables Source Files a /coda/src/venus 100:c+ a /coda/include 100:c+

3 LRU n Works well when u activity remains same F not if context switch occurs after cache fill F context switch occurs after disconnection F application used in disconnected state may not be the same as the ones running when hoarding occurs u can afford to have misses F need to keep entire working set in disconnected state F some dynamically accessed files may not have been referenced recently n Problems addressed by program trace approach

4 Per-program Hoarding n Fixing activity switch problem u Per program traces u User specifies which programs will be used /a/b/c /a/b/c1 /a/b/c2 /a/b/c3 /me/file1 /a/b/x/f/a/b/x/g /me/file2 exec() open()

5 Uniting Traces n Fixing cache miss problem u Look at multiple executions of program (< n) u Unite accesses of all traces F chance of dynamically accessed file being missed lowered F may not want do do so for (execution- specific) data distinguish data from program data in directory with different root directory and has different extension /a/b/c /a/b/c1 /a/b/c2 /a/b/c3 /me/file1 /a/b/x/f/a/b/x/g /me/file2

6 Aggregation Choice n Possible to choose: u Most recent trace u Trace unification

7 Data file choice n Possible to choose: u Data files of all executions. u Data files of all executions by specific user u Data files of most recent execution by specific user.

8 Multi-Program Activities n Bookends u Snapshot spying u User specifies start and end of spying period u Associates it with a bookend name n For each bookend, can ask for hoarding of: u All accesses recorded u Accesses in traces of each program executed F Data file filtering

9 Program Trace Limitations n User involvement u Bookend definition u Hoarding decisions F Data file filtering F Most recent vs. aggregated n Fixed by semantic distance approach

10 Example of Program Trace Limitations n Wish to hoard all chapters of book written using tex n Define a bookend for this project u Get all files accessed by programs (tex) executed during bookend definition u Scheme will get tex and all dynamic files accessed by it n Data file choices: u Get all my data files accessed during bookend spying F Must access all chapters in snapshot u Get all of my data files accessed by tex F Will get more than I want u Get all of my data files accessed during last trace of tex F May not have accessed book recently

11 Semantic Distance Concept n Between files n Low if they belong to same project n High if they do not n Use it to determine files in a project n Hoard all or no files of a project (working set)

12 Temporal Semantic Distance Clock time elapsed between most recent opens/ execs of the files u Clock time not good indicator F Coffee break between references to related files

13 Sequence-based Semantic Distance Number of intervening references (including open of first file) between the most recent opens/execs of the files A: source file B: includeC: include B ? Non commutative u Looks only at first reference time (open) u Files accessed during reference lifetime (open to close) should have equal semantic distance 12 3 open close

14 Lifetime-based Semantic Distance n SD(F1, F2) u 0 if F2 opened before F1 closed u # intervening opens otherwise n Consider an exec as open immediately followed by close n Considers only last reference u Dynamic linking conditional A: source file B: includeC: include B 00 3

15 Aggregation-based Semantic Distance n Take arithmetic mean of SD(F1 i, F2 i ), 1< I < number of references to F1 u 1, 1, 1498 vs. 500, 500, 500 n Take Geometric Mean n Efficiency: u O(N 2 ) storage F Track n (20) closest neighbours u O(N) cost per reference F Update SDs of files accessed in the last m (100) references

16 Clustering n Goal u Cluster files into projects based on SDs n Difficulties u No objective measure of goodness of clustering u Need overlapping clusters F Common header files u SD not commutative

17 Distance-based Threshold F1, F2 in same cluster if SD(F1, F2) <= p or SD(F2,F1) <= p u Size of project not considered u For any p, one can imagine a project with > p files n Combine clusters if they have overlapping files f 1, f 2.. f p combined with f p, f p+1.. f l u All files will become one cluster

18 Common Neighbours-based Threshold n Based on the n (nearest) neighbours n Look at # common neigbours, c n Two thresholds: u k f (far) < k n (near) k n <= c k f <= c <= k n c < k f Clusters combined into one Files inserted in each other’s clusters No action

19 Combining Phase A B C D E F G ABCDEFGABCDEFG knkn kfkf knkn kfkf knkn knkn knkn {A, B} {A, B, C} {D, E}{A, B, C} {D, E}{A, B, C}{F, G} {D, E, F, G}{A, B, C}

20 Insertion Phase A B C D E F G ABCDEFGABCDEFG knkn kfkf knkn kfkf knkn knkn knkn {D, E, F, G}{A, B, C} {A, B, C, D}{C, D, E, F, G}

21 Other Correlating Factors n Directory membership u Files with common ancestor directories related n File naming conventions u Source and header files have same prefix n Other relations u # include files, import statements, common words Ancestor level automatically recorded and subtracted from shared neigbours External investigator generates relationship weight and is added to shared neigbours

22 Another Option n Add/subtract from SD n SD is asymmetric n Directly modifying shared neighbour count has more impact

23 Searching Programs n Example: Find n Opens all files in a sub tree n Destroys LRU and SD information n Accesses of meaningless program ignored u Program accessing > d % of possible directory members n Important to detect meaningless phase rather than program u Get working directory F Does exhaustive search F Accesses during search ignored rather than entire program calling getcwd

24 Shared Libraries n Accessed by all programs u All clusters will be combined via them n Files involved in more than a certain percentage (1%) of accesses ignored and always put in hoard set.

25 Temporary Files n Not important by definition n But may have small semantic distance to other files n System disregards files in certain directories

26 Rarely Accessed Critical Files n Hardly accessed but important u Boot strapping u Suspend/resume files n User specified lists n System-specific heuristics u. Files in unix

27 Non Files n Can be critical u Device file n Access to them may not be recorded u Symbolic link points to actual file n Non-directories take no space u Always hoarded n Directories may be needed to do offline file-name translation u Replication system makes decisions regarding them

28 Handling Hoard Miss n If hoard miss u Add file and its project to hoard set n Record it for goodness measure.

29 Goodness Measure n Caching u Cache miss rate n Hoarding u Time to first cache miss F Does not take into account working set size vs. hoard size working set no miss Working set ~ hoard size -> high miss rate u Miss-free hoard size

30 Miss-free Hoard Size Under LRU n Look at references before most recent disconnection u F4 F3 F1 F2 F1 F5 n Keep only most recent reference to each file u F4 F3 F2 F1 F5 n Mark files accessed since disconnection u F3 F5 n Locate the first marked file in sorted list u F3 n Sum the size of all files between this file and end of sorted list u F3 + F2 + F1 + F5

31 Live usage n Gathered user traces of activities n Few hoard misses in actual usage

32 Comparison Experiments n Gathered user traces of activities n Replaced each trace simulating disconnection duration of u 24 hours u 7 days n Assumed infinitesimal reconnection only for re-hoarding n Mode of traced activities u Connected F Can do activities normally not done in disconnected mode Web access F Access patterns remain same u Disconnected mode F Actual hoard misses could influence activities F But misses were few anyway n Semantic distance leads to hoard size slightly bigger than WS n Much better than LRU

33 Unresolved Issues n Hoarding of fine-grained data