Group Based Management of Distributed File Caches

Group Based Management of Distributed File Caches
Amer,Long,Burns

Prefetching vs. Automated Grouping “Reducing Access Latency”
Dynamic File Grouping Prefetching vs. Automated Grouping “Reducing Access Latency” Implicit prefetching Increasing retention priority Modelling allowing overlapping partitions Why? Scenarios.

Success and accuracy dependent on model and strength of relationships.
Edge priority Predicting file access behavior: Frequency or Recency ? Location? Single or group of files? Input data?

Hot data items where access costs are less.
Tracking sequence, not the time. Aggregating Cache Successors: Immediate and Transitive Group [g] for retrieval. Server maintains immediate successors Client then uses LRU replacement in cache, requested at the front followed by [g –1].

Server Side: Comparing LRU, LFU, Dgrouping
Client Side: Demand fetches,Cache hit ratios,Miss rates. Server Side: Comparing LRU, LFU, Dgrouping Statistics are accessed piggy backed with client file requests. Client cache -> Server cache, drop in hit rate for server cache. Interdependency among file access events Comparisons based on recency vs. frequency for maintaining successors

Associating files with successors:
Sequences or single files? CAB,CDB….successors are A and D Tracking sequences: More metadata, reduce likelihood of repeated successors. Single file: Greater predictability Ideal may be a combination of frequency and recency.

Efficient Massive Sharing of Content Among Peers
Triantafillou, Xiruhaki, Koubarakis

Exploit semantics of documents, construct clusters.
Goal is to achieve Global load balancing Short response time Exploit semantics of documents, construct clusters. Assumptions: Known popularity Documents accompanied by key words

Idea is to impose order in chaos!
Goal is then to achieve Inter and Intra cluster load balancing. Idea is to impose order in chaos! Each doc is associated with a popularity. Requests can either retrieve or publish

Response time bound by no. of nodes in the cluster.
Processing: Obtain semantic category through keywords in the request. Associate categories to clusters using metadata present locally. Forward to cluster. Intra cluster load balancing: forwarding requests to a random node each time. Response time bound by no. of nodes in the cluster.

Inter cluster load balancing: NP complete Assumptions:
All peers have the same capacities Enough storage space to store documents of contributed categories Q: Partition N into k clusters,such that If two docs belong to same category, then the host nodes also belong to the sme cluster. Clusters have equal normalized probabilities.

Partitioning into clusters with nearly equal normalized probabilities.
Using a greedy algorithm, MinDiff All clusters initially empty For each semantic category, assign to the cluster which min total diff of popularities among clusters.

Discussions

Group Based Management of Distributed File Caches

Similar presentations

Presentation on theme: "Group Based Management of Distributed File Caches"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Group Based Management of Distributed File Caches

Similar presentations

Presentation on theme: "Group Based Management of Distributed File Caches"— Presentation transcript:

Similar presentations

About project

Feedback