Download presentation
Presentation is loading. Please wait.
Published byIsaac Wade Modified over 9 years ago
1
Adaptive Storage Management for Modern Data Centers Imranul Hoque 1
2
Applications in Modern Data Centers More than 20 PB of data (> 260 billion files) just in the photo app. 2 More than 500 million active users. Over 2.5 million websites have integrated with Facebook.
3
Application Characteristics Type of data – Networked data (integration between content and social sites) Volume of data/scale of system – TB to PB (photos, videos, news articles, etc.) End-user performance requirement – Low latency (real time web) 3
4
New Classes of Storage Systems Application requirements Why traditional storage systems have failed? New storage system Networked data Representation Join (e.g., friends of friends) Graph database Example: Twitter uses FlockDB Large data volume and massive scale of system Consistency vs. (Performance and Availability) (Small but frequent r/w and batch transactions with rare write) vs. Heavy read/write workload Key-value storage Example: Digg uses Cassandra Low latency and high throughput Disk is bottleneckIn-memory storage Example: Craigslist uses Redis 4 Absence of adaptive techniques for storage management cause these new storage systems to perform sub-optimally.
5
Adaptive Storage Management New storage systemUse casesScope of performance improvement Graph database Online social network Amazon style recommendation engine Hierarchical data set Data layout on disk Key-value storage Status update Image tags Click stream Load distribution across servers In-memory storage Status update Financial tick data Online gaming stats Construction of working set 5 Bondhu Shomota Sreeti How the adaptive techniques should be designed?
6
Hypothesis Statement “Adaptive techniques which leverage the underlying heterogeneity of the system as a first class citizen, can improve the performance of these new classes of storage systems significantly.” 6
7
Bondhu: Leveraging Heterogeneity 7 Placement techniques Social Graph Hard Disk Drive Exploit heterogeneity in the social graph to make better data placement decisions.
8
Shomota: Leveraging Heterogeneity 8 Server 1 Server 2 Server 3 Server 4 SAT TUE SUN MON WED THU FRI MON TUE WED THU FRI SAT SUN Tablets Table Mitigate load heterogeneity across servers to alleviate hot spot via adaptive load balancing techniques.
9
Sreeti: Leveraging Heterogeneity 9 Swapping strategy Prefetching strategy Exploit heterogeneity in user access patterns to design prefetching and swapping techniques for better performance. Main memory Persistent storage Users
10
Contribution Project name Storage type Heterogeneity leveraged Exploit/m itigate Techniques proposed Status BondhuGraphSocial graphExploitData layout on disk Mature ShomotaKey-valueLoadMitigateLoad balancing across servers On-going SreetiIn-memoryRequest pattern ExploitPre-fetching and swapping Future work 10
11
Bondhu: A Social Network-Aware Disk Manager for Graph Databases 11
12
Visualization of Blocks Accessed Facebook New Orleans network [Viswanath2009] Neo4j graph database – 400 KB blocks per user blktrace tool to trace blocks 12 getProperty() Property Scattered disk access
13
Sequential vs. Random Access How bad is random access? fio benchmarking tool 13 70 vs. 0.7 MB/s 150 vs. 1 MB/s 98 vs. 0.8 MB/s
14
Social Network-Aware Disk Manager Approaches in other systems – Popularity-based approach: multimedia file system [Wong1983] – Tracking block access patterns [Li2004][Bhadkamkar2009] Properties of online social networks [Misolve2007] – Strong community structure – Small world phenomenon Exploit heterogeneity in social graph – Keep related users’ data close by on disk – Reduce seek time, rotational latency, # of seeks 14
15
The Bondhu System Novel framework for disk layout algorithms – Based on community detection Integration into Neo4j, a widely-used open source graph database Experimentation using real social graph – Response time improvement by 48% compared to the default Neo4j layout 15
16
Problem Definition Logical block addressing scheme (LBA) – One-dimensional representation of disk 16 L1L1 L2L2 L3L3 L4L4 L5L5 L6L6 L7L7 V1V1 V2V2 V3V3 V4V4 V5V5 V6V6 V7V7 V5V5 V1V1 V3V3 V2V2 V4V4 V6V6 V7V7 Cost of layout = 18 Cost of layout = 14 NP-hard problem: fast multi-level heuristic approach V1V1 V1V1 V5V5 V5V5 V2V2 V2V2 V3V3 V3V3 V4V4 V4V4 V6V6 V6V6 V7V7 V7V7
17
Disk Layout Algorithm 17
18
Community Detection Module Goal: organize the users of the social graph into clusters Based on community detection algorithms – Graph partition driven (ParCom) [Karypis1998] – Modularity optimization driven (ModCom) [Blondel2008] 18 V1V1 V1V1 V5V5 V5V5 V2V2 V2V2 V3V3 V3V3 V4V4 V4V4 V6V6 V6V6 V7V7 V7V7
19
Intra-community Layout Module 19 V1V1 V1V1 V5V5 V5V5 V2V2 V2V2 V3V3 V3V3 V4V4 V4V4 V6V6 V6V6 V7V7 V7V7 1 1 1 1 1 1 1 1 V2V2 V1V1 V3V3 V5V5 V3V3 V3V3 V1V1 V1V1 2 2 VCVC VCVC V5V5 V5V5 V2V2 V2V2 V7V7 V6V6 V4V4 Goal: create a disk layout for each community
20
Inter-community Layout Module 20 V1V1 V1V1 V5V5 V5V5 V2V2 V2V2 V3V3 V3V3 V4V4 V4V4 V6V6 V6V6 V7V7 V7V7 VAVA VAVA VBVB VBVB 2 VAVA VBVB V2V2 V1V1 V3V3 V5V5 V7V7 V6V6 V4V4
21
Modeling OSN Dynamics Uniform Model – Assign equal weight Preferential Model – Weight of edge (V i, V j ) ∝ edge degree of V j – We use [edge_degree(V i ) + edge_degree(V j )]/2 Overlap Model – Weight proportional to # of common friends – We use (c + 1), c = # of common friends 21 V1V1 V1V1 V5V5 V5V5 V2V2 V2V2 V3V3 V3V3 V4V4 V4V4 V6V6 V6V6 V7V7 V7V7 degree = 4 degree= 3 3.5 2
22
Implementation and Evaluation Modified PropertyStore of Neo4j Facebook New Orleans network [Viswanath2009] – 63731 users – 817090 links – Assign weights according to uniform, preferential, and overlap models Workload: sample social network application – ‘List all friends’ operation – 1500 random users, 6 times/user Metrics – Cost (defined earlier) – Response time = time to fetch data blocks from all friends of a random user 22
23
Visualization using Bondhu 23 Default layout: Scattered disk access Bondhu: Clustered disk access
24
Effect of Block Size 24 Caching disabled Caching enabled File system reads data in chunks of 4KB – (4B, 40B, 400B) block = (1024, 102, 10) users data Default layout – 10x decrease in expected # of friends, when block size increases from 4B to 40B to 400B Bondhu layout – 4B to 40B not much decrease in expected # of friends – 40B to 400B rapid decrease in expected # of friends Cached in memory
25
Response Time Metric vs. Cost Metric 25 Improvement in response time is due to better placement decisions. Block size: 40 B Block size: 400 KB
26
Effect of Different Models Model = {preferential, overlap, uniform, default} Workload = {random, preferential, overlap} – 1000 random users, 1000 requests, 10 measurements 26 Little added benefit
27
Effect of OSN Evolution 27 Still better than the default layout by 72% 33% less nodes
28
Summary Adaptive disk layout techniques – Exploit heterogeneity of social graph (community structure) Implementation in Neo4j graph database Extensive trace-driven experimentation 48% improvement in median response time Low additional benefit using complex models Infrequent re-organization 28
29
Shomota: An Adaptive Load Balancer for Distributed Key-Value Storage Systems 29
30
30
31
Load Heterogeneity in K-V Storage Hash partitioned vs. range partitioned – Range partitioned data ensures efficient range scan/search – Hash partitioned data helps even distribution Uneven space distribution due to range partitioning – Solution: partition the tablets and move them around Few number of very popular records 31 Server 1 Server 2 Server 3 Server 4 SAT TUE SUN MON WED THU FRI
32
The Shomota System Mitigate load heterogeneity Algorithms for solving the load balancing problem – Load = space, bandwidth – Evenly distribute the spare capacity – Distributed algorithm, not a centralized one – Reduce the number of moves Previous solutions: – One dimensional/key-space redistribution/bulk loading [Stoica2001, Byers2003, Karger2004, Rao2003, Godfrey2005, Silberstein2008] 32
33
System Modeling and Assumptions Table Tablet Server A Server B Server C B 1, S 1 B 2, S 2 B 3, S 3 B A, S A B B, S B B C, S C 33 1.<= 0.01 in both dimensions 2. # of tablets >> # of nodes 1.<= 0.01 in both dimensions 2. # of tablets >> # of nodes B 1, S 1 B 4, S 4 B 5, S 5
34
System State B B SS Target Zone: helps achieve convergence Target Point Goal: Move tablets around so that every server is within the target zone 34 A given server
35
Load Balancing Algorithms Phase 1: – Global averaging phase – Variance of the approximation of the average decreases exponentially fast [Kempe2003] Phase 2: – Gossip phase – Point selection strategy Midpoint strategy Greedy strategy – Tablet transfer strategy Move to the selected point with minimum cost (space transferred) Phase 1 Phase 2 Phase 1 Phase 2 t 35
36
Summary Distributed load balancing techniques for key- value storage system – Mitigates both space and throughput heterogeneity across servers – PeerSim-based simulation – Integration in Voldemort (on-going) – Simulation results exhibit fast convergence while keeping the data movements at a low level 36
37
Sreeti: Access Pattern-Aware Memory Management 37
38
In-memory Storage System Growth in Internet population – Search engine, social networking, blogging, e- commerce, media sharing User expectation – Fast response time + high availability Serving large number of users at real-time – Option 1: SSD – Option 2: memory Emerging trends – Memory caching system: Memcached – In-memory storage system: Redis, VoltDB, etc. 38
39
Motivation Assumption in existing in-memory storage systems – Enough RAM to fit all data in memory Counter example: – Values associated with keys are large Approach taken by existing systems: – Redis, Memcached: use LRU for swapping 39 Performance of in-memory storage systems can be improved further if heterogeneity in user request-pattern is leveraged.
40
Proposal for Sreeti System Adaptive techniques for prefetching, caching, and swapping – Exploit heterogeneity in user request-patterns 40 Swap Fetch Users Application Main Memory Persistent Storage Associative rule mining
41
Hypothesis Statement “Adaptive techniques which leverage the underlying heterogeneity of the system as a first class citizen, can improve the performance of these new classes of storage systems significantly.” 41 Disk layout Load balancing Prefetching, caching, swapping Disk layout Load balancing Prefetching, caching, swapping Exploit Mitigate Exploit Mitigate Graph database Key-value storage In-memory storage Graph database Key-value storage In-memory storage
42
Summary Project name Storage typeHeterogeneity leveraged Exploit/miti gate Techniques proposed BondhuGraphSocial graphExploitData layout on disk ShomotaKey-valueLoadMitigateLoad balancing across servers SreetiIn-memoryRequest patternExploitPre-fetching and swapping 42
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.