Download presentation
Presentation is loading. Please wait.
Published byBethanie Pearson Modified over 6 years ago
1
Finding a Needle in Haystack : Facebook’s Photo storage
Doug Beaver, Sanjeev Kumar, Harry C. Li, Jason Sobel, Peter Vajgel Presented by : Pallavi Kasula
2
Photos @ Facebook April,2009 October,2010 August,2012 Total
15 billion photos 60 billion images 1.5 petabytes 65 billion photos260 billion images20 petabytes Upload Rate 220 million photos/week 25 terabytes 1 billion photos/week 60 terabytes 2 billion photos/week* Serving Rate 550,000 images/sec 1 million images/sec *
3
Goals High throughput and low latency Fault-tolerant Cost-effective
Simple
4
Typical Design
5
NFS-based Design
6
NFS-based Design Typical website small working set
Infrequent access to old content ~99% CDN hit rate Facebook large working set Frequent access to old content ~80% CDN hit rate
7
NFS-based Design Metadata bottleneck Each image stored as a file
Large metadata size severely limits the metadata hit ratio Image read performance ~10 iops / image read (Large directories-thousands of files) 3 iops / image read (smaller directories - hundreds of files) 2.5 iops / image read ( file handle cache)
8
NFS-based Design
9
Haystack based Design id>/<Logical volume, Photo>
10
Store Organization Store’s capacity is organized into physical volumes. eg- 1 TB -> 100 physical volumes each of 100GB. Logical volumes - Physical volumes on different machines grouped.
11
Photo upload
12
Haystack Directory
13
Haystack Directory Logical to physical volume mapping Load balancing
writes across logical volumes reads across physical volumes Request handled by CDN or Cache Identifies and marks volumes Read- Only
14
Haystack Cache
15
Haystack Cache Distributed hash table with photo-id as key
Receives HTTP requests from CDNs and browsers Cache a photo if following two conditions are met: Request is directly from browser not CDN Photo fetched is from write enabled Store machine
16
Haystack Store
17
Haystack Store Each store manages multiple physical volumes
Physical volume analogous to a large file saved as “/hay/haystack_<logical volume id>” Contains in-memory mappings of Photo ids to filesystem metadata (file, offset, size etc.) Represents each physical volume as a large file
18
Layout of Haystack Store file
19
Haystack Store Read Cache machine supplies the logical volume id, key, alternate key, and cookie to the Store machine Store machine looks up the relevant metadata in its in-memory mappings Seeks to the appropriate offset in the volume file, reads the entire needle Verifies cookie and integrity of the data Returns data to the Cache
20
Haystack Store Write Web server provides logical volume id, key, alternate key, cookie, and data to Store machines Store machines synchronously append needle images to physical volume files Update in-memory mappings as needed
21
Haystack Store Delete Store machine sets the delete flag in both the in-memory mapping and in the volume file Space occupied by deleted needles is lost! How to reclaim? Compaction! Important because 25% of photos get deleted in a given year.
22
Layout of Haystack Index file
23
Haystack Index File The index file provides the minimal metadata required to locate a particular needle in the store Main purpose: allow quick loading of the needle metadata into memory without traversing the larger Haystack store file Index is usually less than 1% the size of the store file Problem: Updated asynchronously leading to stale checkpoints with needles without index needles and index needles not updated when deleted.
24
File System Uses XFS - extent based file system Advantages :
Small blockmaps for several contigious large files that can be stored in main memory Efficient file preallocation which mitigates fragmentation
25
Failure Recovery Pitch-fork - Background task that periodically checks health of each store. If health checks fails, marks logical volumes on that store as Read-Only Failures addressed manually with operations like Bulk sync.
26
Optimizations Compaction Deleted and Duplicate needles
Saving more memory Delete flag replaced with offset 0 Cookie value not stored Batch upload
27
Evaluation Characterize photo requests seen by Facebook
Effectiveness of Directory and Cache Store performance
28
Photo Requests
29
Traffic Volume
30
Haystack Directory
31
Haystack Cache hit rate
32
Benchmark Performance
33
Production Data
34
Production Data
35
Conclusion Haystack simple and effective storage system
Optimized for random reads Cheap commodity storage Fault-tolerant Incrementally scalable
36
Q&A Thanks
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.