Presentation is loading. Please wait.

Presentation is loading. Please wait.

Boxwood: Abstractions as the Foundation for Storage Infrastructure Lidong Zhou, Microsoft Research Silicon Valley Joint work with Chandu Thekkath, John.

Similar presentations


Presentation on theme: "Boxwood: Abstractions as the Foundation for Storage Infrastructure Lidong Zhou, Microsoft Research Silicon Valley Joint work with Chandu Thekkath, John."— Presentation transcript:

1 Boxwood: Abstractions as the Foundation for Storage Infrastructure Lidong Zhou, Microsoft Research Silicon Valley Joint work with Chandu Thekkath, John MacCormick, Nick Murphy, and Marc Najork

2 12/06/2004Boxwood2 Distributed Storage Applications are Hard to Build Distributed storage: low hardware cost, but high development/deployment cost Application logic on low-level storage interface Hardware parallelism and concurrency control Fault tolerance a necessity Incremental expansion and dynamic reconfiguration vs. system consistency Our goal: Distributed storage applications made easy to design, build, and deploy

3 12/06/2004Boxwood3 Target Application and Setting Enterprise storage applications and back-end storage for data-intensive Internet services

4 12/06/2004Boxwood4 Roadmap Boxwood Vision Boxwood Architecture Building Applications on Boxwood Performance Related Work and Conclusion

5 12/06/2004Boxwood5 Boxwood Vision Incorporate rich virtualized abstractions into low levels of the storage An evolution path for distributed storage: Storage Applications

6 12/06/2004Boxwood6 Boxwood Vision Incorporate rich virtualized abstractions into low levels of the storage An evolution path for distributed storage: Virtual Disk Storage Applications

7 12/06/2004Boxwood7 Boxwood Vision Incorporate rich virtualized abstractions into low levels of the storage An evolution path for distributed storage: Storage Applications …… TreeTableList

8 12/06/2004Boxwood8 Why High-Level Abstractions Reduce the complexity of distributed storage applications Natural continuum of storage virtualization High-level programming language for building distributed storage applications Potential built-in performance optimization by exploiting structural information Caching Prefetching

9 12/06/2004Boxwood9 Roadmap Boxwood Vision Boxwood Architecture Building Applications on Boxwood Performance Related Work and Conclusion

10 12/06/2004Boxwood10 Chunk Store Reliable Media ServicesServices Locking Logging Consensus Storage Application High-level Storage Abstractions Boxwood Architecture Replicated Logical Device Magnetic Media B-Tree

11 12/06/2004Boxwood11 Chunk Store Persistent storage with malloc-like interface Virtualization layer that hides the distributed nature Manage address space or free space for higher layers Reliable storage through replicated logical device Chunk Store Allocate De-allocate Read Write Replicated Logical Device

12 12/06/2004Boxwood12 B-Tree Abstraction B-Tree: A proven useful data structure for storage applications Distributed/reliable B- Link trees in Boxwood B-Link trees: high concurrency with simple locking Distributed reliable storage from chunk store Caching for performance Distributed lock service for consistency Logging for recovery B-Link Tree Insert Delete Lookup Enumerate Create Chunk Store LockingLogging

13 12/06/2004Boxwood13 Boxwood Services Distributed lock service for coordinating concurrent access to shared data Logging and recovery service for atomicity in face of transient failures Consensus service for system consistency Clean design of these services is crucial for scalability and for managing complexity

14 12/06/2004Boxwood14 Roadmap Boxwood Vision Boxwood Architecture Building Applications on Boxwood Performance Related Work and Conclusion

15 12/06/2004Boxwood15 Distributed Storage Applications on Boxwood: A Recipe 1.Design applications for local storage Map application logic to storage abstractions 2.Adapt the design for a distributed storage infrastructure Boxwood abstractions are virtualized Boxwood offers facilitating distributed services Separating algorithmic design from distributed system concerns is attractive.

16 12/06/2004Boxwood16 Local Disks From B-Link Tree Algorithm to Distributed Reliable B-Link Trees Local Disks B-Link trees on a single machine B-Link Tree Algorithm Local Locks Logging

17 12/06/2004Boxwood17 From B-Link Tree Algorithm to Distributed Reliable B-Link Trees B-Link Tree Algorithm Global Lock Service Reliable Logging Chunk Store Distributed and reliable B-Link trees Replicated Logical Device

18 12/06/2004Boxwood18 B-Link Tree Chunk Store ServicesServices BoxFS BoxFS: Multi-Node File Server on Boxwood Exported via NFS v2 Directory/File B-Tree Directory: maps names to NFS file handle with embedded B-tree handle File: maps block number to chunk handle File blocks chunks Locking/caching at file system level ~2500 lines of C# code

19 12/06/2004Boxwood19 Roadmap Boxwood Vision Boxwood Architecture Building Applications on Boxwood Performance Related Work and Conclusion

20 12/06/2004Boxwood20 Prototype Deployment and Performance Evaluation System setup Eight Dell PowerEdge 2650 servers with a single 2.4 GHz Xeon processor, 1GB of RAM Gigabit Ethernet switch Adaptec AIC-7899 dual SCSI adapter, and 5 SCSI drives Performance evaluation Single-machine non-replicated performance (BoxFS vs. NFS) B-tree operation scalability BoxFS operation scalability

21 12/06/2004Boxwood21 BoxFS vs. NFS over NTFS: Connectathon Benchmarks

22 12/06/2004Boxwood22 B-Tree Scaling (Private Tree)

23 12/06/2004Boxwood23 BoxFS Scaling (Read)

24 12/06/2004Boxwood24 B-Tree Scaling (Shared Tree)

25 12/06/2004Boxwood25 BoxFS Scaling (Write/MkDirEnt)

26 12/06/2004Boxwood26 Roadmap Boxwood Vision Boxwood Architecture Building Applications on Boxwood Performance Related Work and Conclusion

27 12/06/2004Boxwood27 Related Work Distributed Storage/Operating Systems Virtual/Logical disks File systems Database systems Scalable Distributed Data Structures Linear Hash Table (LH) and its variants (Litwin, 1980--present) Scalable distributed hash table (Gribble et al., 2000) Highly concurrent B-trees (Lehman and Yao, 1981; Sagiv, 1986)

28 12/06/2004Boxwood28 Conclusion and Future Directions A storage infrastructure offering virtualized high-level abstractions is promising Future Work: Explore more abstractions and applications; expose flexible interfaces (e.g., through hints) Leverage high-level abstractions for better load balancing, prefetching, and caching Graceful degradation during massive failures


Download ppt "Boxwood: Abstractions as the Foundation for Storage Infrastructure Lidong Zhou, Microsoft Research Silicon Valley Joint work with Chandu Thekkath, John."

Similar presentations


Ads by Google