Presentation is loading. Please wait.

Presentation is loading. Please wait.

Efficient data maintenance in GlusterFS using databases

Similar presentations


Presentation on theme: "Efficient data maintenance in GlusterFS using databases"— Presentation transcript:

1 Efficient data maintenance in GlusterFS using databases
Joseph Fernandes Dan Lambright

2 Who we are ? Joseph Fernandes (Senior Engineer, Red Hat Storage)
Dan Lambright (Principle Engineer, Red Hat Storage)

3 Agenda Quick GlusterFS Overview Data Maintenance Challenges
Existing Solutions Proposed Solution : Optimized Database Case study : GlusterFS Data Cache Tier Lesson learned What's next

4 What is GlusterFS Distributed File System Software Define NAS
TCP/IP or RDMA Native Client, SMB, NFS

5 What is Data Maintenance
Maintenance tasks performed on data for protection, performance, and optimum storage utilization

6 Challenges in Data maintenance
Data Maintenance has a overhead on CPU, Memory, Storage, Network.. Therefore.. Fast Search Rich Metadata Distribute Load balancing Search should be precise and fast Should have rich metadata filter : Modification Frequency, IO Sizes etc Should deal with distributed nature of data Should do load balancing

7 Existing Solutions File system crawl File system log
Metadata databases In-memory inode caches File system crawl : Slow File system log : Write fast, Slow read and more space Metadata databases: Gluster doesnot have one In-memory inode caches: Not Durable

8 Optimized DB for GlusterFS
Proposed Optimized DB for GlusterFS

9 Optimized DB for GlusterFS
“ Record now , consume later” Database optimized to record fast Good Querying Capabilities Embedded Database

10 LibgfDB API Abstraction Rich Search Filters Non Centralized
Performance optimization options API Abstraction : Any DB Rich Search Filters : Frequency Counters, Size of IO counters, Parts of File meta etc Non Centralized : local to bricks Performance optimization options

11 Gluster Client Data Maintenance Scanners IO Query LIBGFDB Gluster Brick DataStore Insert / Update CTR Xlator Posix Xlator LIBGFDB

12 Datastore Optimization: Sqlite3
PRAGMA page_size: Align page size PRAGMA cache_size: Increased cache size PRAGMA journal_mode: Change to WAL PRAGMA wal_autocheckpoint : Less often autocheck PRAGMA synchronous : Set to NORMAL PRAGMA auto_vacuum : Set to NONE

13 DataStore Optimization: Sqlite3
Buffer cache Insert/Update Shared Memory File Sync Write Ahead Logging (WAL) Checkpoint Database file

14 Cache Tiering (Gluster 3.7 feature)
logical volume composed of diverse storage units Secure / nonsecure, compressed / uncompressed, etc. Cache tiering Fast storage as cache for slow storage Fa$t SSD, slow HDD Fast 2X replicated, slow erasure coded What goes in the cache? DB tracks usage patterns Files migrate between tiers per usage Migration is slow

15 Policies for Smart Migration
File size Sequential vs. random Access rate Migration frequency Break files into chunks Gluster “sharding” feature

16 Gluster implementation
New volume type: tier Attach / detach hot bricks to existing volumes Migration uses existing mechanisms Tweaks to Distributed Hash Table (DHT) Old DHT: destination node = hash(file+path) New: Always try hot tier first Hot tier may be multiple bricks. Which brick on tier? Choose with old DHT algorithm “Stacking DHT”

17 Other Client Xlator Tier Xlator HOT DHT COLD DHT Replication Xlator HOT Tier COLD Tier Other Server Xlator Other Server Xlator Demotion CTR Xlator CTR Xlator POSIX Xlator POSIX Xlator Brick Storage Brick Storage Heat Data Store Promotion Heat Data Store

18 Benchmarking: how well does it work?
Many benchmarks a poor fit for tiering Cache miss triggers migration - costly Tiering needs stable workloads Data stays in hot tier for hours or longer e.g. a set of videos popular for several days New benchmarking tool Can use with dm-cache, Ceph tiering, … DB results Scalability problems

19 Lesson Learned : DB updates can be expensive
DB query may have scalability problems Durability (ACID semantics) is expensive Updates can be Expense: Read + modify + updates Scalability Issues: Since Single files and WAL complex queries can be slow Durable Metadata: Not Suited for durable metadata

20 What's next: Libgfdb Performance options : PLog
Sqlite3 Database Sharding Ceph Tier Implementation: Bloom Filters

21 Feature Page es/data-classification Gluster Forge: Joseph Fernandes Dan Lambright

22 THANK YOU


Download ppt "Efficient data maintenance in GlusterFS using databases"

Similar presentations


Ads by Google