Efficient data maintenance in GlusterFS using databases

Efficient data maintenance in GlusterFS using databases
Joseph Fernandes Dan Lambright

Who we are ? Joseph Fernandes (Senior Engineer, Red Hat Storage)
Dan Lambright (Principle Engineer, Red Hat Storage)

Agenda Quick GlusterFS Overview Data Maintenance Challenges
Existing Solutions Proposed Solution : Optimized Database Case study : GlusterFS Data Cache Tier Lesson learned What's next

What is GlusterFS Distributed File System Software Define NAS
TCP/IP or RDMA Native Client, SMB, NFS

What is Data Maintenance
Maintenance tasks performed on data for protection, performance, and optimum storage utilization

Challenges in Data maintenance
Data Maintenance has a overhead on CPU, Memory, Storage, Network.. Therefore.. Fast Search Rich Metadata Distribute Load balancing Search should be precise and fast Should have rich metadata filter : Modification Frequency, IO Sizes etc Should deal with distributed nature of data Should do load balancing

Existing Solutions File system crawl File system log
Metadata databases In-memory inode caches File system crawl : Slow File system log : Write fast, Slow read and more space Metadata databases: Gluster doesnot have one In-memory inode caches: Not Durable

Optimized DB for GlusterFS
Proposed Optimized DB for GlusterFS

Optimized DB for GlusterFS
“ Record now , consume later” Database optimized to record fast Good Querying Capabilities Embedded Database

LibgfDB API Abstraction Rich Search Filters Non Centralized
Performance optimization options API Abstraction : Any DB Rich Search Filters : Frequency Counters, Size of IO counters, Parts of File meta etc Non Centralized : local to bricks Performance optimization options

Gluster Client Data Maintenance Scanners IO Query LIBGFDB Gluster Brick DataStore Insert / Update CTR Xlator Posix Xlator LIBGFDB

Datastore Optimization: Sqlite3
PRAGMA page_size: Align page size PRAGMA cache_size: Increased cache size PRAGMA journal_mode: Change to WAL PRAGMA wal_autocheckpoint : Less often autocheck PRAGMA synchronous : Set to NORMAL PRAGMA auto_vacuum : Set to NONE

DataStore Optimization: Sqlite3
Buffer cache Insert/Update Shared Memory File Sync Write Ahead Logging (WAL) Checkpoint Database file

Cache Tiering (Gluster 3.7 feature)
logical volume composed of diverse storage units Secure / nonsecure, compressed / uncompressed, etc. Cache tiering Fast storage as cache for slow storage Fa$t SSD, slow HDD Fast 2X replicated, slow erasure coded What goes in the cache? DB tracks usage patterns Files migrate between tiers per usage Migration is slow

Policies for Smart Migration
File size Sequential vs. random Access rate Migration frequency Break files into chunks Gluster “sharding” feature

Gluster implementation
New volume type: tier Attach / detach hot bricks to existing volumes Migration uses existing mechanisms Tweaks to Distributed Hash Table (DHT) Old DHT: destination node = hash(file+path) New: Always try hot tier first Hot tier may be multiple bricks. Which brick on tier? Choose with old DHT algorithm “Stacking DHT”

Other Client Xlator Tier Xlator HOT DHT COLD DHT Replication Xlator HOT Tier COLD Tier Other Server Xlator Other Server Xlator Demotion CTR Xlator CTR Xlator POSIX Xlator POSIX Xlator Brick Storage Brick Storage Heat Data Store Promotion Heat Data Store

Benchmarking: how well does it work?
Many benchmarks a poor fit for tiering Cache miss triggers migration - costly Tiering needs stable workloads Data stays in hot tier for hours or longer e.g. a set of videos popular for several days New benchmarking tool Can use with dm-cache, Ceph tiering, … DB results Scalability problems

Lesson Learned : DB updates can be expensive
DB query may have scalability problems Durability (ACID semantics) is expensive Updates can be Expense: Read + modify + updates Scalability Issues: Since Single files and WAL complex queries can be slow Durable Metadata: Not Suited for durable metadata

What's next: Libgfdb Performance options : PLog
Sqlite3 Database Sharding Ceph Tier Implementation: Bloom Filters

Feature Page es/data-classification Gluster Forge: Joseph Fernandes Dan Lambright

THANK YOU

Efficient data maintenance in GlusterFS using databases

Similar presentations

Presentation on theme: "Efficient data maintenance in GlusterFS using databases"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Efficient data maintenance in GlusterFS using databases

Similar presentations

Presentation on theme: "Efficient data maintenance in GlusterFS using databases"— Presentation transcript:

Similar presentations

About project

Feedback