Download presentation
Presentation is loading. Please wait.
1
Research Introduction
Taeuk Kim Laboratory for Advanced System Software Sogang University 1
2
Contents Background Introduction of Parallel File System LADS FT-LADS
What I will do
3
Background Large amount of Scientific data Satellite image
Nuclear data Big data era Cooperative research and analyze => transferring large scale data ↑
4
Background Network bandwidth scale ↑
10G ethernet, Infiniband on private network Relatively file system bandwidth ↓ Bottleneck on storage server ↑ A solution for bottleneck on file system is PFS (Parallel File System)
5
Parallel File System I/O processing consumes time ↑
Idea for distributing I/O workload Managing file by chunking -> object Maintaining and managing multiple storage server and parallely processing file objects with multi-threading
6
Parallel File System - Lustre
MDS/MGS : Metadata Server, Management Server MDT : Metadata Target OSS : Object Storage Server OST : Object Storage Target
7
LADS There are still problems -> I/O contention on PFS
Many clients use same PFS -> QoS ↓ Disk failure -> rebuilding disk from RAID Unbalanced I/O load => performance ↓ If we can avoid I/O congestion, the performance will be better LADS (Layout-Aware Data Scheduling) is motivated from above -> (Data transfer) application level solution It helps you to avoid I/O contention in OSS/OST
8
LADS Avoiding congestions of cluster storage
OSS’s or OST’s data arrival rate > data service rate -> overflow occurs (In Lustre)
9
LADS In PFS, we can know the layout of objects
-> Possible to distributing objects well and avoiding congeston With knowing the layout, it’s possible to schedule I/O not to be congested
10
LADS Transfer protocol
Step1 – (src) NEW_FILE to comm’s Q, NEW_BLOCK to OST’s Q Step 2 – (sink) FILE_ID to comm’s Q Step 3 – (src) wake up N I/O threads and read data from OST and send that to sink Step 4 – (sink) receive req and read from src’s RMA buffer Step 5 – (sink) Store data objects in sink’s OST (layout-aware) Step 6 – (src) If BLOCK_DONE, release RMA buffer and goto step 3 Scheduling Layout-aware : knowing about which OST stores which data objects Congestion-aware : Defualt is RR, but when a thread is accessing an OST, another thread pass that OST and move on to next Object caching When RMA buffer is full, it uses NVM buffer
11
FT-LADS Disk failure situation (of facebook)
New data : 500TB~900TB/day Failure : lose 100k blocks/day Reconstruction traffic : 180TB/day Recovery of disk failure is important Currently LADS doesn’t have fault tolerance mechanism
12
FT-LADS FT-LADS is a poster accepted by IEEE CLUSTER’16
It suggests fault tolerance mechanism for LADS File logging : remaining sent block’s log file per every files Char type logging Encoding type logging Int type logging Binary type logging : block number as a binary value Bit binary logging : n-th bit position as n-th block Transaction logging : remaining multiple files’ logs in one log file Universal logging : remaining all files’ logs in one log file
13
What I will do is To Support FT-LADS improvement
Implementing some issues, etc To Contribute with a new idea Thinking about core-aware in NUMA
14
Q&A Thank you
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.