Download presentation
Presentation is loading. Please wait.
Published byGwendolyn Carter Modified over 9 years ago
1
1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin – Madison OSDI ’08 – December 9 th, 2008
2
2/25 Corrupt file systems File systems Store massive amounts of data Must be reliable Corrupted file system images Due to hardware errors, file system bugs, etc. Need to be repaired a.s.a.p.
3
3/25 Who should repair? Does journaling (write-ahead log) help? No, only for crashes Does file system repair itself online? No, not enough machinery Fsck: the last line of defense It’s a “must have” utility − XFS: “no need fsck ever”, but deploys fsck at the end Must be fully reliable
4
4/25 But … fsck is complex Fsck has a big task Turn any corrupt image to a consistent image E.g. check if a data block is shared by two inodes How are they implemented? Written in C hard to reason about Large and complex − Ext2 fsck: 150 checks in 16 KLOC − XFS fsck: 340 checks in 22 KLOC Hundreds of cluttered if-check statements Bottom line: fsck code is “untouchable”
5
5/25 Two Questions Are current checkers really reliable? If not, how should we build robust checkers?
6
6/25 e2fsck is unreliable Analyze e2fsck (ext2 file system checker) Findings: Inconsistent repair − The file system becomes unreadable Consistent but not “correct” − Fsck deletes valid directory entries − Fsck loses a huge number of files
7
7/25 SQCK Lesson: Complexity is the enemy of reliability Big task + bad design complexity unreliability Need a higher-level approach for simplicity SQCK (SQL-based Fsck) Use a declarative query language to write checks Put simply: write fewer lines of code Evaluation Simple and reliable: e2fsck in 150 queries (vs. 16 KLOC of C) More: Great flexibility and reasonable performance
8
8/25 Outline Introduction Analysis of e2fsck SQCK Design SQCK Evaluation Conclusion
9
9/25 Methodology E2fsck task: cross-check all ext2 metadata An indirect pointer should not point to the superblock A subdir should only be accessible from one directory Inject single corruption Observe how e2fsck repairs a single corruption Only corrupt on-disk pointers − Corrupt an indirect pointer to point to the superblock − Corrupt a directory entry to point to another directory Usually, a corrupt pointer is simply cleared to zero
10
10/25 Inconsistent (Out-of-order) Repair Inode *ind 850 851 998999 853 Inode *ind … … … … Indirect block 0 Superblock 1.Check bad indirect pointer 2. Check indirect content Ideal fsck e2fsck Inode *ind Inode *ind … … … … Superblock 2. Check indirect content 1.Check bad indirect pointer 0 Superblock … … … … 0 0 0
11
11/25 Consistent but Incorrect Repair (1) / a1b1 a2b2 Ideal fsck e2fsck / a1b1 a2b2 / a1b1 a2b2 X LF / a1b1 a2b2 / a1b1 b2 X Kidnapping problem! E2fsck does not use all available information
12
12/25 Result Summary Four problems Inconsistent Information-incomplete Policy-inconsistent Insecure E2fsck does not handle all corruptions “Warning: Programming bug in e2fsck! Or some bonehead (you) is checking a mounted (live) filesystem.” Not simple implementation bugs Difficult to combine available information Difficult to ensure correct ordering
13
13/25 Outline Introduction Analysis SQCK Design SQCK Evaluation Conclusion
14
14/25 Fsck Properties Hundreds of checks Complex cross-checks Taxonomy of checks in e2fsck: Must be ordered correctly Single instance Multiple instances Same structure 6311 Different structures 1235 struct A { int x int y } A { x y } A { x y } A { x y } A { x y } B { m n } A { x y } B { m n } A { x y } B { m n } A { x y } B { m n }
15
15/25 A Declarative Approach Lesson: Complexity is the enemy of reliability SQCK Use a declarative query language (e.g. SQL), why? It is declarative: high-level intent is clear Fit for cross-checking massive information Goals achieved Simple: e2fsck in 150 queries (vs. 16 KLOC of C) Reliable: Each check/query is easy to understand Flexible: Plug in/out different queries
16
16/25 Using SQCK Take a fs image Load metadata to db tables Temporary tables Ex: InodeTable, GroupDescTable, DirEntryTable Run checks and repairs (in the form of queries) Flush any modification, and delete tables Scanner Loader File system image Checks + Repairs Flush Database tables
17
17/25 Declarative check (example 1) Cross-checking a single instance of a structure “Find block bitmap that is not located within its block group” first_block = sb->s_first_data_block; last_block = first_block + blocks_per_group; for (i = 0, gd=fs->group_desc; i group_desc_count; i++, gd++) \{ if (i == fs->group_desc_count - 1) last_block = sb->s_blocks_count; if ((gd->bg_blk_bmap < first_block) || (gd->bg_blk_bmap >= last_block)) { px.blk = gd->bg_block_bitmap; if (fix_problem(BB_NOT_GROUP,...)) gd->bg_block_bitmap = 0; }... } SELECT * FROM GroupDescTable G WHERE G.blockBitmap NOT BETWEEN G.start AND G.end
18
18/25 Declarative check (example 2) Cross-checking multiple instances of the same structure “Find false parents (i.e. directory entries that point to a subdirectory that already belongs to another directory)” Must read all directory entries in dir data blocks Wrong implementation in e2fsck (the kidnapping problem)
19
19/25 Declarative check (example 2) if ((dot_state > 1) && (ext2fs_test_inode_bitmap (ctx->inode_dir_map, dirent->inode))) { // ext2fs_get_dir_info // is 20 lines long subdir = e2fsck_get_dir_info (dirent->inode);... if (subdir->parent) { if (fix_problem(LINK_DIR,..)) { dirent->inode = 0; goto next; } } else { subdir->parent = ino; }
20
20/25 Declarative check (example 2) SELECT F.* // returns the // false parent(s) FROM DirEntryTable P, C, F WHERE // P says C is its child P.entry_num >= 3 AND P.entry_ino = C.ino AND // and C says P is his parent C.entry_num = 2 AND C.entry_ino = P.ino AND // F also says C is its child F.entry_num >= 3 AND F.entry_ino = C.ino AND F.ino <> P.ino AND FP C
21
21/25 Declarative Repairs Running declarative checks is part of the problem Must also perform the declarative repairs A repair = An update query Some repairs simply update a few fields A repair = A series of queries Ex: Reconnect an orphan directory to the lost+found directory Combine a series of queries with C code − All repairs are written in SQL − C code is only used for connecting them... SET T.field = newValue, T.dirty = 1
22
22/25 Outline Introduction Analysis SQCK Design SQCK Evaluation Conclusion
23
23/25 SQCK Evaluation Complexity 150 queries in 1100 lines of SQL statements (compared to 16,000 lines of C in e2fsck) Reliability Pass hundreds of corruption scenarios Flexibility Add new checks/repairs Enable different versions of e2fsck Performance Introduce some optimizations
24
24/25 SQCK vs. e2fsck Reasonable First generation of SQCK (with MySQL) Within 1.5x of e2fsck Future optimizations Hierarchical checks Concurrent queries
25
25/25 Conclusion Complexity is the enemy of reliability Recovery code is complex SQCK: Build recovery tools with a higher- level approach
26
26 Thank you! Questions? ADvanced Systems Laboratory www.cs.wisc.edu/adsl
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.