Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin.

Similar presentations


Presentation on theme: "1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin."— Presentation transcript:

1 1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin – Madison OSDI ’08 – December 9 th, 2008

2 2/25 Corrupt file systems  File systems  Store massive amounts of data  Must be reliable  Corrupted file system images  Due to hardware errors, file system bugs, etc.  Need to be repaired a.s.a.p.

3 3/25 Who should repair?  Does journaling (write-ahead log) help?  No, only for crashes  Does file system repair itself online?  No, not enough machinery  Fsck: the last line of defense  It’s a “must have” utility − XFS: “no need fsck ever”, but deploys fsck at the end  Must be fully reliable

4 4/25 But … fsck is complex  Fsck has a big task  Turn any corrupt image to a consistent image  E.g. check if a data block is shared by two inodes  How are they implemented?  Written in C  hard to reason about  Large and complex − Ext2 fsck: 150 checks in 16 KLOC − XFS fsck: 340 checks in 22 KLOC  Hundreds of cluttered if-check statements  Bottom line: fsck code is “untouchable”

5 5/25 Two Questions  Are current checkers really reliable?  If not, how should we build robust checkers?

6 6/25 e2fsck is unreliable  Analyze e2fsck (ext2 file system checker)  Findings:  Inconsistent repair − The file system becomes unreadable  Consistent but not “correct” − Fsck deletes valid directory entries − Fsck loses a huge number of files

7 7/25 SQCK  Lesson: Complexity is the enemy of reliability  Big task + bad design  complexity  unreliability  Need a higher-level approach for simplicity  SQCK (SQL-based Fsck)  Use a declarative query language to write checks  Put simply: write fewer lines of code  Evaluation  Simple and reliable: e2fsck in 150 queries (vs. 16 KLOC of C)  More: Great flexibility and reasonable performance

8 8/25 Outline  Introduction  Analysis of e2fsck  SQCK Design  SQCK Evaluation  Conclusion

9 9/25 Methodology  E2fsck task: cross-check all ext2 metadata  An indirect pointer should not point to the superblock  A subdir should only be accessible from one directory  Inject single corruption  Observe how e2fsck repairs a single corruption  Only corrupt on-disk pointers − Corrupt an indirect pointer to point to the superblock − Corrupt a directory entry to point to another directory  Usually, a corrupt pointer is simply cleared to zero

10 10/25 Inconsistent (Out-of-order) Repair Inode *ind 850 851 998999 853 Inode *ind … … … … Indirect block 0 Superblock 1.Check bad indirect pointer 2. Check indirect content Ideal fsck e2fsck Inode *ind Inode *ind … … … … Superblock 2. Check indirect content 1.Check bad indirect pointer 0 Superblock … … … … 0 0 0

11 11/25 Consistent but Incorrect Repair (1) / a1b1 a2b2 Ideal fsck e2fsck / a1b1 a2b2 / a1b1 a2b2 X LF / a1b1 a2b2 / a1b1 b2 X Kidnapping problem! E2fsck does not use all available information

12 12/25 Result Summary  Four problems  Inconsistent  Information-incomplete  Policy-inconsistent  Insecure  E2fsck does not handle all corruptions  “Warning: Programming bug in e2fsck! Or some bonehead (you) is checking a mounted (live) filesystem.”  Not simple implementation bugs  Difficult to combine available information  Difficult to ensure correct ordering

13 13/25 Outline  Introduction  Analysis  SQCK Design  SQCK Evaluation  Conclusion

14 14/25 Fsck Properties  Hundreds of checks  Complex cross-checks  Taxonomy of checks in e2fsck:  Must be ordered correctly Single instance Multiple instances Same structure 6311 Different structures 1235 struct A { int x int y } A { x y } A { x y } A { x y } A { x y } B { m n } A { x y } B { m n } A { x y } B { m n } A { x y } B { m n }

15 15/25 A Declarative Approach  Lesson: Complexity is the enemy of reliability  SQCK  Use a declarative query language (e.g. SQL), why?  It is declarative: high-level intent is clear  Fit for cross-checking massive information  Goals achieved  Simple: e2fsck in 150 queries (vs. 16 KLOC of C)  Reliable: Each check/query is easy to understand  Flexible: Plug in/out different queries

16 16/25 Using SQCK  Take a fs image  Load metadata to db tables  Temporary tables  Ex: InodeTable, GroupDescTable, DirEntryTable  Run checks and repairs (in the form of queries)  Flush any modification, and delete tables Scanner Loader File system image Checks + Repairs Flush Database tables

17 17/25 Declarative check (example 1)  Cross-checking a single instance of a structure  “Find block bitmap that is not located within its block group” first_block = sb->s_first_data_block; last_block = first_block + blocks_per_group; for (i = 0, gd=fs->group_desc; i group_desc_count; i++, gd++) \{ if (i == fs->group_desc_count - 1) last_block = sb->s_blocks_count; if ((gd->bg_blk_bmap < first_block) || (gd->bg_blk_bmap >= last_block)) { px.blk = gd->bg_block_bitmap; if (fix_problem(BB_NOT_GROUP,...)) gd->bg_block_bitmap = 0; }... } SELECT * FROM GroupDescTable G WHERE G.blockBitmap NOT BETWEEN G.start AND G.end

18 18/25 Declarative check (example 2)  Cross-checking multiple instances of the same structure  “Find false parents (i.e. directory entries that point to a subdirectory that already belongs to another directory)”  Must read all directory entries in dir data blocks  Wrong implementation in e2fsck (the kidnapping problem)

19 19/25 Declarative check (example 2) if ((dot_state > 1) && (ext2fs_test_inode_bitmap (ctx->inode_dir_map, dirent->inode))) { // ext2fs_get_dir_info // is 20 lines long subdir = e2fsck_get_dir_info (dirent->inode);... if (subdir->parent) { if (fix_problem(LINK_DIR,..)) { dirent->inode = 0; goto next; } } else { subdir->parent = ino; }

20 20/25 Declarative check (example 2) SELECT F.* //  returns the // false parent(s) FROM DirEntryTable P, C, F WHERE // P says C is its child P.entry_num >= 3 AND P.entry_ino = C.ino AND // and C says P is his parent C.entry_num = 2 AND C.entry_ino = P.ino AND // F also says C is its child F.entry_num >= 3 AND F.entry_ino = C.ino AND F.ino <> P.ino AND FP C

21 21/25 Declarative Repairs  Running declarative checks is part of the problem  Must also perform the declarative repairs  A repair = An update query  Some repairs simply update a few fields  A repair = A series of queries  Ex: Reconnect an orphan directory to the lost+found directory  Combine a series of queries with C code − All repairs are written in SQL − C code is only used for connecting them... SET T.field = newValue, T.dirty = 1

22 22/25 Outline  Introduction  Analysis  SQCK Design  SQCK Evaluation  Conclusion

23 23/25 SQCK Evaluation  Complexity  150 queries in 1100 lines of SQL statements  (compared to 16,000 lines of C in e2fsck)  Reliability  Pass hundreds of corruption scenarios  Flexibility  Add new checks/repairs  Enable different versions of e2fsck  Performance  Introduce some optimizations

24 24/25 SQCK vs. e2fsck  Reasonable  First generation of SQCK (with MySQL)  Within 1.5x of e2fsck  Future optimizations  Hierarchical checks  Concurrent queries

25 25/25 Conclusion  Complexity is the enemy of reliability  Recovery code is complex  SQCK: Build recovery tools with a higher- level approach

26 26 Thank you! Questions? ADvanced Systems Laboratory www.cs.wisc.edu/adsl


Download ppt "1 SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin."

Similar presentations


Ads by Google