Download presentation
Presentation is loading. Please wait.
Published byNelson Caldwell Modified over 9 years ago
1
Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author, Derrick Coetzee, waives all copyright and related or neighboring rights to this work.
2
Why parallelize builds? Developer cycle time – Faster builds = Developers get more work done, higher morale Continuous integration – Faster builds = tests run more often Check-in verification systems – Faster builds = more throughput on check-in queue
3
Parallel build systems today Job scheduling Typical example: make -j – Find n build steps that have no unbuilt dependencies and run them – Whenever one exits, start the next one Depends on the dependency graph being correct and complete Coarse-grained task parallelism
4
What could go wrong? Incomplete dependency information – Serial builds → leads to incorrect incremental builds – Parallel builds → leads to nondeterministic builds, build breaks, incorrect builds – Developer changes can introduce or remove dependencies at any time #include "yy.lex.h"
5
Example of missing dependencies gcc test.c -o test – What files does it read/write/test existence of?
6
Example of missing dependencies gcc test.c -o test – What files does it read/write/test existence of? Actual: 5 processes, 119 files/directories /usr/bin/gcc/etc/ld.so.hwcap/tmp /usr/lib/gcc/…/cc1/lib/libc.so.6/tmp/ccdCCHK0.s /usr/bin/as/proc/meminfo/tmp/ccKs1ykU.c /usr/bin/ldtest.c.gch/tmp/cc0YtTuE.o /usr/bin/nm/usr/lib/crt1.o/tmp/ccGGL3Eo.ld /usr/bin/strip/usr/…/lib/specs/tmp/ccG4c608.le ………
7
Parallel builds are error-prone Missing dependencies cause errors Nondeterministic builds make errors difficult to reproduce Unnecessary dependencies limit scalability An alternative: – Developer specifies serial build (easier!) – Serial build is automatically parallelized – Nondeterminism is eliminated
8
Build transactions Each build step’s file operations are monitored using system call interception A transaction manager inserts locks before accessing each file (may suspend processes) Ensure that parallel build behaves in same way as the serial build – Use concurrency control techniques from databases – Schedule is conflict-equivalent to the user’s serial schedule
9
Build transactions example (1) Compile test.c to test.o, then (2) link: tidLock/unlockLock typePathResult 1LOCKREAD/etc/ld.so.cacheOK 2LOCKREAD/etc/ld.so.cacheOK …………… 1LOCKCREATEtest.oOK …………… 2LOCKTESTtest.oBLOCKED …………… 1UNLOCKCREATEtest.oOK 2LOCKTESTtest.oOK
10
Build transactions example What if transaction 2 takes the lock first? tidLock/unlockLock typePathResult 1LOCKREAD/etc/ld.so.cacheOK 2LOCKREAD/etc/ld.so.cacheOK …………… 2LOCKTESTtest.oOK …………… 1LOCKCREATEtest.oROLLBACK 2 …………… 2LOCKTESTtest.oBLOCKED …………… 1UNLOCKCREATEtest.oOK 2LOCKTESTtest.oOK
11
Avoiding cascading rollback To ensure conflict-equivalence to the serial schedule, transactions must commit in order – Strict two-phase locking is too strict Instead, take advantage of the fact that the dependency graph – and lock set – changes very little from build to build Predicted locks – Derived from set of possible conflicts during previous run – Never block – Give no privilege to access data – Block conflicting lock attempts by transactions with larger timestamps
12
Build transactions example Compile step followed by a link step: tidLock/unlockLock typePathResult 1PREDICTED LOCK CREATEtest.oOK 1LOCKREAD/etc/ld.so.cacheOK 2LOCKREAD/etc/ld.so.cacheOK …………… 2LOCKTESTtest.oBLOCKED …………… 1LOCKCREATEtest.oOK …………… 1UNLOCKCREATEtest.oOK 2LOCKTESTtest.oOK
13
Preliminary results - Linux kernel build Number of concurrent processes
14
Preliminary results - Linux kernel build Statistics: – Number of transactions/build steps: 2,949 – Parallel build time: 3m9s – Total lock requests: 1,859,172 – Lock requests blocked due to conflict: 1,697
15
Future work: Unimplemented stuff Haven’t yet implemented rollback – Needed for “unexpected dependencies” Fast cross-platform system call interception – ptrace, binary translation, custom filesystem? Multiversion timestamping – Useful for builds that read/write the same file multiple times Append-only files – Log files, standard out
16
Future work: Diagnosing make build bugs If two build steps experience a conflict, but neither depends on the other directly or indirectly… – This proves the make build is nondeterministic – Isolates most important missing dependencies Filter dependency graph by “files in my source repository” – Finds other interesting dependencies (e.g. headers) Easy bug-finding tool for existing projects
17
Future work: Process hierarchies Long-running process spawning many short- lived processes (e.g. make) Rolling back make would be very bad Solution is virtualization: – Lie to make (your children have completed) – Predict outputs of children based on previous build – block make if it tries to access these – Rolling back make (if necessary) isn’t so bad now
18
Future work: Intra-build step parallelism Efficient parallel parsing for compilation – Ref Par Lab Browser’s work (Seth Fowler) Efficient parallel optimization – Unexplored? Efficient parallel linking – Ref Google’s gold linker
19
Questions?
20
Future work: Validated incremental builds Observation: most build steps produce same output files as in previous build Go ahead and use the old versions – if they’re wrong, we’ll find out when that file is rebuilt Eliminates blocking for a faster parallel build, at the cost of more rollbacks
21
Future work: Distributed parallel builds How to automatically partition builds between machines based on dependency graph? How to efficiently handle unexpected dependencies
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.