SCons: Big Signature Refactoring 18 April 2007
Build tools do TWO things Build things in the correct order NOT build what’s up to date
Building things in the correct order Simplest way to get correct order is to just write down the commands in a script If our dependencies never changed, we’d just script it and be done with it Dependency management is necessary because our software changes
Not rebuilding what’s up-to-date Simplest way to get up-to-date software is to build from scratch every time If our tools were infinitely fast, we’d do build from scratch Deciding what’s up-to-date is necessary because our tools take time
How does Make decide what’s up to date? Target file Source file(s) Is timestamp(target) > timestamp(source) for all source files, it’s up to date
What’s wrong with the way Make does it? Contents can change without modifying timestamp (incorrectly not rebuilding) Timestamps can change without modifying the contents (unnecessary rebuilds) –This can be positive: touch the source file and rebuild ensures a target is up-to-date Can’t handle timestamps rolling back in time
What’s right with the way Make does it? It uses file metadata (the timestamp) to approximate if the file contents have changed It’s a cheap test
What file metadata can we use? Timestamp (only metadata Make uses) Size Content checksum
Way SCons currently does it Metadata stored in a.sconsign file –Used to be one per directory –Still can be configured that way –sconsign script to dump metadata info –Contrast ClearCase, which stores metadata from a custom file system Use the metadata for more sophisticated decisions
Example Program(‘foo.c’) $ scons -Q gcc –o foo.o –c foo.c gcc –o foo foo.o $ sconsign.sconsign ===.: foo: 8f72e133e001cb380a13bcb6a16fb16f None foo.o: e61afae6ccfe99a63b0b4c15f18422f6 foo.o: e61afae6ccfe99a63b0b4c15f18422f6 None foo.c: b489a8c34c318fc60c8dac54fd58b791 foo.h: c864c870c5c6f984fca5b0ebd7361a7d
More readable output $ sconsign –-verbose.sconsign ===.: foo: bsig: 8f72e133e001cb380a13bcb6a16fb16f csig: None timestamp: size: 6762 implicit: foo.o: e61afae6ccfe99a63b0b4c15f18422f6 foo.o: bsig: e61afae6ccfe99a63b0b4c15f18422f6 csig: None timestamp: size: 1488 implicit: foo.c: b489a8c34c318fc60c8dac54fd58b791 foo.h: c864c870c5c6f984fca5b0ebd7361a7d
Readable timestamps, too $ sconsign –-verbose –-readable.sconsign ===.: foo: bsig: 8f72e133e001cb380a13bcb6a16fb16f csig: None timestamp: 'Tue Apr 17 19:15: ' size: 6762 implicit: foo.o: e61afae6ccfe99a63b0b4c15f18422f6 foo.o: bsig: e61afae6ccfe99a63b0b4c15f18422f6 csig: None timestamp: 'Tue Apr 17 19:15: ' size: 1488 implicit: foo.c: b489a8c34c318fc60c8dac54fd58b791 foo.h: c864c870c5c6f984fca5b0ebd7361a7d
foo.c foo.o foo.h foo $ sconsign –-verbose.sconsign ===.: foo: bsig: 8f72e133e001cb380a13bcb6a16fb16f csig: None timestamp: size: 6762 implicit: foo.o: e61afae6ccfe99a63b0b4c15… foo.o: bsig: e61afae6ccfe99a63b0b4c15f18422f6 csig: None timestamp: size: 1488 implicit: foo.c: b489a8c34c318fc60c8dac54… foo.h: c864c870c5c6f984fca5b0eb… SourceSignatures(‘MD5’) Program(‘foo.c’)
“Build signatures” bsig(foo.o) = md5( sig(foo.c) + sig(foo.h) + sig(cmd_line) ) bsig(foo) = md5( sig(foo.o) + sig(cmd_line) ) $ sconsign –-verbose.sconsign ===.: foo: bsig: 8f72e133e001cb380a13bcb6a16fb16f csig: None timestamp: size: 6762 implicit: foo.o: e61afae6ccfe99a63b0b4c15… foo.o: bsig: e61afae6ccfe99a63b0b4c15f18422f6 csig: None timestamp: size: 1488 implicit: foo.c: b489a8c34c318fc60c8dac54… foo.h: c864c870c5c6f984fca5b0eb…
foo.c foo.o foo.h foo SourceSignatures(‘MD5’) TargetSignatures(‘content’) Program(‘foo.c’) $ sconsign –-verbose.sconsign ===.: foo: bsig: 27d34d ce2f9cc5d8f3e3fbbb csig: None timestamp: size: 6762 implicit: foo.o: c f22aad3e8c1bc… foo.o: bsig: e61afae6ccfe99a63b0b4c15f18422f6 csig: None timestamp: size: 1488 implicit: foo.c: b489a8c34c318fc60c8dac54… foo.h: c864c870c5c6f984fca5b0eb…
foo.c foo.o foo.h foo $ sconsign –-verbose.sconsign ===.: foo: bsig: csig: None timestamp: size: 6762 implicit: foo.o: foo.o: bsig: csig: None timestamp: size: 1488 implicit: foo.c: foo.h: SourceSignatures(‘timestamp’) Program(‘foo.c’)
foo.c foo.o foo.h foo $ sconsign –-verbose.sconsign ===.: foo: bsig: csig: None timestamp: size: 6762 implicit: foo.o: foo.o: bsig: csig: None timestamp: size: 1488 implicit: foo.c: foo.h: SourceSignatures(‘timestamp’) TargetSignatures(‘content’) Program(‘foo.c’)
Problems with how SCons does it now Can’t switch between content and timestamps Can’t mix content + timestamps in same config –Example: want to use content for all input files except one really big file where you want to use timestamps Stores information about last build decision, not just metadata about the current state of file Must have complete dependency graph to make the same signature as last time –Can’t build only part of the DAG Can’t use dependency output from tools –Example: gcc –Md output
New.sconsign format $ sconsign.sconsign ===.: SConstruct: 059bf2bda6723d166c5dab7d54a0ca foo: c3d f56d foo.o: cc74a5b5cd4b174a59b58495cd2ef1f c4245ece9e7108d276b3c8eb7662d921 [$LINK -o $TARGET...] foo.c: b489a8c34c318fc60c8dac54fd58b foo.h: c864c870c5c6f984fca5b0ebd7361a7d foo.o: cc74a5b5cd4b174a59b58495cd2ef1f foo.c: b489a8c34c318fc60c8dac54fd58b foo.h: c864c870c5c6f984fca5b0ebd7361a7d d055c09cba5c626f5e38f2f17c29c6fa [$CC -o $TARGET -c...]
===.: SConstruct: csig: 059bf2bda6723d166c5dab7d54a0ca13 timestamp: size: 17 foo: csig: c3d f56d87 timestamp: size: 6762 implicit: foo.o: csig: cc74a5b5cd4b174a59b58495cd2ef1f9 timestamp: size: 1488 action: c4245ece9e7108d276b3c8eb7662d921 [$LINK –o $TARGET $LINKFLAGS $SOURCES...] foo.c: csig: b489a8c34c318fc60c8dac54fd58b791 timestamp: size: 55 foo.h: csig: c864c870c5c6f984fca5b0ebd7361a7d timestamp: size: 19 foo.o: csig: cc74a5b5cd4b174a59b58495cd2ef1f9 timestamp: size: 1488 implicit: foo.c: csig: b489a8c34c318fc60c8dac54fd58b791 timestamp: size: 55 foo.h: csig: c864c870c5c6f984fca5b0ebd7361a7d timestamp: size: 19 action: d055c09cba5c626f5e38f2f17c29c6fa [$CC –o $TARGET –c $CFLAGS $CCFLAGS...]
New.sconsign format $ sconsign.sconsign ===.: SConstruct: 059bf2bda6723d166c5dab7d54a0ca foo: c3d f56d foo.o: cc74a5b5cd4b174a59b58495cd2ef1f c4245ece9e7108d276b3c8eb7662d921 [$LINK -o $TARGET...] foo.c: b489a8c34c318fc60c8dac54fd58b foo.h: c864c870c5c6f984fca5b0ebd7361a7d foo.o: cc74a5b5cd4b174a59b58495cd2ef1f foo.c: b489a8c34c318fc60c8dac54fd58b foo.h: c864c870c5c6f984fca5b0ebd7361a7d d055c09cba5c626f5e38f2f17c29c6fa [$CC -o $TARGET -c...]
New.sconsign format Every file entry is consistent: foo.c: b489a8c34c318fc60c8dac54fd58b –Content signature (if read, None if not) –Timestamp –Length –Just stores file state at last time used Source files are explicitly stored –Can be used for caching checksums Actions and their signatures are explicitly stored
No build signatures! Up-to-date decision is done by comparing current metadata of each input file with information last time target was built Each decision can be independent: –Example: “Rebuild this target file if:” Any input text file has different content than last time Any input graphic file has a different timestamp than last time
Supporting Slides
Build signature boils down states of source+dependency files at time target was built –Not complete state, just our signature calc
foo.c foo.o foo.h foo