Download presentation
1
Asserting expectations
2
Observation vs assertion
Observation is simply the act of the programmer watching the states and events of system. The true burden is more on the act of judging whether the observed state is sane or not. Assertion is an automated observation technique, that can specify a condition at particular points in a program. What the figure is illustrating is that manual observation can only examine a small number of variable values during execution time. Whereas, assertions can check large parts of the execution, catching infections before they can obscure their origins.
3
Basic assertions Basic Assertions are straightforward code that check for infections Example: We could write an assert function to use Void assert(int x) { if (!x) { printf(“Assertion failed!\n”); abort(); } We can now use the function in our code to assert that a divisor in nonzero: assert(divisor !=0); This allows us not only to have separate code for debugging, but more importantly, will automatically check and ensure divisor is nonzero
4
Assertion and macros But we want top shelf debugging, not just some ‘script kiddy’, no info report, that only informs us “Assertion failed!” You need Macros! We’ve been spoiled, ever wonder how most IDE’s can tell you the line number and other important information about a failure? #ifndef NDEBUG #define assert(ex) \ ((ex) ? 1 : (cerr << __FILE__ << ":" << __LINE__ \ << ": assertion ‘" #ex "’ failed\n", \ abort(), 0)) #else #define assert(x) ((void) 0) #endif When a failure happens, in our example divisor ==0, then the __FILE__ and __LINE__ macros report the location And leverages the “stringize” mechanism to output the assertion that actually failed. This would look like: Divide.c:37: assertion ‘divisor !=0’ failed Abort (core dumped)
5
pre & post conditions are just checks or assertions
Asserting invariants The most important use of assertions in debugging is to ensure the sanity of data invariants – properties that must hold throughout the entire execution. Pre-conditions and Post-conditions can be used to ensure invariants hold At the beginning and end of public methods Takeaway: pre & post conditions are just checks or assertions in methods to make sure the important data is intact. Can be as simple as an if statement.
6
Asserting invariants Example: class Time { public: int hour(); // 0..23 int minutes(); // 0..59 int seconds(); // (incl. leap seconds) void set_hour(int h); … } Any time from 00:00:00 to 23:59:60 is valid void Time::set_hour(int h) { // precondition assert(0 <= hour() && hour() <= 23) && (0 <= minutes() && minutes() <= 59) && (0 <= seconds() && seconds() <= 60); … // postcondition } This ensures that whatever set_hour() does the invariant is not violated
7
Ensuring sanity We all need ways to keep sane.
What did you notice as far as elegance is concerned in the TIME example? Did you think, “meh, there’s probably a more sophisticated way to introduce a specific help function.” You Did!? I knew I had a quality reader, spotted you back on the 4th slide In the TIME example we can design a function that checks the sanity of an object bool Time::sane() { return (0 <= hour() && hour() <= 23) && (0 <= minutes() && minutes() <= 59) && (0 <= seconds() && seconds() <= 60); } sane() is now the invariant of a Time object: It should: Always hold before every public method Always hold after every public method
8
Locating infections Using the sane() function created in the previous slide We can implement it in our methods: void Time::set_hour(int h) { assert(sane()); // precondition // Actual code goes here assert(sane()); // postcondition } We can locate and understand infections by where the failures have taken place Precondition failure = infection before method Postcondition failure = infection within method All assertions pass = no infection Takeaway: You can rule out an entire class as an infection site by wrapping each public method that changes the state with the sanity checks. If they all pass you can be confidant that the Class, in this example Time, is not an infection site
9
Complex invariants As data structures become more complex, the invariants become more complex In this example (red/black tree) every property of a tree is checked in an individual helper function But we can write a sane() function that calls them all together. This way if anything goes wrong in a red/black tree the sane() invariant will catch it! class RedBlackTree { … boolean sane() { assert(rootHasNoParent()); assert(rootIsBlack()); assert(redNodesHaveOnlyBlackChildren()); assert(equalNumberOfBlackNodesOnSubtrees()); assert(treeIsAcyclic()); assert(parentsAreConsistent()); return true; }
10
Invariant in debuggers
If you’ve made a function that checks data invariants, we can use it on the fly in an interactive debugger! Such as GDB: (gdb) break 'Time::set_hour(int)' if !sane() Breakpoint 3 at 0x2dcf: file Time.C, line 45. (gdb)_ It will interrupt execution as soon as the breakpoint condition (holds at the specified location—that is), the Time invariant is violated. This even works if the assertions have been disabled.
11
Design by contract concept
EIFEL is a language that uses the concept of design by contract Contract is a set of preconditions that must be met by the caller(client) and a set of postconditions that are guaranteed by the callee(supplier) Remember our set_hour() assertion function from slide 6, of course you do, well this is how EIFEL would do it by contract: set_hour(h : INTEGER) is -- Set the hour from ‘h’ require sane_h : 0 <= h and h <= 23 ensure hour_set : hour = h minute_unchanged : minutes = old minutes second_unchanged : seconds = old seconds This now makes a contract which specifies interface properties, which kind of serve as a specification of what the function should do.
12
Defining invariants with Z
Z is a specification language, based on discrete mathematics in order to strive to be complete and unambiguous as possible. An example of Date class in Z An example of set_hour() in Z Similar to contract, Z makes specification, however, Z can express anything but it will needed to be validated in the code which can be difficult. Whereas, contracting with EIFEL integrate the specification within the code, caveat there being it is limited to code language.
13
Java Modeling Language (JML)
Lets keep rolling with the set_hour() function as we learn about JML JML is a specification language that excels in both expressive power and quantity/quality of tools You can write assertions with special comments in the JAVA code: requires 0 <= h && h <= 23 @ ensures hours() == h && @ minutes() == \old(minutes()) && @ seconds() == \old(seconds()) @*/ void Time::set_hour(int h) … As in this example, assertions are written as ordinary Boolean JAVA expressions together with some extra operators such as \old, which stands for the value of the variable at the moment the method was entered. Using JMLC, the JML compiler, such a specification can be translated into assertions that are then checked at runtime.
14
Java Modeling Language (JML) More Uses
Documentation: The JMLDOC documentation generator produces HTML containing both JAVADOC comments and JML specifications. This is a great help for browsing and publishing JML specifications. Unit testing: JMLUNIT combines the JML compiler JMLC with JUNIT (Chapter 3 “Making Programs Fail”) such that one can test units against JML specifications. Invariant generation: The DAIKON invariant detection tool can report detected invariants in JML format, thus allowing simple integration with JML. Static checking: The ESC/Java static checker checks simple JML specifications statically, using the deduction techniques laid out in Chapter 7 “Deducing Errors.” In particular, it can leverage specified invariants to detect potential null pointer exceptions or out-of-bound array indexes. Verification: JML specifications can be translated into proof obligations for various theorem provers. The more that properties are explicitly specified the easier it is to prove them.
15
If we don’t want or cannot check against a specification,
Relative debugging If we don’t want or cannot check against a specification, we can compare against a reference run Which is simply using a reference program to compare results with. Better examples and usage: The program has been modified – Test the old programs output with the new one The environment has changed – Test the program’s output in the old env vs new env The program has been reimplemented – Test aspects of old program with aspects of new
16
It compares two program runs and compares
Relative assertions The text book uses a relative debugger GUARD to illustrate relative assertions It compares two program runs and compares the variable values throughout the runs Specified like so: assert ==
18
MALLOC_CHECK Neat way to avoid common heap errors is to set the MALLOC_CHECK environment variable In this example you can detect deallocation of heap memory: $ MALLOC_CHECK_ = 2 myprogram myargs free() called on area that was already free’d() Aborted(core dumped) The core file generated at program abort can be read in by a debugger, and it will directly lead to the location where free() was called the second time.
19
ELECTRICFENCE Buffer Overflow’s are major problems in computer science and security A quality reader like yourself I’m sure would love to know methods and tools to prevent against and detect(since this is debugging class) such a horrendous problem. ELECTRICFENCE library baby! (horn) This bad-boys basic idea is to allocate arrays in memory such that each array is preceded and followed by a nonexisting memory area If the program tries to access that area, meaning an overflow occurred, the OS will abort the program You can implement ELECTRICFENCE by compiling it in with the program gcc -g -o sample-with-efence sample.c -lefence
20
VALGRIND Did you like ELECTRICFENCE? Pretty sweet right?
Well, get ready to meet something named after the entrance to Valhalla VALGRIND detects: • Read access to noninitialized memory • Write or read access to nonallocated memory • Write or read access across array boundaries • Write or read access in specific stack areas • Detection of memory leaks (areas that were allocated but never deallocated) Continue reading to see and example. What’s VALGRIND referencing here? Don Quixote? The dreadful origami wars? No one truly knows
21
VALGRIND “VALGRIND generates the previous
A note on how VALGRIND detects and provides details about invalid memory areas: VALGRIND is built around an interpreter for x86 machine code instructions. It interprets the machine instructions of the program to be debugged, and keeps track of the used memory in so-called shadow memory. Each memory bit is associated with a controlling value bit (V-bit). Each Vbit is initially unset. VALGRIND sets it as soon as the associated memory bit is being written. In addition, each byte is associated with an allocated bit (A-bit), which is set if the corresponding byte is currently allocated. When some memory area is deallocated, VALGRIND clears the A-bits. “VALGRIND generates the previous error message for the sample program: a[0] and a[1] are allocated and initialized— their A- and V-bits set (shown in gray). In contrast, a[2] is neither allocated nor initialized. Accessing it causes VALGRIND to issue an error message”
22
VALGRIND Another Example
Here’s how you would normally run your program: myprog arg1 arg2 And oops, you got a memory leak in there you don’t know where it came from, and now you are restarting your computer. With valgrind! (Actually, just one small tool in the VALGRIND chest, memleak detection): valgrind --leak-check=yes myprog arg1 arg2 Now while this will make your program run much slower, and use a lot more memory, but that is by design! It issues messages about memory errors and leaks that it detects. Unfortunately it cannot tell you why the memory leaked, but it can tell you where the leaked memory was allocated. However, for more information you can add the --track-origins=yes flag. Here’s a leaky program: (on next slide)
23
leaker.c Memory Leak messages #include <stdlib.h> void f(void) {
int* x = malloc(10 * sizeof(int)); x[10] = 0; // problem 1: heap block overrun } // problem 2: memory leak -- x not freed int main(void) f(); return 0; } Memory Leak messages ==19182== 40 bytes in 1 blocks are definitely lost in loss record 1 of 1 ==19182== at 0x1B8FF5CD: malloc (vg_replace_malloc.c:130) ==19182== by 0x : f (a.c:5) ==19182== by 0x80483AB: main (a.c:11)
24
Language extensions: CYCLONE
CYCLONE is a safer dialect of the C programming language Contains pointers with special properties: special pointer fat pointer Declare a pointer that cannot be NULL Declared Using instead of ‘*’ int getc Record location and bound information Declared using ‘?’ instead of ‘*’ int strlen (const char? s);
25
Language extensions: CYCLONE
Restrictions imposed by CYCLONE to preserve safety • NULL checks are inserted to prevent segmentation faults. • Pointer arithmetic is restricted. • Pointers must be initialized before use. • Dangling pointers are prevented through region analysis and limitations on free(). • Only “safe” casts and unions are allowed. • goto into scopes is disallowed. • switch labels in different scopes are disallowed. • Pointer-returning functions must execute return. • setjmp() and longjmp() are not supported.
26
Assertion and production code
Developers wonder, should I maintain all my checks in production? These are some listed NEVER turn off by the text-book: Critical results: If your program computes a result that people’s lives, health, or money depends on, it is a good idea to validate the result using some additional computation. External conditions: Any conditions that are not within our control must be checked for integrity.
27
Assertion and production code
Some assertions to consider: The more active assertions there are the greater the chances of catching infections. Because not every infection need result in a failure, assertions increase your chances of detecting defects that would otherwise go by unnoticed. Therefore, assertions should remain turned on. The sooner a program fails the easier it is to track the defect. The larger the distance between defect and failure the more difficult it is to track the infection chain. The more active assertions there are the sooner an infection will be caught, which significantly eases debugging. This idea of making code “fail fast” is an argument for leaving assertions turned on. Defects that escape into the field are the most difficult to track. Failing assertions can give the essential clues on how the infection spread.
28
Assertion and production code
By default, failing assertions are not user friendly. This is not yet a reason to turn off assertions. An unnoticed incorrect behavior is far more dangerous than a noticed aborted behavior. When something bad may happen, do not shoot the messenger (and turn assertions off ), but make sure the program gracefully fails. Assertions impact performance. This argument is true, but should be considered with respect to the benefits of assertions. As with every performance issue, one must first detect how much performance is actually lost. Only if this amount is intolerable one should specifically check for where the performance is lost.
29
Tools JML -- The Iowa State University JML tools include the JML compiler GUARD -- The GUARD relative debugger VALGRIND -- The VALGRIND tool for Linux is part of Linux distributions for x86 processors PURIFY -- PURIFY, marketed by IBM, is also available for Solaris and Windows INSURE++ -- INSURE++ is a commercial tool that detects memory problems by instrumenting C and C++ source code. CYCLONE -- The CYCLONE dialect was developed by Jim
30
Sources Zeller, Andreas. Why Programs Fail a Guide to Systematic Debugging. 2nd ed. San Francisco, Calif.: Morgan Kaufmann ;, 2009. Zeller, Andreas. Web. 28 Oct < >.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.