Maintenance Reverse Engineering Ethics 8 April Maintenance Reverse Engineering Ethics
Software Engineering Elaborated Steps Concept Requirements Architecture Design Implementation Unit test Integration System test Maintenance
Best (and Worst) Testing Practices (Boris Beizer) Unit testing to 100% coverage: necessary but not sufficient for new or changed Integration testing: at every step; not once System testing: AFTER unit and integration testing Testing to requirements: test to end users AND internal users Test execution automation: not all tests can be automated Test design automation: implies building a model. Use only if you can manage the many tests Stress testing: only need to do it at the start of testing. Runs itself out Regression testing: needs to be automated and frequent Reliability testing: not always applicable. statistics skills required Performance testing: need to consider payoff Independent test groups: not for unit and integration testing Usability testing: only useful if done early Beta testing: not instead of in-house testing
Maintenance: the Final Chapter
Cost of Maintenance Estimates of percentage of total life cycle cost: 40% - 90% Cost of fixing a bug Requirements 1x Design 5x Coding 10x Testing 20x Delivery 200x
Problems of Maintenance Organizational Alignment with objectives Cost benefit analysis Process Impact Documentation Regression testing Technical Building software that is maintainable Professional hierarchy
Objectives of Maintenance Change over time! At release: bug-free Six months later: competitive or competition-leading features Two years later: reduce maintenance cost
Building Maintainable Software Code Well documented code Names, headers, style, … Can names be too long? Decoupled code Documentation Architecture, design documentation, use cases, requirements, … But only if maintained!!!!!
Software Maintenance Types Adaptive maintenance: changes needed as a consequence of operation system, hardware, or DBMS changes Corrective maintenance: the identification and removal of faults in the software Perfective maintenance: changes required as a result of user requests Preventive maintenance: changes made to software to make it more maintainable
Why adaptation? Lehman’s Law (1985): if a program doesn’t adapt, it becomes increasingly useless Example: programs that didn’t adapt to the web The majority of maintenance is concerned with evolution deriving from user requested changes
Lehman’s Second Law As an evolving program changes, its structure tends to become more complex Extra resources must be devoted to preserving the semantics and simplifying the structure For most software, nothing has been done about it, so changes are increasingly more expensive and difficult
Reengineering Code gets messy over time At some point, quality suffers Extreme programming re-factoring At some point, quality suffers Changes slow Fixes introducing errors Need to invest in the code! Rules as to when to rewrite a module Abstractions: variables -> methods Harder: when is REDESIGN needed?
Lehman’s Five Laws The law of continuing change: A program that is used in a real-world environment necessarily must change or become less and less useful in that environment. The law of increasing complexity: As an evolving program changes, its structure becomes more complex unless active efforts are made to avoid this phenomenon. The law of large program evolution: Program evolution is a self-regulating process and measurement of system attributes such as size, time between releases, number of reported errors, etc., reveals statistically significant trends and invariances. The law of organizational stability: Over the lifetime of a program, the rate of development of that program is approximately constant and independent of the resources devoted to system development. The law of conservation of familiarity: Over the lifetime of a system, the incremental system change in each release is approximately constant. Lehman, M. and Belady, L. (1985). Program Evolution: Processes of Software Change, volume 27 of A.P.I.C. Studies in Data Processing. Academic Press. Lehman M.M. and Ramil J.F. (2001), “Rules and Tools for Software Evolution Planning and Management”, Annals of Software Eng., spec. issue on software management, vol. 11, pp. 15-44. - The first law tells us that system maintenance is an inevitable process. Fault repair is only part of the maintenance activity and that changing system requirements will always mean that a system must be changed if it is to remain useful. Thus, software engineering should be concerned with producing systems whose structure is such that the costs of change are minimized. - The second law states that, as a system is changed, its structure is degraded and additional costs, over and above those of simply implementing the change, must be accepted if the structural degradation is to be reversed. The maintenance process should perhaps include explicit restructuring activities which are simply aimed at improving the adaptability of the system. It suggests that program restructuring is an appropriate process to apply. - The third law suggests that large systems have a dynamic all of their own and that is established at an early stage in the development process. This dynamic determines the gross trends of the system maintenance process and the particular decisions made by maintenance management are overwhelmed by it. This law is a result of fundamental structural and organizational effects. - The fourth law suggests that most large programming projects work in what he terms a 'saturated' state. That is, a change of resources or staffing has imperceptible effects on the long-term evolution of the system. - The fifth law is concerned with the change increments in each system release. - Lehman's laws are really hypotheses and little work has been carried out to validate them. Nevertheless, they do seem to be sensible and maintenance management should not attempt to circumvent them but should use them as a basis for planning the maintenance process. It may be that business considerations require them to be ignored at any one time (say it is necessary to make several major system changes). In itself, this is not impossible but management should realize the likely consequences for future system change.
Steps for handling a change Understand the problem Design the changes Analyze impact Implement changes Update documentation Regression test Release
Cost Benefit (Risk) Analysis Will this problem reduce the number of programs that I sell? Will this problem impact future sales? How many people will it affect? How important are the customers it will affect? Is it a “show stopper” or an annoyance?
Patches What is a patch? When should it be used? Problems with use Quick fix that doesn’t go through the full process When should it be used? Error that is preventing use of the system Problems with use Multiple patches can be order dependent Users can barely track which ones have been applied Code version explosion Permanent fix may or may not be compatible
Legacy Systems Existing systems that are still useful May not want to invest in enhancements Future functions will use new process May not be able to easily modify Unsupported language or libraries Lack of skills No source code available!
Handling Legacy Systems Incorporation Business as usual Encapsulation Accessed from new system Adapters Wrapper around the legacy system Adapters in new system
Reverse Engineering
Reverse Engineering What is it? Is it legal? Discovering the technology through analysis of a program’s structure and operation Analyzing a system to identify its components and interrelationships in order to create a higher abstraction Is it legal? Associated with hackers and crackers
Fundamental Problem Understanding code with … no comments void p (int M) { int c = 2; while (c <= M) { int t = 2; boolean f = true; while (t ** 2 <= c) { if (c % t == 0) { f = false; break; } t++; } if (f) l(c); c++; } } Understanding code with … no comments meaningless variable names no visible structure
Reverse Engineering Lots of tools for simple translation Disassemblers, decompilers, hex editors, … How useful are these? What can they do and not do? Approaches to Understanding Source-to-source translation Object recovery and specification Incremental approaches Component-based approaches Wikibook on the topic http://en.wikibooks.org/wiki/Reverse_Engineering
Uses of Reverse Engineering Reasonably legal managing clearly owned code recovery of data from proprietary file formats creation of hardware documentation from binary drivers (often used for producing Linux drivers) enhancing consumer electronics devices malware analysis discovery of undocumented APIs (but probably a bad idea) criminal investigation copyright and patent litigation Probably unethical even when legal malware creation, often involving a search for security holes breaking software copy protection (games and expensive engineering software)
Digital Millennium Copyright Act (1998) Criminalizes production and dissemination of technology that can circumvent measures taken to protect copyright Exceptions Interoperability between software components Retrieval of data from proprietary software Full text http://www.copyright.gov/legislation/dmca.pdf
Ethics
ACM Code of Ethics and Professionalism (Excerpt) GENERAL MORAL IMPERATIVES Contribute to society and human well-being Avoid harm to others Be honest and trustworthy Be fair and take action not to discriminate Honor property rights including copyrights and patent Give proper credit for intellectual property Respect the privacy of others Honor confidentiality ORGANIZATIONAL LEADERSHIP IMPERATIVES Articulate social responsibilities Enhance the quality of working life Proper and authorized uses of computing and communication resources Ensure that those affected by a system have their needs clearly articulated; validate the system to meet requirements Protect the dignity of users
Intellectual Honesty (McConnell, Code Complete) Refusing to pretend you’re an expert when you’re not Readily admitting your mistakes Trying to understand a compiler warning rather than suppressing the message Clearly understanding your program – not compiling it to see if it works Providing realistic status reports Providing realistic schedule estimates and holding your ground when management asks you to adjust them
Whistle Blowing What are the alternatives? When is it okay? When is it not a choice?
Ethics of a project intended use potential misuse consequences fairness to the knowing users implications for unknowing users