Yazd University, Electrical and Computer Engineering Department Course Title: Advanced Software Engineering By: Mohammad Ali Zare Chahooki 1 Machine Learning Applications in Software Engineering
Introduction 2 In software engineering, there are three categories of entities: processes products resources
Introduction 3 Processes are collections of software related activities, such as: constructing specification, detailed design, or testing. Products refer to artifacts, deliverables, documents that result from a process activity, such as: a specification document, a design document, or a segment of code. Resources are required by a process activity, such as: personnel, software tools, or hardware.
Introduction 4 Machine learning methods have been utilized to … develop better software products and … to make software development process more efficient and effective.
Challenges in Software Engineering 5 Essential difficulties in developing and maintaining large software by: Complexity Conformity Changeability Invisibility
Challenges in Software Engineering 6 Complexity: “Software entities are more complex for their size than perhaps any other human construct.” “Many of the classical problems of developing software products derive from this essential complexity and its nonlinear increases with size.”
Challenges in Software Engineering 7 Conformity: Software must conform to the many different human institutions and systems it comes to interface with.
Challenges in Software Engineering 8 Changeability: “The software product is embedded in a cultural matrix of … applications, users, laws, and machine vehicles. These all change continually, and … their changes inexorably force change upon the software product.”
Challenges in Software Engineering 9 Invisibility : “The reality of software is not inherently embedded in space.” “As soon as we attempt to diagram software structure, we find it to constitute not one, but several, general directed graphs, superimposed one upon another.”
Applications of ML in SE 10 Software quality prediction: These predictions will then be the basis for ranking modules … thus enabling a manager to select as many modules from the top of the list as resources allow for reliability enhancement.
11 Software size estimation: Work planning and … estimations of the effort required based on the estimate of the size of the software. Software size can be measured in lines of code (LOC) Applications of ML in SE
12 Software development cost prediction Project or software effort prediction Maintenance task effort prediction Software resource analysis to identify classes of software modules that have high development effort or faults Applications of ML in SE
13 Correction cost estimation Software reliability prediction: Software reliability growth models can be used to characterize how software reliability varies with time and other factors. The models offer mechanisms for estimating current reliability measures and for predicting their future values. Applications of ML in SE
14 Defect prediction Reusability prediction Software release schedule: Testability of program modules prediction Applications of ML in SE
15 Discovery of sub-system structures: Sub-systems are composed and … Results may be evaluated using software engineering principles like high cohesion and low coupling Association Rule Mining for Program Understanding Discover loop invariants: Loops which don't terminate or terminate without achieving their goal behavior Applications of ML in SE
16 Software Categorization Keeps developers informed about related software for: Learn the “best practice” Increase software reuse Detecting Copy-Paste-Related Bugs Copy-pasted code is common in large systems because of code reuse This manner is Prone to bugs E.g., identifiers are not changed consistently Therefore, must detect copy-paste code and … handle minor modifications? Applications of ML in SE
17 Analyzing Bug Reports Most open source software development projects have bug repositories. They have valuable information for both developers and users. Bug repositories contain duplicate bug reports Bug report assignment is time-consuming therefore d eveloper recommendation is another application Applications of ML in SE
18 Fault Localization Running tests produces execution traces Some tests fail and the other tests pass Given many execution traces generated by tests... we can suggest likely faulty statements Applications of ML in SE
19 Stabilizing Buggy Applications Users may report bugs in a program … those bug reports can be used to prevent the program from crashing Given a program state S and an event e, predict whether e likely results in a bug based on Positive samples: past bugs Negative samples: “not bug” reports Applications of ML in SE
20 Guiding Software Changes Programmers start changing some locations … Suggest locations that other programmers have changed together with this location E.g., “Programmers who changed this function also changed …” Applications of ML in SE
Reference 21 Du Zhang, and Jeffrey JP Tsai, Advances in machine learning applications in software engineering, Idea Group Publishing, Du Zhang, and Jeffrey JP Tsai. "Machine learning and software engineering." Software Quality Journal 11.2 (2003): Xie, Tao, et al. "Data mining for software engineering." Computer 42.8 (2009):