Applications of Data Mining in Software Engineering

Slides:



Advertisements
Similar presentations
Web Mining.
Advertisements

An Introduction to Data Mining
Data Mining in Computer Games By Adib Adam Hussain & Mohammed Sarfraz.
New Technologies Supporting Technical Intelligence Anthony Trippe, 221 st ACS National Meeting.
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
Web Mining Research: A Survey Authors: Raymond Kosala & Hendrik Blockeel Presenter: Ryan Patterson April 23rd 2014 CS332 Data Mining pg 01.
Software Engineering Institute Carnegie Mellon University Pittsburgh, PA Sponsored by the U.S. Department of Defense © 1998 by Carnegie Mellon.
Data Mining Sangeeta Devadiga CS 157B, Spring 2007.
Software Quality Metrics
RIT Software Engineering
SE 450 Software Processes & Product Metrics 1 Defect Removal.
Data Mining By Archana Ketkar.
Data Mining Adrian Tuhtan CS157A Section1.
The Software Product Life Cycle. Views of the Software Product Life Cycle  Management  Software engineering  Engineering design  Architectural design.
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
GUHA method in Data Mining Esko Turunen Tampere University of Technology Tampere, Finland.
CIS 2200 Kannan Mohan Department of CIS Zicklin School of Business, Baruch College.
OOSE 01/17 Institute of Computer Science and Information Engineering, National Cheng Kung University Member:Q 薛弘志 P 蔡文豪 F 周詩御.
Intelligent Systems Lecture 23 Introduction to Intelligent Data Analysis (IDA). Example of system for Data Analyzing based on neural networks.
ACS1803 Lecture Outline 2 DATA MANAGEMENT CONCEPTS Text, Ch. 3 How do we store data (numeric and character records) in a computer so that we can optimize.
Unsupervised Learning. CS583, Bing Liu, UIC 2 Supervised learning vs. unsupervised learning Supervised learning: discover patterns in the data that relate.
University of Palestine software engineering department Testing of Software Systems Fundamentals of testing instructor: Tasneem Darwish.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
 To explain the importance of software configuration management (CM)  To describe key CM activities namely CM planning, change management, version management.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Research PHE 498. Define Research Research can be considered as systematic inquiry: A process that needs to be followed systematically to derive conclusions.
Data Mining By : Tung, Sze Ming ( Leo ) CS 157B. Definition A class of database application that analyze data in a database using tools which look for.
 Fundamentally, data mining is about processing data and identifying patterns and trends in that information so that you can decide or judge.  Data.
Data Mining By Dave Maung.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
CISB113 Fundamentals of Information Systems Data Management.
A Metrics Program. Advantages of Collecting Software Quality Metrics Objective assessments as to whether quality requirements are being met can be made.
AUDIT SOFTWARE Chapter 16. Generalized Audit Software Off-the-shelf software that provides a means to gain access to and manipulate data maintained on.
Software Engineering1  Verification: The software should conform to its specification  Validation: The software should do what the user really requires.
Introduction to Data Mining by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
Data Mining and Decision Support
Information Design Trends Unit Five: Delivery Channels Lecture 2: Portals and Personalization Part 2.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
Data Mining – Introduction (contd…) Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
Chapter 25 – Configuration Management 1Chapter 25 Configuration management.
Engineering, 7th edition. Chapter 8 Slide 1 System models.
Tool Support for Testing
Data Mining Functionalities
Data Mining.
Data Mining Generally, (Sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it.
MIS2502: Data Analytics Advanced Analytics - Introduction
CS 325: Software Engineering
Web Mining Ref:
Data Mining 101 with Scikit-Learn
Software Documentation
Data The lowest level of abstraction from which information and knowledge are derived. Any collection of recorded facts, numbers, or datum of any nature.
Abstract descriptions of systems whose requirements are being analysed
DEFECT PREDICTION : USING MACHINE LEARNING
ACS1803 Lecture Outline 2   DATA MANAGEMENT CONCEPTS Text, Ch. 3
Adrian Tuhtan CS157A Section1
Sangeeta Devadiga CS 157B, Spring 2007
Data Analysis.
Project insights using mining software repositories
Data Science introduction.
CVE.
Understanding Customer Behaviors with Information Technologies
Chapter 8 Software Evolution.
Metrics for process and Projects
Kenneth C. Laudon & Jane P. Laudon
Text Mining Application Programming Chapter 9 Text Categorization
Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall
Bita Akram Julia Zochodne
Presentation transcript:

Applications of Data Mining in Software Engineering Majority of collaborative software development organizations utilize revision control software (e.g., CVS, Subversion, Git, etc.) Use to manage the ongoing development of digital assets that may be worked on by a team of people. Maintain a historical record of each revision Allow users to access and revert to previous versions. Provides a way to analyze historical artefacts produced during software development. Such as number of lines written, authors which wrote particular lines or any number of common software metrics.

Continued Most large organizations (and many smaller ones) also use a system for tracking software defects. Can be mined to discover patterns in software development processes, including the time-to-fix, defect-prone components, problematic authors, etc. Some bug trackers are able to correlate defects with source code in a revision system.

Continued Virtually all software development teams use some form of electronic communication (e-mail, instant messaging, etc.) Part of collaborative development (communication in small teams may be primarily or exclusively verbal, but such cases are inconsequential from a data mining perspective). Text mining techniques can be applied to archives of such communication to gain insight into development processes, bugs and design decisions.

Continued Software documentation and knowledge bases can be mined to provide further insight into software development processes. Useful to organizations that use the same processes across multiple projects and want to examine a process in terms of overall effectiveness or fitness for a given project. Knowledge bases may contain source code, this approach focuses primarily on retrieval of information from natural languages.

Data Mining Techniques in Software Engineering Association rules and frequent patterns (A Frequent pattern is a pattern (a set of items, subsequences, sub-graphs, etc.) that occurs frequently in a data set. First proposed by [AIS93] in the context of frequent item sets and association rule mining for market basket analysis.) Classification (Classification is a data mining function that assigns items in a collection to target categories or classes. The goal of classification is to accurately predict the target class for each case in the data. For example, a classification model could be used to identify loan applicants as low, medium, or high credit risks.)

Continued Clustering (Cluster is a group of objects that belongs to the same class. In other words, similar objects are grouped in one cluster and dissimilar objects are grouped in another cluster.) Text mining (Text mining, also referred to as text data mining, roughly equivalent to text analytics, is the process of deriving high-quality information from text. High-quality information is typically derived through the devising of patterns and trends through means such as statistical pattern learning.)