Dependency Architecture

Slides:



Advertisements
Similar presentations
Supplement 02CASE Tools1 Supplement 02 - Case Tools And Franchise Colleges By MANSHA NAWAZ.
Advertisements

1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
C++ Functions. 2 Agenda What is a function? What is a function? Types of C++ functions: Types of C++ functions: Standard functions Standard functions.
1 BTEC HNC Systems Support Castle College 2007/8 Systems Analysis Lecture 9 Introduction to Design.
Higher Computing Computer structure. What we need to know! Detailed description of the purpose of the ALU and control unitDetailed description of the.
S2008Final_part1.ppt CS11 Introduction to Programming Final Exam Part 1 S A computer is a mechanical or electrical device which stores, retrieves,
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Relational Operator Evaluation. Overview Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g.,
Query Execution. Where are we? File organizations: sorted, hashed, heaps. Indexes: hash index, B+-tree Indexes can be clustered or not. Data can be stored.
Limiting datasets Some reports can take hours and even days to run. The Retrieve Catalog Records (p_ret_01) is one such report. One way to significantly.
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
Operating Systems, Winter Semester 2011 Practical Session 9, Memory 1.
CS 3204 Operating Systems Godmar Back Lecture 18.
CS703 - Advanced Operating Systems By Mr. Farhan Zaidi.
1 Requirements Management - II Lecture # Recap of Last Lecture We talked about requirements management and why is it necessary to manage requirements.
Microsoft Access 2007 Introduction to Database. What is a Database? Database--A Collection of Tables Usually Associated With a General Topic Database.
CSC 108H: Introduction to Computer Programming
Certification of Reusable Software Artifacts
Appendix 1 - Packages Jim Fawcett copyright (c)
Why indexing? For efficient searching of a document
Semi-Automated Software Restructuring
Why Metrics in Software Testing?
EGR 2261 Unit 11 Pointers and Dynamic Variables
Compiler Design (40-414) Main Text Book:
Dependency Analysis: - CSE681 Pr#2 - an Interesting Design Example
Chapter 2 Memory and process management
Software Metrics 1.
Dependency Analysis Use Cases
Definition CASE tools are software systems that are intended to provide automated support for routine activities in the software process such as editing.
UNIT 2 – CHAPTER 1 – LESSON 1 DIGITAL INFORMATION.
Project Center Use Cases Revision 2
Project Center Use Cases
CSE687 – Object Oriented Design
Learning to Program D is for Digital.
Segments Basic Uses: slides minutes
Practical Office 2007 Chapter 10
Architecture Concept Documents
Project Center Use Cases
Topics Introduction to Repetition Structures
Database Applications (15-415) DBMS Internals- Part VII Lecture 16, October 25, 2016 Mohammad Hammoud.
FACE RECOGNITION TECHNOLOGY
Run-time organization
History of compiler development
Software Documentation
Other Kinds of Arrays Chapter 11
Project Center Use Cases Revision 3
Physical Database Design for Relational Databases Step 3 – Step 8
Algorithm Analysis CSE 2011 Winter September 2018.
Dependency Analysis Use Cases
Project Center Use Cases Revision 3
Scripts & Functions Scripts and functions are contained in .m-files
Starter Write a program that asks the user if it is raining today.
CSCI/CMPE 3334 Systems Programming
Chapter 8: Introduction to High-Level Language Programming
Compiler Construction
Data Structures and Algorithms
Caching Principles and Paging Performance
CSE 303 Concepts and Tools for Software Development
Practical Session 9, Memory
Caching Principles and Paging Performance
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
Lecture 13: Query Execution
Spreadsheets, Modelling & Databases
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
.NET Framework Data Providers
Chapter 1 Introduction.
Software Development Environment, File Storage & Compiling
Year 10 Computer Science Hardware - CPU and RAM.
Executable Specifications
Chapter 1: Introduction to Computers and Programming
Presentation transcript:

Dependency Architecture Jim Fawcett CSE681 – Software Modeling & Analysis Fall 2002

Use Cases Dependency analysis generates information for: Building test plans: Don’t test a module until all the modules on which it depends have been tested. Software maintenance: What modules depend on the module we plan to change? We need to test them after the change to see if they have been adversely affected. Documentation: Documenting dependency information is an integral part of the design exposition.

Scope of Analysis This architecture is concerned with dependencies between a program’s modules. A module is a relatively small partition of a program’s source code into a cohesive part. A typical module should consist of about 400 source lines of code (SLOC). Obviously some will be smaller, some larger, but this is a good target size Typical project sizes are: Modest size research project – 10,000 sloc  25 modules Modest size commercial product – 600 kslocs  1,500 modules

Conclusions from Use Case Analysis Even for relatively modest sized research projects, there is too much information to do an adequate analysis by hand. We need automated tools. The tools need to show dependencies in both ways, e.g.: What files does this file depend on? What files depend on this file? The tools need to disclose dependencies between all files in the project.

Critical Issues Scanning for Dependencies in C# modules Data structure used to hold dependencies Displaying large amounts of information to user False dependencies due to unneeded includes in C++ modules Dependence on System Libraries

Dependency Scanning Will naïve scanning work for 1500 files? If opening and scanning a single file takes 25 msec, then: Finding dependencies for 1 file takes: 0.025 X1500 / 60 = 0.625 minutes Finding dependencies for all files takes: 0.625 X 1500 / 60 = 15.6 hours! So let’s scan each file once and store all its identifiers in hash table in RAM. If that takes 30 msec per file: Then making hash tables for all files takes: 0.03 X 1500 / 60 = 0.75 minutes If hash table lookup takes 10 sec per file then finding dependencies between all files takes: 0.00001 X 1500 X 1500 / 60 + 0.75 = 1.125 minutes!

Timing Results Parsing Prototype Source Conservative Estimate Prototype Results Open file, parse, store in Hashtable – Millisec 25 7 Hashtable Lookup - Microsec 10 0.6

Comparison of Estimated with Measured Naïve scanning – scan each file 1500 times: Estimated time to complete scanning of 1500 files: 15.6 hours Measured time to complete scanning of 1500 files: 4.4 hours Processing each file once and storing in Hashtable, then doing lookups for each file: Estimated time to complete processing: 1.1 minutes Measured time to complete processing: 0.2 minutes

Hash Table Layout

Memory to Store Hash Tables Assume each file is about 500 lines of source code  about 30 chars X 500 = 15 KB Assume that 1/3 of that is identifiers The rest is comments, whitespace, keywords, and punctuators  5 KB of indentifier storage Assume HashTable takes 10 KB per file, so the total RAM required for this data is: 0.01 X 1500 = 15 MB. That’s large, but acceptable on today’s desktop machines.

File Scanning For each file in C# file set: For each class and struct identifer in file Look in every other file’s HashTable for those identifiers If found, other file depends on current file Record dependency Complexity is O(n2) For each file in C++ file set: #include statements completely capture dependency. Complexity is O(n)

C# Scanning Process

C# Scanning Activities Define file set User supplies by browsing, selection, patterns User may wish to scan subdirectory Extract token information from each file: Extract tokens from each file and store in HashTable. Save list of Class and Struct identifiers from scan Create HashedFile type with filename, class and struct list, and HashTable as data. Store HashedFiles in ArrayList For each HashedFile in list: Walk through ArrayList searching HashTables for the identifiers in class and struct list (note that this is very fast). First time one is found, stop processing file – dependency found.

C# Scan Activity Diagram

C++ Scan Activity Diagram

Memory to Hold Dependencies Naïve storage uses a dense matrix. With 1500 files, that’s 2,250,000 elements. Assume each path name is stored only once and we save 75 bytes of path information, so with 1500 files  112.5 KB Dependency is a boolean and takes 1 byte to store  2.25 MB. So, the total dependency matrix takes 2.36 MB. Therefore, naïve storage is acceptable.

Dependency Matrix

False Dependencies in C++ files Need to scan both .h and .cpp files. Could programmatically comment out each include – one at a time – and attempt to compile, thus finding the ones actually needed. We would probably do this with a seperate tool. We could also just scan, as we do for C#, but that is harder for C++ since we need to check dependencies on global functions and data as well as classes and structs.

Dependence on System Libraries Not practical to scan for system dependencies in C#. Can’t find source modules. System dependencies can be found using reflection, but are not particularly useful. System dependencies in C++ are easy to find from #include<someSystemHeader> This information is often useful, so why not provide it?

C# Scanner Class Diagram

Displaying Large Sets of Dependencies User will probably want to: Enter a name and get list of dependencies. Find all files with no dependencies. Find all files dependent on only the files processed so far this run. Show list of files entered so far and list of files not entered yet. Select subset of files for display. Show a compressed (bitmap?) matrix. Show a scrolling list of files with their dependencies. Show list of names, not matrix row. Matrix row may be far too long to view (e.g., 1500 elements).

Partitions

Summary of Critical Issues Scanning for Dependencies in C# modules √ Data structure used to hold dependencies √ Displaying large amounts of information ~√ False C++ dependencies ~√ Dependence on System Libraries C# X C++ √

Prototype Code Scanning – critically important Sizes - important How much time to open file and scan for class, struct identifiers? How much time to build HashTables and HashedFile objects? How much time to evaluate dependencies between two files by HashTable lookup? Sizes - important How big is HashedFile object for typical files? User Display – could leave to design team with requirement for early evaluation. Mockup display alternatives.