Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Grokking Software Architecture Richard C. Holt Software Architecture Group (SWAG) School of Computer Science, University of Waterloo, Canada 2008 Working.

Similar presentations

Presentation on theme: "1 Grokking Software Architecture Richard C. Holt Software Architecture Group (SWAG) School of Computer Science, University of Waterloo, Canada 2008 Working."— Presentation transcript:

1 1 Grokking Software Architecture Richard C. Holt Software Architecture Group (SWAG) School of Computer Science, University of Waterloo, Canada 2008 Working Conference on Reverse Engineering

2 2 Retrospective 1998 2008 Ten years ago. WCRE most influential paper. “Structural Manipulations of Software Architecture using Tarski Relational Algebra” Today. Retrospective. “Grokking Software Architecture” 17 papers in WCRE

3 3 Grokking Software Architecture Grokking Software architecture

4 4 Overview of Talk: 4 Parts Part 1. 1998 paper: Hopes & claims Part 2. Software Architecture Part 3. Formalizing Boxology Part 4. ROP: Relation-Oriented Programming & Grok-Like Languages

5 5 Part 1. 1998 paper: Hopes & claims Represent software architecture as a typed graph –Graphs with “colors” of edges & nodes Manipulate & visualize these architectural graphs Manipulations can be specified algebraically -- - and automatically executed In brief: Formalize architectural diagrams and reap the benefits arising from the corresponding mathematics.

6 6 Top View of As-Built Software Architecture (250KLOC System)

7 7 View of One Subsystem of the 250 KLOC System

8 8 CS 746G Topics in Software Architecture University of Waterloo 1)CS746 in Winter 1998 Linux (Operating System)CS746 in Winter 1998 2)CS746 in Winter 1999 Apache (Web Server)CS746 in Winter 1999 3)CS746 in Winter 2000 Mozilla (Web Browser)CS746 in Winter 2000 4)CS746 in Winter 2001 Eazel Nautilus (File Manager)CS746 in Winter 2001 5)CS798 in Winter 2002 Postgres et al (Data Base)CS798 in Winter 2002 6)CS746 in Winter 2003 EMACS et al (Editor)CS746 in Winter 2003 7)CS746 in Winter 2004 Gnumeric (Spreadsheet)CS746 in Winter 2004 8)CS746 in Fall 2004 Mozilla (Web Browser -- again)CS746 in Fall 2004 9)CS746 in Fall 2005 Open Office (Open Source Office Suite)CS746 in Fall 2005 10)CS746 in Fall 2006 Asterisk (Open Phone Switch)CS746 in Fall 2006 11)CS746 in Fall 2008 MySQLCS746 in Fall 2008

9 9 Process of View Creation Parser Grok: Fact manipulator Layouter Browser Clustering Source code Facts extracted from code Hierarchic decomposition Architectural diagram

10 10 Transformations to do Hiding a b c d e f g h T V S b a T V Graph G Graph H = hide(hide(G,T),V) d e f Graph I = hideExt(G, S)

11 11 Lifting Calls Up to File Level call is a procedure call fileCall is a file level call fileCall := funcDef o call o inv funcDcl main.c startup start.h main call funcDef funcDcl Procedure bodyProcedure header File fileCall

12 12 Part 2. Software Architecture: Boxology Approach Software architecture: –What is it? –State of practice –How is it represented –Keep It simple –Models & tools –Views of architecture Extracting As-Built architecture

13 13 Software Architecture: What is it? Confusion. I have a sneaking suspicion that ‘architecture’ is one of the most overused and least understood terms in professional software development circles. Gorton Consensus. Architecture captures system structure in terms of components [parts] and how they interact. Gorton

14 14 Software Architecture: State of the Practice “It’s common for there to be little or no documentation covering the architecture in many projects.” Gorton “I'm hopeless when it comes to documentation.” Torvalds “The architecture that actually predominates in practice is the ‘big ball of mud’ ” Foote et al

15 15 Software as Spaghetti Foote et al

16 16 Software Architecture: How is it Represented in Practice? …predominant tools used for architecture documentation are Microsoft Word, Visio and Power Point Gorton What’s needed: Concepts, notations and tools that are –easy to use and –help us produce useful, understandable documentation

17 17 KISS: Keep it Simple Stupid “Any fool can make things bigger, more complex, and more violent. It takes a touch of genius - and a lot of courage - to move in the opposite direction.” Einstein “Make everything as simple as possible, but not simpler.” Einstein

18 18 Models and Tools for Software Architecture “UML has, for better or (many would say) worse, become the industry standard ADL [Architecture Design Language]” Shaw UML “lacks, however, a robust suite of tools for analysis, consistency checking” Shaw

19 19 UML Component Diagram: Box and Arrow Diagram Gorton

20 20 As-Built View Views of Software Architecture Kruchten Users’ View Deployment View Concurrency View End user System Engineer Integrator Programmers & software managers Scenarios

21 21 Extracting the As-Built Architecture from the Code “Reverse engineering is the process of analyzing a subject system to create representations of the system at a higher level of abstraction.” Chikofsky Relational approach. –Parse the code to produce relations, e.g (call, P, Q) means proc P calls Q –Manipulate edges into as-built architecture

22 22 Boxology as a Central ADL (Architectural Design Language) “The most widely used design notation [for software architecture] is informal ‘block and arrow’ diagrams.” Gorton

23 23 Cross Fertilization!! Rev Eng, S/W Arch, Relational Approach Reverse engineering –Architecture extraction –As-Built view: Code is king –Traceability Software architecture –Need for representation & tools –Simplicity & utility Relational approach –Boxology –Formalization --- Tarski algebra

24 24 Part 3. Formalizing Boxology Boxology is the “Representation of an organized structure as a graph of labeled nodes (‘boxes’) and connections between them (as lines or arrows).” Wikipedia “Toward boxology: preliminary classification of architectural styles” Shaw

25 25 Example Typed Graph r ab C C vwxyz CCCECC I UU v w xy z ab r U U I E  C = { (r,a), (r,b), (a,v), (a,w) (a,x), (b,y), (b,z) }  I = { (a,b) }  E = { (b,y) }  U = { (v,w), (x,y) }

26 26 Boxology is Just Scribbling? Box & arrow diagrams –Are just scribbles? No –Formalized by typed graphs –Visualized as (nested) boxes & arrows –Manipulated by Tarski algebra etc. –Exchanged as Triples (RSF), extended to TA, or GXL or …

27 27 Boxology has Semantics? Yes Compare to BNF –Semantics by informal attachment to productions Compare to Codd’s relational approach –Semantics by interpretation of tables. Semantics by attributes & descriptions –Separation of concerns –Structure then semantics Use box/arrow diagrams as underlying formalism for software architecture (Mini-MOF?)

28 28 Adding Algebra to Boxology Tables then Codd relational algebra –N-ary relations Boxes/arrows then Tarski relational algebra –Binary relations

29 29 Example Typed Graph r ab C C vwxyz CCCECC I UU v w xy z ab r U U I E  C = { (r,a), (r,b), (a,v), (a,w) (a,x), (b,y), (b,z) }  I = { (a,b) }  E = { (b,y) }  U = { (v,w), (x,y) }

30 30 Tarski Algebraic Operators UnionI + E = {(a,b), (b,y)} IntersectionE ^ C = {(b,y)} DifferenceC - E = {(r,a), (r,b), (a,v), (a,w), (a,x), (b,z)} Inverseinv E = {(y,b)} CompositionI o E = {(a,y)} Identityid = {(r,r), (a, a), (b,b), (w,w) … } Transitive Cl.C+ = {(r,a), (r, b), (r,v), (r,w), (r,x), (r,y), (r,z), (a,v), (a,w), (a,x), (b,y), (b,z)} Reflex. T.C.C* = ID + C+

31 31 A Schema in TA –Determines Types of boxes Types of edges Allowed connectivity between edges Supports inheritance in schemas –Also attributes (strings) on boxes & on edges call TA Schemas for Box and Arrow Diagrams instance procvar pqxy call instance ref Malton WCRE 2005

32 32 Why Formalize Boxology?? Cause it Makes Our Life Better Clear understanding & clear specification –What does RSF meaning? –Meaning is independent of implementation –Clarifies deeper concepts, e.g., expressiveness Generality Progress in reverse engineering Progress in software architecture Not just scribbling

33 33 Part 4. ROP: Relation-Oriented Programming & Grok-Like Languages A paradigm shift

34 34 Example: Mickey Eats Swiss Cheese Mickey. eat –Swiss –Roquefort eat. Mickey –Garfield –Fluffy eat o eat –(Garfield Swiss) –(Garfield Roquefort) –(Fluffy Swiss) –(Fluffy Roquefort) eat+ –,,, GarfieldFluffy NancyMickey RoquefortSwiss The “eat” relation

35 35 Example ROP/Grok Program: Is relation R a tree? How you would program this test …

36 36 Grok Program: Is R a Tree? if R has no loops & R has one root & R has only single parents then put “R is a tree” Pseudo code Assume each node is a source or target of the contain C relation

37 37 Grok Program: Is R a Tree? if R has no loops Pseudo codeGrok code if # ( R+ ^ ID ) = 0 a b c d R R R R Does transitive closure of R have any self-loops? Yes

38 38 Grok Program: Is R a Tree? if R has no loops & R has one root Pseudo codeGrok code if # ( R+ ^ ID ) = 0 & # (dom R - rng R) = 1 a bc d g e f dom rng Does R have exactly one source? Yes

39 39 Grok Program: Is R a Tree? if R has no loops & R has one root & R has only single parents Pseudo codeGrok code if # ( R+ ^ ID ) = 0 & # (dom R - rng R) = 1 & # ((R o inv R) - ID) != 0 b c d a R inv R R o inv R Does my child have another parent? Yes

40 40 Grok Program: Is R a Tree? if R has no loops & R has one root & R has only single parents then put “R is a tree” Pseudo codeGrok code if # ( R+ ^ ID ) = 0 & # (dom R - rng R) = 1 & # ((R o inv R) - ID) != 0 then put “R is a tree” Moral: Relational progamming is not like low level (Java level) programming. Loops typically disappear.

41 41 Notation: Does it Matter? By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental power of the race. Alfred North Whitehead

42 42 Wins & Losses Using Tarski Algebra Wins –Good for computing new edges, for finding properties of edges, eg, nodes in loops, leaves, etc. Losses –Not good for locating patterns involving several nodes, e.g., find complete connected sub-graphs

43 43 Notation: Grok (Tarski) vs. Crocopat S := P o CS(x,z) := EX(y, P(x,y) & C(y,z)) y z x My parent’s (P) children (C) are my (reflexive) siblings (S) GrokCrocopat P P C C SS Should Crocopat add Tarski operators??

44 44 Characterizing Grok-Like Languages Relational Useful for software analysis Expressiveness –How powerful can a query be? Codd algebra and Crocopat are more powerful. –How well can a query meet our needs? How writeable? How readable? Performance of implementation –Can hold large graphs? –Fast enough to manipulate large graphs?

45 45 Performance of Grok-Like Languages Size & speed: OK for --- Grok & Crocopat –All memory resident, no disk access –Hundreds of thousands of edges –Modeling million-line systems –Most operations not more than a few seconds –Crocopat scales up a bit more for transitive closure –House keeping, e.g., time to read files, is critical –Need to test on 64-bit implementations

46 46 Data Structures for Binary Relations Tables: One for each type of relation DBMS Single table of triples Grok Linked lists –Pointers and nodes Lsedit, JGrok (caches sorted lists) BDD: Binary Decision Diagram Relview, Crocopat –Memory efficient storage of binary relations –Works well with dense graphs –Proven useful RelView, Crocopat –Surprising (to me): BDD efficient for transitive closure

47 47 Grok-Like Languages PS: Paul Klint’s relational language... Discussion of Grok-Like Languages

48 48 Progress: Using Grok-Like Languages 1.Enforce architecture rules. Holt 96, Feijs 98, Knodel 08 2.Lift dependency edges. Holt 98, Feijs 1998 3.Find design pattern instances. Consens 98, Beyer 02 4.Find violations of patterns. Guo 99 5.Find anti-patterns. vanEmden 02, Feijs 98 6.Change impact analysis. Feijs 98 7.Specify extraction from syntax. Lin 08 8.Find source of dependency. Fahmy 01, Feijs 98 9.Locate uses of protocols. Wu 01 10.Type inference using transitive closure. vanDeursen 99

49 49 Grokking Software Architecture Conclusions

50 50 Conclusions Typed graphs nicely formalize various software structures Software architecture can benefit from a ROP approach Tarski algebra, added to boxology, is elegant –Does not handle multi-node patterns Grok-like (ROP) languages are elegant and sufficiently efficient –ROP is high level, is faster, more reliable, more flexible Lots of –Work done so far –Room for more work

Download ppt "1 Grokking Software Architecture Richard C. Holt Software Architecture Group (SWAG) School of Computer Science, University of Waterloo, Canada 2008 Working."

Similar presentations

Ads by Google