Hands-on Refactoring with Wrangler Simon Thompson Huiqing Li, Xingdong Bian University of Kent.

Slides:



Advertisements
Similar presentations
Progress on ‘HaRe: The Haskell Refactorer’ Huiqing Li, Claus Reinke, Simon Thompson Computing Laboratory, University of Kent Refactoring is the process.
Advertisements

An Abstract Interpretation Framework for Refactoring P. Cousot, NYU, ENS, CNRS, INRIA R. Cousot, ENS, CNRS, INRIA F. Logozzo, M. Barnett, Microsoft Research.
Verification and Validation
1 Mind Visual Diff An architecture comparison tool December 16 th, 2014 – v0.2.2 Seyvoz Stephane Assystem.
Tutorial 8: Developing an Excel Application
Lecture 1 Introduction to the ABAP Workbench
MP IP Strategy Stateye-GUI Provided by Edotronik Munich, May 05, 2006.
Background information Formal verification methods based on theorem proving techniques and model­checking –to prove the absence of errors (in the formal.
Automated creation of verification models for C-programs Yury Yusupov Saint-Petersburg State Polytechnic University The Second Spring Young Researchers.
Refactoring Erlang Programs Huiqing Li Simon Thompson University of Kent.
Improving your (test) code with Wrangler Huiqing Li, Simon Thompson University of Kent Andreas Schumacher Ericsson Software Research Adam Lindberg Erlang.
Property-based Testing – ProTest FP7 Strep John Derrick University of Sheffield, and members of the ProTest team.
The Formalisation of Haskell Refactorings Huiqing Li Simon Thompson Computing Lab, University of Kent
Refactoring Haskell Programs Huiqing Li Computing Lab, University of Kent
WRT 2007 Refactoring Functional Programs Huiqing Li Simon Thompson Computing Lab Chris Brown Claus Reinke University of Kent.
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
Architectural Design Principles. Outline  Architectural level of design The design of the system in terms of components and connectors and their arrangements.
The Haskell Refactorer, HaRe, and its API Huiqing Li Claus Reinke Simon Thompson Computing Lab, University of Kent
Supplement 02CASE Tools1 Supplement 02 - Case Tools And Franchise Colleges By MANSHA NAWAZ.
Generative Programming. Generic vs Generative Generic Programming focuses on representing families of domain concepts Generic Programming focuses on representing.
1 An introduction to design patterns Based on material produced by John Vlissides and Douglas C. Schmidt.
CSC 8310 Programming Languages Meeting 2 September 2/3, 2014.
©Ian Sommerville 2000Software Engineering, 6th edition. Chapter 19Slide 1 Verification and Validation l Assuring that a software system meets a user's.
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
REFACTORING Lecture 4. Definition Refactoring is a process of changing the internal structure of the program, not affecting its external behavior and.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 22 Slide 1 Verification and Validation.
CSCE 548 Code Review. CSCE Farkas2 Reading This lecture: – McGraw: Chapter 4 – Recommended: Best Practices for Peer Code Review,
CODE. Using Wrangler to refactor Erlang programs and tests Simon Thompson, Huiqing Li Adam Lindberg, Andreas Schumacher University of Kent, Erlang Solutions,
|Tecnologie Web L-A Anno Accademico Laboratorio di Tecnologie Web Introduzione ad Eclipse e Tomcat
Introduction to MDA (Model Driven Architecture) CYT.
DIY Refactorings in Wrangler Huiqing Li Simon Thompson School of Computing University of Kent.
Mining and Analysis of Control Structure Variant Clones Guo Qiao.
Reviewing Recent ICSE Proceedings For:.  Defining and Continuous Checking of Structural Program Dependencies  Automatic Inference of Structural Changes.
Cross Language Clone Analysis Team 2 October 27, 2010.
Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:
IBM Software Group ® Overview of SA and RSA Integration John Jessup June 1, 2012 Slides from Kevin Cornell December 2008 Have been reused in this presentation.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Mid-term Presentation Validation of Architecture Rules & Design Patterns 25 th May Shravan Shetty &Vinod J Menezes Supervised by, Prof. Dr. M. v. d. Brand.
Generative Programming. Automated Assembly Lines.
Guide to Programming with Python Chapter One Getting Started: The Game Over Program.
Formal Semantics Chapter Twenty-ThreeModern Programming Languages, 2nd ed.1.
Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:
1 5 Nov 2002 Risto Pohjonen, Juha-Pekka Tolvanen MetaCase Consulting AUTOMATED PRODUCTION OF FAMILY MEMBERS: LESSONS LEARNED.
Refactoring1 Improving the structure of existing code.
UHD::3320::CH121 DESIGN PHASE Chapter 12. UHD::3320::CH122 Design Phase Two Aspects –Actions which operate on data –Data on which actions operate Two.
Refactoring Erlang Programs Huiqing Li Simon Thompson University of Kent Zoltán Horváth Eötvös Loránd Univ.
FDT Foil no 1 On Methodology from Domain to System Descriptions by Rolv Bræk NTNU Workshop on Philosophy and Applicablitiy of Formal Languages Geneve 15.
Eclipse 24-Apr-17.
COMP3190: Principle of Programming Languages
Getting the right module structure: using Wrangler to fix your projects Simon Thompson, Huiqing Li School of Computing, University of Kent, UK.
Scientific Debugging. Errors in Software Errors are unexpected behaviors or outputs in programs As long as software is developed by humans, it will contain.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
Cross Language Clone Analysis Team 2 February 3, 2011.
Scalable Clone Detection and Elimination for Erlang Programs Huiqing Li, Simon Thompson University of Kent Canterbury, UK.
Plug-in Architectures Presented by Truc Nguyen. What’s a plug-in? “a type of program that tightly integrates with a larger application to add a special.
® IBM Software Group © 2007 IBM Corporation Module 1: Getting Started with Rational Software Architect Essentials of Modeling with IBM Rational Software.
Cross Language Clone Analysis Team 2 February 3, 2011.
PROGRAMMING PRE- AND POSTCONDITIONS, INVARIANTS AND METHOD CONTRACTS B MODULE 2: SOFTWARE SYSTEMS 13 NOVEMBER 2013.
New Project Model UI Primary Author: Mikhail Sennikovsky Major contributors: Mikhail Voronin, Oleg Krasilnikov, Leo Treggiari Intel Corporation September,
Chapter – 8 Software Tools.
Fusion Design Overview Object Interaction Graph Visibility Graph Class Descriptions Inheritance Graphs Fusion: Design The overall goal of Design is to.
1 Asstt. Prof Navjot Kaur Computer Dept PRESENTED BY.
The PLA Model: On the Combination of Product-Line Analyses 강태준.
Maintaining software solutions
Instructor: Prasun Dewan (FB 150,
○Yuichi Semura1, Norihiro Yoshida2, Eunjong Choi3, Katsuro Inoue1
: Clone Refactoring Davood Mazinanian Nikolaos Tsantalis Raphael Stein
Lecture 15 (Notes by P. N. Hilfinger and R. Bodik)
Java External Libraries & Case Study
Java IDE Dwight Deugo Nesa Matic Portions of the notes for this lecture include excerpts from.
Presentation transcript:

Hands-on Refactoring with Wrangler Simon Thompson Huiqing Li, Xingdong Bian University of Kent

Overview What is refactoring? Examples The process of refactoring Tool building and infrastructure What is in Wrangler … demo Latest advances: data, processes, erlide.

Introducing refactoring

Soft-ware There’s no single correct design … … different options for different situations. Maintain flexibility as the system evolves.

Refactoring Refactoring means changing the design or structure of a program … without changing its behaviour. RefactorModify

Examples

Generalisation -module (test). -export([f/1]). add_one ([H|T]) -> [H+1 | add_one(T)]; add_one ([]) -> []. f(X) -> add_one(X). -module (test). -export([f/1]). add_one (N, [H|T]) -> [H+N | add_one(N,T)]; add_one (N,[]) -> []. f(X) -> add_one(1, X). -module (test). -export([f/1]). add_int (N, [H|T]) -> [H+N | add_int(N,T)]; add_int (N,[]) -> []. f(X) -> add_int(1, X). Generalisation and renaming

Generalisation -export([printList/1]). printList([H|T]) -> io:format("~p\n",[H]), printList(T); printList([]) -> true. printList([1,2,3]) -export([printList/2]). printList(F,[H|T]) -> F(H), printList(F, T); printList(F,[]) -> true. printList( fun(H) -> io:format("~p\n", [H]) end, [1,2,3]).

Generalisation -export([printList/1]). printList([H|T]) -> io:format("~p\n",[H]), printList(T); printList([]) -> true. -export([printList/1]). printList(F,[H|T]) -> F(H), printList(F, T); printList(F,[]) -> true. printList(L) -> printList( fun(H) -> io:format("~p\n", [H]) end, L).

Asynchronous to synchronous pid! {self(),msg} {Parent,msg} -> body pid! {self(),msg}, receive {pid, ok}-> ok {Parent,msg} -> Parent! {self(),ok}, body

Refactoring

Refactoring = Transformation + Condition Transformation Ensure change at all those points needed. Ensure change at only those points needed. Condition Is the refactoring applicable? Will it preserve the semantics of the module? the program?

Transformations fullstopone

Condition > Transformation Renaming an identifier "The existing binding structure should not be affected. No binding for the new name may intervene between the binding of the old name and any of its uses, since the renamed identifier would be captured by the renaming. Conversely, the binding to be renamed must not intervene between bindings and uses of the new name."

Which refactoring exactly? Generalise f by making 23 a parameter of f: f(X) -> Con = 23, g(X) + Con This one occurrence? All occurrences (in the body)? Some of the occurrences … to be selected.

Compensate or crash? -export([oldFun/1, newFun/1]). oldFun(L) -> newFun(L). newFun(L) -> … …. -export([newFun/1]). newFun(L) -> … …. or?

Refactoring tools

Tool support Bureaucratic and diffuse. Tedious and error prone. Semantics: scopes, types, modules, … Undo/redo Enhanced creativity

Semantic analysis Binding structure Dynamic atom creation, multiple binding occurrences, pattern semantics etc. Module structure and projects No explicit projects for Erlang; cf Erlide / Emacs. Type and effect information Need effect information for e.g. generalisation.

Erlang refactoring: challenges Multiple binding occurrences of variables. Indirect function call or function spawn: apply (lists, rev, [[a,b,c]]) Multiple arities … multiple functions: rev/1 Concurrency Refactoring within a design library: OTP. Side-effects.

Static vs dynamic Aim to check conditions statically. Static analysis tools possible … but some aspects intractable: e.g. dynamically manufactured atoms. Conservative vs liberal. Compensation?

Architecture of Wrangler

Wrangler in Emacs

Refactorings in Wrangler Renaming variable, function, module, process Function generalisation Move function between modules. Function extraction Fold against definition Introduce and fold against macros. Tuple function arguments together Register a process From function to process Add a tag to messages All these refactorings work across multiple-module projects and respect macro definitions.

Wrangler demo

Tool building

Wrangler and RefactorErl Lightweight. Better integration with interactive tools (e.g. emacs). Undo/redo external? Ease of implementing conditions. Higher entry cost. Better for a series of refactorings on a large project. Transaction support. Ease of implementing transformations.

Duplicate Code Detection Especially for Erlang/OTP programs. Report syntactically well-formed code fragments that are identical after consistent renaming of variables … … ignoring differences in literals and layout. Integrated with the refactoring environment.

Code Inspection Support Variable use/binding information. Caller functions. Caller/callee modules. Case/if/receive expressions nested more than a specified level. Long function/modules. Non tail-recursive servers. Non-flushed unknown messages...

Integration … with IDEs Back to the future? Programmers' preference for emacs and gvim … … though some IDE interest: Eclipse, NetBeans … Issue of integration with multiple IDEs: building common interfaces.

Integration … with tools Test data sets and test generation. Makefiles, etc. Working with macros e.g. QuickCheck uses Erlang macros … … in a particular idiom.

APIs … programmer / user API in Erlang to support user-programmed refactorings: declarative, straightforward and complete but relatively low-level. Higher-level combining forms? OK for transformations, but need a separate condition language.

Verification and validation Possible to write formal proofs of correctness: check conditions and transformations different levels of abstraction possibly-name binding substitution for renaming etc. more abstract formulation for e.g. data type changes. Use of Quivq QuickCheck to verify refactorings in Wrangler.

Clone detection

The Wrangler Clone Detector Uses syntactic and static semantic information. Syntactically well-formed code fragments … identical after consistent renaming of variables, … with variations in literals, layout and comments. Integrated within the refactoring environment.

The Wrangler Clone Detector Make use of token stream and annotated AST. Token–based approaches  Efficient.  Report non-syntactic clones. AST-based approaches.  Report syntactic clones.  Checking for consistent renaming is easier.

The Wrangler Clone Detector Source Files Tokenisation Token Stream Normalisation Normalised Token Stream Suffix Tree Construction Suffix tree

The Wrangler Clone Detector Source Files Tokenisation Token Stream Normalisation Normalised Token Stream Suffix Tree Construction Suffix tree Clone Collector Initial Clones Clone Filter Filtered Initial Clones Clone Decomposition Parsing + Static Analysis Annotated ASTs Syntactic Clones

The Wrangler Clone Detector Source Files Tokenisation Token Stream Normalisation Normalised Token Stream Suffix Tree Construction Suffix tree Clone Collector Initial Clones Clone Filter Filtered Initial Clones Clone Decomposition Parsing + Static Analysis Annotated ASTs Syntactic Clones Consistent Renaming Checking Clones to report

The Wrangler Clone Detector Source Files Tokenisation Token Stream Normalisation Normalised Token Stream Suffix Tree Construction Suffix tree Clone Collector Initial Clones Clone Filter Filtered Initial Clones Clone Decomposition Parsing + Static Analysis Annotated ASTs Syntactic Clones Consistent Renaming Checking Clones to report Formatting Reported Code Clones

Clone detection demo

Support for clone removal Refactorings to support clone removal.  Function extraction.  Generalise a function definition.  Fold against a function definition.  Move a function between modules.

Case studies Applied the clone detector to Wrangler itself with threshold values of 30 and 2.  36 final clone classes were reported …12 are across modules, and 3 are duplicated function definitions.  Without syntactic checking and consistent variable renaming checking, 191 would have been reported. Applied to third party code base (32k loc, 89 modules),109 clone classes reported.

Data-oriented refactorings

-module(tup1). -export([gcd/1]). gcd({X,Y}) -> if X>Y -> gcd({X-Y,Y}); Y>X -> gcd({Y-X,X})‏; true -> X end. Tupling parameters -module(tup1). -export([gcd/2]). gcd(X,Y) -> if X>Y -> gcd(X-Y,Y); Y>X -> gcd(Y-X,X); true -> X ‏ end. 2

-module(rec1). -record(rec,{f1, f2}). g(#rec{f1=A, f2=B})-> A + B. h(X, Y)-> g(#rec{f1=X,f2=X}), g(#rec{ f1=element(1,Y), f2=element(2,Y)}). Introduce records … -module(rec1). g({A, B})-> A + B. h(X, Y)-> g({X, X}), g(Y). f1 f2

Introduce records in a project Need to replace other expressions … Replace tuples with record Record update expression Record access expression Chase dependencies across functions … … and across modules.

Refactoring and Concurrency

Wrangler and processes Refactorings which address processes Register a process. Rename a registered process. From function to process. Add tags to messages sent / received.

Challenges to implementation Data gathering is a challenge because Processes are syntactically implicit. Pid to process links are implicit. Communication structure is implicit. Side effects.

Underlying analysis Analyses include Annotation of the AST, using call graph. Forward program slicing. Backwards program slicing.

Wrangler and Erlide

Erlide is an Eclipse plugin for Erlang. Distribution simplified. Integration with the edit undo history. Notion of project. Refactoring API in the Eclipse LTK. Ongoing support for Erlide from Ericsson.

Issues on integration LTK has a fixed workflow for interactions. New file vs set of diffs as representation. Fold and generalise interaction pattern. Cannot support rename / create file. Other refactorings involve search … a different API.

Conclusions

Future work Concurrency: continue work. Refactoring within a design library: OTP. Working with Erlang Training and Consulting. Continue integration with Eclipse + other IDEs. Test and property refactoring in. Clone detection: fuller integration.

Ackonwledgements Wrangler development funded by EPSRC. The developers of syntax-tools, distel and Erlide. George Orosz and Melinda Toth. Zoltan Horvath and the RefactorErl group at Eotvos Lorand Univ., Budapest.

Property discovery in Wrangler Clone detection … … and elimination. Find code that is similar … … common abstraction … … accumulate the instances. Examples: Test code from Ericsson: different medium and codec. Clone removal example: 2.6k to 2.0k and counting.

Other Wrangler developments Fully integrated into Eclipse … keeps the reviewers happy! User experience: preview the changes, code inspector, Respecting test code in e.g. EUnit. Multi-version: Erlang, OS, Java, Eclipse. Windows installer.

Next steps Refine the notion of similarity … … to take account of insert/delete in seqs of commands. Support property extraction from 'free' and EUnit tests. Refactorings of tests and properties themselves. Further integration into Erlide: allow use of the contextual menu. Case study with Lambda Stream.

Case Studies Applied the clone detector to Wrangler itself and other Erlang applications with the thresholds of 30 for the minimum size of the clone (in tokens) and 2 for the minimum number of duplicates. WranglerMnesiaYaws No. of files Size (K Loc) Time (Min) <6 <3 No. Clones Inter-module clones 35518

Clearly a clone From the Dialyzer user interface.

Less clearly worth replacing OkButton = gs:button(WinPacker, [{label, {text, "Ok"}}, {pack_xy, {2,3}}]), CancelButton = gs:button(WinPacker, [{label, {text, "Cancel"}}, {pack_xy, {3,3}}]), Also from the dialyzer GUI … would it be clearer to have an intervening common function call?

Related Work Existing clone detection approaches: Program text-based. Token-based. AST-based. PDG-based. Hybrid approaches. Language dependent or independent? 70

Future Work Use visualization techniques to improve the presentation of clone results. Extend the current approach to find “similar” code fragments. How to automate or semi-automate the work- flow of clone detection and removal. 71

Conclusions The Wrangler clone detector - Relatively efficient - No false positives Refactorings support interactive removal of clones. Integrated in the development environment. 72

Questions?

Installation: Mac OS X and Linux Requires R11B-5, 12B, 13B + Emacs Download Wrangler from make, sudo make install Add to.emacs file: (add-to-list 'load-path "/usr/local/share/wrangler/elisp") (require 'wrangler) 74

Installation: Windows Requires R11B-5, 12B, 13B + Emacs Download installer from Requires no other actions. 75

Installation: Eclipse + ErlIDE Requires Erlang R11B-5 or later, if it isn't already present on your system. On Windows systems, use a path with no spaces in it. Install Eclipse 3.4, if you didn't already. All the details at 76

Starting Wrangler in Emacs Open emacs, and open a.erl file. M-x erlang-refactor-on or C-c, C-r New menus: Refactor and Inspector Customise for dir Undo C-c, C-_ 77

Preview Feature Preview changes before confirming the change Emacs ediff is used. 78

Stopping Wrangler in Emacs M-x erlang-refactor-off to stop Wrangler Shortcut C-c, C-r 79

Hands On Check out sample code from: svn co anches/refa anches/refa Or use your own project code Feedback: or 80