Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:

Slides:



Advertisements
Similar presentations
CPSC 388 – Compiler Design and Construction
Advertisements

Symbol Table.
Progress on ‘HaRe: The Haskell Refactorer’ Huiqing Li, Claus Reinke, Simon Thompson Computing Laboratory, University of Kent Refactoring is the process.
Semantics Static semantics Dynamic semantics attribute grammars
Object-Oriented programming in C++ Classes as units of encapsulation Information Hiding Inheritance polymorphism and dynamic dispatching Storage management.
Programming Languages and Paradigms
1 Programming Languages (CS 550) Lecture Summary Functional Programming and Operational Semantics for Scheme Jeremy R. Johnson.
Names and Bindings.
Clean code. Motivation Total cost = the cost of developing + maintenance cost Maintenance cost = cost of understanding + cost of changes + cost of testing.
Programming Languages Marjan Sirjani 2 2. Language Design Issues Design to Run efficiently : early languages Easy to write correctly : new languages.
1 Relational Algebra & Calculus. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
Refactoring Erlang Programs Huiqing Li Simon Thompson University of Kent.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
Xyleme A Dynamic Warehouse for XML Data of the Web.
1 Semantic Processing. 2 Contents Introduction Introduction A Simple Compiler A Simple Compiler Scanning – Theory and Practice Scanning – Theory and Practice.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
The Formalisation of Haskell Refactorings Huiqing Li Simon Thompson Computing Lab, University of Kent
Refactoring Haskell Programs Huiqing Li Computing Lab, University of Kent
G Robert Grimm New York University Fine-grained Mobility (in Emerald)
WRT 2007 Refactoring Functional Programs Huiqing Li Simon Thompson Computing Lab Chris Brown Claus Reinke University of Kent.
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
The Haskell Refactorer, HaRe, and its API Huiqing Li Claus Reinke Simon Thompson Computing Lab, University of Kent
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
Vakgroep Informatietechnologie – IBCN Software Architecture Prof.Dr.ir. F. Gielen Quality Attributes & Tactics (4) Modifiability.
1 Relational Algebra and Calculus Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Building An Interpreter After having done all of the analysis, it’s possible to run the program directly rather than compile it … and it may be worth it.
Abstract Data Types and Encapsulation Concepts
1 Exception and Event Handling (Based on:Concepts of Programming Languages, 8 th edition, by Robert W. Sebesta, 2007)
CSC 8310 Programming Languages Meeting 2 September 2/3, 2014.
1 Chapter 5: Names, Bindings and Scopes Lionel Williams Jr. and Victoria Yan CSci 210, Advanced Software Paradigms September 26, 2010.
REFACTORING Lecture 4. Definition Refactoring is a process of changing the internal structure of the program, not affecting its external behavior and.
1 Relational Algebra and Calculus Chapter 4. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.
Data Structures Using C++ 2E
Keneequa Brown Chris Duzan Navaid Khalfay. Originally developed in 1986 by Joe Armstrong as a proprietary language within Ericsson Released as open source.
COMPILERS Semantic Analysis hussein suleman uct csc3005h 2006.
DIY Refactorings in Wrangler Huiqing Li Simon Thompson School of Computing University of Kent.
Functional Programming Universitatea Politehnica Bucuresti Adina Magda Florea
Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:
Basic Semantics Associating meaning with language entities.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
ADTs and C++ Classes Classes and Members Constructors The header file and the implementation file Classes and Parameters Operator Overloading.
1 Relational Algebra & Calculus Chapter 4, Part A (Relational Algebra)
1 Relational Algebra and Calculas Chapter 4, Part A.
Refactoring Erlang Programs Huiqing Li Simon Thompson University of Kent Zoltán Horváth Eötvös Loránd Univ.
8 1 Chapter 8 Advanced SQL Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Introduction to Compilers. Related Area Programming languages Machine architecture Language theory Algorithms Data structures Operating systems Software.
Static Detection of Race Conditions in Erlang Maria Christakis National Technical University of Athens, Greece Joint work with Kostis Sagonas.
Object-Oriented Programming Chapter Chapter
Scalable Clone Detection and Elimination for Erlang Programs Huiqing Li, Simon Thompson University of Kent Canterbury, UK.
Bernd Fischer RW713: Compiler and Software Language Engineering.
(1) ICS 313: Programming Language Theory Chapter 11: Abstract Data Types (Data Abstraction)
Chapter 12: Programming in the Large By: Suraya Alias 1-1.
Fusion Design Overview Object Interaction Graph Visibility Graph Class Descriptions Inheritance Graphs Fusion: Design The overall goal of Design is to.
©SoftMoore ConsultingSlide 1 Structure of Compilers.
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
Procedure Definitions and Semantics Procedures support control abstraction in programming languages. In most programming languages, a procedure is defined.
PPL Syntax & Formal Semantics Lecture Notes: Chapter 2.
Semantic Analysis. Find 6 problems with this code. These issues go beyond syntax.
LECTURE 10 Semantic Analysis. REVIEW So far, we’ve covered the following: Compilation methods: compilation vs. interpretation. The overall compilation.
Run-Time Environments Presented By: Seema Gupta 09MCA102.
Compiler Construction (CS-636)
Lecture 15 (Notes by P. N. Hilfinger and R. Bodik)
Binding Times Binding is an association between two things Examples:
UNIT V Run Time Environments.
Chapter 8 Advanced SQL.
COMPILERS Semantic Analysis
Module Crib08 where -- Simple data declaration, -- case expression, and -- simple patterns data Tree a = Tip a | Node2 (Tree a) a (Tree a) | Node3 (Tree.
Refactoring.
Presentation transcript:

Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors: Zoltán Horváth and Simon Thompson Project members: László Lövei, Tamás Kozsik

Refactor Refactoring is restructuring program code without altering its external behaviour Aims: restructuring of code, code quality improvement, coding conventions, optimization, migrating to new API In general not available for functional languages, only: HaRe (AST) and Clean refactoring (relational database)

Erlang Functional programming language and runtime environment developed by Ericsson Designed to build distributed, reliable, soft realtime concurrent systems (telecommunication) Highly dynamic nature Lightweight processes and message passing Messages can be sent to a process ID or registered name,which are bound to function code at runtime. Possibility of running dynamically created code (eval, hot code replacement).

Erlang Function id-s are atoms, atoms may be constructed runtime (and passed as arguments to apply,spawn). The syntactical category of an atom may be a function or ”string”, etc. Side effects are restricted to message passing and built-in functions Modules with explicit interface definitions and static export and import lists No static type system Variables are assigned a value only once in their life Variables are not typed statically, they can have a value of any data type.

The refactor tool Erlang node Source code In Emacs AST of the source Source in the database MySQL Distel Emacs ODBC (?)

Code, AST

Storing the code in database Every node in a tree has a unique identifier. Every module has a unique module identifier. Almost every syntax-tree node type has an own database table The records in the tables contain the identifiers of the current nodes and their children’s ones. We store the positions, node types and names in separate tables.

Semantic informations We store the semantic informations in separate tables: identical variables, function definitions and their calling expressions, scope of the nodes, hierarhy of scopes AST + semantic informations = graph, not tree. + Searching and using a graph is more efficient with relational database as traversing the tree. - The storing-recovering of the code and communication with the database is more expensive

Our own tables: visibility of variables

Our own tables: function calls

Our own tables: visibility of scopes

Problems during the building up The Erlang prepocessor substitutes the macro definitions using epp_dodger instead of epp Every node has only the line number of position informations using erl_scan1 (modified by Huiqing) instead of erl_scan: it can give back the column information too

Problems during the building up In Erlang language there are many types of comments: we have not only comments, but pre- and postcomments too, which can be list of comment nodes. The erl_comment_scan:file collects the comments from the file, and we have to put them too with the correct position information into the database. There is the same problem with the column infomation so we use erl_recomment1 instead of erl_recomment.

The algorithms We use postorder traverse on the syntax tree to give identifiers and put the nodes into the correct table. We use preorder traverse on the syntax tree to get the visibility information. The other information come from the database (collected by separate processes), for example the information of the function callings.

Analysis for Refactoring Steps Syntax analysis (AST) Static semantics (Annotated AST or relational database): scope, visibility, binding structure, type information. Side-condition analysis Compensations Dynamic function calls (apply, spawn, etc.) Syntactical, semantical and library coverage

Rename variable Definition: Find every occurrence of the variable (i.e. the variables with the same name in the visibility range of the variable) and replace every occurrence with the new name. Precondition: The new variable name is not visible at any occurrence of the variable Limited to one module (no global variables)

Rename function Definition: The refactoring rely on finding the definition and every place of call for a given function and substitute it with a new name. Preconditions: No name clash in the current module (existing functions, import list) No name clash in other modules, if the function is exported

Reorder arguments Definition: Change the order of arguments in the same way at the definition and every place of call for a given function. Preconditions: No side effect of the parameters (just planned) Tricky implicit function calls delete, create subtree (the same problem will be at the tuple arguments refactor step too)

Implicit function example

Tuple arguments Definition: Change the way of using some arguments at the definition and at every place of call for a given function by grouping some arguments into one tuple argument. Preconditions: The given position must be within a formal argument of a function definition The function must be a declared function, not a fun- expression The given number must not be too large No name clash if the arity is changing (not only in the current module if the function is exported)

Eliminate variable Definition: All instances of a variable are replaced with its bound value in that region where the variable is visible. The variable can be left out where its value is not used. Preconditions: It has exactly one binding occurrence on the left hand side of a pattern matching expression, and not a part of a compound pattern. The expression bound to the variable has no side effects. Every variable of the expression is visible (that is, not shadowed) at every occurrence of the variable to be eliminated.

Eliminate variable cont. Decide if an occurrence is needed (remove or replace). Remove if: Not at the end of block expression Not at the end of clause body Not at the end of the recieve expression action Not at the end of try expression body, after branch and handler Replicate subtree, because we need unique id-s in the new subtrees. The subtree can contain every node type, we had to implement the most of the syntax tool module for database representation.

Planned refactor steps (short term) Merge subexpression duplicates: All instances of the same subexpressions are stored in a variable that the user gives, then all instances of the original subexpression are changed to the variable. Extract function: An alternative of a function definition might contain a sequence of expressions which can be considered as a logical unit, hence a function definition can be created from it. The extracted function is lifted to the module level, and it is parameterised with the variables that the expressions depend on. The sequence of expressions is replaced with a function call expression.

Planned refactor steps (middle term) Tuple to record Specialisation of functions Generalisation of functions Fusion of functions Modification of data structures

Fusion and comparison We are working together with the University of Kent. They released the Wrangler Erlang refactorer in January Their approach is working with annotated abstract syntax trees without database. We plan to make a common version with more refactorings. The two tools have only two common refactorings (rename variable, rename function). The two tools gave the same result on bigger testbases too.

Testing We tested the tool with more than 200 little test cases, which are made to cover the possible branches of the refactorings. The tool gave the same result as the original result files. We tested the tool on a real bigger codebase. Our version was slower at the moment as the AAST approach at big multi module systems and huge source files.

Future work We plan to make the fusion of the two tools at first just „under a common umbrella” and later with a common interface level Compare the two approaches with more complicated refactorings (generalisation) We plan to eliminate the ODBC connection and call the MySQL database directly from the Erlang node (a mysql module was released for Erlang in the middle of January) Expand the number of the refactorings