Code recognition & CL modeling through AST Xingzhong Xu Hong Man.

Slides:



Advertisements
Similar presentations
Duplicate code detection using Clone Digger Peter Bulychev Lomonosov Moscow State University CS department.
Advertisements

Debugging Natural Semantics Specifications Adrian Pop and Peter Fritzson Programming Environment Laboratory Department of Computer and Information Science.
Software & Services Group, Developer Products Division Copyright© 2010, Intel Corporation. All rights reserved. *Other brands and names are the property.
ANTLR in SSP Xingzhong Xu Hong Man Aug Outline ANTLR Abstract Syntax Tree Code Equivalence (Code Re-hosting) Future Work.
A survey of techniques for precise program slicing Komondoor V. Raghavan Indian Institute of Science, Bangalore.
Reverse Engineering © SERG Code Cloning: Detection, Classification, and Refactoring.
Automated creation of verification models for C-programs Yury Yusupov Saint-Petersburg State Polytechnic University The Second Spring Young Researchers.
SSP Re-hosting System Development: CLBM Overview and Module Recognition SSP Team Department of ECE Stevens Institute of Technology Presented by Hongbing.
Automated Analysis and Code Generation for Domain-Specific Models George Edwards Center for Systems and Software Engineering University of Southern California.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Semantic Signal Processing Group Linguistics for Semantic Radio C Language Fangming He, Xingzhong Xu, Hong Man, Yudong Yao Department of Electrical and.
Semantic Signal Processing: Semantic Modeling and Prototype Demo Development SSP Team.
SPAC Lab, Stevens SSP Re-hosting System Development: Modeling of Matlab Programs - Array (vector/matrix) Ning Han, Hongbing Cheng, Jiadi Yu, Hongbin Li,
Semantic Signal Processing for Re-hosting CR/SDR Implementations SP/Radio Primitive Recognition Jiadi Yu, Yingying Chen 1.
From Cooper & Torczon1 Implications Must recognize legal (and illegal) programs Must generate correct code Must manage storage of all variables (and code)
Group Discussion Hong Man 07/21/ UMD DIF with GNU Radio From Will Plishker’s presentation. 2 GRC The DIF Package (TDP) Platforms GPUs Multi- processors.
BİL744 Derleyici Gerçekleştirimi (Compiler Design)1.
Stimulating reuse with an automated active code search tool Júlio Lins – André Santos (Advisor) –
Mining Metamodels From Instance Models: The MARS System Faizan Javed Department of Computer & Information Sciences, University of Alabama at Birmingham.
SSP Re-hosting System: CLBM and Semantic Representations SSP Team Department of ECE Stevens Institute of Technology Presented by Hongbing Cheng and Jiadi.
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.
Lecture 2 Phases of Compiler. Preprocessors, Compilers, Assemblers, and Linkers Preprocessor Compiler Assembler Linker Skeletal Source Program Source.
ANTLR.
ANTLR Andrew Pangborn & Zach Busser. ANTLR in a Nutshell ANother Tool for Language Recognition generates lexers generates parsers (and parse trees)‏ Java-based,
Compiler Design Nai-Wei Lin Department of Computer Science National Chung Cheng University.
September 7, September 7, 2015September 7, 2015September 7, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University.
An Introduction Chapter Chapter 1 Introduction2 Computer Systems  Programmable machines  Hardware + Software (program) HardwareProgram.
INTRODUCTION TO COMPUTING CHAPTER NO. 06. Compilers and Language Translation Introduction The Compilation Process Phase 1 – Lexical Analysis Phase 2 –
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
Change Impact Analysis for AspectJ Programs Sai Zhang, Zhongxian Gu, Yu Lin and Jianjun Zhao Shanghai Jiao Tong University.
Mining and Analysis of Control Structure Variant Clones Guo Qiao.
Compiler course 1. Introduction. Outline Scope of the course Disciplines involved in it Abstract view for a compiler Front-end and back-end tasks Modules.
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Model-Driven Analysis Frameworks for Embedded Systems George Edwards USC Center for Systems and Software Engineering
CSC 230: C and Software Tools Rudra Dutta Computer Science Department Course Introduction.
COMPILER OVERVIEW. Compiler Phases  Syntactic Analysis (Lexing, Parsing)  c = (a + b) * (a + b);
Research Topics CSC Parallel Computing & Compilers CSC 3990.
Towards the better software metrics tool motivation and the first experiences Gordana Rakić Zoran Budimac.
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 1, 08/28/03 Prof. Roy Levow.
Compiler design Lecture 1: Compiler Overview Sulaimany University 2 Oct
1. 2 Preface In the time since the 1986 edition of this book, the world of compiler design has changed significantly 3.
Duplicate code detection using anti-unification Peter Bulychev Moscow State University Marius Minea Institute eAustria, Timisoara.
Evaluating and Improving an OpenMP-based Circuit Design Tool Tim Beatty, Dr. Ken Kent, Dr. Eric Aubanel Faculty of Computer Science University of New Brunswick.
Weaving a Debugging Aspect into Domain-Specific Language Grammars SAC ’05 PSC Track Santa Fe, New Mexico USA March 17, 2005 Hui Wu, Jeff Gray, Marjan Mernik,
Compiler Design Introduction 1. 2 Course Outline Introduction to Compiling Lexical Analysis Syntax Analysis –Context Free Grammars –Top-Down Parsing –Bottom-Up.
Compiler Introduction 1 Kavita Patel. Outlines 2  1.1 What Do Compilers Do?  1.2 The Structure of a Compiler  1.3 Compilation Process  1.4 Phases.
Automatically detecting and describing high level actions within methods Presented by: Gayani Samaraweera.
Implementation of a Relational Database as an Aid to Automatic Target Recognition Christopher C. Frost Computer Science Mentor: Steven Vanstone.
Visualization in Problem Solving Environments Amit Goel Department of Computer Science Virginia Tech June 14, 1999.
 Programming - the process of creating computer programs.
SSQSA present and future Gordana Rakić, Zoran Budimac Department of Mathematics and Informatics Faculty of Sciences University of Novi Sad
Chapter – 8 Software Tools.
1 Asstt. Prof Navjot Kaur Computer Dept PRESENTED BY.
ICS312 Introduction to Compilers Set 23. What is a Compiler? A compiler is software (a program) that translates a high-level programming language to machine.
CS416 Compiler Design1. 2 Course Information Instructor : Dr. Ilyas Cicekli –Office: EA504, –Phone: , – Course Web.
CS 603: Programming Language Organization Lecture 6 Spring 2003 Department of Computer Science University of Alabama Joel Jones.
Estimating Code Size After a Complete Code-Clone Merge Buford Edwards III, Yuhao Wu, Makoto Matsushita, Katsuro Inoue 1 Graduate School of Information.
Advanced Computer Systems
Introduction to Compiler Construction
CS 3304 Comparative Languages
Introduction to Parsing (adapted from CS 164 at Berkeley)
Compiler 薛智文 TH 6 7 8, DTH Spring.
Implementing Language Extensions with Model Transformations
Introduction to Computer Science for Majors II
Compiler 薛智文 TH 6 7 8, DTH Spring.
Implementing Language Extensions with Model Transformations
Lec00-outline May 18, 2019 Compiler Design CS416 Compiler Design.
Compiler 薛智文 M 2 3 4, DTH Spring.
Presentation transcript:

Code recognition & CL modeling through AST Xingzhong Xu Hong Man

Outline Introduction of AST in SSP AST for Code Recognition AST for Cognitive Linguistic Modeling Summary and Future Work 2Semantic Signal Processing Stevens

Introduction of AST in SSP Most language application use Abstract Syntax Tree(AST) as an Intermediate Representation(IR) to help the computer semantically understanding code in programming domain.* Signal Processing Code How to semantically analyzing it? How to semantically modeling it? *Terence Parr, The Definitive Antlr Reference: Building Domain-Specific Languages (Pragmatic Programmers), 2007 **ANTLR for (i = 0; i < n; i++){ acc0 += d_taps[i] * input[i]; } 3Semantic Signal Processing Stevens

Code Recognition In order to perform code re-hosting and other semantic code analysis, we may firstly recognize the functionality of each code segment. In Computer Science, there are two approaches to perform Code Recognition: 1.AST based recognition [Gabel, 2008] [Roy 2009] o Generate the AST o Perform Tree Matcher 2.Random Test based recognition [Jiang, 2009] [Bertran, 2005] o Segment the code o Test the I/O behavior 4Semantic Signal Processing Stevens

Code Recognition AST represents the source code in programming domain. Radio and computational primitives has their feature in AST. o Filter ≈ LOOP + ACCUMULATION + MULTIPLY 5Semantic Signal Processing Stevens for (i = 0; i < n; i++){ acc0 += d_taps[i] * input[i]; }

Code Recognition Result In order to test the idea, I design a Code Recognition demo (not fully debugged). Source: GNU-Radio (C++) Objective: Recognize and print the filter code. Platform: Ubuntu Java SE 1.6+ ANTLR 3.2 Process: Generate AST for each C++ file. Match the filter sub-tree pattern. Print the matched code segment. 6Semantic Signal Processing Stevens

Code Recognition Result Result: Totally 932 C++ source files in GNU-Radio. 689 files successfully analyzed (to be continued). 59 filter patterns found. for (i = 0; i < n; i += N_UNROLL){ acc0 += d_taps[i + 0] * input[i + 0]; acc1 += d_taps[i + 1] * input[i + 1]; acc2 += d_taps[i + 2] * input[i + 2]; acc3 += d_taps[i + 3] * input[i + 3]; } for (int j = 0; j next_bit()-1.0; sum += *in++ * d_pn; } for (i=0; i < d_ff_taps.size(); i++) acc += conj(d_ff_delayline[(i+d_ff_index) & ff_mask]) * d_ff_taps[i]; 7Semantic Signal Processing Stevens

CL Modeling Intermediate Representation: AST (Programming Domain) CL Modeling (Signal Processing Domain) 8Semantic Signal Processing Stevens k = N – i;

CL Modeling 9Semantic Signal Processing Stevens k = N – i; Rewrite and mapping the structure and tokens from the AST to CL Modeling Tree.

CL Modeling Result In order to test our idea, I designed a CL Modeling demo based on AST.* One tree rewriter will translate and modify the current AST to CL Modeling Tree. Based on the CL Modeling Tree, print the CL Modeling XML file. 10Semantic Signal Processing Stevens *Terence Parr, Language Implementation Patterns: Create Your Own Domain-Specific and General Programming Languages, Pragmatic Programmers, 2010.

Summary & Future Work The programming domain AST is a key interface for language application, in SSP project: Code Recognition: Determine the functionality of the code segment. Cognitive Linguistic Modeling: As an intermediate form to modeling the radio code. Future Work: Cover more code, C++, Matlab, VHDL etc. Discover more computational and radio primitive. Fully support CL Modeling. 11Semantic Signal Processing Stevens

Reference 1.Jiang L. and Su, Z Automatic Mining of Functionally equivalent code fragments via random testing. In Proceedings of the Eighteenth international Symposium on Software Testing and Analysis. 2.Gabel, M., Jiang, L., and Su, Z Scalable detection of semantic clones. In Proceedings of the 30th international Conference on Software Engineering. 3.C.K. Roy, J.R. Cordy and R. Koschke B Comparison and Evaluation of code Clone Detection Techniques and Tools: A Qualitative Approach. Science of Computer Programming. 4.Bertran, M., Babot, F., and Climent, A An Input/Output Semantics for Distributed Program Equivalence Reasoning. Electron. Notes Theor. Comput. Sci. 137,1 (Jul.2005) 5.Terence Parr, The Definitive Antlr Reference: Building Domain-Specific Languages (Pragmatic Programmers), Terence Parr, Language Implementation Patterns: Create Your Own Domain-Specific and General Programming Languages, Pragmatic Programmers, Semantic Signal Processing Stevens