Presentation is loading. Please wait.

Presentation is loading. Please wait.

Code recognition & CL modeling through AST Xingzhong Xu Hong Man.

Similar presentations


Presentation on theme: "Code recognition & CL modeling through AST Xingzhong Xu Hong Man."— Presentation transcript:

1 Code recognition & CL modeling through AST Xingzhong Xu Hong Man

2 Outline Introduction of AST in SSP AST for Code Recognition AST for Cognitive Linguistic Modeling Summary and Future Work 2Semantic Signal Processing Stevens

3 Introduction of AST in SSP Most language application use Abstract Syntax Tree(AST) as an Intermediate Representation(IR) to help the computer semantically understanding code in programming domain.* Signal Processing Code How to semantically analyzing it? How to semantically modeling it? *Terence Parr, The Definitive Antlr Reference: Building Domain-Specific Languages (Pragmatic Programmers), 2007 **ANTLR for (i = 0; i < n; i++){ acc0 += d_taps[i] * input[i]; } 3Semantic Signal Processing Stevens

4 Code Recognition In order to perform code re-hosting and other semantic code analysis, we may firstly recognize the functionality of each code segment. In Computer Science, there are two approaches to perform Code Recognition: 1.AST based recognition [Gabel, 2008] [Roy 2009] o Generate the AST o Perform Tree Matcher 2.Random Test based recognition [Jiang, 2009] [Bertran, 2005] o Segment the code o Test the I/O behavior 4Semantic Signal Processing Stevens

5 Code Recognition AST represents the source code in programming domain. Radio and computational primitives has their feature in AST. o Filter ≈ LOOP + ACCUMULATION + MULTIPLY 5Semantic Signal Processing Stevens for (i = 0; i < n; i++){ acc0 += d_taps[i] * input[i]; }

6 Code Recognition Result In order to test the idea, I design a Code Recognition demo (not fully debugged). Source: GNU-Radio 3.2.2 (C++) Objective: Recognize and print the filter code. Platform: Ubuntu 10.04 + Java SE 1.6+ ANTLR 3.2 Process: Generate AST for each C++ file. Match the filter sub-tree pattern. Print the matched code segment. 6Semantic Signal Processing Stevens

7 Code Recognition Result Result: Totally 932 C++ source files in GNU-Radio. 689 files successfully analyzed (to be continued). 59 filter patterns found. for (i = 0; i < n; i += N_UNROLL){ acc0 += d_taps[i + 0] * input[i + 0]; acc1 += d_taps[i + 1] * input[i + 1]; acc2 += d_taps[i + 2] * input[i + 2]; acc3 += d_taps[i + 3] * input[i + 3]; } for (int j = 0; j next_bit()-1.0; sum += *in++ * d_pn; } for (i=0; i < d_ff_taps.size(); i++) acc += conj(d_ff_delayline[(i+d_ff_index) & ff_mask]) * d_ff_taps[i]; 7Semantic Signal Processing Stevens

8 CL Modeling Intermediate Representation: AST (Programming Domain) CL Modeling (Signal Processing Domain) 8Semantic Signal Processing Stevens k = N – i;

9 CL Modeling 9Semantic Signal Processing Stevens k = N – i; Rewrite and mapping the structure and tokens from the AST to CL Modeling Tree.

10 CL Modeling Result In order to test our idea, I designed a CL Modeling demo based on AST.* One tree rewriter will translate and modify the current AST to CL Modeling Tree. Based on the CL Modeling Tree, print the CL Modeling XML file. https://sites.google.com/site/stevensxingzhong/home/clmb 10Semantic Signal Processing Stevens *Terence Parr, Language Implementation Patterns: Create Your Own Domain-Specific and General Programming Languages, Pragmatic Programmers, 2010.

11 Summary & Future Work The programming domain AST is a key interface for language application, in SSP project: Code Recognition: Determine the functionality of the code segment. Cognitive Linguistic Modeling: As an intermediate form to modeling the radio code. Future Work: Cover more code, C++, Matlab, VHDL etc. Discover more computational and radio primitive. Fully support CL Modeling. 11Semantic Signal Processing Stevens

12 Reference 1.Jiang L. and Su, Z. 2009. Automatic Mining of Functionally equivalent code fragments via random testing. In Proceedings of the Eighteenth international Symposium on Software Testing and Analysis. 2.Gabel, M., Jiang, L., and Su, Z. 2008. Scalable detection of semantic clones. In Proceedings of the 30th international Conference on Software Engineering. 3.C.K. Roy, J.R. Cordy and R. Koschke B. 2009. Comparison and Evaluation of code Clone Detection Techniques and Tools: A Qualitative Approach. Science of Computer Programming. 4.Bertran, M., Babot, F., and Climent, A. 2005. An Input/Output Semantics for Distributed Program Equivalence Reasoning. Electron. Notes Theor. Comput. Sci. 137,1 (Jul.2005) 5.Terence Parr, The Definitive Antlr Reference: Building Domain-Specific Languages (Pragmatic Programmers), 2007 6.Terence Parr, Language Implementation Patterns: Create Your Own Domain-Specific and General Programming Languages, Pragmatic Programmers, 2010. 12Semantic Signal Processing Stevens


Download ppt "Code recognition & CL modeling through AST Xingzhong Xu Hong Man."

Similar presentations


Ads by Google