IBM Software Group ® CDT DOM Proposal Title slide Doug Schaefer IBM Tech Lead, Eclipse CDT December 2004.

Slides:



Advertisements
Similar presentations
Inside an XSLT Processor Michael Kay, ICL 19 May 2000.
Advertisements

Extending Eclipse CDT for Remote Target Debugging Thomas Fletcher Director, Automotive Engineering Services QNX Software Systems.
Eclipse Rational CDT Update February 5 th, Solid Palette Gradient Palette I Gradient Palette II APPLYING THESE COLORS Click on the.
Building FHIR Servers on Existing Applications
OO Programming in Java Objectives for today: Overriding the toString() method Polymorphism & Dynamic Binding Interfaces Packages and Class Path.
Copyright © IBM Corp., Introducing the new Web Tools JavaScript™ Features Phil Berkland IBM Software Group 9/26/2007.
Feature requests for Case Manager By Spar Nord Bank A/S IBM Insight 2014 Spar Nord Bank A/S1.
Copyright  2005 Symbian Software Ltd. 1 Lars Kurth Technology Architect, Core Toolchain The Template Engine CDT Developer Conference, Oct 2005.
Chapter3: Language Translation issues
20-Jun-15 Eclipse. Most slides from: 2 About IDEs An IDE is an Integrated Development Environment.
The Structure of the GNAT Compiler. A target-independent Ada95 front-end for GCC Ada components C components SyntaxSemExpandgigiGCC AST Annotated AST.
Whole Platform Tesi di Dottorato di: RICCARDO SOLMI Università degli Studi di Bologna Facoltà di scienze matematiche, fisiche e naturali Corso di Dottorato.
Generative Programming. Generic vs Generative Generic Programming focuses on representing families of domain concepts Generic Programming focuses on representing.
Course Instructor: Aisha Azeem
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 17 Slide 1 Rapid software development.
FHIRFarm – How to build a FHIR Server Farm (quickly)
® IBM Software Group © 2006 IBM Corporation How to read/write XML using EGL This Learning Module shows how to utilize an EGL Library to read/write an XML.
CVSQL 2 The Design. System Overview System Components CVSQL Server –Three network interfaces –Modular data source provider framework –Decoupled SQL parsing.
TGDC Meeting, December 2011 Michael Kass National Institute of Standards and Technology Update on SAMATE Automated Source Code Conformance.
This chapter is extracted from Sommerville’s slides. Text book chapter
LAYING OUT THE FOUNDATIONS. OUTLINE Analyze the project from a technical point of view Analyze and choose the architecture for your application Decide.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
REFACTORING Lecture 4. Definition Refactoring is a process of changing the internal structure of the program, not affecting its external behavior and.
ASP.NET INTRODUCTION INTO وزارة التربية و التعليم العالي كلية العلوم و التكنولوجيا قسم علوم الحاسوب و تكنولوجيا المعلومات اعداد الاستاذ: عبد الله محمد.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Chapter 10 Architectural Design.
Building Search Portals With SP2013 Search. 2 SharePoint 2013 Search  Introduction  Changes in the Architecture  Result Sources  Query Rules/Result.
ITCS 6010 SALT. Speech Application Language Tags (SALT) Speech interface markup language Extension of HTML and other markup languages Adds speech and.
1 Programming Languages Tevfik Koşar Lecture - II January 19 th, 2006.
Oct 26, 2005 CDT DOM Roadmap Doug Schaefer. Parser History  CDT 1.0 ► JavaCC based parser  Used to populate CModel and Structure Compare ► ctags based.
Compiler course 1. Introduction. Outline Scope of the course Disciplines involved in it Abstract view for a compiler Front-end and back-end tasks Modules.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
WEB BASED DATA TRANSFORMATION USING XML, JAVA Group members: Darius Balarashti & Matt Smith.
IBM Software Group ® Overview of SA and RSA Integration John Jessup June 1, 2012 Slides from Kevin Cornell December 2008 Have been reused in this presentation.
Generative Programming. Automated Assembly Lines.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
© 2006 IBM Corporation Agile Planning Web UI. © 2006 IBM Corporation Agenda  Overview of APT Web UI  Current Issues  Required Infrastructure  API.
CE Operating Systems Lecture 14 Memory management.
Modeling Component-based Software Systems with UML 2.0 George T. Edwards Jaiganesh Balasubramanian Arvind S. Krishna Vanderbilt University Nashville, TN.
© 2006 by «Author»; made available under the EPL v1.0 | Date | Other Information, if necessary Doug Schaefer CDT DOM What is it? What should it be?
Eclipse 24-Apr-17.
1. 2 Preface In the time since the 1986 edition of this book, the world of compiler design has changed significantly 3.
CS 460/660 Compiler Construction. Class 01 2 Why Study Compilers? Compilers are important – –Responsible for many aspects of system performance Compilers.
Introduction to Compilers. Related Area Programming languages Machine architecture Language theory Algorithms Data structures Operating systems Software.
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
Cross Language Clone Analysis Team 2 October 13, 2010.
Behavioral Patterns CSE301 University of Sunderland Harry R Erwin, PhD.
Plug-in Architectures Presented by Truc Nguyen. What’s a plug-in? “a type of program that tightly integrates with a larger application to add a special.
Topic 4 - Database Design Unit 1 – Database Analysis and Design Advanced Higher Information Systems St Kentigern’s Academy.
® IBM Software Group © 2007 IBM Corporation Module 1: Getting Started with Rational Software Architect Essentials of Modeling with IBM Rational Software.
1 Chapter 12 Configuration management This chapter is extracted from Sommerville’s slides. Text book chapter 29 1.
® IBM Software Group © 2009 IBM Corporation Essentials of Modeling with the IBM Rational Software Architect, V7.5 Module 15: Traceability and Static Analysis.
Cross Language Clone Analysis Team 2 February 3, 2011.
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
New Project Model UI Primary Author: Mikhail Sennikovsky Major contributors: Mikhail Voronin, Oleg Krasilnikov, Leo Treggiari Intel Corporation September,
Chapter – 8 Software Tools.
1 Asstt. Prof Navjot Kaur Computer Dept PRESENTED BY.
Eclipse 27-Apr-17.
© 2010 IBM Corporation RESTFul Service Modelling in Rational Software Architect April, 2011.
CSE 351 Caches. Before we start… A lot of people confused lea and mov on the midterm Totally understandable, but it’s important to make the distinction.
*DT Project Model Leo Treggiari Intel Corp. Dec, 2005.
Component 1.6.
Compiler Design (40-414) Main Text Book:
Improving Performance
Eclipse 20-Sep-18.
Compiler Construction
Presented By: Darlene Banta
Java IDE Dwight Deugo Nesa Matic Portions of the notes for this lecture include excerpts from.
Presentation transcript:

IBM Software Group ® CDT DOM Proposal Title slide Doug Schaefer IBM Tech Lead, Eclipse CDT December 2004

IBM Software Group | Rational software © 2004 IBM Corporation 2 Where are we (IBM) coming from?  We’re part of the Model-Driven Development (MDD) team in the Rational Software Division of the IBM Software Group  CDT is the core C++ component of Rational Software Architect  The C++ component includes visualization of code as UML as well as transformation of UML to C++  We need an accurate, complete DOM that allows programmatic code change  Since do both Java and C++ we’d like the architectures of the JDT and CDT to be “similar”.

IBM Software Group | Rational software © 2004 IBM Corporation 3 What have we done until now?  Started work on parser in Dec  Considered a number of options but settled on a handwritten parser  Needed smart control over ambiguities in C/C++  Needed to support a number of clients with different needs  Similar in approach to newer versions of gcc.  First contributed in 1.1 for CModel replacing previous JavaCC-based parser  Added Indexing and Search as client in 1.2  Replaced previous ctags-based index (not without issue)  Numerous clients in 2.0, including content assist, F3, type browsers

IBM Software Group | Rational software © 2004 IBM Corporation 4 Parsing Architecture 1.0  We had concern over scalability of AST  Settled on a callback-based architecture  Clients created only the data that they needed  Client would pass in a requestor to the parser and the parser would pass in parse info as the parse progressed  Despite the names, the IAST* parse info does not form a proper AST  Client would also be responsible for passing in ScannerInfo  Compiler command line arguments that affect parsing Include paths, macro defs  Needed to properly parse

IBM Software Group | Rational software © 2004 IBM Corporation 5 Parsing Architecture 1.0 – What worked well  Pretty accurate parse information (usually)  Content assist worked really well for C and C++  Accurate out line view  Accurate search results, F3  Ability to generate type info for class and hierarchy browsers  Flexibility for clients  Enabled us to pile on the clients in 2.0

IBM Software Group | Rational software © 2004 IBM Corporation 6 Parser Architecture 1.0 – What didn’t work well  Performance  Assumed parse times would be quick, < 1 sec Finding parse times usually in 2-4 second range  Index takes a lot of time, memory, and CPU power to generate  Content assist times out regularly  Hard to provide accurate results consistently  F3 and content assist often produce no results  Need for accurate ScannerInfo often hard to satisfy  Scalability to large projects  Problems only worsen when project grows  Index times longer than already long build times  ScannerInfo different for different files in project

IBM Software Group | Rational software © 2004 IBM Corporation 7 Why a DOM?  Although flexible, the callback mechanism was hard to define  Unable to provide all information that every possible client would require  Often unable to determine parser context for certain constructs  Clients had added complexity to manage their own data  Parser had added complexity to provide enough data Parse mode proliferation  Need to address scalability  Cut down on the amount of parsing we need to do  Can we reuse parse results?  Need data structures to help programmatically make changes to the source code

IBM Software Group | Rational software © 2004 IBM Corporation 8 DOM Architecture  The DOM is composed of the parser/scanner and three levels of data  1) Physical AST with mapping back through macro expansions to source  2) Logical Scope/Binding tree with cross translation unit indexing  3) AST Rewriter  Firm up interfaces for DOM creation  Navigate from Core Model to DOM  Hide parser from clients (use core model)  Allows us to play with how the DOM is created to improve scalability

IBM Software Group | Rational software © 2004 IBM Corporation 9 Goals for 3.0  To reduce the fixed cost of a parse (preprocess, scan & syntax matching) and to allow for lazy semantic evaluation  Improve performance & reduce memory footprint of navigation features  To provide a “complete” physical AST which can make our clients aware of preprocessor macro expansions in source code  To provide better support for C  Link-time resolution cross references  Tailored implementation of parser/semantic bindings

IBM Software Group | Rational software © 2004 IBM Corporation 10 Physical AST - IASTNode  Given a file to parse, the AST Framework shall return an IASTTranslationUnit which then can be traversed or visited  The IASTNode interface hierarchy represent constructs in the C/C++ grammar. long x; /* IASTSimpleDeclaration with 1 IASTDeclarator */ long y(); /* IASTSimpleDeclaration with 1 IASTFunctionDeclarator */ int f() { return sizeof( int ); } /* IASTFunctionDefinition */  Physical tree is unaware of any semantic knowledge  Declaration before use (C++)  Scoping  Type compatibility  lValue vs. rValue  Allows for quick generation of syntax tree  Only slight overhead to cost of scan & preprocess

IBM Software Group | Rational software © 2004 IBM Corporation 11 Logical Tree - IBinding  Logical elements are higher-level constructs that map onto a physical IASTNode  For 3.0 : IASTName#resolveBinding() long x; /* IASTName for x resolves to an IVariable */ long y(); /* IASTName for y resolves to an IFunction */ int f() { return sizeof( int ); } /* IASTName for f resolves to an IFunction*/  Beyond 3.0 – IASTBinaryExpression could bind to an user defined operator  Semantic errors return an IProblemBinding describing the error  IASTTranslationUnit can be asked for all declarations or references of a particular IBinding  Bindings can be resolved completely lazily on request or in a full traversal of the AST  Indexer would require full binding resolution  Most other clients do not require all bindings to be resolved

IBM Software Group | Rational software © 2004 IBM Corporation 12 Macro Awareness in the Physical AST  Nearly all parser clients in the CDT are concerned with offsets  Selection  Search Markers  Navigation  Compare  Within a preprocessor macro expansion, our IScanner implementation massages the offsets as tokens arrive  However, the CDT 2.x AST Nodes are unaware as to whether or not they are a result of a macro expansion  This deficiency affects different clients differently.

IBM Software Group | Rational software © 2004 IBM Corporation 13 The Good – Outline View

IBM Software Group | Rational software © 2004 IBM Corporation 14 The Bad – Search Markers Slightly Off

IBM Software Group | Rational software © 2004 IBM Corporation 15 The Ugly – A Refactoring Gone Wrong

IBM Software Group | Rational software © 2004 IBM Corporation 16 Introducing IASTNodeLocation  Every node in the AST derives from IASTNode  Every IASTNode provides a mechanism for resolving its location in the physical AST  External Header Files  Resources  Working Copies  Macro Expansions  getNodeLocations() returns an array of IASTNodeLocation, as a node may span multiple locations (when macros are involved)  Interpretation of macro expansion locations are up to the client

IBM Software Group | Rational software © 2004 IBM Corporation 17 Better support for C  Support for link-time bindings  resolving function references in C Will greatly aid navigation/search features for this style of C code Refactoring/programmatic edit support will be difficult −Definite candidate for “Preview” pane in refactoring −Without full build-model to reference, resolution is heuristic-based  Stricter emphasis upon providing custom implementation for C & GCC  AST & supporting data structures will be more compact memory wise  Syntax parsing will be more accurate  Algorithms for semantic analysis in C are far less rigorous Should yield better performance for nearly all clients

IBM Software Group | Rational software © 2004 IBM Corporation 18 Parser Language Variants  Interfaces  org.eclipse.cdt.core.dom.ast – Base interfaces that apply to both ANSI C & ISO C++  org.eclipse.cdt.core.dom.ast.c – C99 specific sub-interfaces  org.eclipse.cdt.core.dom.ast.cpp – ISO C++ 98 specific sub-interfaces  org.eclipse.cdt.core.dom.ast.gnu – GNU extensions that apply to both C & C++  org.eclipse.cdt.core.dom.ast.gnu.c – GNU extensions that apply to C  org.eclipse.cdt.core.dom.ast.gnu.cpp – GNU extensions that apply to C++  Implementations  org.eclipse.cdt.internal.core.parser2 – supporting infrastructure  org.eclipse.cdt.internal.core.parser2.c – C99 & GCC support  org.eclipse.cdt.internal.core.parser2.cpp – ISO C++ 98 & G++ support  Other variants may subclass C or C++ Source Code Parser and choose which GNU extensions they wish to enable through a configuration interface

IBM Software Group | Rational software © 2004 IBM Corporation 19 Physical Tree Hierarchy  C++ extends the common AST  GNU C++ extends the C++  C++ and C are unrelated outside of the common AST package

IBM Software Group | Rational software © 2004 IBM Corporation 20 Logical Tree Hierarchy  C extends common  C++ extends common  Theoretically, new bindings could be added for future variants  e.g. GCC Signatures

IBM Software Group | Rational software © 2004 IBM Corporation 21 AST Rewriting  AST Rewriter accepts requests to change AST  Add (Insert/Set), Remove, Replace  The new code can be a new AST Node or a String  AST Rewrite gathers change requests and then executes  Analyzes the change requests for validity  Produces text edits that can be applied to documents to affect the change  The analysis can decide not to do a change because it is too hard  E.g. when macros are involved

IBM Software Group | Rational software © 2004 IBM Corporation 22 DOM as a Service  We need to take some effort to minimize parsing within Eclipse  C and C++ are mature languages : hence, larger source code bases Multiple clients parsing on resource-change events can cripple the system  A complex web of #include’s throughout the code base is difficult to optimize per-parse without having knowledge of previous parses Same with templates …  The Indexer is already parsing continually, we should be able to leverage that information for all other clients that require saved-file parse trees  Since parsing can be a processor and memory-intensive operation, it is difficult for the indexer to co-ordinate its priority vs. other parser clients for system resources users requests an Open Declaration which competes against a running indexer for memory & CPU  Eventual Goal AST Service w/Index Incremental Parse