Data-Driven Inference of API Mappings Department of Computer Science Rutgers University Amruta Gokhale- Daeyoung Kim-

Slides:



Advertisements
Similar presentations
Improving Learning Object Description Mechanisms to Support an Integrated Framework for Ubiquitous Learning Scenarios María Felisa Verdejo Carlos Celorrio.
Advertisements

The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS Solution: The Chinese Room Conclusions.
RNA-Seq based discovery and reconstruction of unannotated transcripts
Search in Source Code Based on Identifying Popular Fragments Eduard Kuric and Mária Bieliková Faculty of Informatics and Information.
Inferring Likely Mappings Between APIs Department of Computer Science Rutgers University Amruta Gokhale- Vinod Ganapathy-
© 2005 by Prentice Hall Appendix 2 Automated Tools for Systems Development Modern Systems Analysis and Design Fourth Edition Jeffrey A. Hoffer Joey F.
SSP Re-hosting System Development: CLBM Overview and Module Recognition SSP Team Department of ECE Stevens Institute of Technology Presented by Hongbing.
A Tool to Support Ontology Creation Based on Incremental Mini- Ontology Merging Zonghui Lian Data Extraction Research Group Supported by Spring Conference.
Relevance Feedback based on Parameter Estimation of Target Distribution K. C. Sia and Irwin King Department of Computer Science & Engineering The Chinese.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A Prototype of.
Language-Independent Set Expansion of Named Entities using the Web Richard C. Wang & William W. Cohen Language Technologies Institute Carnegie Mellon University.
Digital Library Service Integration (DLSI) --> Looking for Collections and Services to be DLSI Testbeds
Partial Automation of an Integration Reverse Engineering Environment of Binary Code Author : Cristina Cifuentes Reverse Engineering, 1996., Proceedings.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
LEARNING WORD TRANSLATIONS Does syntactic context fare better than positional context? NCLT/CNGL Internal Workshop Ankit Kumar Srivastava 24 July 2008.
© 2005 by Prentice Hall Appendix 2 Automated Tools for Systems Development Modern Systems Analysis and Design Fourth Edition Jeffrey A. Hoffer Joey F.
Automating your Business Processes Using Oracle Workflow Therron Hofsetz Logical Apps, Inc.
Main Index Contents 11 Main Index Contents Storage Containers -GeneralGeneral -Vectors (3 slides)Vectors -ListsLists -MapsMaps ADT’s ADT’s ADT’s (2 slides)Classes.
Appendix 2 Automated Tools for Systems Development © 2006 ITT Educational Services Inc. SE350 System Analysis for Software Engineers: Unit 2 Slide 1.
Systems Analysis – Analyzing Requirements.  Analyzing requirement stage identifies user information needs and new systems requirements  IS dev team.
OpenAlea An OpenSource platform for plant modeling C. Pradal, S. Dufour-Kowalski, F. Boudon, C. Fournier, C. Godin.
1 Yolanda Gil Information Sciences InstituteJanuary 10, 2010 Requirements for caBIG Infrastructure to Support Semantic Workflows Yolanda.
Language Identification of Search Engine Queries Hakan Ceylan Yookyung Kim Department of Computer Science Yahoo! Inc. University of North Texas 2821 Mission.
Zhonghua Qu and Ovidiu Daescu December 24, 2009 University of Texas at Dallas.
CIS Computer Programming Logic
` Tangible Interaction with the R Software Environment Using the Meuse Dataset Rachel Bradford, Landon Rogge, Dr. Brygg Ullmer, Dr. Christopher White `
Mihir Daptardar Software Engineering 577b Center for Systems and Software Engineering (CSSE) Viterbi School of Engineering 1.
FIIT STU Bratislava Classification and automatic concept map creation in eLearning environment Karol Furdík 1, Ján Paralič 1, Pavel Smrž.
Mobile Topic Maps for e-Learning John McDonald & Darina Dicheva Intelligent Information Systems Group Computer Science Department Winston-Salem State University,
1 A Static Analysis Approach for Automatically Generating Test Cases for Web Applications Presented by: Beverly Leung Fahim Rahman.
CS266 Software Reverse Engineering (SRE) Reversing and Patching Java Bytecode Teodoro (Ted) Cipresso,
GoogleDictionary Paul Nepywoda Alla Rozovskaya. Goal Develop a tool for English that, given a word, will illustrate its usage.
Minor Thesis A scalable schema matching framework for relational databases Student: Ahmed Saimon Adam ID: Award: MSc (Computer & Information.
Malay-English Bitext Mapping and Alignment Using SIMR/GSA Algorithms Mosleh Al-Adhaileh Tang Enya Kong Mosleh Al-Adhaileh and Tang Enya Kong Computer Aided.
Property of Jack Wilson, Cerritos College1 CIS Computer Programming Logic Programming Concepts Overview prepared by Jack Wilson Cerritos College.
Publication Spider Wang Xuan 07/14/2006. What is publication spider Gathering publication pages Using focused crawling With the help of Search Engine.
Serghei Mangul Department of Computer Science Georgia State University Joint work with Irina Astrovskaya, Marius Nicolae, Bassam Tork, Ion Mandoiu and.
Automated Patch Generation Adapted from Tevfik Bultan’s Lecture.
Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.
Collocations and Terminology Vasileios Hatzivassiloglou University of Texas at Dallas.
Testing Cross-Platform Mobile App Development Frameworks Nader Boushehrinejadmoradi, Vinod Ganapathy, Santosh Nagarakatte, Liviu Iftode Department of Computer.
INFORMATION RETRIEVAL PROJECT Creation of clusters of concepts that represent a domain corpus.
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
 Programming - the process of creating computer programs.
Chapter 4 Automated Tools for Systems Development Modern Systems Analysis and Design Third Edition 4.1.
ModTransf A Simple Model to Model Transformation Engine Cédric Dumoulin.
Ranking of Database Query Results Nitesh Maan, Arujn Saraswat, Nishant Kapoor.
Virtual Navigation of Multimedia Maps A versatile map generator and viewer Virtual Navigation of Multimedia Maps A versatile map generator and viewer Robert.
Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.
Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Virtual Navigation of Multimedia Maps A versatile map generator and viewer Virtual Navigation of Multimedia Maps A versatile map generator and viewer Robert.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
NTNU Speech Lab 1 Topic Themes for Multi-Document Summarization Sanda Harabagiu and Finley Lacatusu Language Computer Corporation Presented by Yi-Ting.
Software Engineering and Mobile Apps COM Presentation.
Wednesday NI Vision Sessions
Mechanisms for Requirements Driven Component Selection and Design Automation 최경석.
Appendix 2 Automated Tools for Systems Development
Java Yingcai Xiao.
Modern Systems Analysis and Design Third Edition
Web News Sentence Searching Using Linguistic Graph Similarity
Modern Systems Analysis and Design Third Edition
Automated Pattern Based Mobile Testing
Weaving Abstractions into Workflows
Modern Systems Analysis and Design Third Edition
Presented by: Prof. Ali Jaoua
Modern Systems Analysis and Design Third Edition
Automated Software Integration
Leveraging Textual Specifications for Grammar-based Fuzzing of Network Protocols Samuel Jero, Maria Leonor Pacheco, Dan Goldwasser, Cristina Nita-Rotaru.
Presentation transcript:

Data-Driven Inference of API Mappings Department of Computer Science Rutgers University Amruta Gokhale- Daeyoung Kim- Vinod Ganapathy- PROMOTO 2014

Personal story: Change in environment is hard! PROMOTO 2014Data-Driven Inference of API Mappings1 Nagpur, IndiaNew Jersey, USA

Personal story: Change in environment is hard! PROMOTO 2014Data-Driven Inference of API Mappings2 Nagpur, IndiaNew Jersey, USA 45° C - 10° C

Mobile app for a single platform PROMOTO 2014Data-Driven Inference of API Mappings3 iPhone app

PROMOTO 2014Data-Driven Inference of API Mappings4 BlackBerry 10 Android Windows Phone Challenge: Porting apps across multiple mobile platforms Windows Phone app BlackBerr y app Android app iPhone app

Porting assistance Porting to Windows Phone: –Developer guides for porting –Discussion forums on porting PROMOTO 2014Data-Driven Inference of API Mappings5

Challenges in porting apps PROMOTO 2014Data-Driven Inference of API Mappings6 Different SDKs for app development Different programming languages Different development environments Different debugging aids Every mobile platform exposes its own programming API Every mobile platform exposes its own programming API PlatformLanguageDevelopment Tools AndroidJavaEclipse iOSObjective CXCode Windows PhoneC#Visual Studio

iOS classiOS method name CGGeometryCGRect CGRectMake(CGFloat x, y, width, height) Returns a rectangle with the specified coordinate and size values. CGGeometrybool CGRectContainsPoint(CG Rect rect, CGPoint point) Returns whether a rectangle contains a specified point. Using API documentation to write app PROMOTO 2014 Data-Driven Inference of API Mappings7 iPhone App iPhone App

Using API documentation to write app PROMOTO 2014Data-Driven Inference of API Mappings8 Android class Android method name android.gra phics void drawRect(Rect r, Paint paint) Draws the specified Rect using specified Paint android.gra phics bool contains(int x, int y) Returns true if (x,y) is inside the rectangle. Android App Android App Android phone

Can we do better than searching API documentation for each new platform? PROMOTO 2014Data-Driven Inference of API Mappings9

APIs often have similar functionality PROMOTO 2014Data-Driven Inference of API Mappings10 Android class nameAndroid method name android.graphicsvoid drawRect(Rect r, Paint paint) android.graphicsbool contains(int x, int y) iOS class nameiOS method name CGGeometryCGRect CGRectMake (CGFloat x, y, width, height) CGGeometrybool CGRectContainsPoint (CGRect rect, CGPoint point)

API mapping databases PROMOTO 2014Data-Driven Inference of API Mappings11 API mapping databases map methods in a source API to methods in a target API iOS MethodAndroid Method CGGeometry.CGRectMake()android.graphics.drawRect() CGGeometry.CGRectContainsPoint()android.graphics.contains()

Platform APIs ~ Natural languages PROMOTO 2014Data-Driven Inference of API Mappings12 Source API Target API Unknown source language Unknown target language

PROMOTO 2014Data-Driven Inference of API Mappings13 English language text Spanish language text NLP Toolkit Word mappings English word Spanish word northnorte exitsalida Word mappings

Mappings between English and Spanish words PROMOTO 2014Data-Driven Inference of API Mappings14 enlarge- ment society state control import- ance amplifi- cacion estado sociedad import- ancia control

PROMOTO 2014Data-Driven Inference of API Mappings15 iOS API methods “text” Android API methods “text” NLP Toolkit API mappings iPhone API method Android API method CGRectMakedrawRect CGRect- ContainsPoint contains API method mappings

iOS and Android API methods’ mappings PROMOTO 2014Data-Driven Inference of API Mappings16 CGRectGet- Height CGRectGet- Width CGRectMake CGRectCont ainsPoint CGContext FillRect height drawRect width setStyle contains

API mapping tools PROMOTO 2014Data-Driven Inference of API Mappings17 windowsphone.interoperabilitybridges.com/porting API mappings from Android, iPhone to Windows Phone

Creating API mapping databases PROMOTO 2014Data-Driven Inference of API Mappings18 Mapping databases are populated manually by domain experts Painstaking, error-prone and expensive –Hard to evolve API mapping databases as the corresponding APIs evolve

Our contribution PROMOTO 2014Data-Driven Inference of API Mappings19 We propose to automatically create API mapping databases We propose to automatically create API mapping databases Prototyped in a tool called DDR (Data- Driven Rosetta) –Creates mappings between iOS API and Android API Leverages NLP approach to identify likely API mappings

Workflow of DDR PROMOTO 2014Data-Driven Inference of API Mappings20 Source Program Path Extraction Target Program Path Extraction NLP Inference Engine Source method Target method PR CGRect- Make() drawRect( ) 0.60 GetWidth()width()0.45 GetWidth() GetHeight() RectMake() ……… height() width() setStyle() drawRect() ……… Source Apps Target Apps Source Program Paths Target Program Paths Output Mappings

Program path extraction PROMOTO 2014Data-Driven Inference of API Mappings21 Dis- assembler Control flow graph constructor Program path extractor Mobile app binary Intermediate code representation Control flow graph Program paths

NLP Inference engine Matching Canonical Correlation Analysis (MCCA) [ACL `08*] 1.Define a generative model 2.Inference on the model done via Expectation-Maximization (EM) algorithm * Learning Bilingual Lexicons from Monolingual Corpora Haghighi et. al., ACL `08 PROMOTO 2014Data-Driven Inference of API Mappings22

Generative model PROMOTO 2014Data-Driven Inference of API Mappings23 Target feature extraction Source feature extraction Source word features Target word features Generative Model Seed Mappings

Generative model Features computed from individual languages: 1.Frequency of words 2.Substring properties 3.Context counts Features form the observed data explained via a generative process PROMOTO 2014Data-Driven Inference of API Mappings24

Relating a pair of mapped methods drawRect CGRectMake Common, hidden concept behind the generation processes 25 Generative model PROMOTO 2014Data-Driven Inference of API Mappings Target method features Source method features

Inference algorithm E-step: Find the maximum weighted (partial) bipartite matching M-step: Find the best parameters of the model by performing canonical correlation analysis (CCA) PROMOTO 2014Data-Driven Inference of API Mappings26

Our modifications to inference algorithm String similarity function: method names instead of method signatures Output: a list of top 10 mappings sorted in decreasing order of edge weights PROMOTO 2014Data-Driven Inference of API Mappings27

Implementation PROMOTO 2014Data-Driven Inference of API Mappings28 Collected 50 Android apps and 50 iOS apps 3,414 unique iOS API methods 2,229 unique Android API methods Evaluation under progress!

Conclusion It is becoming increasingly important to port apps to a variety of platforms Key challenge: Different platforms use different programming APIs API mapping databases help, but they are created manually by domain experts PROMOTO 2014Data-Driven Inference of API Mappings29 We presented a methodology to automate the creation of API mapping databases