Obstacles and opportunities with using visual and domain-specific languages in scientific programming Michael Jones, Christopher Scaffidi School of Electrical.

Slides:



Advertisements
Similar presentations
Interdisciplinary Research: Opportunities and Challenges
Advertisements

What You See Is What You Get Document Preparation Choices at the University of Tennessee by Jennifer W. Spirko UT Graduate School.
MatLab API in C++ Christopher Dabney. Purpose MatLab … MatLab … is an interpreted scripting language is an interpreted scripting language conversion to.
TAILS: COBWEB 1 [1] Online Digital Learning Environment for Conceptual Clustering This material is based upon work supported by the National Science Foundation.
Sixteen Questions About Software Reuse William B. Frakes and Christopher J. Fox Communications of the ACM.
DSLs: The Good, the Bad, and the Ugly Kathleen Fisher AT&T Labs Research.
ProgInIndustry/ draft C/ slide 1 of 10 Computer Programming in Industry Paul Street Information Services & Systems (ISS)
CSI5112 Software Engineering Team: Andrei Anisenia Margi Fumtiwala.
ClearEye: An Visualization System for Document Revision CPSC 533C Project Update Qiang Kong Qixing Zheng.
CS351 © 2003 Ray S. Babcock Cost Estimation ● I've got Bad News and Bad News!
SE 450 Software Processes & Product Metrics Reliability Engineering.
Simulation.
12 C H A P T E R Systems Investigation and Analysis and Analysis.
LabVIEW For BIOEN 201 Fritz Reitz, Ph.D.. Why talk about LabVIEW BIOEN 301 labs use it, so it helps to be a little familiar with what it is it’s a VERY.
A Qualitative Study of Animation Programming in the Wild Aniket Dahotre, Yan Zhang, Christopher Scaffidi ESEM 2010.
Structural Bioinformatics Dr. Avraham Samson Course no.: Credit points: 1.5 Final grade is based on 10 assignments Course homepage:
OVERVIEW OF PETROLEUM ENGINEERING.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 17 Slide 1 Rapid software development.
What it is and what it is used for?.  It is a type of writing by an author who is trying to get something. As a result, it is an extremely persuasive.
Software Configuration Management (SCM)
CW-V1 SDD 0201 Principals of Software Design and Development Introduction to Programming Languages.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.
The Bridging Role of Graduate Placement Programmes in the SME Workplace Padraig Gallagher 8/11/2013.
Sociotechnical production systems for software in science James Howison and Jim Herbsleb Institute for Software Research School of Computer Science Carnegie.
Objectives of the Lecture
User Modeling 1 Lecture # 7 Gabriel Spitz. Objective of Lecture Why model the user How do we build a user profile How to utilize the user profile 2 Gabriel.
Effective User Services for High Performance Computing A White Paper by the TeraGrid Science Advisory Board May 2009.
Sharad Oberoi and Susan Finger Carnegie Mellon University DesignWebs: Towards the Creation of an Interactive Navigational Tool to assist and support Engineering.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Software Engineering EKT 420 MOHAMED ELSHAIKH KKF 8A – room 4.
Intent Specification Intent Specification is used in SpecTRM
Navigation and Ancillary Information Facility NIF Preview of a Web-based GUI Interface to SPICE “WebGeocalc” The NAIF and UCD Teams August 22, 2011 SPICE.
1 Software Design Reference: Software Engineering, by Ian Sommerville, Ch. 12 & 13, 5 th edition and Ch. 10, 6 th edition.
1 3. Computing System Fundamentals 3.1 Language Translators.
1 Software Design Overview Reference: Software Engineering, by Ian Sommerville, Ch. 12 & 13.
The Role of Experience in Software Testing Practice Zahra Molaei Soheil Hedayatitezengi Comp 587 Prof. Lingard 1 of 21.
Open Source In the DoD Dawn Meyerriecks Chief Technology Officer Defense Information Systems Agency (703) ,
TEST-1 6. Testing & Refactoring. TEST-2 How we create classes? We think about what a class must do We focus on its implementation We write fields We write.
The University of Michigan, School of Information, August 5, 2015 Data Management, Sharing and Reuse: A User’s Perspective Ixchel M. Faniel, Ph.D. Research.
DISTRIBUTED COMPUTING. Computing? Computing is usually defined as the activity of using and improving computer technology, computer hardware and software.
Program & Programming Languages. Introduction to Programming Programming is the process of writing a computer program in a language that the computer.
Electronic labnotes Mari Wigham COMMIT/. Information WUR  Organising, sharing, finding and reusing data  Expertise in: ● Modelling data.
Gateway. All students required to pass Gateway Science = April 20 –Graded on: Science Language Arts Social Studies = April 22 –Graded on: Social Studies.
BOĞAZİÇİ UNIVERSITY DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS MATLAB AS A DATA MINING ENVIRONMENT.
MIS 7003 MBA Core Course in MIS Professor Akhilesh Bajaj The University of Tulsa Introduction to S/W Engineering © All slides in this presentation Akhilesh.
Bioinformatics Curriculum Issues, goals, curriculum.
CSE 190p wrapup Michael Ernst CSE 190p University of Washington.
1 SWE 513: Software Engineering People II. 2 Future Experience What will you be doing one year from now? Ten years from now?
Collaborations of Librarians & Scientists to Support Agriculture Research Martin Kesselman Rutgers, The State University of New Jersey USA
1 COMPUTER SCIENCE DEPARTMENT COLORADO STATE UNIVERSITY 1/9/2008 SAXS Software.
9/03 Data Mining – Introduction G Dong (WSU)1 CS499/ Data Mining Fall 2003 Professor Guozhu Dong Computer Science & Engineering WSU.
B i o i n f o r m a t i c s / B i o m e d i c a l A p p l i c a t i o n s i n E E L A Mexico, D.F., october 22 – 26, e – s c i e n c e M e x i c.
EXPLORING PROCESS OF DOING DATA SCIENCE VIA AN ETHNOGRAPHIC STUDY OF A MEDIA ADVERTISING COMPANY J.SALTZ, I.SHAMSHURIN 2015 IEEE INTERNATIONAL CONFERENCE.
 Programming - the process of creating computer programs.
GUI For Computer Architecture May01-05 Team Members: Neil HansenCprE Ben JonesCprE Jon MathewsCprE Sergey SannikovCprE Clients/Advisors: Manimaran Govindarasu.
An Introduction to Software Engineering. Objectives  To introduce software engineering and to explain its importance  To set out the answers to key.
Sixteen Questions About Software Reuse William B. Frakes and Christopher J. Fox Communications of the ACM.
High Risk 1. Ensure productive use of GRID computing through participation of biologists to shape the development of the GRID. 2. Develop user-friendly.
The study was requested of the NRC’s Board on Life Sciences by NSF, NIH, and DOE To examine the current state of biological research in the U.S. and recommend.
1 Seattle University Master’s of Science in Business Analytics Key skills, learning outcomes, and a sample of jobs to apply for, or aim to qualify for,
INTRODUCTION CSE 470 : Software Engineering. Goals of Software Engineering To produce software that is absolutely correct. To produce software with minimum.
Web Application.
Unified Modeling Language
Objectives of the Presentation
1 Oregon State University
Institutional Research and Assessment (IR&A) Feedback Survey
3.02D Multimedia Authoring Programs
Helping a friend out Guidelines for better software
CMPT 102 Introduction to Scientific Computer Programming
Igor Stančin, Alan Jović to: {igor.stancin,
Presentation transcript:

Obstacles and opportunities with using visual and domain-specific languages in scientific programming Michael Jones, Christopher Scaffidi School of Electrical Engineering and Computer Science Oregon State University

2 Scientists as end-user programmers Professional objectives –Discovering knowledge –Disseminating knowledge –Funding knowledge creation Reasons for programming –Data acquisition –Data analysis –Data visualization Background  Study  Results  Conclusion Photo credit: Microsoft Office (2010)

3 Results from prior empirical studies Testing –Often reliant on visualization, rarely systematic Reuse –Usually white-box rather than black-box Languages –Usually procedural and compiled –Rarely changes during a project Background  Study  Results  Conclusion

4 Few mentions of visual or domain-specific languages (DSLs) Three studies mentioned… –Matlab was used to pre-process data –Hypercard, LabVIEW, spreadsheets were used More widespread use might be expected!!! –Matlab: “enjoys wide usage among scientists” –LabView: over 1 million users worldwide –Excel: 500 million users… maybe some scientists? Background  Study  Results  Conclusion

5 We need more details Key research questions: –To what extent are DSLs used by scientists? –What concerns do scientists have about using DSLs and other languages? Background  Study  Results  Conclusion ?

6 Interviews of scientists Artifact walkthrough –Interviewee identified a recent programming project –We didn’t mention visual/domain-specific languages! –The selected project served as an anchor for questions Interviewed 10 scientists –One later requested that his data not be used –Recruited by s to physical/natural scientists Background  Study  Results  Conclusion

7 Participant and project overview Background  Study  Results  Conclusion Programming experience (yrs) Job titleField Project age (years) 15Professor Biology6 5Scientist 15ScientistBioinformatics6 1 Graduate student Bioinformatics3 15ScientistMeteorology18 4ProfessorChemistry2 1ProfessorChemistry11 25ScientistPhysics2 15ScientistPhysics6

8 Participants and their projects Most participants were end-user programmers –6 produced code for own use –2 produced code for others’ use Each program performed scientific modeling –Mining historical data to generate models E.g., mining weather + disease data to create models –Running models to make predictions + visualizations E.g., simulating chemical reactions Background  Study  Results  Conclusion

9 DSLs were indeed in widespread use Background  Study  Results  Conclusion Language / tool# projects using it Excel2 Matlab2 Minitab1 Sigma Plot1 ArcGIS1 Flash1 Silverlight1 MayaVi1 Only 2 of 8 projects had exclusively relied on non-DSLs (C, C++, Perl)

10 Most projects demonstrated a language transition Addition or replacement of languages –5 of 8 projects added one or more DSL –4 of 8 projects added a non-DSL Lots of “language thrashing” –Biologists: Excel, Matlab, C, Perl, Minitab, Sigma Plot –Chemists: Fortran, C, Perl, Java, PHP, ArcGIS What concerns motivated language choices?... Background  Study  Results  Conclusion

11 Four concerns that drove language choices Background  Study  Results  Conclusion

12 Concern #1: Need for specific functionality Physical measurements Historical data Hypothetical values 1. Pre-process by transforming (often DSL) 2. Instantiate, run models (not DSL) Models’ output/predictions 3. Post-process: transform or visualize (often DSL) DSL Trad. Background  Study  Results  Conclusion

13 Concern #2: Complexity Complaints about complexity –Traditional languages: poor fit, unintuitive –Regardless of language: algorithmic complexity –When mixing languages: Hard to trace data dependencies and relationships DSL Trad. Background  Study  Results  Conclusion

14 Concern #3: Cost and time Tendency to grab what was familiar –Little resistance to licensing costs –Often reliant on what colleagues recommended (including grad students) Led to some unpleasant surprises –Often reliant on outside consultants acting as advisors or developers Big trouble when grants run out! DSL Trad. Background  Study  Results  Conclusion

15 Concern #4: Performance Traditional languages were preferred over DSLs –One project team used parallel programming; one team wanted it –Challenges: Shared servers Time for revising code Insufficient funds DSL Trad. Background  Study  Results  Conclusion

16 Two non-language concerns Lack of version control –Diving into coding –No planning ahead for managing versions –No planning ahead for team coordination Lack of documentation –No proactive recognition of the need –Insufficient secondary notation support in DSLs –Heavy reliance on naming conventions Background  Study  Results  Conclusion

17 Key results DSLs are… –Alive and well among scientists –Often used for data transformation and visualization Opportunities to help scientists with… –Selecting appropriate languages –Tracing data flow among multiple programs –Improving performance –Managing different versions Background  Study  Results  Conclusion

18Questions? Princeton pulsar lab, where I worked with astrophysicists ( ) Photo credit: colleague Ingrid Stairs (2006) Background  Study  Results  Conclusion