Investment in High Performance Computing A Predictor of Research Competitiveness in U.S. Academic Institutions Amy Apon, Ph.D. Director, Arkansas High.

Slides:



Advertisements
Similar presentations
THE COSTS OF EXPERIENTIAL LEGAL EDUCATION: MYTH AND REALITY Robert R. Kuehn Washington University in St. Louis.
Advertisements

Copyright © 2011 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 12 Measures of Association.
CAC we enable your success 5/15/2015www.cac.cornell.edu1 High Performance Computing Center Sustainability NSF Workshop May 3-5, 2010 Stanley Ahalt –
Office of the Vice President for Research Research Trends Richard O. Buckius Vice President for Research Academic Leadership Forum March 31, 2010.
From last time….. Basic Biostats Topics Summary Statistics –mean, median, mode –standard deviation, standard error Confidence Intervals Hypothesis Tests.
Investing in Human Capital Underrepresented Racial Minorities’ Intentions to Attend Graduate School in STEM Fields Kevin Eagan Christopher Newman University.
Simple Linear Regression
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Statistics 303 Chapter 10 Least Squares Regression Analysis.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
1 Simple Linear Regression Chapter Introduction In Chapters 17 to 19 we examine the relationship between interval variables via a mathematical.
April 2009 OSG Grid School - RDU 1 Open Science Grid John McGee – Renaissance Computing Institute University of North Carolina, Chapel.
Moving to the Top 50 Research Administrators April 2, 2014.
Simple Linear Regression Analysis
Modeling client arrivals at access points in wireless campus-wide networks Maria Papadopouli Assistant Professor Department of Computer Science University.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Internal Consistency Reliability Analysis PowerPoint.
South Carolina Economic Summit Douglas P. Woodward Director, Division of Research Moore School of Business University of South Carolina.
Correlation & Regression
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
L Berkley Davis Copyright 2009 MER301: Engineering Reliability Lecture 13 1 MER301: Engineering Reliability LECTURE 13 Chapter 6: Multiple Linear.
Chapter 11 Simple Regression
ND EPSCoR Dr. Philip Boudjouk Advancing Science Excellence in North Dakota Vice President for Research, Creative Activities & Technology Transfer North.
The Haves and Have Nots: Sophisticated Cross-Institutional Analysis Techniques that Support Budget Justifications Brian W. Keith 2014 Charleston Conference.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
THE MANAGEMENT AND CONTROL OF QUALITY, 5e, © 2002 South-Western/Thomson Learning TM 1 Chapter 8 Performance Measurement and Strategic Information Management.
1 Introduction to Grant Writing Beth Virnig, PhD Haitao Chu, MD, PhD University of Minnesota, School of Public Health December 11, 2013.
Author(s): Brenda Gunderson, Ph.D., 2011 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Non-commercial–Share.
Open Science Grid For CI-Days Elizabeth City State University Jan-2008 John McGee – OSG Engagement Manager Manager, Cyberinfrastructure.
Academic Research Enhancement Award (AREA) Program Erica Brown, PhD Director, NIH AREA Program National Institutes of Health 1.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Amy W. Apon* Linh B. Ngo* Michael E. Payne* Paul W. Wilson+
AICUP Economic Development Institute Cluster, Ecosystems and Collaborators (OH MY!) Prepared by: Fourth Economy Consulting Rich Overmoyer President & CEO.
Topic 10 - Linear Regression Least squares principle - pages 301 – – 309 Hypothesis tests/confidence intervals/prediction intervals for regression.
Brian Macpherson Ph.D, Professor of Statistics, University of Manitoba Tom Bingham Statistician, The Boeing Company.
SW388R6 Data Analysis and Computers I Slide 1 Multiple Regression Key Points about Multiple Regression Sample Homework Problem Solving the Problem with.
1. The Research Process Research New Research New Ideas Solve Problems Commercialization Enhanced Scientific Literacy Updated Learning Materials Increased.
Watts Humphrey IBM director of programming and vice-president of technical development Joined CMU Software Engineering Institute in 1986 Initiator and.
Statistical planning and Sample size determination.
Chapter Thirteen Copyright © 2006 John Wiley & Sons, Inc. Bivariate Correlation and Regression.
1 © ACADEMY OF FINLAND Academy of Finland 2012: Research knows no boundaries Tiina Kotti PhD, Programme Manager, Programme Unit.
1Mobile Computing Systems © 2001 Carnegie Mellon University Writing a Successful NSF Proposal November 4, 2003 Website: nsf.gov.
NIH and IRB Purpose and Method M.Ed Session 2.
Inferential Statistics Introduction. If both variables are categorical, build tables... Convention: Each value of the independent (causal) variable has.
1.What is Pearson’s coefficient of correlation? 2.What proportion of the variation in SAT scores is explained by variation in class sizes? 3.What is the.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Physical Science and You Chapter One: Studying Physics and Chemistry Chapter Two: Experiments and Variables Chapter Three: Key Concepts in Physical Science.
OKLAHOMA Supercomputing Symposium 2011 University of Oklahoma October 11, 2011 James Wicksted, RII Project Director Associate Director, Oklahoma EPSCoR.
1 Supplemental line if need be (example: Supported by the National Science Foundation) Delete if not needed. XDMoD Financial Analytics Craig Stewart ORCID.
Personal Assessment of the College Environment (PACE) Survey Summary of Fall 2014 Results Presentation to College Council Executive Cabinet August 5, 2015.
Simple Linear Regression and Correlation (Continue..,) Reference: Chapter 17 of Statistics for Management and Economics, 7 th Edition, Gerald Keller. 1.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Using the regression equation (17.6). Why regression? 1.Analyze specific relations between Y and X. How is Y related to X? 2.Forecast / Predict the variable.
Investment in High Performance Computing A Predictor of Research Competitiveness in U.S. Academic Institutions Really Amy Apon, Ph.D. Director, Arkansas.
Coalition for National Science Funding (CNSF) 
Computer Science Department, University of Missouri, Columbia
Basic Estimation Techniques
R. E. Wyllys Copyright 2003 by R. E. Wyllys Last revised 2003 Jan 15
COLLEGE OF ENGINEERING GEORGIA TECH Academic Year
South Carolina Economic Summit
CHAPTER 10 Correlation and Regression (Objectives)
Instrumental Variables and Two Stage Least Squares
Basic Estimation Techniques
Statistics and Data Analysis
Chapter 12 Inference on the Least-squares Regression Line; ANOVA
Instrumental Variables and Two Stage Least Squares
BA 275 Quantitative Business Methods
Instrumental Variables and Two Stage Least Squares
15.1 The Role of Statistics in the Research Process
Instrumental Variables Estimation and Two Stage Least Squares
Presentation transcript:

Investment in High Performance Computing A Predictor of Research Competitiveness in U.S. Academic Institutions Amy Apon, Ph.D. Director, Arkansas High Performance Computing Center Professor, CSCE, University of Arkansas Stan Ahalt, Ph.D. Director RENCI Professor, Computer Science, UNC-CH Work supported by the NSF through Grant # University of Arkansas and RENCI/UNC-CH Really

Collaborators Amy Apon University of Arkansas Stanley Ahalt RENCI, University of North Carolina Vijay Dantuluri RENCI, University of North Carolina Constantin Gurdgiev IBM Moez Limayem University of Arkansas Linh Ngo University of Arkansas Michael Stealey RENCI, University of North Carolina

Research Study Background and motivation Research hypothesis Data acquisition Analysis and Results Discussion

Research and Computing Cyberinfrastructure Ecosystem Foundation Computational and Data Driven Science Nanotechnology High Energy Physics Health Sciences Global Climate Modeling …

Credit: NSF OCI $ $ $ $ $ $ $ $ $ $ $ $

Conversation with a Chancellor HPC guys, “This is a great investment! We think we can run the HPC center with only $1M/year in hardware and $1M/year in staffing.” Chancellor, “Which 20 faculty do you want me to fire?”

HPC: High rePeating Cost Computer equipment is usually treated as a capital expense, with costs for substantial clusters in the range of $1M+ Warranties on these generally last 3 years, or 5 years at most, after which repairs become prohibitive Even without that, the pace of technology advances require refreshing every 3-5 years Staffing is a long term repeating cost!

Ranks of Top 500 Computers and Appearances in Succeeding Lists HPC: High rePeating Cost

Some Observations

What is the ROI? Can I convince my VPR that the funds invested in HPC add value to the institution and create opportunity? What if this is not true?

Hypothesis Investment in high performance computing, as measured by entries on the Top 500 list, is a predictive factor in the research competitiveness of U.S. academic institutions. We study Carnegie Foundation institutions with “Very High” and “High” research activity – about 200 institutions

Data Acquisition Independent variables Top 500 List count and rank of entries o Mapped from “supercomputer site” to “institution” o We note that entries are voluntary – the absence of an entry does not mean that an institution does not have HPC Dependent variables NSF and other federal funding summary and award information Publication counts U.S. News and World Report rankings

Data from the Top 500 List An historical record without comparison of supercomputers

Data from the Top 500 List About 100 U.S. institutions have appeared on a Top 500 List

Analysis Examples Correlation analysis Regression analysis

Simple Example of ROI Evidence based on 2006 NSF funding Average NSF funding: $30,354,000 Average NSF funding: $7,781, of Top NSF-funded Universities with HPC 98 of Top NSF-funded Universities w/out HPC With HPCWithout HPC

More evidence, NSF funding Longer Example of ROI

Correlation Analysis CountsNSFPubs All Fed DOEDODNIHUSNews dRankSum Counts NSF Pubs All Fed DOE DOD NIH

Regression Analysis Two Stage Least Squares (2SLS) regression is used to analyze the research-related returns to investment in HPC We model two relationships Model 1 : NSF Funding as a function of contemporaneous and lagged Appearance (APP) on the Top 500 List Count and Publication Count (PuC), and Model 2 : Publication Count (PuC) as a function of contemporaneous and lagged Appearance on the Top 500 List Count (APP) and NSF Funding

Endogeneity Funding allows an institution to acquire resources Resources are used to perform research, which leads to more funding Resources are also cited in the argument for research funding NSF funding begats HPC resources which begats NSF funding …

Regression Analysis Original tests revealed significant problems with endogeneity of Publication Counts (PuC) and NSF Funding. To correct for this, we deployed a 2SLS estimation method, with number of undergraduate Student Enrollments (SN) acting as an instrumental variable in the first stage regression for PuC (Model 1) and NSF (Model 2). In both cases, SN was found to be a suitable instrument for endogenous regressors.

First Result A single HPC investment yields statistically significant immediate returns in terms of new NSF funding An entry on a list results in an increase of yearly NSF funding of $2.4M o Confidence level 95% o Confidence interval $769K-$4M

Second Result A single HPC investment yields statistically significant immediate returns in terms of increased academic publications An entry results in an increase in yearly publications of 60 o Confidence level 95% o Confidence interval

Third Result Analysis on the rank of the system shows that rank has a positive impact to competiveness, but with reduced confidence. We have not studied returns to other institutions of investments by resource providers, or returns to overall U.S. competitiveness.

Fourth Result HPC investments suffer from fast depreciation over a 2 year horizon Consistent investments in HPC, even at modest levels, are strongly correlated to research competitiveness. Inconsistent investments have a significantly less positive ROI

Discussion More study is needed to precisely determine the rate of depreciation of HPC investments The publication counts include all publications, not just those related to HPC More study is needed regarding how use of national systems, such as Teragrid, may impact research competitiveness

Data from Teragrid Usage

Questions?