Bibliometric evidence for empirical trade-offs in national funding strategies Duane Shelton and Loet Leydesdorff ISSI 2011 Durban
Outline Modeling of Input-Output Relations Best Models from Correlations and Regression Trade-offs in Allocation of R&D investments Validation by Forecasts from Extrapolations, Regressions, and Individual Country Models Conclusions
Some prior work Leydesdorff. A series starting in 1990 with regression of papers with GERD. Most recently a 2009 publication with Wagner on which GERD components are best in encouraging papers. Shelton. Started in 2006 modeling paper share as a function of GERD share to account for US decline. Recently a 2010 presentation with Foland using GERD components to account for the European Paradox.
Output dependent variables (DVs) Papers and Paper Share Science Citation Index Scopus Patents and Patent Share Triadic USPTO PCT The full paper covers all; here we will focus on those in red. The full paper covers all; here we will focus on those in red.
Input variables (IVs) from OECD Overall GERD (Gross Expenditures on R&D) GERD source components: Government Industry Abroad (funding from abroad) Other GERD spending components: HERD (higher education sector) BERD (business sector) Non-Profit (other than universities) GOVERD (government labs) Number of researchers
Shares provide the best national comparisons Some indicators are nearly zero-sum: countries compete for a nearly fixed number of slots for paper publications and patent grants. (Paper submissions and patent applications are unbounded.) The slots do rise slowly with time, and this complicates national comparisons. Thus, in analyzing relative positions of nations, their share of most outputs is a more relevant indicator. Modeling of the inputs that cause these output shares is also best done in shares. Of course, once a model is built for shares, it can easily be used to calculate absolutes.
All inputs and outputs depend on the size of the country, making all country-wise correlations high, and obscuring identification of which variables are most important One might divide all variables by some measure of size, but stepwise multiple linear regression can also tease out which input IVs are best for predicting output DV. IVs are added one-by-one in order of which makes the best model for the DV. The size of nations is a confounding factor
Step-wise regression of 2007 SCI paper share (ps07) vs. three IVs Government GERD share HERD share Overall GERD share Fit of regression line
SCI Scopus Capital vs. Labor GERD Researchers Funding Components Industry Government Spending Components HERD BERD Correlations: Papers vs. Inputs Red indicates strongest correlation of pair; it will dominate a 2 IV model
IV1IV2Coeff1Coeff2Const p1 p2 R 2 GERDResearchers % GERD % GovernmentIndustry % Government % HERD % GovernmentHERD % Regressions of SCI paper share in 2007 For example the best single IV model is: Papers07 = Governments
Step-wise regression of 2007 triadic patent share (Patents07) vs. three IVs Industry GERD share BERD share Government GERD share Fit of regression line Fit is OK, but not as good as paper models
Shelton, R. D. & Leydesdorff, L. (in preparation). Publish or Patent: Bibliometric evidence for empirical trade-offs in national funding strategies
Triadic USPTO Capital vs. Labor GERD Researchers Funding Components Industry Government Spending Components HERD BERD Correlations: Patents vs. Inputs Red indicates strongest correlation of pair; it will dominate 2 IV model
IV1IV2Coeff1Coeff2Const p1 p2 R2 R2 GERDResearchers % IndustryGovernment % IndustryBERD % IndustryNonProfit % Industry % BERD % Regressions for 2007 triadic patent share For example the best single IV model is: Patents07 = Industrys
Regressions show a trade-off in allocations To maximize papers, a country should maximize its government funding of R&D, instead of industry funding To maximize patents, a country should do the opposite: maximize its industrial funding of R&D, which can be encouraged by government Similarly spending in the higher education sector seems to encourage papers, while spending in the business sector more encourages patents Thus these allocations are simply a choice between longer and shorter term benefits of R&D Not surprising, but regressions provide some quantitative confirmation of this logic
Summary of models for paper share Simple extrapolations of trends in output paper share m i provide a reality check for models based on input resource drivers The Shelton Model based on GERD share works well for big countries. It accounts for the decline in US and EU due to the rise of China's share of GERD w i. m i = k i w i The Shelton-Leydesdorff Model based on government share accounts for the EU increase in efficiency in the 1990s, and the long-term US decline. m i = k i ’ w i ’ + c’ Adding a second IV, HERD spending share w i ’’ works even better. This accounts for the EU passing the US in m i = k i ’w i ’ + k i ’’w i ’’ + c’’
Validation of paper share models Like any theory, models need to be tested to see how well they account for new phenomena. Scattergrams can show how well regression models fit a year’s data, or perhaps a new data point. They don’t forecast the future so well. Once key IVs are identified by statistics, individual country models can be built and tested by “forecasting the past.” Simple extrapolation of output DVs serves as a reality check
Extrapolation of SCI paper shares This model forecasts that the PRC will not pass the US until about 2020, and the EU27 until after 2025
Extrapolation of paper share in the Scopus database This can be compared to a recent similar forecast by the UK Royal Society.
Scattergram of paper share vs. government funding share
Same scattergram focused on smaller countries
Scattergram of patent share vs. industrial funding share
Same scattergram focused on smaller countries
Performance of Shelton Model in forecasting from 2005 to 2010 Based on forecasts of GERD and its share from 2005 data. Accuracy of US and EU is not bad. PRC is growing slower than forecast.
Uses 5-year average of rates of Gov increase. EU and PRC fit well, but US is worse than forecast, because its rate of Gov increase has plummeted to near zero. (Individual models used.) Performance of Shelton-Leydesdorff model: forecasting from 2005 to 2010
Conclusions Regressions show that investment choices are complementary: some are best for papers and some for patents Models based on these resource inputs have some success in forecasting But a take-away for the professors in the audience: just using HERD share to predict paper share is surprisingly accurate Thus if nations want to excel in papers, they should just give money to professors!
ps07 = HERD p=0.000 R 2 = 98.6% Paper share ≈ HERD share!
Forget statistics: Simply predicting paper share with HERD share works well for the US and EU. It also predicts that the EU should lead the US. Performance of HERD as predictor of paper share