Kevin W. Boyack Sandia National Laboratories Sackler Colloquium on Mapping Knowledge Domains May 11, 2003 An indicator-based characterization of PNAS Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under Contract DE-AC04-94AL85000.
2 Outline Data sources Impact vs. funding Map of highest impact work Topics and Import-Export Bibliographic coupling and external references
3 Indicators Lots of work by NSF, OECD –Many ways of counting –Often slanted to economic –Not often directly correlating inputs and outputs –Rarely taking any firm stand Some studies relating funding and impact –Most recent from Britain or Australia (biomed) Few large scale import-export studies
4 Data Sources Used ISI/SCIE data as base set –Used only articles, letters, notes, reviews (ALNR) –Did not include commentaries, editorials, corrections Medline for MeSH terms NIA grants (dollar amounts, durations, etc.) PNAS full text (not used) PNAS tables of contents (topics)
5 Data Merges NIA GRANTS YEAR PI INST FUNDING -Amounts -Durations ISI/SCIE VOL PAGE YEAR AUTHOR INST REFERENCES COUNTS MEDLINE VOL PAGE MeSH -Funding type -Descriptors PNAS TOC VOL PAGE TOPICS
6 Percentiles vs. Counts This study uses percentiles exclusively rather than citation counts –Percentiles enables cross- year comparisons Only 30-40% of papers have more citations than the mean Calculation of percentiles –Papers ranked for each year by citation count –Rankings converted into percentiles
7 Counts/Percentiles for 1983 Papers
8 Funding and Impact Effect of funding type Effect of funding amount
9 MeSH Support Types Support, U.S. Gov't, P.H.S. –NIH Support, U.S. Gov't, Non-P.H.S. –All other US agencies (NSF, DOE, DOD, etc.) Support, Non-U.S. Gov't –US non-government (academia, industry …) –Foreign
10 Funding Categories
11 Impact by Funding Category
12 Impact Stability
13 Matching Papers to Grants PNAS author = Grant PI (last name + first initial) AND PNAS author institution = Grant PI institution AND PNAS publication year >= Grant initial year AND PNAS publication year <= Grant initial year + 5 OR PNAS publication year <= Grant final year + 2
14 NIA – 4.1% fraction of PNAS
15 Impact by Grant Amount
16 Cumulative Histograms by Range
17 Impact by Publishing Institution
18 Impact by Institution and Funding
19 Map of Highest Impact Papers Used top quartile of cited docs per year –Number of citations as of 12/31/2002 Citation based map –Direct and bibliographic coupling Henry Small’s combined linkage formula Direct weight of 5 (rather than Small’s 2) –Outer references included Divided into 70 clusters Shift in content over time
20 Top Quartile are Highly Cited
21 Ordination
22 Clustering
23 Highest Impact Map – Time Progression
24 Core (BioMed) – Time Progression
25 Another View
26 AIDS Research – Time Progression
27 Cluster Timeline
28 Diagnostic Terms/Topics by Cluster
29 Diagnostic Terms/Topics by Cluster
30 PNAS Topics BIOLOGICAL SCIENCES –Agricultural Sciences –Applied Biological Sciences –Biochemistry –Biophysics –Cell Biology –Developmental Biology –Ecology –Evolution –Genetics –Immunology –Medical Sciences –Microbiology –Neurobiology –Pharmacology –Physiology –Plant Biology –Population Biology –Psychology PHYSICAL SCIENCES –Applied Mathematics –Applied Physical Sciences –Astronomy –Chemistry –Computer Sciences –Engineering –Geology –Geophysics –Mathematics –Physics –Statistics SOCIAL SCIENCES –Anthropology –Economic Sciences –Psychology –Social Sciences
31 Impact by PNAS Topic
32 Topic Import-Export Matrix
33 Topic Map
34 More Fun Looking for a better way to show evolution of science over time periods –Should show splitting, joining of clusters, rather than the more continuous evolution that our current techniques show Map short time periods (e.g. 2 years) with overlaps and use overlaps to join maps
35 Big Change in Clusters with One Year
36 Another Example (3 Year Change)
37 Bib Coupling Distribution Changes - Why?
38 Bib Coupling Distribution Changes - Why?
39 Distribution of References Top references 3194Laemmli UK (1970), Nature 227, Maniatis T (1982), Mol Cloning Laboratory. 2659Sanger F (1977), P Natl Acad Sci USA 74, Sambrook J (1989), Mol Cloning Laboratory. 1149Chirgwin JM (1979), Biochemistry-US 18, Lowry OH (1951), J Biol Chem 193, Bradford MM (1976), Anal Biochem 72, Maxam AM (1980), Method Enzymol 65, Southern EM (1975), J Mol Biol 98, Towbin H (1979), P Natl Acad Sci USA 76, Chomczynski P (1987), Anal Biochem 162, Feinberg AP (1983), Anal Biochem 132, Rigby PWJ (1977), J Mol Biol 113, Thomas PS (1980), P Nat Acad Sci US-B 77, Miller JH (1972), Expt Mol Genetics.
40 Distribution of References Top references 3194Laemmli UK (1970), Nature 227, Maniatis T (1982), Mol Cloning Laboratory. 2659Sanger F (1977), P Natl Acad Sci USA 74, Sambrook J (1989), Mol Cloning Laboratory. 1149Chirgwin JM (1979), Biochemistry-US 18, Lowry OH (1951), J Biol Chem 193, Bradford MM (1976), Anal Biochem 72, Maxam AM (1980), Method Enzymol 65, Southern EM (1975), J Mol Biol 98, Towbin H (1979), P Natl Acad Sci USA 76, Chomczynski P (1987), Anal Biochem 162, Feinberg AP (1983), Anal Biochem 132, Rigby PWJ (1977), J Mol Biol 113, Thomas PS (1980), P Nat Acad Sci US-B 77, Miller JH (1972), Expt Mol Genetics.
41 Distribution of References
42 Few References Account for Tail All references 31 references removed
43 Questions and Things to Do How to best show the real evolution of science? Does this indicate a lack of a new biomedical revolution to drive the next generation research? Compare coupling distributions of PNAS to other journals