The future of the British RAE The REF (Research Excellence Framework) Jonathan Adams
Research Assessment Exercise - timeline 1980s - policy on concentration and selectivity st Research Selectivity Exercise modified and formalised as the RAE Polytechnics access research funding, enter a streamlined RAE 1996 and further cycles, higher quality thresholds for funding 2008 – new Roberts profiling format
The shift to metrics Evolution –RAE = peer review of an evidence portfolio, including data on outputs, training and grants funding –RAE2008 profiling adds emphasis to the data Discontinuity –Treasurys 2007 announcement was disruptive, from many perspectives Compromise –HEFCE consultations shifted emphasis away from the gross simplification, and restored peer review
Research assessment must support the UKs enhanced international research status Is the assessment dividend beginning to plateau? Has the RAE delivered all it can?
If there is a shift to metrics, then disproportionate change should be avoided
Research performance - indicators, not metrics Inputs Research black box Outputs FundingNumbers..Publications research quality Time What we want to know What we have to use
How can we judge possible metrics? Relevant and appropriate –Are metrics correlated with other performance estimates? –Do metrics really distinguish excellence as we see it? –Are these the metrics the researchers would use? Cost effective –Data accessibility, coverage, cost and validation Transparent, equitable and stable –Is it clear what the metrics do? –Are all institutions, staff and subjects treated equitably? –How do people respond, and can they manipulate metrics? –Once an indicator is made a target for policy, it starts to lose the information content that initially qualified it to play such a role
Three proposed data components Research funding Research training Research output –The key quality measure All have multiple components PLUS Peer Review
HEFCE favours bibliometrics: impact ( ) is related to RAE2001 grade (data for UoA14 Biology)
Impact index is coherent across UK grade levels data for core science disciplines, grade at RAE96
HEFCE favours bibliometrics: impact ( ) is related to RAE2001 grade (data for UoA14 Biology) The residual variance is very great
What is the right impact score? Correct counts –25% of cites are to non-SCI outputs Proliferating versions –How do you collate? Collaboration vs fractional citations –Fractional citation counts would work against trends and policy Self citation – does it matter? –It is part of the sociology of research Normalisation strategies Clustering into subject groups
TOTAL INSTITUTIONAL OUTPUT Non-print UNPUBLISHED ORCLIENT PUBLISHEDREPORTS etc PUBLICATIONS
INSTITUTIONAL PUBLICATIONS Books and chapters Conference proceedings Journal articles Will be in WoS within 2-3 months
INSTITUTIONAL PUBLICATIONS Journals covered by THOMSON WoS and/or SCOPUS Articles in journals not covered by THOMSON WoS and/or SCOPUS or journal not covered at time of publication
INSTITUTIONAL PUBLICATIONS Journals covered by THOMSON WoS and/or SCOPUS 2001 Timeline 2007 CENSUS DATE CENSUS PERIOD
INSTITUTIONAL PUBLICATIONS Journals covered by THOMSON WoS and/or SCOPUS 2001 Timeline 2007 CENSUS PERIOD All papers with an institutional address published by all staff and students employed or in training during Papers with an institutional address published by staff who left or retired before the census date CENSUS DATE All papers with an institutional address published by all staff and students employed or in training during Journals covered by THOMSON WoS and/or SCOPUS
All papers with an institutional address published by all staff and students employed or in training during Papers without that institutional address published by staff recruited during INSTITUTIONAL PUBLICATIONS CENSUS DATE CENSUS PERIOD
Papers published during by staff present at census date Papers published during census period by staff while at the institution PAPERS BY ADDRESS PAPERS BY AUTHOR CENSUS DATE CENSUS PERIOD Leavers Recruits
Quality differentiation: do you assess total activity or selected papers? (data for UoA18 Chemistry)
The average does not describe the profile Two units in the same field differ markedly in average normalised citation impact (2.39 vs. 1.86) because of an exceptionally high outlier in one group, but the groups have similar profiles Average = 2.39 Average = 1.86
Distribution of data values - income MaximumMinimum
Distribution of data values - impact The variables for which we have data are skewed and therefore difficult to picture in a simple way
Simplifying the data picture Scale data relative to a benchmark, then categorise –Could do this for any data set All journal articles –Uncited articles (take out the zeroes) –Cited articles Cited less often than benchmark Cited more often than benchmark –Cited more often but less than twice as often –Cited more than twice as often »Cited less than four times as often »Cited more than four times as often
Categorising the impact data This grouping is the equivalent of a log 2 transformation. There is no place for zero values on a log scale.
UK ten-year profile 680,000 papers AVERAGE RBI = 1.24 MODE (cited) MEDIAN THRESHOLD OF EXCELLENCE? MODE
Profiles are informative and work well across institutions and subjects
HEIs – 10 year totals smoothed Absolute volume would add a further element for comparisons
HEIs – 10 year totals by volume
Normalisation strategy will affect the outcome ( Data for UoA13 Psychology )
Clinical Lab Sci... Accountancy Hosp. based... Com. based... Other stud. Pharmacy Biochemistry Biol. sciences Pre-clin. stud. Physiology Pharmacology Anatomy Veterinary sci. Clin. Dentistry Food sci... Agriculture Earth sci. Environ. sci. Geography Archeology Mineral/mining... Chemistry Metallurgy... Physics Chem. eng. Computer sci. Gen. Eng. Mechanical eng... Electrical eng... Civil eng. Pure maths. Applied maths. Statistical res... Nursing Sports related... Psychology Education Politics... Social policy... Sociology Social work Communication... Built environ. Town/country... Economics... Business... stud. Law Library and info... Anthropology Asian stud. Middle east... Theology... American stud. Iberian... European stud. French German, Dutch... English History Italian Russian... Linguistics Classics... Philosophy History of Art... Art and Design Drama, Dance... Music Celtic stud. Subject clustering needs to fit UK research Engineering Medical Physical Maths Bio-Med Environment Social Arts & hums This tree diagram illustrates similarity in the frequency with which journals were submitted to RAE1996
How should we map data to disciplines? i.e. what is Chemistry? Thomson
How well do metrics respond to variation? Subject differences –Can we accept differences in criteria and balance between clusters? –What about divergence within clusters? –How do metrics support the growth of interdisciplinarity? –How can emerging (marginal?) research groups be recognised? Differences in mode –Where is the balance between basic and applied research? Differences in people –Career breaks, career development
How well do metrics represent different HEIs? Output coverage by articles on Thomson Reuters databases
What will it cost? Data costs –Core data – how much, from whom? –Data cleaning and validation Pilot studies are elucidating this – and the task is big Requirements on institutions –Pilot studies will elucidate this System development System maintenance Will it cover institutional quality assurance?
Other issues Census period –What about synchrony and sequence? Weighting indicators –ERA will weight research training at 0 –Need to weight within types as well as between Interface between quantitative (indicators) and qualitative (peer review) –Role of panel members –Risk of mis-match
Do outputs hang together with income and training? We can tell you … You are the REF Check it out now RAE2008.com
How can we judge possible metrics? Relevant and appropriate - YES –Technical correctness of metrics is not a problem, but there is a lot of work to do in refining and comparing options Cost - MAYBE –Data accessibility is not a problem –But we have yet to scope full system requirements So is there a problem? –Are all subjects, HEIs, staff and modes treated equitably? –What will 50,000 intelligent people start to do? –Goodharts Law - for how long will the metrics track excellence? Researchers must decide, not metricians (RMM, 1997) –The devil is in the detail: get involved
REF pilot projects 20+ institutions (July 08) Collect and collate databases, reconciling authors to staff (Oct 08) Compare Thomson and Scopus coverage Collate and normalise citation counts (Dec 08) Run evaluations of alternative methodologies Disseminate outcomes and consult (Mar 09)
Over 8,000 people participated in recent PBRF rounds (50,000 in the RAE). Thomson recorded fewer than 5,000 articles per year recently (100,000 for the UK). That is less than one article per NZ researcher per year.
Implications for Aotearoa New Zealand Relative data coverage –Balance of regional journals International = trans-Atlantic –The relevance of citations Scale factors and relative load –Fixed costs Community size and anonymity Compatibility of stakeholder and researcher views on assessment outcomes
The future of the British RAE The REF (Research Excellence Framework) Jonathan Adams