Or how to go beyond a
league table position in evaluating a UK university department
Every five years or so all UK university departments get their
research assessed in one gigantic peer
review exercise, which is now called the REF. Each discipline is
assessed separately, and scores can be used to compile a league table. The
exercise has direct financial implications: the better the research, the more
money universities get from the government. But if you know what academics are
like, you will not be surprised to learn that those in the UK obsess about this
exercise and its results to a far greater extent than the money involved would
justify. The results of the latest exercise have just been published, and turned into league tables by
Times Higher Education (THE) here.
You could say that the REF now provides the
same sort of incentive system for UK universities as profit does for a firm. In
some cases academics whose research is below their departmental average are put
under pressure to leave by one means or another, and most academics feel acutely
the pressure to improve on how their own output will be assessed by this
exercise. In contrast, poor performance in teaching or administration is not
nearly such a serious issue.
Many academics complain bitterly about the indignity of all
this. An alternative system would be one where getting tenure was the last
performance hurdle an academic had to pass, and from then on they were free to
do what they liked. Research money could all be allocated on a project by
project basis. I personally doubt that would be a better system from society’s
point of view, and I do find it annoying how academics can complain so much
about pressures that are taken for granted elsewhere.
It would be a mistake, however, to think that the position in
some REF league table told you all you needed to know to evaluate the quality
of research in a department. The REF releases a wealth of data, and going
beyond the headline number (usually the GPA score) can be informative. In the
latest exercise departments were evaluated under three headings: outputs,
environment and impact. Details about what is involved for each category can be
found here.
Outputs, which has the highest weight in the total (65%), looks
at the quality of the four best recent publications of each submitted member of
staff. The key word to note here is ‘submitted’. A department/university can
choose not to submit all its staff to the REF, and by not submitting staff that
a department/university considers are well below average it can raise its GPA
score (if it gets its assessments right). So to the extent that staff are not
submitted, the GPA will overestimate the average quality of the research done
in that department. As I said, league tables normally just look at the GPA
score [1].
To some it may seem strange that this is allowed, but there are
arguments to justify it. Departments do pay a significant financial penalty for
leaving staff out - they only get money for submitted staff. To get a guide to
the total amount of quality adjusted
research done in a department, simply multiply the GPA score by the number of
people submitted (called ‘power’ by THE).
The decision about whether not to submit a member of staff is
an agonising one [2] that involves many difficult trade-offs. To the individual
not being submitted it is a nasty slap in the face. For the department, the
perceived benefits in getting a higher position in GPA based league tables may
outweigh the financial cost of not submitting staff members. Decisions on this
front do vary from university to university, and from department to department:
in economics,
compare the third and fourth columns of the table below.
Although it only counts for 15% of the total GPA score, the
‘environment’ heading may be of particular interest to potential PhD students.
It is based on a number of different criteria, including the number of PhDs,
the support provided for research, and research income from outside grants.
Only three economics departments had all elements of environment judged to be of the
highest (4*) quality this time: UCL, LSE and Oxford.
Impact is a new category, accounting for 20% of the total. It
is based on case studies where research has engaged with public, private and
third sector organisations, or directly with the public. For example, one of
Oxford’s case studies for economics was my own work on fiscal councils. A quick
look at the results suggests that this new element has had a significant
influence on the overall results. In economics, for example, the only
department where all the submitted case studies were judged to be of the
highest quality was Bristol. So while Bristol only came 12th= on published
outputs, a strong impact and environment score lifted them to 6th in the
overall ranking.
As with any evaluation system, there are difficult judgements
to make on the details, and these can lead to possibilities to ‘play the
system’. Chris Bertram focuses on one particular issue at Crooked
Timber. Each iteration of the assessment exercise attempts to change the
details of the rules to avoid this, only to allow some new possibility to
exploit the system. Partly as a result, after each exercise many academics feel that there must be a better and less time
consuming way to judge the quality of research produced by individual academics
or departments, but perhaps the fact that we keep returning to the same basic
procedure suggests otherwise.
REF 2014 results: economics and econometrics
University
|
GPA
Score
|
No. of staff
submitted
|
Eligible
staff
|
Power
|
% 4*
Outputs
|
% 4*
Environ.
|
% 4*
Impact
|
UCL
|
3.78
|
37
|
45
|
139
|
70
|
100
|
92
|
LSE
|
3.55
|
51
|
56
|
182
|
56
|
100
|
87
|
Oxford
|
3.44
|
84
|
97
|
289
|
43
|
100
|
64
|
Cambridge
|
3.42
|
27
|
38
|
92
|
55
|
13
|
50
|
Warwick
|
3.41
|
42
|
52
|
142
|
43
|
38
|
60
|
Bristol
|
3.32
|
19
|
25
|
62
|
22
|
63
|
100
|
Essex
|
3.25
|
33
|
40
|
108
|
29
|
63
|
20
|
Edinburgh
|
3.14
|
18
|
28
|
55
|
31
|
50
|
13
|
Royal Holloway
|
3.11
|
14
|
23
|
45
|
35
|
0
|
60
|
Nottingham
|
3.05
|
35
|
46
|
107
|
20
|
13
|
18
|
UEA
|
3.04
|
14
|
22
|
43
|
20
|
0
|
20
|
Surrey
|
3.01
|
21
|
25
|
62
|
27
|
13
|
0
|
Queen Mary
|
2.98
|
24
|
31
|
73
|
20
|
13
|
13
|
York
|
2.93
|
28
|
46
|
82
|
14
|
0
|
40
|
St. Andrews
|
2.92
|
21
|
31
|
60
|
24
|
0
|
0
|
Manchester
|
2.89
|
33
|
45
|
96
|
11
|
13
|
40
|
Glasgow
|
2.86
|
24
|
30
|
68
|
18
|
0
|
0
|
Sussex
|
2.84
|
17
|
24
|
49
|
15
|
0
|
37
|
Exeter
|
2.78
|
25
|
31
|
68
|
13
|
25
|
13
|
Birmingham
|
2.78
|
24
|
27
|
67
|
8
|
0
|
27
|
Southampton
|
2.70
|
22
|
28
|
59
|
22
|
0
|
10
|
Birkbeck
|
2.60
|
25
|
32
|
65
|
10
|
0
|
0
|
Leicester
|
2.59
|
22
|
29
|
58
|
19
|
0
|
0
|
Sheffield
|
2.58
|
15
|
26
|
38
|
8
|
0
|
40
|
Aberdeen
|
2.48
|
19
|
26
|
48
|
5
|
0
|
0
|
City
|
2.44
|
14
|
26
|
33
|
17
|
0
|
20
|
Kent
|
2.32
|
22
|
26
|
51
|
3
|
0
|
13
|
Brunel
|
2.20
|
26
|
29
|
58
|
2
|
0
|
0
|
Note that many economics departments are assessed under
Business and Management, and are not included here. Sources: columns 2,3 and 5:
Times Higher Education, column 4: HESA,
columns 6-8, REF.
[1] THE publishes an alternative university wide ranking
(aggregated across departments) that multiplies the GPA by the proportion of
staff submitted, but that implicitly gives the research of non-submitted staff
a score of zero, which is likely to be too extreme. It is better to simply note
either the power score, or the proportion of staff submitted. Approximate
information on the number of staff eligible for submission by department can be
found here.
[2] This is based on my own experience at my previous
university, where I was research director for the school. Thankfully I have
played no role in these decisions at Oxford!
By UAE do you actually mean UEA? (I wouldn't normally be this pedantic but it is my alma mater)
ReplyDeleteAs in some way this leads towards postgraduate education, I wonder about the relative merits of Masters degrees v PhDs, from both a student/University financing point of view and a relative economic merit basis for society later on. My feeling is that PhD funding is easier to get than that for a Masters course, but that in general a Masters course provides more economic benefit to society by its more applied nature, but I could be wrong. I would appreciate your views on this as it is something that has puzzled me for years...
ReplyDeleteI think it is better calibrated for disciplines dealing mainly in numbers than those which mostly use letters.
ReplyDeleteOne issue with the REF is that funding for the next several years is allocated on output since the last hiring round. What do you think about replacing the "Output" measure with something like a "piece rate" system where an AER is worth £x; an EJ worth £y etc to the university. There could also be some sort of top up based on citations. These rates could be periodically reviewed to ensure that the ESRC's budget constraint was respected.
ReplyDeleteA forward looking system might ensure that universities hire those that they think will be the most productive in the future, which is presumably what a social planner would do (rather than, e.g. hiring those who were productive many moons ago.)
(Of course, there might still be a case for the impact and enviromental element, so this wouldn't need to change.)
Interesting point! However, who would be able (today) to detect "the Einsteins" of tomorrow? The benevolent dictator?
Deletei wonder how these schools compare internationally; should top students consider studying out of their country.
DeleteBy the way, the U of Chicago looked at grades and GRE test scores and decided that after a certain not very high level, they were not predictive of success in economics. The same results was obtained by Terman studying students in the SF Bay area in general.
I considered applying to British schools for an undergraduate education, but I did not think that my chances were very good.
Citations!?
DeleteNo, no, bloody no. The tyrrany of citation metrics has already skewed and screwed academia too far.
In my own (Russell Group) Faculty, we are dominated by solid, but essentially second-rate academics with solid citation metrics. Many of these are game players who know how to get their citation scores on the up escalator (incremental research in buzzy areas; networking; informal mutually citing cliques).
In my own Dept, we have only one academic out of 40-odd who, in an internal audit, was judged to have four 4* REF outputs this time round. He is doing work that is systematically destroying a cosy consensus that has held for half a century in his area of study. He has an enormous institutionalised inertia to push against. In other words, he is saying what people don't want to hear. Consequently, he reaps few citations...at the moment. His work will be the basis of the next generation's textbooks though.
He is 40 year old and is still an Assistant Prof. He has been passed over for promotion for each of the last 5 years because his h-index is "judged" to be deficient.
Academic appraisal is certainly needed, but management by metrics is no solution.
original poster here
Delete@ anon above: ok; perhaps citations would create perverse incentives.
Perhaps you could use the average citations/article to rank journals objectively (for instance); i.e. to judge how much an AER is worth relative to an EJ.
My question is, why would anyone bother to rate you any anyway. In my world we have laws; constants and other universally excepted numbers that actually work. In macro-economics, there is no Ohms Law; no Boltzmann constant; no Standard Molar Volume etc etc.
DeleteCurrent macroeconomic thinking at SWL's department, is based on a book that flogs the "3-equation New Keynesian model" (an Oxford in-house, publication). Now, if I was asked which macroeconomic theory we should preserve for posterity, that would not be my first choice!
The kids you have to be sorry for are those that are paying £9k a year to be taught this crap. Social Sciences, in aggregate, gets a lot of stick, a lot of it not deserved. The macroeconomics sub-division of it however drags it down. The latter has supplied fodder to organisations like the IMF and Troika and similar "Deficit Hawks". Supra-national organisations that have caused death, pain and suffering, based on the myth of "austerity" and a total lack of understanding of how a Sovereign fiat currency economy, actually works.
Does blogging count anywhere?
ReplyDeleteRe tenure and freedom thereafter. As an undergraduate at Cambridge in the late 70s I would have been hostile to the apparent lack of regular performance appraisal, and would have considered published research a mark of successful performance. In retrospect it does seem a simplistic view. What has mattered most to me in my life was the quality of teaching. So a link to an obituary of a man who taught me written by a man who also taught me, which I feel is a fitting commentary on the idea of data driven performance appraisal of academics.
ReplyDeletehttp://www.hist.cam.ac.uk/news/096-in-memoriam-arthur-hibbert
I know many staff and students at those top rated institutions who are not happy with both the teaching and research conditions. One also must be wary of peer review; for example if the dominant paradigm in a subject says that the way you get to the truth is through rational expectations optimisation models, and you do that, that will get you a higher ranking. Is that good for learning or society? Most probably, not.
ReplyDeleteWhen I taught in the UK, the informal joke was that the RAE was a "Soviet Britain" equivalent of the 5-year plan. And it felt that way preparing for it. One important person told us essentially to practice fraud with research funding requests--adding an extra "0" to the end of the sum we really needed, for example. Writing a book was frowned upon, because it counted as one out of four publications. Better to go for the quick stuff. And the important journals in my discipline show it--they have this quality of "what I did over my summer vacation," to get stuff out as quickly as possible. So I really have to disagree with Wren-Lewis here about RAE (or whatever it is called today) acting as incentives, unless the word "perverse" is added right before it.
ReplyDelete