Tuesday, 23 December 2014

How to reference the REF

Or how to go beyond a league table position in evaluating a UK university department

Every five years or so all UK university departments get their research assessed in one gigantic peer review exercise, which is now called the REF. Each discipline is assessed separately, and scores can be used to compile a league table. The exercise has direct financial implications: the better the research, the more money universities get from the government. But if you know what academics are like, you will not be surprised to learn that those in the UK obsess about this exercise and its results to a far greater extent than the money involved would justify. The results of the latest exercise have just been published, and turned into league tables by Times Higher Education (THE) here

You could say that the REF now provides the same sort of incentive system for UK universities as profit does for a firm. In some cases academics whose research is below their departmental average are put under pressure to leave by one means or another, and most academics feel acutely the pressure to improve on how their own output will be assessed by this exercise. In contrast, poor performance in teaching or administration is not nearly such a serious issue.

Many academics complain bitterly about the indignity of all this. An alternative system would be one where getting tenure was the last performance hurdle an academic had to pass, and from then on they were free to do what they liked. Research money could all be allocated on a project by project basis. I personally doubt that would be a better system from society’s point of view, and I do find it annoying how academics can complain so much about pressures that are taken for granted elsewhere.

It would be a mistake, however, to think that the position in some REF league table told you all you needed to know to evaluate the quality of research in a department. The REF releases a wealth of data, and going beyond the headline number (usually the GPA score) can be informative. In the latest exercise departments were evaluated under three headings: outputs, environment and impact. Details about what is involved for each category can be found here.

Outputs, which has the highest weight in the total (65%), looks at the quality of the four best recent publications of each submitted member of staff. The key word to note here is ‘submitted’. A department/university can choose not to submit all its staff to the REF, and by not submitting staff that a department/university considers are well below average it can raise its GPA score (if it gets its assessments right). So to the extent that staff are not submitted, the GPA will overestimate the average quality of the research done in that department. As I said, league tables normally just look at the GPA score [1].

To some it may seem strange that this is allowed, but there are arguments to justify it. Departments do pay a significant financial penalty for leaving staff out - they only get money for submitted staff. To get a guide to the total amount of quality adjusted research done in a department, simply multiply the GPA score by the number of people submitted (called ‘power’ by THE).

The decision about whether not to submit a member of staff is an agonising one [2] that involves many difficult trade-offs. To the individual not being submitted it is a nasty slap in the face. For the department, the perceived benefits in getting a higher position in GPA based league tables may outweigh the financial cost of not submitting staff members. Decisions on this front do vary from university to university, and from department to department: in economics, compare the third and fourth columns of the table below.

Although it only counts for 15% of the total GPA score, the ‘environment’ heading may be of particular interest to potential PhD students. It is based on a number of different criteria, including the number of PhDs, the support provided for research, and research income from outside grants. Only three economics departments had all elements of environment judged to be of the highest (4*) quality this time: UCL, LSE and Oxford.

Impact is a new category, accounting for 20% of the total. It is based on case studies where research has engaged with public, private and third sector organisations, or directly with the public. For example, one of Oxford’s case studies for economics was my own work on fiscal councils. A quick look at the results suggests that this new element has had a significant influence on the overall results. In economics, for example, the only department where all the submitted case studies were judged to be of the highest quality was Bristol. So while Bristol only came 12th= on published outputs, a strong impact and environment score lifted them to 6th in the overall ranking.

As with any evaluation system, there are difficult judgements to make on the details, and these can lead to possibilities to ‘play the system’. Chris Bertram focuses on one particular issue at Crooked Timber. Each iteration of the assessment exercise attempts to change the details of the rules to avoid this, only to allow some new possibility to exploit the system. Partly as a result, after each exercise many academics feel that there must be a better and less time consuming way to judge the quality of research produced by individual academics or departments, but perhaps the fact that we keep returning to the same basic procedure suggests otherwise.

REF 2014 results: economics and econometrics
University
GPA
Score
No. of staff
submitted
Eligible
staff
Power
% 4*
Outputs
% 4*
Environ.
% 4*
Impact
UCL
3.78
37
45
139
70
100
92
LSE
3.55
51
56
182
56
100
87
Oxford
3.44
84
97
289
43
100
64
Cambridge
3.42
27
38
92
55
13
50
Warwick
3.41
42
52
142
43
38
60
Bristol
3.32
19
25
62
22
63
100
Essex
3.25
33
40
108
29
63
20
Edinburgh
3.14
18
28
55
31
50
13
Royal Holloway
3.11
14
23
45
35
0
60
Nottingham
3.05
35
46
107
20
13
18
UEA
3.04
14
22
43
20
0
20
Surrey
3.01
21
25
62
27
13
0
Queen Mary
2.98
24
31
73
20
13
13
York
2.93
28
46
82
14
0
40
St. Andrews
2.92
21
31
60
24
0
0
Manchester
2.89
33
45
96
11
13
40
Glasgow
2.86
24
30
68
18
0
0
Sussex
2.84
17
24
49
15
0
37
Exeter
2.78
25
31
68
13
25
13
Birmingham
2.78
24
27
67
8
0
27
Southampton
2.70
22
28
59
22
0
10
Birkbeck
2.60
25
32
65
10
0
0
Leicester
2.59
22
29
58
19
0
0
Sheffield
2.58
15
26
38
8
0
40
Aberdeen
2.48
19
26
48
5
0
0
City
2.44
14
26
33
17
0
20
Kent
2.32
22
26
51
3
0
13
Brunel
2.20
26
29
58
2
0
0
Note that many economics departments are assessed under Business and Management, and are not included here. Sources: columns 2,3 and 5: Times Higher Education, column 4: HESA, columns 6-8, REF.



[1] THE publishes an alternative university wide ranking (aggregated across departments) that multiplies the GPA by the proportion of staff submitted, but that implicitly gives the research of non-submitted staff a score of zero, which is likely to be too extreme. It is better to simply note either the power score, or the proportion of staff submitted. Approximate information on the number of staff eligible for submission by department can be found here.

[2] This is based on my own experience at my previous university, where I was research director for the school. Thankfully I have played no role in these decisions at Oxford!



13 comments:

  1. By UAE do you actually mean UEA? (I wouldn't normally be this pedantic but it is my alma mater)

    ReplyDelete
  2. As in some way this leads towards postgraduate education, I wonder about the relative merits of Masters degrees v PhDs, from both a student/University financing point of view and a relative economic merit basis for society later on. My feeling is that PhD funding is easier to get than that for a Masters course, but that in general a Masters course provides more economic benefit to society by its more applied nature, but I could be wrong. I would appreciate your views on this as it is something that has puzzled me for years...

    ReplyDelete
  3. I think it is better calibrated for disciplines dealing mainly in numbers than those which mostly use letters.

    ReplyDelete
  4. One issue with the REF is that funding for the next several years is allocated on output since the last hiring round. What do you think about replacing the "Output" measure with something like a "piece rate" system where an AER is worth £x; an EJ worth £y etc to the university. There could also be some sort of top up based on citations. These rates could be periodically reviewed to ensure that the ESRC's budget constraint was respected.

    A forward looking system might ensure that universities hire those that they think will be the most productive in the future, which is presumably what a social planner would do (rather than, e.g. hiring those who were productive many moons ago.)

    (Of course, there might still be a case for the impact and enviromental element, so this wouldn't need to change.)

    ReplyDelete
    Replies
    1. Interesting point! However, who would be able (today) to detect "the Einsteins" of tomorrow? The benevolent dictator?

      Delete
    2. i wonder how these schools compare internationally; should top students consider studying out of their country.

      By the way, the U of Chicago looked at grades and GRE test scores and decided that after a certain not very high level, they were not predictive of success in economics. The same results was obtained by Terman studying students in the SF Bay area in general.

      I considered applying to British schools for an undergraduate education, but I did not think that my chances were very good.

      Delete
    3. Citations!?

      No, no, bloody no. The tyrrany of citation metrics has already skewed and screwed academia too far.

      In my own (Russell Group) Faculty, we are dominated by solid, but essentially second-rate academics with solid citation metrics. Many of these are game players who know how to get their citation scores on the up escalator (incremental research in buzzy areas; networking; informal mutually citing cliques).

      In my own Dept, we have only one academic out of 40-odd who, in an internal audit, was judged to have four 4* REF outputs this time round. He is doing work that is systematically destroying a cosy consensus that has held for half a century in his area of study. He has an enormous institutionalised inertia to push against. In other words, he is saying what people don't want to hear. Consequently, he reaps few citations...at the moment. His work will be the basis of the next generation's textbooks though.

      He is 40 year old and is still an Assistant Prof. He has been passed over for promotion for each of the last 5 years because his h-index is "judged" to be deficient.

      Academic appraisal is certainly needed, but management by metrics is no solution.

      Delete
    4. original poster here

      @ anon above: ok; perhaps citations would create perverse incentives.

      Perhaps you could use the average citations/article to rank journals objectively (for instance); i.e. to judge how much an AER is worth relative to an EJ.

      Delete
    5. My question is, why would anyone bother to rate you any anyway. In my world we have laws; constants and other universally excepted numbers that actually work. In macro-economics, there is no Ohms Law; no Boltzmann constant; no Standard Molar Volume etc etc.

      Current macroeconomic thinking at SWL's department, is based on a book that flogs the "3-equation New Keynesian model" (an Oxford in-house, publication). Now, if I was asked which macroeconomic theory we should preserve for posterity, that would not be my first choice!

      The kids you have to be sorry for are those that are paying £9k a year to be taught this crap. Social Sciences, in aggregate, gets a lot of stick, a lot of it not deserved. The macroeconomics sub-division of it however drags it down. The latter has supplied fodder to organisations like the IMF and Troika and similar "Deficit Hawks". Supra-national organisations that have caused death, pain and suffering, based on the myth of "austerity" and a total lack of understanding of how a Sovereign fiat currency economy, actually works.

      Delete
  5. Re tenure and freedom thereafter. As an undergraduate at Cambridge in the late 70s I would have been hostile to the apparent lack of regular performance appraisal, and would have considered published research a mark of successful performance. In retrospect it does seem a simplistic view. What has mattered most to me in my life was the quality of teaching. So a link to an obituary of a man who taught me written by a man who also taught me, which I feel is a fitting commentary on the idea of data driven performance appraisal of academics.

    http://www.hist.cam.ac.uk/news/096-in-memoriam-arthur-hibbert

    ReplyDelete
  6. I know many staff and students at those top rated institutions who are not happy with both the teaching and research conditions. One also must be wary of peer review; for example if the dominant paradigm in a subject says that the way you get to the truth is through rational expectations optimisation models, and you do that, that will get you a higher ranking. Is that good for learning or society? Most probably, not.

    ReplyDelete
  7. When I taught in the UK, the informal joke was that the RAE was a "Soviet Britain" equivalent of the 5-year plan. And it felt that way preparing for it. One important person told us essentially to practice fraud with research funding requests--adding an extra "0" to the end of the sum we really needed, for example. Writing a book was frowned upon, because it counted as one out of four publications. Better to go for the quick stuff. And the important journals in my discipline show it--they have this quality of "what I did over my summer vacation," to get stuff out as quickly as possible. So I really have to disagree with Wren-Lewis here about RAE (or whatever it is called today) acting as incentives, unless the word "perverse" is added right before it.

    ReplyDelete

Unfortunately because of spam with embedded links (which then flag up warnings about the whole site on some browsers), I have to personally moderate all comments. As a result, your comment may not appear for some time. In addition, I cannot publish comments with links to websites because it takes too much time to check whether these sites are legitimate.