And yet we persist in doing idiot things that can only possibly have this result:
Assessing school-teachers on the improvement their kids show in tests between the start and end of the year (which obviously results in their doing all they can depress the start-of-year tests).
Assessing researchers by the number of their papers (which can only result in slicing into minimal publishable units).
Assessing them — heaven help us — on the impact factors of the journals their papers appear in (which feeds the brand-name fetish that is crippling scholarly communication).
Assessing researchers on whether their experiments are “successful”, i.e. whether they find statistically significant results (which inevitably results in p-hacking and HARKing).