Been a while since I have had time to read articles, but I am happy that I took the time to read ‘Student evaluations of teaching (mostly) do not measure teaching effectiveness’ by Anne Boring, Kellie Ottoboni and Philip B. Stark. I will just quote some of their conclusions – though I was aware of the issue, it was really thought-provoking!
“Universities generally treat SET [student evaluations of teaching] as if they primarily measure teaching effectiveness or teaching quality. […T]he best evidence so far shows that they do not: they have biases that are stronger than any connection they might have with effectiveness. […] On the whole, high SET seem to be a reward students give instructors who make them anticipate getting a good grade, for whatever reason” (1)
“We find that the association between SET and an objective measure of teaching effectiveness, performance on anonymously graded final, is weak and – for these data – generally not statistically significant. In contrast, the association between SET and (perceived) instructor gender is large and statistically significant: Instructors whom (students believe) are male receive significantly higher average SET. […] We therefore conclude that SET primarily do not measure teaching effectiveness, that they are strongly and non-uniformly biased by factors including the gender of the instructor and student, that they disadvantage female instructors, and that it is impossible to adjust for these biases. SET should not be relied upon as a measure of teaching effectiveness. Relying on SET for personnel decisions has disparate impact by gender, in general.” (2)
Source: Boring et al. ScienceOpen Research 2016, Student evaluations of teaching (mostly) do not measure teaching effectiveness