Wednesday, October 17, 2018

Wow! 29 Teams of Analysts, One Identical Data Set, One Identical Research Question, 29 Different Outcomes.

This is an incredible paper, Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results, saw 61 analysts (in 29 teams), be given the same data set meant to address the same research question (are soccer referees more likely to give red cards to dark skinned players than light skinned players).

The outcomes?

20 teams found a statistically significant positive effect, while 9 teams did not, and where effect sizes ranged (in odds-ratio units), despite all teams working from the same data set, from 0.89 to 2.93 (where 1.0 would be no effect).

Why so many differences?

Because results depend a great deal on any team's chosen analytic strategy which in turn is influenced by their statistical comfort and choices and their interplay with their pre-existing working theories.

Now these results weren't incentivized examples of p-hacking. The authors of this study point out that the variability seen was based on "justifiable, but subjective, analytic decisions", and while there's no obvious means with which to ensure a researcher has chosen the right methodology for their study, the authors suggest that,
"transparency in data, methods, and process gives the rest of the community opportunity to see the decisions, question them, offer alternatives, and test these alternatives in further research".
Something all the more important in cases where authors might in fact have biases the would incentivize them to favour a particular outcome, and why I wish I was offered more in the way of stats and critical appraisal in medical school (and maybe less in the way of embryology for instance).

[Photo by Timur Saglambilek from Pexels]