Using judgement when rating cricketers

Yesterday Zak Crawley scored 171*. 

Before the West Indies series I said “he would be a bold and wrong selection, going against the publicly available data… If he succeeds, I will give them credit.” Time for me to issue a credit to the England selectors.

Moving swiftly on, with Crawley and Ben Stokes outperforming their ratings it’s time for a rethink. There’s qualitative data I’ll start using: the judgement of others. We’re in uncharted waters; we’re Going Beyond Stats.

I appraise batsmen based on four years of data (six if they rarely play). It’s a stable method. I don’t have meaningless volatile opinions (like a certain former England captain). There are limitations though. Injuries, new techniques, changing roles can all have an effect. My ratings system is too slow (look at Bairstow – his stellar 2016 is still boosting his rating).

Adding the 171* to my database would boost Crawley’s expected Test average by three runs (going from 26 to 29). Is that enough? Is he better than that?

Rob Key previously said “Crawley is young and his numbers will improve. You just have to watch him bat to know that.” Actually I can’t: I don’t trust my eye. But if people I respect are saying a player is better than their stats, maybe there’s something in that.

With a base average of 29, your chances of 171* (in the first innings, at home, against a strong attack) are around one in 150. In other words, yesterday shouldn’t have happened. Much more likely for a player with a base average of 35 – it’s a one in 60 shot.

After a finite number of innings there is uncertainty about how good a player is. I can tell you roughly how good a player has been, give or take a bit. Luck is a factor, moreso the less data you have. Error bars give a range of possible ratings for a player, with a 95% chance their level of ability lies somewhere in that range.

Let’s jam those two concepts together to add a third step to my ratings framework:

  1. Rate player based on stats
  2. Add error bars to that rating, based on the number of dismissals.
  3. NEW: Adjust the rating within the plausible range based on the views of people you trust.

Oh, and “the views of people you trust” can include selectors. For instance, if someone had rubbish data but is batting at four, then selectors are implicitly telling us the person is better than their stats. We should use that information.

What does that mean for Crawley?

  1. Rating 29
  2. Margin of error +/- 7 (after 83 dismissals in the last three years).
  3. Boost rating by 5.0 to reflect the excitement and that England choose to bat him at three.

Crawley is now expected to average 34 in Tests. This recognises that he is probably at the upper end of the range of plausible averages (22 to 36) because he is trusted to bat at three, and is highly regarded.

I will do the same exercise with all county cricketers to redefine their red ball ratings to incorporate the role they are given. However, there are about to be several meaningless games in the Bob Willis Trophy, so that may have to wait for next year.

I’ll also do that with England’s Test batsmen. For instance, Stokes at 40 +/- 12 might be re-rated at 44 (based on two year record, and pundit judgement).

One innings doesn’t change everything, but I’ve seen enough outliers to want to try something new. Where does this leave us, now data is being tinkered with, and we lose the safety of being moored to pure stats? I draw the parallel with the astronomers maintaining the Earth as the centre of the solar system, as more data made that harder to believe. Models became contorted to fit the data, before being binned. Is that happening here? Time will tell. #OnOn.