Adjusting averages (Lyon vs Ashwin)

Pick a big enough sample size and conditions should average out… At least that’s what I’d always assumed.

Let’s disprove that conceptually and then with numbers.

Is it fair to directly compare the averages of these two players? Bowler A plays his home Tests in Sri Lanka, ending his career with a bounty of Bangladesh wickets. Player B is part of a four man attack. Bowls a lot against the top order, and his home games are in Australia. He never gets a sniff of the tail – less than 10% of his wickets are batsmen averaging 10 or less.

No matter how big the sample size, A and B aren’t on a level playing field; there is a bias in favour of A.

How would we expect A or B to perform against a batsman that averaged 30, on an average pitch? Our best bet is to adjust their stats for:

  1. Ground
  2. Batsmen bowled to
  3. Innings number
  4. Specific pitch condition
  5. Ball age (maybe)
  6. Match situation (eg. Team playing for draw / declaration)

Here, I’ll do the first two, looking at the players with 50 wickets over the last four years. Will assume that factors 3-6 average out over a career.

For “Ground” take a weighted average of spinners’ averages at the stadia where each bowler has taken wickets (ie. Mehidy Hasan Miraz’s 29 wickets at 19 at Dhaka are still valuable, but worth more like 29 wickets at 21).

For “Batsman bowled to” each run conceded is worth one run – but the wickets are awarded a value based on who was dismissed – so getting Virat Kohli gives you more credit than Ishant Sharma.

Data is the four years to 17th Feb 2021. Note positive adjustments to averages are bad; negative is good.

The mean adjustment is really interesting: increasing spinners’ averages by 4%. This indicates that just looking at raw averages flatters spinners. Why is this? I think it’s a function of when spinners bowl. If they don’t get much action in the first 30 overs, three wickets will already be down. Thus they’ll disproportionately dismiss the (weaker) lower middle order.

Lyon vs Ashwin

The similarity of their adjusted records looks striking when compared to raw averages. Let’s take a closer look and see if it stacks up.

Firstly, who they dismiss:

It’s not like Ashwin is getting an easy ride, but 30% of Lyon’s wickets come against batsmen who’ve averaged over 40, while for Ashwin that figure is 20%.

Again, Ashwin plays on a mix of pitches, while Lyon has taken over half his wickets at grounds where spinners traditionally struggle.

Overall, Lyon has done amazingly well to average under 30 over the last four years given where he has bowled and to whom.

Other observations

While Ravi Jadeja’s raw average of 24.6 is flattering, he’s still right up there.

Moeen Ali can feel aggrieved not to be ahead of Dom Bess as England’s second spinner.

Roston Chase is better than his average would say – but with relatively little data the error bars get large (60 wickets means his rating is 37 +/- 5).

Nathan Lyon is the best current spinner – we adjust his average down by 11%, of which 8% comes from where he plays. He also gets a boost from who he bowls to: as part of a four man attack, Lyon does feature more against the top order.

Where do we go with this? Extending this to pace bowlers is harder, as strictly one should adjust for when in the innings they bowl (the new ball is helpful). This would need a model of wicket and run probability by ball bowled, and then to compare each player’s actual results to what the average player would achieve.

PS. This would be easy to check… if you had CricViz data. Expected averages would tell the story. Especially comparing head-to-head for the games in which both Lyon and Ashwin played. And splitting LHB and RHB so there was no bias driven by matchups.

The Ashes: A tale of two spinners

I wrote an Ashes preview. It was boring. You won’t be subjected to it. Fortunately, when researching that I noticed a strange feature of Nathan Lyon’s bowling: he is great in the first innings of a Test.

At the time of writing it’s unclear whether we’ll see Moeen Ali vs Nathan Lyon as the opposing spinners in the 2019 Ashes – Ali’s batting has been poor of late, so it’s hard to justify his selection. Easier to make seam-friendly wickets and neutralise Lyon. Career averages show why that’s tempting:

Fig 1 – Nathan Lyon and Moeen Ali’s Test bowling records (as at 30/7/19)

That data masks two things – firstly, since 2017 both bowlers average 29. Secondly, and interestingly, how they perform through a match.

Fig 2 – Lyon (Yellow Triangle) and Ali (Green Square) by Innings of the match. Axes are the same in all four charts.

Let’s walk through that quartet of charts. In the first Innings, Nathan Lyon is about as good as it gets. An average of 32 is 11 runs per wicket better than the average for all spinners. He’s right up there with Ashwin & Jadeja. Moeen Ali is, frankly, awful. Averaging 16 more runs per wicket than Shane Shillingford. That Green Square is poles apart from Lyon’s Yellow Triangle.

Through the second and third innings, Nathan Lyon stubbornly refuses to improve. The chasing pack catches him, then outshines him by the third innings. Ali is comparable with him at that point (and within touching distance of the rest).

Now it gets weird. If anything, Lyon is worse in the fourth innings. A bowling average of 34 is now ten runs worse than that for all spinners since 2010. The control is still there, as his economy rate is unaffected. The sample size is fine (58 wickets in the fourth innings). Odd.

Meanwhile, the fourth innings is Ali’s playground. 59 wickets at 22, he’s right up there with the big boys. Go Green Square, go!

Let’s end with some practical uses for this, before it becomes pub trivia.

  • Nathan Lyon can be part of a four man attack for Australia – he can bowl effectively in the first innings, so Australia don’t need to play four quicks to have sufficient firepower early in the match.
  • Moeen Ali shouldn’t bowl in the first innings for England. Stokes can play the role of fourth bowler, and Ali should bowl no more than ten overs per day. Save him for later in the game.

Further Reading

Cricinfo independently noticed this back in 2017 (ie. I haven’t copied them, honest!) Unfortunately for them, they attributed the difference to the Asian continent. That quirk has now been ironed out.