A review of England’s batting options

Eeny meeny miny moe

Anon, Pre-1820

Whinging about selection is part of how I traditionally spend the days leading up to an England Test. It’s my habit, and I’m probably not alone in that.

With the new(ish) England selection panel of Ed Smith, Trevor Bayliss, and James Taylor, whinging about batting selection has been more difficult.

Burns in for Cook? The logical choice. Moeen Ali recalled? Makes sense. Buttler plucked from White Ball obscurity? Not what I would have done (Hildreth or Livingstone), but OK.

Looking for some whinging ammunition ahead of England’s first warm up game against a West Indies Board XI on 15th Jan*, I did some analysis of England qualified batsmen. Specifically, their records in the last 3 years of all Red Ball Cricket (Test to 2nd XI, adjusted for difficulty).

What I expected to see was a clear hierarchy of players, with some of my favourites at the top, and England’s sub-optimal picks somewhere down the list. Actually, the selectors’ choices are supported by the data, and England have a big group of players who are of very similar abilities.

Below I’ve grouped players by expected Test average, based on the last 3 years:

World Class (Expected Average 42+) – Root & Bairstow

Test Regulars (Expected Average 35-42) – Pope, Burns, Ali, Stokes

Plausible Selections (Expected Average 30-35) – Stoneman, Roy, Buttler, Westley, Wells, Jennings, Livingstone, Gubbins, Brown, Ballance, Foakes, Clarke, Hales, Denly, Woakes, Duckett.

Wildcards (Data says Expected Average >30, but reasons to be suspicious)– Northeast: mostly driven by 2016 scores in Division 2. A poor run at Hampshire lately. Hughes: scored 425-3 in 2nd XI last 3 years. Didn’t play a first class game in 2018, only made 209 runs at 23 in the 2018 North Staffs Premier League, so probably safe to rule him out of Ashes contention.

Conclusion:

From a batting perspective, England have chosen well. They’ve picked all the World Class and Regular players (apart from Pope, who only has 32 completed innings, and is on the fringes of the squad). All their other batsmen are from the Plausible Selections bucket. England have a lot of Plausible Selections; it doesn’t really matter which of them they pick. Dropping Buttler for Hales would be worth about 4 runs over the course of a Test. As long as the selectors keep picking players that are amongst the best available, I’ll cut them some slack.

Other Discoveries:

  • England’s batting is weaker than at the start of the decade. England were spoiled by a team with 7 batsmen who averaged over 40 – like this side that beat South Africa by an innings in Durban in 2009. Pragmatically, they use 2 or 3 all-rounders (Stokes, Ali, Woakes) and often use 8 batsmen to do the job that 7 did at the start of the decade.
  • A number of players have been tried that currently average under 30 in Tests: Stoneman, Westley, Jennings, Duckett, Hales, Pope. This analysis indicates that these were good selections, and much of the underperformance is due to chance. An example: Stoneman averaged 28 in 11 tests, against an expectation of 34. But 11 tests is a small sample size, and 7 of those tests were away, including an Ashes series.
  • Bairstow is one of England’s two best batsmen. Dropping him would be an error.

*England’s Squad to tour the West Indies (Batsmen only):

Joe Root (Yorkshire) (captain), Moeen Ali (Worcestershire), Jonny Bairstow (Yorkshire), Rory Burns (Surrey), Jos Buttler (Lancashire), Joe Denly (Kent), Ben Foakes (Surrey), Keaton Jennings (Lancashire), Ben Stokes (Durham), Chris Woakes (Warwickshire)

Explaining the Underperformance of Overseas batsmen in County Cricket

At Globogym we’re better than you. And we know it!

Dodgeball – 2005

In last week’s blog, the data showed how poorly some overseas players performed in First Class cricket compared with their Test performances.

Looking at overseas players, surprisingly they perform 21% worse in Division 1 than their Test average. Contrast that with England players who do 28% better. Two examples jump out: Pujara scoring 172 runs at 14, Kane Williamson scoring 260 runs at 26. How can we explain those scores?

As there have been only 20 non-England Test players in Division 1 over the last three years, the sample size is too small for meaningful analysis. To get more insight, I’ve combined Division 1 and Division 2, which increases the sample size to 331 completed innings. I then found 3 factors which influence performance:

  • SA / NZ / Australian players outperform other nations (probably because these are the countries with conditions most similar to those in England).
  • Test players will average more in Division 2 than Division 1.
  • Top order (1-3) batsmen are most affected by English conditions (this makes sense – they will face lengthy spells against the best County bowlers with the ball swinging and seaming more than they are used to). Middle order players (numbers 4-7) are unaffected, while tailenders get a boost to their average.

I created a model to quantify this behaviour, combining these factors. The best fit to the data is as follows:

  • SANZAR +10%, others -10%
  • Top order -25%, Middle order +3%, Lower order +25%
  • Division 2 +10%

Applying this makes Pujara’s performance less of an outlier, and more a function of being a number 3, and therefore the wrong type of overseas batsman to go for. Using my model, his expected average in D1 is just 36, and while he underperformed this, it’s no longer an outlier. Similarly, Azhar Ali (Test Avg 48) would be expected to average 33, and averaged 34.

But – the current iteration of the model has arbitrary cut-offs (why should a number 4 outscore a number 3 by 25%?) and the above table has a high standard deviation. I’ll enhance it once it can be tested against 2019 data.

What the current model can do is make predictions:

Poor 2019 Overseas Player selections

Azhar Ali will be playing for Somerset next season. He’ll be 34 by then, and will be expected to average 30. I hope they aren’t paying him too much. Next season could be the one where Somerset’s batting frailty bites.

Bancroft at Durham and Joe Burns at Lancashire should struggle at the top of the order.

Top 2019 Overseas Player picks

1. S.Marsh better hope Glamorgan bat him below 3 – he could do well if he avoids the new ball.

2. Temba Bavuma isn’t the strongest Test batsman, but as a 28 year old he’ll be at or near his peak, and Division 2 cricket with Northamptonshire should suit him. It helps he doesn’t start until 14th May.

3. Bowlers! Abbas, Worrall, and Siddle should be far more valuable than top order batsmen. That said, I’ve not done the analysis of bowlers yet. Watch this space.

Test vs County Cricket Averages

“Coach woulda put me in fourth quarter, we would’ve been state champions. No doubt. No doubt in my mind.”

Napoleon Dynamite (2004)

It’s often assumed that we cannot compare Test and first class batting performances – the old comparing ‘apples to oranges’ conundrum. But if we can quantify the relative values of the different formats, we can compare like with like.

Looking at batting performance of players who’ve played across multiple formats in English* domestic cricket (2016-2018), one can assess the relative difficulty of each tier. My analysis found that it’s 19% harder to bat in Test Cricket than it is in Division 1.

If a player averages 40 in Division 1 – the data says you could expect him to average 31 in Test cricket, 44 in Division 2, and 54 in the 2nd XI.

That tells us that you’d need to consistently average over 55 in Division 2 to average 40 in test cricket – hence so few England players being pulled from those ranks in recent years.

It also means that Hildreth (who I’ve previously thought of as an England option as he averages 41 in Division 1) would be expected to average 32 in Tests, and therefore isn’t the batsman we are looking for.

A few examples of 2016-2018 Division 1 and Test averages:

Note that only Root and Buttler underperformed in Division 1 relative to Test Cricket.

At this point its worth going into the assumptions – professionally I’m always keen to show where the data ends and the judgement begins. The data can tell us performances for each player who crosses tiers. Judgement needs to be applied to appraise that data and turn it into a single factor.

Some options:

  • Jonas (@cric_analytics) has looked at minimum 10 innings in both competitors – the downside of this is that it excludes valid data points. For instance, Ben Stokes scored 226 @ 28.3 in D1 in the last 3 years – 10 runs below his test average. That should count to the total, even if it’s a small sample. Jonas reckoned a 20% gap between Test and County cricket – slightly wider than my data suggests.
  • Include all overlap – the risk is that this is skewed by a few high/low scores from one-test wonders against weak/strong opponents. This gives a mere 2% difference between Test and D1.
  • Overseas players included: this gave an 8% gap between D1 and Test – but playing away from home knocks 10% off batting average, so this is not a fair comparison. To put it another way, Pujara playing for Yorkshire averaged 14, because every game was an away game.
  • I have used relative performance for English players with >4 completed innings in each format, and weighted the overall result according to the lower of the completed innings in each format. For instance, Ben Stokes has played 8 completed D1 innings, but 46 Test innings – so the overall result is weighted with a factor of 8 because of Stokes’ performances, while Dawid Malan played 36 D1, 26 Test innings, so is more useful for this exercise and receives a weighting of 26.

Adjusting for the level individuals are playing at, allows comparison of players in different tiers. In future posts I’ll look at some implications of this data:

  1. 2nd XI players with the potential to be First Class batsmen
  2. England’s best available batsmen
  3. Overseas players: who has & hasn’t succeeded – will look at any trends in the data.
  4. It’ll take more number crunching, but I’m interested in linking First Class / List A performance- to see how well correlated they are, and use that to gauge quality of players for which limited data is available (there are a lot of players with a handful of FC games behind them – too few completed innings to fairly appraise them

*I know it’s English and Welsh. Sorry Glamorgan. There isn’t an easy word for English and Welsh, so I’ll use English as shorthand for English and Welsh.

The Journey Begins

Thanks for joining me!

“You’d better listen to her, because the Pentagon does”

Top Gun (1986)

A bit about me before I get into the numbers:

It’s easy to have an opinion, and particularly easy to broadcast that view online. Filtering out the noise is a challenge.

So why should anyone care what I think about cricket?

Well, my cv for starters- Masters degree in Physics from Oxford (4th year was focused on simulations of Earth’s atmosphere), then qualified as an accountant, spent 2 years in Banking Front Office (where I cut my teeth on excel modelling), and after a further role in Banking Finance I’m now working for a FTSE-100 retailer, doing modelling and strategy.

It’s not quite the Pentagon, but you should listen to me, because some people at a FTSE-100 retailer do.

In 2011 I built a test match simulator – which could predict the outcome of an innings from a given starting point, based on ball by ball bowler vs batsman probabilities, and running the simulated innings enough times to get a reasonable sample (>1,000). This was mainly for gambling, and it works.

Later I expanded this to cover the two white ball formats, though the 50 over model has always received more attention than the 20-20 one – I don’t mind 20-20, but I struggle to love it.

With a full time job, and a young family, cricket data comes third on the list – and that means I will focus on red ball cricket. There’s a lot of professionals who have got further than me in 20-20, and I’m not going to stand out by splitting my efforts across 3 formats.

Let’s see if I can come up with some original thoughts, and some predictions which stand the test of time.

Ed Bayliss, Dec 2018.