One of the benefits of twitter is hearing new ideas. Jonas (@cric_analytics) has suggested the third innings should pause when the lead reaches 300, then the fourth innings takes place.
That way, a team that’s winning doesn’t have to pointlessly bat until the lead is over 500, before crushing an inferior opponent. Here’s how Jonas puts it:
I’ve modelled how this would work in practice, with the aim of answering two questions:
Does this make the strong team more likely to win? (Probably)
Is the game over sooner? (Generally)
Here’s the summary from the single scenario I looked at:
Scenario: West Indies vs England, Bridgetown.
England have batted first and scored 360. West Indies slipped up and were bowled out for 210. We join the action at the lunch on day three. England lead by 150. Two versions of this were modelled: under the existing laws, and temporarily declaring the third innings if they score 150 more.
Let’s see what happens:
In 92% of cases England made it to 150 without being bowled out – and so, with a lead of 300, temporarily declared
West Indies scored under 300 83% of the time – so the third innings did not need to re-commence
When the West Indies scored more than 300, sometimes the game meandered to a bore draw because the West Indies couldn’t confidently declare
Here’s the distribution of match end times depending on which rules apply:
We can see that there’s a big shift towards Day 4 finishes under compulsory declaration at 300 – mainly from the team batting fourth being bowled out for less than 300.
Worth noting the result wasn’t significantly affected by the rules being used. This would be different in other scenarios – such as if there was less time in the game.
Conclusion – This could be very useful in county cricket (where matches are only 4 days long). Suggest more modelling is required (especially scenarios where the odds are shifted from the draw being favourite to a result being favourite). A trial in County Championship Division 2 would be fascinating.
West Indies can beat England against the odds, but they’ll need their pace bowlers to perform.
The blueprint – Bridgetown 2015. 1-0 down in the series, with a first innings deficit of 68, the West Indies were about to be batted out of the Test.Hearing a wicket fall, a reveller in the Party Stand asked “Was that Trott or Cook?” and was baffled to learn that it was in fact Root, and England were 28-4. The new ball had done the damage, and by the time 20 overs had been bowled it was 39-5 and the game was back in the balance.
West Indies were eventually set 192. Darren Bravo marshalled the batsmen to the target with five wickets in hand. The hosts had accrued only three scores over 30 in the Test, but somehow pulled off an unlikely victory, and drawn the series 1-1.
With that surprise firmly in mind, let’s make some informed predictions for the upcoming series.
1) One spinner is the right choice. This decade the average is 32 for spinners, 26 for pace bowlers. It may be that pitches are turning more than they used to, and it’s true that spinners get 37% of wickets in the Caribbean, but this turn hasn’t delivered cheaper wickets. That said, if a team can reliably judge a pitch as more spin friendly than the average West Indian pitch, then they should go with two spinners – selectors just need to be sure there will be more in the pitch for spinners than quicks before making that decision.
2) West Indies’ best chance will come if their fast bowlers can keep England under 225 in one innings. Turning pitches or not, the West Indies have no elite spinners. If they are going to win this series it will be through devastating fast bowling.
They are unlikely to amass buckets of runs – so Holder’s bowling unit needs to neutralise England’s batting. Specifically, if England score fewer than 225 in one innings, that sets up a target within the range of the West Indian batting.
Taking all factors into account, modelling suggests the probabilities for the first test are: 24% WI. 7% Draw. 69% Eng.
West Indies will probably lose: their batting and spin bowling is inferior to England’s. But if we’ve learned anything from the 2015 series, it’s that home advantage is real, and the new ball could do some serious damage, leaving mystified England supporters to ask “was that Burns or Jennings?” as Stokes returns to the pavilion.
Clive (@vanillawallah) was looking at Kohli’s scores in ODIs since the last World Cup, suggesting that:
Kohli is consistent
He succeeds more than he failures
To check this, I compared Kohli’s performances against what my model would expect him to do – Kohli’s run ranges are broadly in line with what you would expect given his average. His consistency is a consequence of his ability, rather than a specific trait of his batting.
I modelled 1,000 innings for Kohli batting at 3 for India,
with an assumed average of 95 (his average over the last 54 games / 3 ½ years).
The results show slightly more single figure scores in the
real world vs model, offsetting slightly fewer scores in the teens. This is
likely due to small sample sizes.
Two interesting observations:
In a quarter of innings he would (and did) score a hundred. Phenomenal.
The run distribution is skewed towards the 30-50 range by Kohli running out of time – caused by India successfully chasing down targets and the match ending while he is mid-innings.
Rest of the Top 3
Clive also pulled in data on all other top 3 ODI batsmen since the last World Cup. This is a much larger sample size- and worth checking the distribution as a way of verifying my modelling.
Simulating 1,000 innings with two openers: one of whom averages 35, one of whom averages 45 reasonably reflects the real world distribution of scores that Clive showed.
– The real world having more low scores (probably from the
times when weaker openers have been selected)
– More hundreds modelled than seen.
P.S. Appreciate this is White Ball ODI Cricket rather than Red Ball Data. Don’t tell the Branding Police.
In this post I consider the evolution of England’s batting – how it steadily improved through the 2000s, peaked in 2010-11 (as England became World Number 1), tailed off from 2013, and is only recently recovering as we enter 2019.
I took the career averages of the top 7 batsmen for each England Test since 2000, and adjusted them for the age of the batsmen (I’ll cover how I do that in a later post). To eliminate artificially low results, Nightwatchmen are excluded. Where someone only played a few Tests, I made a judgement about what their long term average would have been had they played more Tests.
To bring out the trend, the chart above is smoothed with a moving average of the last five Tests.
Evolution: 2000 to 2019
Weakest Team: 29th June 2000, vs West Indies (Home) – Age adjusted Average 218
Vaughan, Hick, Stewart, Knight, White
If you don’t want to remember how bad England used to be, I suggest you skip to the next paragraph. Don’t worry, we’ll be talking about 2005 soon enough.
Let’s reel off why this team was the weakest this century: England had no batsmen in the top 10. Over five tests the highest total England could manage was 303. Only Trescothick and Atherton averaged over 29. Ramprakash and Hick never settled at Test level. By 2000 Hick was 34 and Stewart was 35. The weakest link was White – averaging 25 in 30 Tests is not enough for a number 7. The need for “the next Botham” was real.
That England won that series 3-1 was down to Cork, Gough, Caddick and White dominating with the ball rather than England imposing themselves with the bat.
England vs West Indies in 2000 marked a watershed for West Indies cricket: this was their first series defeat in England since 1969. Their record in Tests in England since is W1 D2 L13. England were on the up.
That’s more like it. Five of the fifteen England batsmen in the modern era to average over 40. This was a good batting side (rather than a great one), with room for improvement at 6 and 7. Flintoff was a proper all-rounder – a luxury England had not had for a long time. He was an all-rounder with a career batting average of 32 however. Geriant Jones was carried that summer, averaging 25 (in line with his career average of 24).
Strongest Team: 26th December 2010, vs Australia (Away) – Total Average 305
It’s too soon for many to realise just how good this team was: World no.1 from August 2011 to August 2012, with a strong enough four man bowling attack to confidently play six specialist batsmen.
In the 2010-11 Ashes series six of the top 7 averaged over 40; they accrued nine hundreds in only five tests.
But the side was aging: Collingwood 34, Strauss 33, Pietersen 30. The eldest (Strauss & the retiring Collingwood) needed to be replaced in 2011. As it was, Strauss stayed on for 18 tests, but would pass 100 only twice more, averaging 31 after the 2010/11 Ashes.
As players retired they were, predictably, hard to replace. England were also unlucky in that Pietersen and Trott didn’t go on to play full careers with England.
By March 2014 England’s ICC Test Rating had slumped to 100: they went from best in the world to average in 38 months. In May 2014 they lost a home series against Sri Lanka. Some stars (Cook and Root), some young players being played too early to succeed (Ali and Buttler) and Bell had gone on a bit too long.
Current Team: 23rd Jan 2019, vs West Indies (Away) – Expected Average 268
Not bad, probably the best selections that could have made, and should be too strong for the West Indies.
It’s important to see this side for what it is: lacking in stars, yet well balanced with three all-rounders. With Ali bolstering the batting at 8, this team are likely to continue the trend of winning at home but losing away against the top 6.
Verifying the Data
To check this model (age adjusted batting average) against reality, I compared this to the ICC rankings. The correlation is clear. Worth noting that since the Age adjusted Batting Average is smoothed using a 5 point moving average, there is a time lag in the orange curve. This correlation is surprising as the ability of the top 7 batsmen makes up less than half of the strength of a team (the remainder being bowling ability and tail batting strength).
2019 are at about the level of the 2005 Ashes side, by having no weak links
rather than being packed with world-beating batsmen.
Managers tend to pick a strategy that is the least likely to fail, rather then to pick a strategy that is most efficient. The pain of looking bad is worse than the gain of making the best move.
In the last 35 years England have had just 15 batsmen who averaged more than 40 over their career. Expectations should shift: aspire to players averaging 40; accept batsmen averaging 35.
The chart below may surprise you – it surprised me. How could barely any recent English batsman reach the benchmark set for them? Averaging 40 (at least in my head) was a minimum, not an elite average.
The data speaks for itself- 45 isn’t the new 40. 35 is the benchmark, and has been for a long time.
We, the red ball loving hordes (and our journalist generals) need to help the selectors by having realistic expectations.
The selectors should return the favour: stick with players that are good enough, even if they aren’t stars, and even if pundits are piling on the pressure.
Next time someone is 10 tests into their career, averaging 34 and with the data saying they would average 35 long term, let’s not call for a change because they aren’t scoring enough. Only remove them if a better prospect comes along – not someone with similar numbers who we might want to gamble on.
There’s a great case study: Andrew Strauss retired in 2012, and received wisdom is that he is yet to be replaced as an opener. We wanted the next Strauss. We should have been looking for the next Rob Key (15 tests averaging 31 between 2003-2005 while we waited for the next star batsman to come along).
Remember who Carberry got his runs against? An away Ashes series in 2013: Harris, Johnson, Siddle, Lyon, Watson. Those 281 runs were well earned.
With hindsight, pretty much every pick between Robson and Jennings was an error. England had viable alternatives for Strauss 3 times: Compton, Carberry and Robson. Having rejected them, playing people out of position (Trott / Ali) and gambling on youth (Duckett / Hameed) as the next cabs off the rank as England moved ever further down the list of possibles.
England chose weaker options because they weren’t willing to settle for a batsman averaging in the low-30s. That cost England runs- and since the selectors’ are employed to pick the best team possible, this is a failure. One they don’t get criticised enough for. Fear not, dear reader, we know England’s best batting options– and will collectively tut if the selectors deviate from them!
Conclusion: England should hold their nerve, even if Burns and Jennings are only averaging 33 coming into the Ashes.
CricViz now use False Shot Percentages as a metric for assessing batsmen. Most recently they have done this as one factor when considering Australia’s options for the Sri Lanka tour.
A key point is that False Shots and averages are not equivalents – if a two batsman both have a 10% False Shot rate, the more attacking batsman will average more because they will score more runs for each error they make. One has to combine False Shot Rate and Strike Rate to get a useful metric.
As such, I’ve used the data CricViz published, and overlaid that with First Class Strike Rates to give an expected average derived from False Shot %
The chart shows that Maxwell leads the options (due to his Strike Rate of >70 runs per hundred balls, combined with a healthy 10.4% False Shot rate. This is interesting because his 3 year Sheffield Shield average was only 43. Worth bearing in mind he isn’t a Red Ball regular, with only 962 runs in the last 3 years.
Handscomb (real world average 50, False Shot average 57) can feel hard-done-by to have missed out on selection. He averages 38 in Tests, it looks an odd choice.
There is evidence that Pucovski is as good as the hype – CricViz’s data suggesting that not only has he performed well (FC Average 49 after 8 games), but that it isn’t a fluke (v.low False Shots implying he may have been unlucky to average only 49 in those 8 matches). Still, it’s a small sample size.
Conclusions: False Shots combined with Strike Rate are a potentially useful tool in predicting player averages when limited data is available (such as young players). However, more evidence is required of long term correlations before False Shot % and Strike Rate replaces averages.
Whinging about selection is part of how I traditionally
spend the days leading up to an England Test. It’s my habit, and I’m probably
not alone in that.
With the new(ish) England selection panel of Ed Smith, Trevor Bayliss, and James Taylor, whinging about batting selection has been more difficult.
Burns in for Cook? The logical choice. Moeen Ali
recalled? Makes sense. Buttler plucked from White Ball obscurity? Not what I
would have done (Hildreth or Livingstone), but OK.
Looking for some whinging ammunition ahead of England’s first warm up game against a West Indies Board XI on 15th Jan*, I did some analysis of England qualified batsmen. Specifically, their records in the last 3 years of all Red Ball Cricket (Test to 2nd XI, adjusted for difficulty).
What I expected to see was a clear hierarchy of players, with some of my favourites at the top, and England’s sub-optimal picks somewhere down the list. Actually, the selectors’ choices are supported by the data, and England have a big group of players who are of very similar abilities.
Below I’ve grouped players by expected Test average, based on the last 3 years:
World Class (Expected Average 42+) – Root & Bairstow
Test Regulars (Expected Average 35-42) – Pope, Burns, Ali, Stokes
Wildcards (Data says Expected Average >30, but reasons to be suspicious)– Northeast: mostly driven by 2016 scores in Division 2. A poor run at Hampshire lately. Hughes: scored 425-3 in 2nd XI last 3 years. Didn’t play a first class game in 2018, only made 209 runs at 23 in the 2018 North Staffs Premier League, so probably safe to rule him out of Ashes contention.
From a batting perspective, England have chosen well. They’ve picked all the World Class and Regular players (apart from Pope, who only has 32 completed innings, and is on the fringes of the squad). All their other batsmen are from the Plausible Selections bucket. England have a lot of Plausible Selections; it doesn’t really matter which of them they pick. Dropping Buttler for Hales would be worth about 4 runs over the course of a Test. As long as the selectors keep picking players that are amongst the best available, I’ll cut them some slack.
England’s batting is weaker than at the start of the decade. England were spoiled by a team with 7 batsmen who averaged over 40 – like this side that beat South Africa by an innings in Durban in 2009. Pragmatically, they use 2 or 3 all-rounders (Stokes, Ali, Woakes) and often use 8 batsmen to do the job that 7 did at the start of the decade.
A number of players have been tried that currently average under 30 in Tests: Stoneman, Westley, Jennings, Duckett, Hales, Pope. This analysis indicates that these were good selections, and much of the underperformance is due to chance. An example: Stoneman averaged 28 in 11 tests, against an expectation of 34. But 11 tests is a small sample size, and 7 of those tests were away, including an Ashes series.
Bairstow is one of England’s two best batsmen. Dropping him would be an error.
*England’s Squad to tour the West Indies (Batsmen only):
Joe Root (Yorkshire) (captain), Moeen Ali (Worcestershire), Jonny
Bairstow (Yorkshire), Rory Burns (Surrey), Jos Buttler (Lancashire), Joe Denly
(Kent), Ben Foakes (Surrey), Keaton Jennings (Lancashire), Ben Stokes (Durham),
Chris Woakes (Warwickshire)
At Globogym we’re better than you. And we know it!
Dodgeball – 2005
In last week’s blog, the data showed how poorly some overseas players performed in First Class cricket compared with their Test performances.
Looking at overseas players, surprisingly they perform 21% worse in Division 1 than their Test average. Contrast that with England players who do 28% better. Two examples jump out: Pujara scoring 172 runs at 14, Kane Williamson scoring 260 runs at 26. How can we explain those scores?
As there have been only 20 non-England Test players in Division 1 over the last three years, the sample size is too small for meaningful analysis. To get more insight, I’ve combined Division 1 and Division 2, which increases the sample size to 331 completed innings. I then found 3 factors which influence performance:
SA / NZ / Australian players outperform other nations (probably because these are the countries with conditions most similar to those in England).
Test players will average more in Division 2 than Division 1.
Top order (1-3) batsmen are most affected by English conditions (this makes sense – they will face lengthy spells against the best County bowlers with the ball swinging and seaming more than they are used to). Middle order players (numbers 4-7) are unaffected, while tailenders get a boost to their average.
I created a model to quantify this behaviour,
combining these factors. The best fit to the data is as follows:
SANZAR +10%, others -10%
Top order -25%, Middle order +3%, Lower order +25%
Division 2 +10%
Applying this makes Pujara’s performance less of an outlier, and more a function of being a number 3, and therefore the wrong type of overseas batsman to go for. Using my model, his expected average in D1 is just 36, and while he underperformed this, it’s no longer an outlier. Similarly, Azhar Ali (Test Avg 48) would be expected to average 33, and averaged 34.
But – the current iteration of the model has arbitrary cut-offs (why should a number 4 outscore a number 3 by 25%?) and the above table has a high standard deviation. I’ll enhance it once it can be tested against 2019 data.
What the current model can do is make predictions:
Poor 2019 Overseas Player selections
Azhar Ali will be playing for Somerset next season. He’ll be 34 by then, and will be expected to average 30. I hope they aren’t paying him too much. Next season could be the one where Somerset’s batting frailty bites.
Bancroft at Durham and Joe Burns at Lancashire should struggle at the top of the order.
Top 2019 Overseas Player picks
1. S.Marsh better hope Glamorgan bat him below 3 – he could do well if he avoids the new ball.
2. Temba Bavuma isn’t the strongest Test batsman, but as a 28 year old he’ll be at or near his peak, and Division 2 cricket with Northamptonshire should suit him. It helps he doesn’t start until 14th May.
3. Bowlers! Abbas, Worrall, and Siddle should be far more valuable than top order batsmen. That said, I’ve not done the analysis of bowlers yet. Watch this space.
“Coach woulda put me in fourth quarter, we would’ve been state champions. No doubt. No doubt in my mind.”
Napoleon Dynamite (2004)
It’s often assumed that we cannot compare Test and first class batting performances – the old comparing ‘apples to oranges’ conundrum. But if we can quantify the relative values of the different formats, we can compare like with like.
Looking at batting performance of players who’ve played across multiple formats in English* domestic cricket (2016-2018), one can assess the relative difficulty of each tier. My analysis found that it’s 19% harder to bat in Test Cricket than it is in Division 1.
If a player averages 40 in Division 1 – the data says you could expect him to average 31 in Test cricket, 44 in Division 2, and 54 in the 2nd XI.
That tells us that you’d need to consistently average over 55 in Division 2 to average 40 in test cricket – hence so few England players being pulled from those ranks in recent years.
It also means that Hildreth (who I’ve previously thought of as an England option as he averages 41 in Division 1) would be expected to average 32 in Tests, and therefore isn’t the batsman we are looking for.
A few examples of 2016-2018 Division 1 and Test averages:
Note that only Root and Buttler underperformed in Division 1 relative to
At this point its worth going into the assumptions – professionally I’m
always keen to show where the data ends and the judgement begins. The data can
tell us performances for each player who crosses tiers. Judgement needs to be
applied to appraise that data and turn it into a single factor.
Jonas (@cric_analytics) has looked at minimum 10 innings in both competitors – the downside of this is that it excludes valid data points. For instance, Ben Stokes scored 226 @ 28.3 in D1 in the last 3 years – 10 runs below his test average. That should count to the total, even if it’s a small sample. Jonas reckoned a 20% gap between Test and County cricket – slightly wider than my data suggests.
Include all overlap – the risk is that this is skewed by a few high/low scores from one-test wonders against weak/strong opponents. This gives a mere 2% difference between Test and D1.
Overseas players included: this gave an 8% gap between D1 and Test – but playing away from home knocks 10% off batting average, so this is not a fair comparison. To put it another way, Pujara playing for Yorkshire averaged 14, because every game was an away game.
I have used relative performance for English players with >4 completed innings in each format, and weighted the overall result according to the lower of the completed innings in each format. For instance, Ben Stokes has played 8 completed D1 innings, but 46 Test innings – so the overall result is weighted with a factor of 8 because of Stokes’ performances, while Dawid Malan played 36 D1, 26 Test innings, so is more useful for this exercise and receives a weighting of 26.
Adjusting for the level individuals are playing at, allows comparison of players in different tiers. In future posts I’ll look at some implications of this data:
2nd XI players with the potential to be First Class batsmen
England’s best available batsmen
Overseas players: who has & hasn’t succeeded – will look at any trends in the data.
It’ll take more number crunching, but I’m interested in linking First Class / List A performance- to see how well correlated they are, and use that to gauge quality of players for which limited data is available (there are a lot of players with a handful of FC games behind them – too few completed innings to fairly appraise them
*I know it’s English and Welsh. Sorry Glamorgan. There isn’t an easy word for English and Welsh, so I’ll use English as shorthand for English and Welsh.
“You’d better listen to her, because the Pentagon does”
Top Gun (1986)
A bit about me before I get into the numbers:
It’s easy to have an opinion, and particularly easy to broadcast that view online. Filtering out the noise is a challenge.
So why should anyone care what I think about cricket?
Well, my cv for starters- Masters degree in Physics from Oxford (4th year was focused on simulations of Earth’s atmosphere), then qualified as an accountant, spent 2 years in Banking Front Office (where I cut my teeth on excel modelling), and after a further role in Banking Finance I’m now working for a FTSE-100 retailer, doing modelling and strategy.
It’s not quite the Pentagon, but you should listen to me, because some people at a FTSE-100 retailer do.
In 2011 I built a test match simulator – which could predict the outcome of an innings from a given starting point, based on ball by ball bowler vs batsman probabilities, and running the simulated innings enough times to get a reasonable sample (>1,000). This was mainly for gambling, and it works.
Later I expanded this to cover the two white ball formats, though the 50 over model has always received more attention than the 20-20 one – I don’t mind 20-20, but I struggle to love it.
With a full time job, and a young family, cricket data comes third on the list – and that means I will focus on red ball cricket. There’s a lot of professionals who have got further than me in 20-20, and I’m not going to stand out by splitting my efforts across 3 formats.
Let’s see if I can come up with some original thoughts, and some predictions which stand the test of time.