England vs Pakistan Preview – August 2020

An impressive batting unit supported by an exciting bowling attack versus an impressive batting unit supported by an exciting bowling attack.

What will decide the series? Here’s my usual pre-series ramble, with stats on the Pakistan squad at the bottom.

  1. All rounders. England are running out fast. Debateable whether England will even have an all rounder for the first Test. Some names to mull: Ben Stokes (is he fit to bowl?), Joe Denly (dropped), Moeen Ali (dropped), Chris Woakes (struggling with the bat), Sam Curran (is he good enough at either discipline)? As for Pakistan, Shadab Khan averages 34 with the bat from five Tests, but only 27 in First Class – more a number seven than a six. Pakistan will be gambling either way: a five man attack lengthens the tail, an inexperienced four man attack has nowhere to hide.
  2. Pitch preparation. England would be stronger if they could confidently not pick a spinner (Bess didn’t contribute a lot with the ball against West Indies). Pakistan are itching to play two spinners. Why would the Old Trafford groundsman produce a deck that turns? Worth noting in the two Manchester Tests this summer, spin averaged 52 while pace was around half that at 27.
  3. Naseem Shah and Kashif Bhatti. Pakistan’s batting is solid, enough talent that they can cover if any one of them goes full Shai Hope. For all the excitement, I’m uncertain about their bowling. Mohammad Abbas is a banker (Test average 21, and the same average in two devastating seasons at Leicestershire). Shaheen Shah Afridi has 30 Test wickets, so has some track record. Yasir Shah has a proven record – it’s just a bit mediocre (averaging 34 in the last four years). One of Naseem Shah or Bhatti thus has to step up. The signs are good – both average 17 in domestic cricket in the last four years (Bhatti has 125 wickets, the younger N. Shah has only 26).
  4. Rest. West Indies clearly don’t read this blog (or The Daily Telegraph), else they wouldn’t have knackered their bowlers playing three back-to-back Tests. A three Test BTB series is more like a tournament than a traditional Test series: you’ve got to manage bowler workload. The easiest way to do that is to pick the best team for the first game, then half the pace bowlers miss the second Test, and the others miss the third. Sohail Khan is good enough to rotate in – but are the other Pakistan squad bowlers Test standard? With England’s squad depth, their edge will get bigger as the series goes on.
  5. England’s Ashes tunnel vision. Picking Crawley and Bess with an eye on December 2021 is silly. When the West Indies series got real, Crawley was dropped and Bess didn’t get a chance to bowl. Pakistan will be only too happy if England’s team sheet has a number three with a First Class average of 31, and an off spinner for Pakistan’s right handed middle order to milk. Bess isn’t a bad player, it’s just that England would have a better chance of winning playing an extra pacer.

England start as favourites. Burns, Sibley, Root, Stokes are a fine core of a batting order, and there’s healthy bowling options. If Stokes can bowl, a balanced England team playing on increasingly familiar territory should be too strong.

Get to know the Pakistan squad: Stats

Batting:
Note through this lens Babar Azam isn’t the standout batsman.
Bowling:
Note the domestic four year averages of Abbas, N.Shah and Bhatti.

Bob Willis Trophy preview: part two

It’s the evening before the county season starts, and the squads have been announced. That means I can tell you which teams have the best chance of success.

Here’s a unique preview – data driven, based on each player’s red ball performances in the last four years. Most previews name a couple of stars, “one to watch”, and throw in some juicy facts and interviews. Redballdata.com sadly has none of this.

So how can I help you? Without the Test and overseas players, we’ll see lots of talent emerge from the 2nd XI. You and I may not know the names, but I’ve rated those players. The database uses the last four years’ data for Test, county, and 2nd XI Cricket, adjusted for difficulty. For each group, I’ve ranked teams in order of strength, and below the commentary you can zoom into each squad to see the individual rankings.

The North Group

Lancashire (Favourites): Have the bowling to force results. Livingstone, Vilas are two of the best batsmen in county cricket.
Yorkshire: Challengers, can they win their first game without the ODI players? Will Olivier come to the party? So far he hasn’t shone in the County game. Excellent top order, but will they miss Bresnan’s batting?
Durham: Raine & Rushworth are an effective pair. Deep batting line up covers the lack of stars. Having four home games helps. I’ve put a couple of pounds on Durham at 33-1 (Ladbrokes), if that sort of thing is of interest to you.
Notts: Best batting in the division: I’m baffled at how that unit struggled so badly last year. Worried about the bowling, especially if Fletcher is out for a while. Ball can’t do it on his own. Mullaney might bowl a lot this year, which is no bad thing.
Derbyshire: Hard to see this very raw attack winning the group. Batting’s not too bad mind (Godleman, Reece, du Plooy).
Leicestershire: On paper the weakest team. Maybe one of the younger bowlers will surprise us, otherwise 20 wickets is a tall order. Competitive top three batsmen (Azad, Slater, Ackermann) but not much after that.

The Central Group

Somerset (Favourites): Best attack of the 18 counties. Should win the weakest division.
Warwickshire: Will the batsmen let down the bowlers? Much depends on the ageing Bell, Bresnan and Patel. Better reserves than most.
Worcestershire: A couple of batsmen light. Moeen Ali and Ed Barnard are fine all rounders which help balance the side out. Banana skin vs Gloucestershire first up as Worcestershire won’t have Ali (England duty).
Northants: About three bowlers light. Can Sanderson repeat the magic of 2019 (60 wkts at 20)? If not, “definitely viewing it as a squad competition” might make for some weak teams by late August.
Gloucestershire: This campaign may be an awkward reminder that overseas talent is needed for Gloucestershire to survive in Division 1 next year. Dent and Higgins are clearly talented, but there are stronger squads out there.
Glamorgan: Cooke will need to deliver for Glamorgan to get enough runs on the board. The injury to Timm van der Gugten is unfortunate – Glamorgan are the weakest attack in the Central Division without him. This year could be valuable experience for the core of a fine future team- Selman (age 24), Carlson (22), Douthwaite (23), Carey (23), Bull (25).

The South Group

Essex (Favourites) are good. The best team in a tough league. Expect Harmer to deliver with the ball, supported by Porter & Cook. Sir A. N. Cook is the best batsman on show in county cricket.
Surrey … Imagine what they could do at full strength. Can hardly blame them if this year is a struggle. Adding Jamie Overton helps, an unexpected development.
Middlesex are my kind of team – enough batting and bowling to compete, maybe slightly under the radar. Lack of spin options may be exposed in their three away matches, if groundsmen play their cards right.
Hampshire: Ditto. No Abbott. No Edwards. Need to get through the games without the ODI players (Vince & Dawson), and see what happens. Mason Crane has an opportunity – there’s lots of right handers out there.
Kent: May do OK against Hampshire and Sussex’s attacks. The other three sides will take some withstanding though. Could do with Denly making an appearance.
Always up against it, Sussex have given a chance to lesser known players this year. A shame. Not sure where Wiese, Wells, Bopara and Beer are. I’ll give anyone sitting this tournament out the benefit of the doubt: I’m not playing cricket in a pandemic, so can’t expect them to.

Anyone can win. Don’t expect it to be the best team – it’s only a five match series. The bookmakers know this – there are 11 teams with more than a five percent chance of winning, yet no team has a greater than fifteen percent chance.

Tomorrow I’ll be following Durham-Yorkshire. A Durham win would make the North group so much more interesting.

Before you go, here are some trends we might see this year:

  • 0fers – there are bowlers that just aren’t ready for this level. They’ll go wicketless, and heap pressure onto captain and opening bowlers. Canny batsmen will get after them.
  • Clusters of wickets – inevitable when the standard is this variable.
  • The league won’t be won by stars – it’ll be won by the deepest batting lineups, and the bowling attacks that never let up. Hence Lancashire, Yorkshire, Somerset, Essex being favoured. Many won’t see it that way – they’ll talk of centuries and five-fors, but it’ll be “Not Collapsing” and “9-2-30-1” wot won it.

Bob Willis Trophy preview: part one

Strange times. This year’s County Championship makes the best of a bad situation by fitting in a five-match group stage across August and September. Here’s what I think will happen, based on the Playing Conditions; disrupted squads; and the weather. Part two of this post will look at which players and teams I expect to do well.

Playing conditions

  • A reduction from a minimum of 96 overs to a minimum of 90 overs in a day’s play.
  • Each county’s first innings of a match can last no longer than 120 overs
  • The follow-on will increase from 150 to 200 runs
  • The new ball will be available after 90 overs rather than 80 overs
  • Eight points for a draw
  • Three regional groups of six. Two group winners with the most points contest the final.

Impacts

Perversely, more draws. Fewer overs per day removes up to 24 overs from a match. Capping an innings at 120 overs limits a team’s ability to go big batting once. Add to that the increased points on offer for a draw, and canny captains (once behind in the match) may change focus to points accumulation. While there is need to win the group and outscore one of the other group winners, a defeat makes qualification very unlikely – so conservative cricket may dominate the first two rounds. The last thing you want to do is give your rivals a 20 point head start.

Mismatches – all 18 counties together for the first time since 1999 gives an opportunity for the stronger players in the second division to prove themselves. However, there is the potential for some mismatches. Gary Ballance against some of the weaker attacks in the North group, for example.

Lopsided groups. The South group is toughest, and thus we are obliged to tag it the “group of death”. Sussex and Middlesex are the second division teams in that group, but are better than that. It will be difficult to win the South group, and the winner may not even qualify for the final if their victories don’t yield sufficient points.

Nothing to play for. After two defeats, a team is almost certainly out of contention. With no relegation, I hope teams do the decent thing and give 100 percent. This will be difficult. “Come on lads, let’s do it for the fans streaming this whilst working from home!” Hopefully something resembling the best possible team is selected, though it would be totally understandable if this weren’t the case: players may have other priorities in a pandemic.

Spinners to the fore. A new ball after 90 overs favours spinners (who will have the ball in their hands more) and lower middle order batsmen (who get easier conditions for longer). Win the toss and bat, surely.

Timing and weather

August / early September matches should slightly favour the bowlers. Last year’s first innings scores were 20 runs lower in August/September than the rest of the season. The Test matches have offered turn, indicating what pitches might do given the dry summer we’ve had.

The long range forecast from the Met Office is understandably vague, though hints at more weather disruption in the north than the south.

Confidence is low, but the second week of August is likely to see a mixture of dry and settled conditions, interspersed with occasional bouts of wetter and windier weather. The majority of the unsettled weather will most likely be in the north and west, though it may spread further south and east from time to time. Temperatures are likely to be around the average for this time of year, with any particularly warm weather being short-lived and generally towards the south or southeast. Looking further ahead into late-August, there are some tentative signs that conditions could become more widely dry and settled, particularly in the southeast.

Availability

These aren’t the county sides you’re used to. No overseas players. No England Test players. Won’t see much of the England white ball crew either. That means ignore the 2019 league positions and look at who will actually be playing. Are Hampshire a credible force without Vince, Dawson, Edwards and Abbott?

This is a great opportunity for the 200-250th best cricketers in England & Wales to get a run of five games. Let’s see how many of them can translate second XI success to First Class.

I’d normally end with a proper conclusion- but without analysing the squads that would be a mistake. Will save that for next time- once the teams announce their squads, I can pull in the ratings from my database to see who is best on paper.

For now- my hunch is that the Central group is the best one to be in. Can’t wait to run the numbers on Gloucestershire, Northamptonshire and Somerset and see who is best placed.

Pace bowlers struggle in back-to-back Tests; Pope Catholic.

No sensational claims today – just quantifying what you already know. Bowling pace across two Tests in quick succession makes a player tired and less effective.

I took all match performances this century, comparing data against the next game that player bowled in. Then cut the data by the number of days between games. For instance, in this England vs West Indies series, the first game started on 8th July, the second on the 16th July – eight days apart. Any gap of nine days or under between start dates, to me, is “back-to-back” (BTB).

What did I find? Pace bowlers in the second of a pair of back to back Tests average 7% more than in the first game.

With more number crunching we can get closer to understanding why this happens:

•            Are there more long hops, driving up averages as bowlers leak runs? No – Economy rates don’t significantly rise.

•            More fielding errors? No – there’s no effect on spinners – so it’s probably not fielding causing it.

•            Pace bowlers must just be less potent when tired. Interestingly, at higher workloads the effect is bigger – pace bowlers bowling over 40 overs in each match of BTB Tests add 8% to their average in the 2nd Test.

Conclusion: this is significant – I’m adding it as an input to my Test match model. Front line pace bowlers add 3.5% to their average if a quick turnaround from the last Test. This rises to 4% if they bowled over 40 overs in the prior Test. Reduce expected average by 3.5% if this match isn’t BTB.

In case you’re wondering why it’s only 3.5%, and not the 7% I quoted earlier, during the first BTB Test, the bowler will be well rested, so 3.5% better than usual. The second Test they’ll expect to be 3.5% worse than usual, giving a 7% gap between performances.

PS. We finally have a mechanism for why home advantage gets bigger as a series goes on – hosts have a huge player pool to draw from, tourists are drained by practice matches.

PPS. This nugget of trivia will make you feel tired just reading it: 38 year old Courtney Walsh delivered a mammoth 128 overs across two BTB Tests in March 2001. It was six months after Ambrose retired, so Walsh was asked the impossible. West Indies lost the series 2-1. To his credit, he took 9-216 across the two games. Understandably, he almost immediately retired from Tests.

Implications for the July 2020 England vs West Indies series

While for most participants the sample size is too small, England’s veterans have kindly left a trail of data over the years: Anderson averages 25 in non BTB, 27 in the second Test of BTB. Broad averages 27 in non BTB, 29 in second Test of BTB (bear this in mind if Broad is picked for the third Test).

Notes:

•            Third Test is back-to-back-to-back – WI will be out on their feet unless some of Raymon Reifer, Rahkeem Cornwall, Chemar Holder get rotated in. Expect Cornwall & one other.

•            What the heck were West Indies thinking fielding first at Old Trafford? The betting markets thought it was a mistake at the time. If they’d planned to field first, why didn’t they bring a fresh bowler into the team?

England vs West Indies (July 2020) – Preview

I recommend you read this back-to-front. Like a newspaper: skip to the tables at the end, digest the stats, make your own mind up – then read my words and see if we’ve reached the same conclusion.

On paper this series is a mismatch – the fourth ranked team hosting the eighth. West Indies averaging 23 runs per wicket over the last three years, facing English bowlers in English conditions. Yet there are reasons to believe in the tourists: eight of their expected top nine are peaking, aged between 27 and 30. They could have the best Test opening bowlers right now in Kemar Roach and Jason Holder. Roach averages 22 over the last four years; Holder 23.

Talk is cheap. It’s easy to argue this either way. What does the data say?

By my ratings, England are 50 runs per innings stronger, a 59% chance of winning (West Indies 29%, Draw 12%). Bookmakers only give West Indies an 11% chance. Intriguing.

Do people underestimate this West Indian side? The difficulty of batting in the West Indies Regional Four Day Competition is roughly comparable with County Championship Division 1 – so the last-six-year domestic records of Brathwaite (avg 45), Hope (57) and Chase (46) indicate their underwhelming Test records are misleading. Note Hope hasn’t played a domestic game in three years. He averages 52 in ODIs, but it looks worryingly like he’ll never fulfill his Test potential. Modern cricket.

Some thoughts on the optimum makeups of the sides:

Holder is best at eight. West Indies’ strength is in bowling; their weakness in batting. With canny selection they can paper over the cracks. Jason Holder, Raymon Reifer and Rahkeem Cornwall could feasibly be 8-9-10 giving West Indies the best of both worlds. However, the lure of picking the best bowlers would lengthen the tail with a batsman being displaced (Holder, West Indies’ highest placed batsman in the ICC rankings, moving up to six as part of a five man attack). That would be a mistake – the West Indies win probability would drop by 4%.

West Indies only have one other decision to make: do West Indies need a front line spinner? This decision should be based on reading the pitch. If not, Roston Chase covers those overs. If they do, then J Holder, Cornwall, Reifer/Gabriel, Roach is logical. Cornwall isn’t the Test prospect he appears: expect a mid-30s average. While he has a fantastic domestic average (23) over the last four years, this is flattered by spinning domestic conditions. Remember that Chase also averages 24 in that period, but 42 in Tests.

The hosts’ shaky top order means England have to pick a number eight that can bat – which limits their choices. If Jack Leach plays, then one of the batting bowlers (likely Chris Woakes) needs to play. Woakes loves bowling at home: in the last four years he averages 21. Alternatively, Moeen Ali could play: this is Stuart Broad’s best chance of joining Archer/Anderson/Stokes as England’s pace quartet. Broad may not make the cut– he’s played every home Test since 2012, but is sliding down the pecking order.

Leach (SLA) is the best slow bowling option. West Indies’ middle order is packed with right handers. Leach & Parkinson turn the ball away, so have an advantage. Leach also has the best county average over the last four years (23). Meanwhile Ali averages 40 against right handers. If Ali plays (for his batting), the West Indies should focus on seeing off the new ball, because favourable conditions await.

It doesn’t really matter which ‘keeper England choose. The gap was marginal when I looked at it before [link]. This just isn’t a debate that excites me- it’s a judgement call, and no criticism should be levied at selectors if it fails. Unlike Zak Crawley, who would be a bold and wrong selection, going against the publicly available data. His best first class season saw an average of 34. If he’s picked and fails, it’s not his fault- blame the selectors. If he succeeds, I will give them credit.

Both teams impress with the ball. The batting will decide the series. England at full strength are better than the West Indies. Most of that advantage comes from Root and Pope. Neither team has much in the way of batting reserves. With Root unavailable for the first Test, England have a lacklustre choice of alternatives. Ballance and Kohler-Cadmore aren’t in the squad. The replacements are c.14 runs per innings weaker than Root.

While the West Indies batsmen are at their peak, England are looking to the future. If England go 2-0 up (which is perfectly plausible), they could have six players aged 24 or under (Sibley, Lawrence, Pope, Bess, Curran, Mahmood) in the dead rubber to ensure the old farts don’t break down with three tests over 21 days. Need to keep something in the tank for Pakistan.

Look out for bowler workloads. Tests on the 8th, 16th, 24th July. James Anderson is 37 years old. Roach and Holder are easily West Indies’ best bowlers. This might have some anti-cricket effects: if the opposition are 200-1 chasing 260 on the fifth day, do you take your best bowler off the field to rest for the next Test? Don’t want to risk them in a lost cause. No problem to fatigue (not injure) Reifer or Archer, but not the star bowlers.

And a left-field hypothesis, which I don’t really believe: Stokes will fail with the bat because he needs a crowd. He feeds off it. Away from thousands of fans he isn’t the same player. In six years of county cricket he averages 25. In the UAE he contributed 88-6.

PS. I’ve cut home advantage in my model to 10% (from 20%) to reflect the lack of crowd. No idea if that’s the right thing to do. The Conversation reckons it’s nil for crowd-free football. Betfair podcast thinks it’s also nil.

Appendix – Data tables

I had these spreadsheets in front of me as printouts when I appeared as a guest on three recent Betfair “Cricket only Bettor” podcasts, which you can listen to here, here and here.

West Indies Batsmen

West Indies Bowlers

England Batsmen

England Bowlers

Test partnerships – does it matter who bats with whom?

Does cricket lose something when we are dispelled of its myths? Some fictions are unhelpful, such as Michael Vaughan’s success without having thrived at county level. However, we like to believe in partnerships: every smile and punch of gloves boosting the batting of our heroes, spurring them on to greater heights.

Thus I write hesitantly – I am loathe to reduce cricket to a spreadsheet, even though I literally do that. Hopefully some unsolved X factors will remain after the stats revolution.

On to today’s topic. Last time we saw that right-left partnerships don’t influence white ball run rate. This post covers the currency of red ball cricket: averages. Does who you’re batting with impact your average?

Considering the period 2010 to today, seven pairs performed much better than expected based on the records of the individuals in that partnership. Two pairs performed worse. They are shown below, ordered by how surprising that out-performance is.

That’s nine outliers – seven good and two bad.

But what are the chances each outlier was just fluke? After all, Clarke & Ponting only had 20 partnerships in the 2010s. After this analysis of error bars on averages we have a way to answer that – by quantifying how likely it is that a specific average (eg. Jermaine Blackwood averaging 37 in England) is arrived at by chance, based on the sample size.

With 120 partnerships (min 20 innings) since 2010, we would expect six pairs to lie two standard deviations from expected average. Actually we have nine. On the face of it, that’s evidence that some duos do get a boost from batting together. However, two of the nine drop off the list with further scrutiny. Kayes and Iqbal happened to bat together more at home than away. Bell/Pietersen somehow had 19 of their 23 partnerships in the first innings. Adjust the calculations to reflect that, and we have seven outliers, whilst by chance we would expect to have six. In layman’s terms, if each duo batted together enough times, their partnership average would eventually reach their combined average.

Here’s the chart of all 120 players, plotting variance to expectation against frequency. Even with small sample sizes, most partnerships average within five runs of expectation.

Where does this leave us? Remembering that “absence of evidence is not evidence of absence“, the jury’s deliberations will continue, but they will now be leaning in favour of specific partnerships not making a significant impact on a player’s average. Cricket is a one on one sport, bowler against the batsman on strike.

***

PS. How did I arrive at the expected average for a partnership? Start with the mean of the post-2010 average of the two players in each partnership. Add 1.5 runs for any partnership that isn’t two openers, on the basis that one of the batsmen will start the partnership with their eye-in. Add 4.6% for the extras that would be scored in that innings. It’s a slightly different formula for when a senior batsman is with a tailender.

PPS. Why the cut-off in 2010? “No balls” dropped off then. Here’s the 50 year history of extras in Test cricket. Extras count towards partnership totals, so the maths gets more involved when extras vary significantly by year.

Do right-left pairings score faster in ODIs?

Let’s start with the superficial (Boo! Hiss!) – a right-left pair score 0.8 runs per hundred balls faster than a right-right duo.

ODI partnership summary – min 120 balls, top nine teams only, up to 18 June 2020.

But right-left pairings aren’t something exotic. They are the normal state of affairs. 48% of ODI runs are scored by this combination. No bowler should be phased by normality.

Jarrod Kimber, while concluding that “it’s complicated”, suggested the quicker left-right scoring is a combination of additional wides and ensuring unfavourable spin matchups for the fielding team.

But what about taking into account how quickly players usually score? Gayle, Munro, Morgan are quick scoring left handers, who will be involved in fast scoring partnerships.

I’ve taken each ODI pairing of the last five years and looked at how quickly they should score together – which is the mean of their strike rates. For instance, Sikhar Dhawan (98) and Rohit Sharma (96) would be expected to score 97 runs per hundred balls. Actually, they favoured setting a base, and scored at 86 per hundred balls. No right-left benefit there. However, the Dhawan-Sharma point is anecdotal – the real story is in the general case.

Two ways we can look at this – firstly, excess runs per hundred balls (ie. take all the right-left pairings, compare the runs they scored against expectation based on individual strike rates, and divide by the number of balls bowled). Right-left combinations are weaker than right-right pairs on this metric by 0.2 runs per hundred balls.

Next, because the first method is weighted towards players that batted together lots (Roy-Bairstow’s blitzes have a big impact), we take the raw average of each pairing. For example, Dhawan-Sharma’s impact score is 86 minus 97, being -11 runs per hundred balls. Taking the average for all right-left pairs, they come out 0.4 runs slower per hundred balls than right-right partnerships.

That’s 2-0 to the right-right pairings. Right-left combinations look slower than right-right pairings, once you adjust for who is batting.

But could it be impacted by time of the innings? For instance, do lots of right-left pairs open the batting, so score more slowly at that stage of the innings? Let’s repeat those same two calculations, but just for openers.

Darn it. We have three measures saying right-left pairings are of no benefit, against one saying that they are.

We need more data.

The good news – I’ve finally found a use for all those meaningless T20Is: to test right-left supremacy.

Running the same methodology for 2015-20, it’s nice to see some familiar faces. Dharwan and Sharma top the list, with 1,663 runs together. This time their collective strike rate of 141 is much closer to what we’d expect. And the general case:

Conclusion & Discussion: If anything your team will score faster with two right-handers batting together. Why should that be? One thought: with a left-right combination, the bowler must have a different approach for each batsman, and adopt the optimum lines and lengths for the player on strike. However, with two right handers that isn’t necessary. Is there a risk that a bowler tries to apply the same plan to two quite different right-handed players? I’ve no idea, but it kinda feels possible.

***

This has all been a bit dry, so let’s have some fun. Firstly, the Campbell-Hope award for the pairings who added up to more than the sum of their parts:

Min 300 runs. Top nine teams only.

And the same for slow scoring – where two batsmen either don’t gel or happen to have come together to consolidate not dominate:

Min 300 runs. Top nine teams only.

PS. That was supposed to be some harmless trivia. But Angelo had to spoil it. Did you see him in four of the twelve pairings? Another hypothesis to test: “Is Angelo Mathews better with some players than others”?

Further readingCricinfo analysis of ODI partnership averages. Concluded no advantage to left-right partnerships. Doesn’t cover strike rates though – so I may have done something original here.

IPLsplaining

Himanish Ganjoo (@hganjoo153 on Twitter) kindly shared some IPL data with me. Now, I’ve not seen the IPL for a long time, and the last T20 I went to was almost a year ago*. But I can play with data. Here I’ll explore batting in the last five overs.

Batsmen have scored 59,958 runs in overs 16-20 in the IPL, at a strike rate of 154. What makes a successful batsman? To start with, I’ll check the correlations between strike rate and Dots/Singles/Boundaries.

There’s a weak inverse correlation between dots and strike rate.
The inverse correlation between % of balls hit for a single vs strike rate is more compelling
Well now. That’s rather a good fit.

Strike rate in the last five overs is all about boundary hitting. The slow players hit one ball per over to the boundary, where the four top batsmen hit two.

Slowcoaches

Let’s look at the batsmen that don’t sparkle at the end of the innings:

Not a boundary hitter in sight. None of them have hit 20% of deliveries to the boundary, so all of them underperform.

A shallow read of this says these players are either batting too high (shouldn’t be batting at all) or too low (being exposed trying to keep up at this stage of the innings). Since I know little about T20 I won’t try and go further than that!

Really surprised to see Shakib Al Hasan on the list. There’s a wider point – Al Hasan’s strike rate in ODIs is a healthy 83, yet in T20Is it’s an anaemic 124. I may follow up and see how common that is.

Another way

What about six hitting? I know it’s supposed to matter, but it’s not essential. Here’s some fine batsmen doing it differently:

On average 7.2% of balls in the last five overs in the IPL are hit for six. You can be a successful batsman at the death even if you can’t hit sixes as well as that. These players manage it. All keep their dot ball percentage under 30, they hit way more fours than average, and take slightly more singles.

It’s good to see – there’s room for those that keep it on the deck, even at the end of a T20 innings. Selectors take note.

Farming the strike

If one of the rare 200+ SR players bats with a 130SR player, they would expect to score 0.7 runs per ball more than their partner. There’s an argument for refusing singles, apart from on the last ball of the over.

Similarly, the weakest batsmen should be looking to turn the strike back to an elite batsman. If batting normally is worth 1.3 runs per ball, then the cost of taking a single is only 0.3 runs that ball, and it should be made up for by having the better batsman facing.

The data doesn’t really bear that out (if it did, the trendline for strike rate vs singles wouldn’t be a straight line). Maybe T20 cricket hasn’t fully absorbed this lesson. Or maybe it has, but doesn’t show up as this analysis is based on the last 12 years.

Conclusion

That boundary % chart will stay with me. Boundaries are so valuable that the skill of turning a dot into a one, or finding the gap so one becomes two doesn’t really show up. But we’d be fools for thinking that sixes are the only currency. Fours are OK with me.

* At Cheltenham. Benny Howell took his only T20 five wicket haul. It rained a lot.

What if the 2005 Ashes had been a draft?

What would happen if the 2005 Ashes series started with a draft? I ran this scenario as a way to test my upgraded Test match model. By enlisting outsiders to draft the teams, they were then eagle-eyed in reviewing the results (thanks to Rob and Pud for their contribution).

Brilliantly, the series was decided in the last hour at the Oval, with Michael Vaughan shepherding the tail against the new ball.

Model updates

Since the last iteration I’ve added matchups, refreshed ground data, added realistic spin/seam performance by innings, and had another go at lifelike bowling changes.

With this much improvement comes lots of testing, and this exercise is just one small part of that.

Rating Players

Instead of career averages, I used performances up to July 2005 to rate the players. This is how I would have rated players at the time – serving as an additional check of my ratings process.

It throws up a few oddities: Having averaged 54 over the last four years’ County Championship, Rob Key looked Kevin Pietersen’s equal.

The Draft

Squad analysis

Rob foolishly excluded Martyn and Thorpe, but we’ll let him off because England dropped Thorpe in the real world.

Gilchrist is so much better than Geriant Jones that it was a surprise Gilchrist was eighth pick: there was huge value in securing his services early.

Clever from Rob to grab Flintoff and Warne. Once he had done that, there was a premium on Collingwood as the last all rounder: he should have been earlier than 18th pick.

The Series

Rob negotiated a tricky chase of 190 at Lord’s before comfortable back-to-back wins for Pud at Trent Bridge and Edgbaston. McGrath’s match figures of 6-74 at Edgbaston exposed Rob’s tail.

Hubris set in for Pud at Headingley – winning the toss and batting, nobody made it to 30. Then all four bowlers conceded centuries as Rob amassed 504 (Strauss 235*) to set up a comfortable win.

All square two-all going to the Oval. A characteristically flat pitch, yet the pressure almost got to Rob at the toss. With Warne struggling, Rob considered fielding first before his better judgement kicked in.

Three scores in excess of 400 put the game out of Pud’s reach, leaving him 102 overs to survive to share the Ashes. Wickets fell steadily. Collingwood (23) was fifth man out just after lunch, leaving Vaughan (102*) and Gilchrist much to do.

Bizarrely, Gilchrist (52 from 68) counter-attacked. Pud’s views when Warne bagged the wicket are unbroadcastable. With ten overs to go, Vaughan and Harmison were standing firm, but two wickets in two balls for Hoggard won the match and the series, for Rob.

Batting Averages

Andrew Strauss was “Man of the Series” for his 557 runs at an average of 80.

Bowling Averages

Warne’s performance was unlucky. His average of 46 was unexpected. Subsequent testing confirmed that he should have thrived against Pud’s numerous right handers, but it didn’t happen for him.

Model upgrades required

– Bring back best bowlers when a team is seven or eight down. Collingwood shouldn’t have bowled at the tail as much as he did – this is why Collingwood bagged 19 wickets at 23.

– Build in the ability to play for the draw. Gilchrist’s five-an-over antics were unlikely on the fifth day with 300 required to win.

Conclusion

A decent hour’s entertainment and two improvements for the model. A success.

Batting ability in Test cricket is not normally distributed (it just looks like it is).

How is talent distributed in elite cricket? Bell curve (ie. normal distribution), or something else? Here I’ll argue that the distribution of ability is the tail of a normal distribution. The evidence is strong at county level, but rather weaker for Test cricket. As you’ll see, I’ve not let that stop me.

1. Marathon Running & County Cricket

Let’s start with a different sport. Here’s the distribution of running performances for millions of marathon runners:

Fig 1 – Distribution of marathon times. Taken from Allen et al.: Reference-Dependent Preferences: Evidence from Marathon Runners. See here.

The spread of marathon times across the population is broadly a bell curve, but there are some subtleties: firstly, that the unfit are less likely to take up long distance running (myself included), so the distribution is lopsided. Secondly, marathon runners appear to have target times, and performances are bunched around times like four hours.

Focus on the distribution of the elite – the quicker the time, the fewer runners are capable of it. Lots of runners at the bottom of the elite pile, then fewer and fewer as the pace goes up.

County cricket fits that pattern (based on my ratings of how players across 2nd XI and the County Championship would fare in Division 1). Loads of quite talented players who could just about make the grade, whittled down to 22 who would average over 40.

Fig 2 – distribution of redballdata county batting ratings, min 30 innings. Excludes overseas players.

2. Test Cricket

Fans of Occam’s razor might want to look away now. This section sees me building a house on sand.

My previous post demonstrated that averages are a function of luck and talent. We know the impact of luck, we have the actual averages. Thus we can work backwards to estimate the distribution of batting talent. I’ll now suggest a distribution of batting ability in Test cricket.

We start by making a graph of the averages of batsmen in Test Cricket. Looks a teensy bit like a bell curve, and nothing like the County chart. There’s only 300 players so it’s not a smooth distribution.

Fig 3 – Career averages, batsmen minimum 20 matches, since 1970, batting in the top six.

a. Talent Distribution in Test Cricket

However, selection isn’t perfect. Nor is there a continuous supply of Test standard cricketers in each country. This means a sprinkling of selections who are of a lower standard. Also, each country is a different standard. This means the true distribution of Test batting ability is the sum of the curves for each country.

Putting all that together, the distribution takes the form:

Fig 4 – Suggested distribution of talent in Test Cricket. Each curve is the tail of a normal distribution plus a small number of weaker players. To reflect the relative strengths of cricketing nations (and variation over time), the Overall curve is the sum of three curves (for an inferior, average, and superior team).

That yellow curve is probably smoother in the real world. Still, not terrible as a first attempt at answering the question “what does the distribution of Test batting talent over the last 50 years look like”?

b. The Luck Curve

The median player had 75 completed innings, so I’ve used that to derive the spread in averages (versus “true” averages). A reminder: this comes from a simulation of many careers.

Fig 5 – impact of luck on average for a top order batsman that has been dismissed 75 times.

Strictly, I should merge many luck curves – a tight one for Tendulkar (292 dismissals, a wide one for Moin Khan (26 dismissals). Still, every journey starts with a single step.

c. Talent * Luck = Performance

We now combine the Talent and Luck curves (probability densities) and compare them to the observed distribution

Fig 6 – Actual batting averages vs a Theoretical distribution based on proposed luck and talent curves

Not a bad fit. Naturally, the Actual (blue) curve is noisy as there are only 300 players that meet the criteria for inclusion. There are fewer players with very high averages than the talent curve I’ve derived would indicate – implying the real talent curve drops off more steeply than mine.

Discussion

What use is knowing how talented players are (rather than just knowing how well they performed)? In order to judge if a player has been unlucky or is unsuited to Test cricket, one needs to know the level of talent they need to have.

If you feel uneasy about the hand-waving approach I’ve applied here, then don’t worry – because so do I. Tinkering to make one curve look like another (noisy) curve is not the most rigorous analysis I’ve done. Just take away the message that luck plays a big role in averages, and we can’t yet use numbers to know how talented Test batsmen really are.

Further reading

Always worth seeing if someone has asked this question in baseball. Here’s analysis that finds batting ability would be normally distributed if you assume fielding is 30% of the value of a player. I can’t comment on baseball, but for cricket that figure is too high. Thus it’s an interesting technique, but not contradictory to my curves. If one could quantify the value of fielding (and/or other attributes) for top order batsman, then the approach in the linked piece could be replicated.

***

*Since 1970, batting in the top six, min 20 matches