Cricketing Barbarians

Rugby Union has a representative team called the Barbarians. They are something of an oddity in the professional era – an invitational team who play against international sides in exhibition matches. These are high scoring, free flowing matches that encourage an attacking and entertaining spectacle. It’s a great antidote to the win-at-all-costs culture that has come with professionalism.

Cricket has a significant gap between the top ODI teams and the rest. Matches between these sides will rarely be balanced – since 2015 there have been 69 games between the top eight sides and Non-World-Cup (NWC) teams. The NWCs have won just five and tied one. There were likely some pretty dull days among the 63 defeats. The perception is that it’s hard to sell these games, so boards would rather have long series between the top teams than host a “minnow”.

How about we use the first paragraph as a solution to the second? Picture an invitational side, the best of the NWC teams*, playing an exhibition 50 over match at Lord’s against an England XI to start the international summer. No stats or averages up for grabs. No wider context, apart from the love of Cricket and a desire to grow the game by giving the best-of-the-rest a chance to show what they can do.

It has been tried before, a three day warm up game in 2012. England scraped a three wicket win in a balanced contest. A shortened red ball practice match might not be the right format – it’s unlikely to pull in the crowds in the way an exhibition 50 over game could.

Marketing these games would not be taxing – take a leaf out of the Barbarians book and have big name guest coaches (such as Kumar Sangakkara). The Baa-Baas have a nice touch where each player wears their own club socks to complement the black and white hoops of the Barbarians kit. This MCC NWC XI could do something similar with Helmets**.

Numbers time

Since this is redballdata.com, we’d better have some stats to support the idea that a composite Barbarian team would be more successful than individual countries.

This isn’t intended to be a comprehensive review of who the best NWC players are – more an indicative view of how players have fared against the best ODI sides. Bertus de Jong is a useful source on Associate Cricket, if you’d like to know more.

Here’s some candidates to be on display for this theoretical XI, based on performances against the World Cup teams since 2015. It would be easy to find a competitive top six from these players, ideally not just drawing players from the 11-13th best teams (Ireland, Scotland and Zimbabwe). Note that no individual team can field seven players who have averaged over 27 against the big names.

Fig 1: ODI Batting 2015-2019, NWC vs World Cup Teams. Ranked by average. Excludes players that retired before 1/1/18.

As for bowling, not many great averages, but these Economy rates would keep the NWC XI in the game. As for individual countries, Zimbabwe have plenty of bowling which could challenge the top ODI teams. Ireland and Scotland don’t have that depth.

Fig 2: ODI Bowling 2015-2019, NWC vs World Cup Teams. Ranked by average. Excludes players that retired before 1/1/18.

This Barbarian concept could work. The hosts would get a spectacle and something a little different for the fans, while giving the Associates & Affiliates a chance for their best players to gain experience of competing against a top team.

*If this comes across as condescending to the sides ranked 11-20, it isn’t intended to. The world is getting smaller, and if Cricket doesn’t widen its popularity, richer sports will. Think of this proposal as a means to an end, building towards bilateral series.

** It may not surprise you to learn that I don’t work in Marketing.

Further Reading

The Cricketer magazine flagged the best players missing from the 2019 World Cup in an article here.

Here are some highlights of an England vs Barbarians fixture in 2018.

Left arm pace in ODIs – where are all the part-timers?

Like most sports fans my weekends can include shouting at the radio. Unlike most sports fans I’m usually het up about statistics, not necessarily the performances on the field.

Last weekend it was claimed that the Indian middle order has a weakness against left arm pace in ODIs. I won’t name the individual that said it, because they are an excellent commentator and this isn’t intended to be a criticism of them.

What’s wrong with that claim? Left arm pace bowlers are normally front line bowlers, so are better than the average bowler. That means that when it is said that “X does badly against left arm quicks” we really mean “X is less good against the better bowlers”. Of course they are, we all are!

Time for a couple of charts. Firstly, there’s a clear distinction between performance of front line bowlers (ie. those that bowl on average more than six overs per innings) and the “change” bowlers:

Fig 1: Bowling records for the 10 World Cup teams this decade.

Note the key difference in average between front line and backup bowlers – 12.1 runs per wicket. It’s likely that the backup bowlers bowl in the middle overs, so flattering their Economy Rates compared to the bowlers trusted to finish the innings.

Sampling the data in any way that includes a greater proportion of front line bowlers will give metrics that indicate batsmen are struggling. For instance, by only measuring performance against left arm pace bowlers!

Next, the same view as above, but with left arm pace bowlers. Note how high the average overs per innings are for left arm pace bowlers. There’s a full list of bowlers at the end of this piece.

Fig 2: Bowling records for the 10 World Cup teams this decade. Split between left arm pace and others.

Left arm pace bowlers average 12% less than other bowlers (admittedly while conceding runs 3% faster). For analysis of the advantages left arm pace bowlers have, refer to this Cricinfo article http://www.espncricinfo.com/magazine/content/story/851399.html

But why wouldn’t there be part-time Left arm quicks? It could be margin of error: bowling over the wicket to a right hander, straying onto the pads is risky while conservatively keeping a consistent line outside the off stump takes bowled and LBW out of the equation.

Summing up, we can draw two conclusions. When considering performance against a sub group of bowlers, one needs to adjust for the quality of that sub-group. The smaller the sub-group, the more careful you need to be. Also, expect all batsmen to average 12% less against left arm pace bowlers in ODIs.

Appendix

Fig 3: All left arm pace bowlers to have played ODI Cricket for the 10 world cup teams since 2010.

There’s barely a part-timer in the group. Only Anderson, Franklin, Udana, Reifer averaged less than six overs per innings – and Raymon Reifer has only played two games!

Still reading? Here’s another example to make the point: imagine a naïve Cricket Analyst for a Test team at the end of last Century. Crunching the numbers they see that most batsmen underperform against leg spin, so recommend the selectors fast track a ‘leggie’. The unfortunate Analyst didn’t notice that there weren’t very many leg spinners out there. All they’ve really discovered is that it wasn’t easy to face Shane Warne, Mushtaq Ahmed or Stuart Macgill. That’s not especially insightful: right data, wrong conclusion.

Matchups and Opportunity Cost

There’s a theory (which I just invented) that you could listen to old radio broadcasts of Cricket and be able to judge the date by the buzzwords of the era. For 2019, it’s “Matchups”: pitting bowlers against the optimum batsmen to stifle run scoring and take cheap wickets.

Matchups seem like a plausible proposition – get enough data, find some patterns, check you’ve got a decent sample size and out will pop some options to consider. Note the need for a plausible proposition (ie. not “Roy struggles against the flipper in the top of the hour when the bowling is from the North-West”).

There are three issues I have with the use of Matchups.

Firstly, they aren’t publicly available – if a pundit refers to X having a weakness against a particular type of bowling, the viewer/listener has no way of knowing if that’s a fact or an opinion. In times gone by, we could accept that all such utterances were opinions, and who better to go to for opinions than people who report on the game for a living? The balance has shifted – so now when hearing “Bairstow struggles against spin early in the innings”, it could be opinion, bad data*, or a solid piece of analysis. There’s something unsatisfying about that.

Secondly, we don’t know if Matchups work. If each one is a hypothesis, it should be easy to aggregate them in order to compare results and expectation. I expect much of this is – understandably – happening behind closed doors. My hunch is also that many Matchups evaporate as statistical flukes, so are of no benefit. If you’re aware of a rigorous assessment of Matchups, please do drop me a line on twitter or via the Contact page on this site.

Finally, and of relevance to the Cricket World Cup, there’s an opportunity cost associated with changing bowling plans. Especially in ODIs where bowlers need rest during an innings.

Let’s explore that Opportunity Cost – what are the downsides of opening with spin? We can expect more teams to open with spin against England after Bairstow fell first ball against Imran Tahir. Here’s how South Africa used their bowling resources that day:

Fig 1: Overs bowled by each player

Early wickets have a big impact on expected score – but one cannot fully appraise the impact of opening with Tahir without taking all factors into account.

  • Rabada didn’t get the new ball. He then had to condense 10 overs into 44, rather than across 50 – does that impact the pace he can bowl?
  • After 24 overs, with the score on 131-3, Faf du Plessis threw the ball to JP Duminy. Five of the next eight overs were bowled by Duminy and Markram. On this occasion it worked – 5-0-30-0 is not too bad. But it’s the big picture that matters, not one innings.
  • Pretorius only bowled seven overs, Phehlukwayo eight. Without a medium pacer or second spinner than can bowl 10 overs in a row, once a team opens with spin, they are probably going to underuse their fourth and fifth bowler.

What are the factors to consider when weighing up whether to open with a spinner in a four pace / one spin attack?

  1. Will it work? What is the increase in chance of a wicket versus the default option?
  2. What are the relative strengths of your sixth (and possibly seventh) best bowlers, compared to your fourth and fifth?
  3. How fit is the bowler who won’t now be opening? Are you confident they can bowl 10 out of 44 overs? How many days since your last game?

What have we learned? The value of a Matchup is the expected gain from one pairing over another, less the downsides of changing the bowling order to accommodate using a specific bowler at a particular time.

* A word on bad data: Andrew Strauss averaged 91.5 against Mitchell Johnson in Tests. It’s a nice piece of trivia, but it’s only based on Strauss scoring 183-2 against Johnson. I doubt this would have much predictive power. Using that as a basis of prediction is roughly the equivalent of writing off Graham Gooch after he bagged a pair on debut.

Further reading: Cricmetric.com claims to have Matchup data for Batsmen vs Bowlers – I’ve no reason to doubt their data.

What’s a dropped catch worth in ODI Cricket?

Jason Roy dropped the ball today. I didn’t see it, but apparently it was rather an easy catch. Pakistan went on from 135-2 (24 overs) to finish 348-8, a score just out of England’s reach. The final winning margin was 14 runs.

What did that drop do to Pakistan’s expected score? Here’s the simulations for the two scenarios: 136-2 (24.1) and 135-3 (24.1)

Fig 1: Two scenarios for the 145th ball of Pakistan’s Innings: Out or one run scored.

If Hafeez had been out, the mean score was 350, while the dropped catch increased the mean score to 377. That’s a 27 run impact.

Can we break that down?

  • Firstly, the runs scored on that ball. Value = one run. Easy.
  • Secondly, the reduced run rate as a new batsman plays themselves in. According to some analysis I’ve done on how batsmen play themselves in, that’s worth four runs (Hafeez had faced 12 balls by this point, so would have been just starting to accelerate).
  • The rest of the impact (22 runs) comes from two factors: more conservative batting as Pakistan from having fewer wickets in hand, and the increased chance of getting bowled out (and thus not using all their overs).

To generalise, the cost of a dropped catch would be a function of:

  • Runs scored on that ball
  • Whether the surviving batsman is set
  • How long left in the innings (the wicket affects the value of future deliveries. Thus the later in the innings a wicket falls, the lower the value of that wicket)
  • How many wickets the batting team has in hand (does the wicket cause more defensive batting)? In this case, being three wickets down after half the innings still leaves plenty of scope for aggressive batting so doesn’t have as big an impact as it could.
  • Strike Rate and Average of the reprieved batsman relative to the rest of the team (dropping Wahab Riaz is better than dropping Babar Azam).

Interesting topic. I might come back to this when other people drop sitters.

World Cup Scheduling is Bangladesh’s friend

Had a brief look at the Cricket World Cup fixtures, didn’t see anything of interest – with 40 odd days to play nine matches each, there would be plenty of time between games so no need to take fatigue into account.

Actually, the fixture list has some unnecessary oddities. There are two occasions when a team plays twice in three days.

Afghanistan follow their match against India on 22nd June with one against Bangladesh on the 24th.

Then India have their own congestion when their 30th June game against England is followed by – you’ve guessed it- Bangladesh on the 2nd July.

Individual performances in the Royal London One Day Cup shows that when bowlers have had less rest than the batsmen they are bowling to, the batsmen get an boost. This was particularly clear cut when teams played twice in three days.

Bangladesh may have a better chance of making the semi finals than the 18% implied by the bookies.

Fantastic boundaries and when to find them

Using a ball-by-ball database of 2019 ODIs, I’ve looked at boundary hitting through the innings. This was to refresh my ODI model, which was based on how people batted in 2011.

Fig 1: Boundary hitting by over. ODIs between the top nine teams, Q1 2019

Key findings:

  • First 10 over powerplay: 10% of balls hit for four, c.2% sixes. Just two fielders outside the ring.
  • Middle overs 10 – 40: c. 8% balls hit for four, c. 2% sixes. Four fielders outside the ring limits boundary options. Keeping wickets in hand mean batsmen don’t risk hitting over the top, though if wickets in hand the six hitting rate starts to pick up from the 30th over.
  • Overs 40-45: Six hitting reaches 5%. No increase in the number of fours: five boundary riders give bowlers plenty of cover.
  • Overs 46-50: Boundary rate c.18% with boundaries of both types picking up.

These probabilities have been added to the model, which now makes some sense and isn’t claiming a 6% chance England score 500!

An early view of what the model thinks for Thursday’s Cricket World Cup opener – if England bat first 342 is par. 69% chance England get to 300, 20% chance of England getting to 400. I can believe that, it is The Oval after all.

The ODIs they are a’changing

My ODI model was built in those bygone 260-for-six-from-50-overs days. Having dusted it off in preparation for the Cricket World Cup it failed its audition: England hosted Pakistan recently, passing 340 in all four innings. Every time, the model stubbornly refused to believe they could get there. Time to revisit the data.

Dear reader, the fact that you are on redballdata.com means you know your Cricket. Increased Strike Rates in ODIs are not news to you. This might be news to you though – higher averages cause higher strike rates.

Fig 1: ODI Average and Strike Rate by Year. Top 9 teams only. Note the strength of correlation.

Why should increasing averages speed up run scoring? Batsmen play themselves in, then accelerate*. The higher your batsmen’s averages, the greater proportion of your team’s innings is spent scoring at 8 an over.

Let’s explore that: Assume** everyone scores 15 from 20 to play themselves in, then scores at 8 per over. Scoring 30 requires 32 balls. Scoring 50 needs 46 balls, while hundreds are hit in 84 balls. The highest Strike Rates should belong to batsmen with high averages.

Here’s a graph to demonstrate that – it’s the top nine teams in the last ten years, giving 90 data points of runs per wicket vs Strike Rate

Fig 2: Runs per over and runs per wicket for the first five wickets for the top nine teams this decade, each data point is one team for one year. Min 25 innings.

Returning to the model, what was it doing wrong? It believed batsmen played the situation, and that 50-2 with two new batsmen was the same as 50-2 with two players set on 25*. Cricket just isn’t played that way. Having upgraded the model to reflect batsmen playing themselves in, now does it believe England could score 373-3 and no-one bat an eyelid? Yes. ODI model 3.0 is dead. Long live ODI model 4.2!

Fig 3: redballdata.com does white ball Cricket. Initially badly, then a bit better.

Still some slightly funny behaviour, such as giving England a 96% chance of scoring 200 off 128 or a 71% chance of scoring 39 off 15. Having said that, this is at a high scoring ground with an excellent top order. Will keep an eye on it.

In Summary, we’ve looked at how higher averages and Strike Rates are correlated, suggested that the mechanism for that is that over a longer innings more time is spent scoring freely, and run that through a model which is now producing not-crazy results, just in time for the World Cup.

*Mostly. Batsmen stop playing themselves in once you are in the last 10 overs. Which means one could look at the impact playing yourself in has on average and Strike Rate. But it’s late, and you’ve got to be up early in the morning, so we’ll leave that story for another day.

**Bit naughty this. I have the data on how batsmen construct their innings, but will be using it for gambling purposes, so don’t want to give it away for free here. Sorry.

Preview: RLODC 2019 Semi Final 1

Nottinghamshire vs Somerset 12th May 2019

redballdata.com modelling: Nottinghamshire 51% – Somerset 49%

At first glance Notts look unstoppable: W6 L1 NR1, NRR +0.6. Two days of rest and home advantage.

Their batting is excellent: Hales and Duckett over their careers averaging high 30s at a run a ball mean more often than not a solid platform with runs on the board and wickets in hand for Mullaney, Moores, Fletcher to work with at the end of the innings. During the group stages scored over 400 twice in seven innings (Somerset’s highest is 358).

However – Somerset’s strength is their bowling – specifically taking wickets.

This makes for a rather unusual range of first innings scores if Notts bat first. Remember that Trent Bridge is a high scoring ground.

Fig 1: Notts projected runs.

Notts are just as likely to score 201-225 as they are 426-450! Such an even distribution is very rare. Nottinghamshire have a roughly 1500-1 chance of breaking the List A world record of 496.

Compare that to the more steady Somerset. Ali, Hildreth, Abell are dependable but not explosive batsmen. Batting deep means they can dig themselves out of trouble and find their way to a total. Thus Somerset have a 66% chance of scoring in the range 276-375.

Fig 2: Somerset projected runs

These are two evenly matched teams.

If you want an even contest that bubbles up over time, hope that Somerset bat first – they will get a reasonable score. Personally, I’d like to see Notts bat first because *cliche* anything could happen. Yes, I appreciate that means a good chance of a low score that Somerset fly past, or a high score that the visitors will get nowhere near.

“Royal London One Day Cup – Group Stage Review” or “Notts and Hants FTW”

If imitation is the sincerest form of flattery, I have something of a crush on International Cricket Captain. Much of the modelling I’ve done is an attempt to recreate what that game could do in simulating whole matches in the blink of an eye. Here is a link to the International Cricket Captain website, if you think you might have 300 hours to kill this summer.

There are two parts of the International Cricket Captain engine I’ve not incorporated: Form and Fatigue. I don’t believe in form and won’t incorporate it until it shows up in the numbers (if the facts change, I’ll change my mind). Let’s look at fatigue instead…

Background

Fixture congestion is nothing new – who can forget 1066, when Harold II’s middle order collapsed at Sussex just 19 days after an attritional fixture on a Yorkshire out-ground.

The Royal London One Day Cup (RLODC) has a punishing schedule – most matches are played less than 48 hours after the last one finished. Some teams get longer breaks- which means we have tired players against slightly less tired ones. This gives us some tasty data to measure the impact of fatigue.

Before we get into the numbers, I’d like to define the tiredness in question – it’s mid-week weariness. Not the short term fatigue that means that as a bowler goes through a spell their effectiveness drops, nor the possibility of long term decline over a season from a relentless schedule. This tiredness is like the mid-music-festival malaise one might experience on the Saturday of Glastonbury, when the preceding days take their toll.

To define a “fatigue factor” we need to see how players fare when one team has had more rest than the other.

Findings

Factors affecting RLODC Team Performance

  • Home Advantage: Home team gains 0.13 runs per over. Away team loses 0.13 runs per over. Net effect on a match 13 runs. I wasn’t specifically looking for this, but had to analyse it as a factor that needed to be controlled for before conclude on Fatigue.
  • Fatigue: Batting team better rested gains 0.23 runs per over. More rested Bowlers concede 0.23 fewer runs per over. Maximum impact on a match 23 runs.

Implications i. 2019 RLODC

Fatigue has an interesting effect on the semi finals: the winners of the North and South groups host the winners of quarter finals between the teams which finished second and third in the groups. The quarter finals take place on the 10th May 2019, the semi finals on the 12th May 2019.

Nottinghamshire and Hampshire have been the best teams in the group stage, and will have both home advantage and the benefit of >6 days rest, rather than the two days of rest the quarter finalists have.

I will running these extra inputs through my 50 over model this weekend to see if this insight offers any gambling opportunities. My expectation is that I’m late to the party on this, and the odds will already factor in rest periods and home advantage.

Implications ii. Selection

In a tournament like the RLODC, we should see more rotation of bowlers in and out of the team, particularly if a squad has bowling depth. Sussex only used eight bowlers in as many matches: who knows whether giving Hamza a day off might have been the difference that got them into the quarter finals, instead of mid-table disappointment. Just imagine if Sussex had had Chris Jordan available to them for the second half of the group stages, rather than on England duty.

Further Reading

Green All Over – Betting Blog, see link for a post on the impact of rest on Baseball odds (which reminded me that there was a potential input I was ignoring).

No winning on Tour

Tours are strange beasts. Anyone who has ever been on a Club Rugby tour can attest that pre-match preparation isn’t entirely conducive to peak performance.

Professional sport should be the opposite of this. Next time you are watching Cricket on TV and they cut to the pavilion balcony, count how many non-playing staff are on hand. I’m not criticising touring parties for being too large – I’ve no data to assess that on. My point is that lots of money is spent by governing bodies to ensure enough specialists are on hand to keep eleven cricketers playing at their best.

Here’s a theory – all this investment in the extra 1% is missing the wood for the trees. The tour scheduling is an unseen problem.

Recall the post-before-last regarding Home Advantage growing as a series goes on, and your correspondent having an effect with no obvious cause? Going through the archives of @Chrisps01’s blog was a possible clue to this – [link] – some analysis on rest periods between matches. A quick re-cut of the data and I could quantitatively look at this effect with two decades’ worth of data.

There’s a certain base advantage in the first Test of a series, which is kept at the same level if subsequent Tests are played back-to-back (ie with less than a seven day gap between matches). Away teams are at a much bigger disadvantage when there is a longer gap between Tests.

Think back to summer 2017 – on August 29th West Indies beat England by five wickets to square the series with just the Lord’s Test to come. On September 2nd & 3rd the full strength West Indies team toiled in a meaningless draw against Leicestershire. England rested. West Indies put up little resistance in the third Test, scoring just 300 runs over two innings.

Why might away teams struggle with longer gaps between Tests? Here’s how I rationalise it:

  1. With very short gaps between Tests, both teams are fully focused on recovery and getting the XI back ready to play the next Test. Both teams are therefore doing the same things and so no team gains an advantage over the other.
  2. Longer gaps between Tests mean tour matches for the away team, and (in the modern era) rest for the home team. Even if not all of the team are involved in a tour match, the focus of the touring party is likely to be distracted by a competitive fixture.
  3. Players for the host team may get the opportunity to go home for a few days during a break in the series – the away team will still be living in hotels.
  4. The data implies that the home team’s activities result in better performance in the next Test.

Touring teams should revisit their itinerary so they are best placed to compete throughout a series: plenty of rest, no meaningless mid-series tour matches.