Impact of “Not Outs” on Averages

A campaign will consist of many skirmishes and a handful of battles. I’m a combatant in a (meaningless) long standing disagreement with an old friend about the value of Chris Woakes. Today’s battleground is whether Woakes’ Test batting average is artificially inflated because he get a lot of “not outs” batting at eight. My opponent believes Woakes is “Bang Average” and therefore clutches any straw which supports that case.

I’d read this in Cricinfo back in 2013 and drew the conclusion that if not outs do make a difference, it’s so small I could ignore it for modelling/gambling purposes. As an aside, generally I like to look at things myself before concluding – for some reason the Cricinfo piece sufficed. Possibly because the author was trying and failing to make the case for adjusting averages to reflect not outs.

As counter-arguments go, I couldn’t rely on picking holes in someone else’s argument, I needed some data. Time for Statsguru!

Batting at seven vs eight

Players who have occupied both batting positions will give us the best data on the impact of those positions on average. I took players who had at least 15 completed innings in each role since 1990 and compared performance.

Fig 1 – Averages batting at seven or eight.
Fig 2 – Impact of additional not outs on average for players batting at eight rather than seven. Note the wide spread of results – there’s no clear trend. Excel’s trendline says the extra not outs from batting at eight do not boost average.

There’s no clear difference between batting at seven rather than eight.

Six versus seven is where it gets more interesting…

Batting at six vs seven

Fig 3 – Averages batting at six or seven. There seems to be a general benefit to batting at seven, but some outliers in Dhoni, Watling, Whittall.
Fig 4 – Impact of additional not outs on average for players batting at seven rather than six. Now we have something. The players with more not outs tend to get a higher average. Excel’s trendline agrees – cutting from bottom left to top right.

Discussion

Let’s take stock after that whirlwind of charts. Generally, players that batted at seven got a boost to their batting average, relative to batting at six. This benefit correlates with increased proportion of not outs when batting lower down.

There is no extra benefit from batting at eight rather than seven. But – that is not to say that there’s no overall benefit to batting at eight rather than six: it’s just that batting at eight has the same benefit as batting at seven rather than six.

I’m no fan of the proposals put forward so far for punishing players for being not out. Yet being not out is correlated with higher averages.

A suggested mechanism: Outrunning Bears

Remember the old joke: two guys, out in the forest, chance upon a bear. The bear starts wandering towards them, and one chap starts tying his shoelaces. Second guy asks “what are you doing – that won’t help you outrun the bear?” First guy answers “I don’t need to outrun the bear, I just need to outrun you”.

In nightmarish batting conditions, the top order have next to no chance of protecting their average. Only one batsman outruns the bear to be not out. There’s an advantage of being the last good batsman in the lineup – you just have to survive while the tail gets blown away, and it’s like the barrage never happened – your average is unscathed.

Fig 5 – Batting position of the not out batsman when a Test team has been bowled out for less than 100 since 1990 (top six Test teams only)

There we have it – number seven is having his average flattered because when the going gets tough, the number seven gets red ink. Well, 13% of the time anyway.

Extend that to all tricky batting situations, and there is likely to be a real impact to averages: the top six rarely get a not out in tricky conditions, that benefit belongs to numbers seven to eleven.

Conclusion

Let’s go back to the original question – is Woakes’ batting average benefiting from coming in at eight? I think so.

Can I quantify it? Not yet. All I’ve shown is that the lower batsman are more likely to survive in bad conditions, yet how often do they miss out on the best batting situations? If a team ends 400/4 declared, numbers seven and eight don’t see any of that action.

Does it matter? If comparing two players who bat in the same position then there’s no impact on their data. If comparing a seven and a six’s record, then yes – a rule of thumb would be:

Average adjustment = -70 * (additional not out % from batting at seven not six)

Which works out as about -1.5 runs in moving from seven to six.

Further reading

The Institute & Faculty of Actuaries know a thing or two about risk. Their take is here. I didn’t find it persuasive.

Appendix: Detailed Data

Fig 6 – Full list, batsmen with more than 15 completed innings batting at seven and eight in Tests since 1990.
Fig 7 – Full list, batsmen with more than 15 completed innings batting at six and seven in Tests since 1990.

Anderson vs Woakes

When I was at university there was a rumour that one of the Geology professors was about to predict a massive earthquake in South America. This would have been a career limiting move if nothing happened.

In the end neither the bold prediction or the earthquake materialised.

I thought of that professor’s reputational gamble when I had the idea of asking whether Chris Woakes might be preferred to James Anderson for the Fourth Ashes Test. To misquote Nasser Hussain, “No Ed Bayliss, you cannot do that.”

The scenario

If you are reading this years from now, Sir James Anderson is currently England’s best bowler, though he doesn’t bat very well. Woakes is a decent batsman, and almost good enough to get into the England team as a bowler. Woakes shores up a mediocre top seven and gives the team balance, especially as Jack Leach is a non-batting spinner. Anderson pulled up during the first Test with a calf injury. He missed the next two Tests and has been added to the squad for the fourth. The series is level 1-1 with two to play. Current speculation is that Woakes might make way for Anderson.

Fig 1 – Career Test records

When weighing the merit of the two players, I’ll look at two factors: England and Australia’s expected runs. To do this, I’ll run my model using each player’s career record as the input* and see how the different teams fare.

Batting

If Woakes were dropped, England would have Broad, Leach and Anderson as a long tail. That means a higher probability that a good batsmen gets left stranded and not out. The following table shows the impact on expected runs over the course of a match of replacing Woakes with Anderson and rejigging the batting order:

Fig 2: Comparing modelled runs scored per Match by batting position in the two scenarios. Note that Bairstow would expect to score two runs fewer per game as a result of more frequently running out of partners.

England would expect to score 29 runs fewer per match with Anderson rather than Woakes.

Interesting that Broad batting at ten outscores Leach in that position by so much – I think it’s because the likely partnerships with Leach at ten (9th wicket: Broad-Leach, 10th wicket: Leach-Anderson) won’t last long.

Bowling

From a bowling perspective, Anderson has an average that’s four runs per wicket better than Woakes. Their strike rates are similar (Anderson 56, Woakes 59). It’s likely this gap is narrower in English conditions (both average 23 at home), but let’s use the raw data rather than run the risk of flattering Woakes.

Note that England have a solid fifth bowler in Ben Stokes, (unlike some teams that would need to use a part-timer if they are bowling all day).

Running this through the model, adjusting for home advantage and Austalia’s brittle batting order, the benefit of Anderson’s bowling over Woakes is 13 runs per match. Not enough to offset the weaker batting.

That seems a little low to me, four wickets per match at four extra runs per wicket would be 16 runs – I think it ends up lower because Australia are away from home and aren’t that strong at batting.

Conclusions

Bringing Anderson into the team for Woakes would be a mistake. Maybe there’s a case for such a change in a must-win match (as the odds of a draw are reduced), but the model does not support such a change for the fourth Test.

It’s important to put this analysis into context. I’m not saying that all specialist bowlers should be replaced by all-rounders. Nor am I saying that Anderson shouldn’t be in the team because he can’t bat.

The head-to-head between Woakes and Anderson is considered in this specific scenario where England have a high quality fifth bowler (Test average 32), but two weak batsmen in Broad and Leach.

James Anderson is England’s best bowler. If fit he should play. If Anderson is fit one needs to reframe the question: you can pick two of Woakes, Broad and Archer. Just make sure one of them is Woakes. Whatever you do, don’t bring in Anderson for Woakes.

*This might be slightly contentious. Any debate on this topic (though the participant may not realise it) will boil down to whether they believe that career record is the right input to use. For example, I’m not making an adjustment for Woakes’ unusually strong home record, nor am I adjusting to reflect more recent performances (which would boost Anderson’s bowling). Nor am I adjusting because Woakes hasn’t scored many runs this series.