Quantitative Methods for Studying Small SamplesOne of the most important methods used for developing match expectancies in football is the method of Poisson distribution. Carry out some research on the Internet and you’ll find ample matter on how to predict soccer betting winners with the help of this method.
In this method a certain expected average score is assigned to a home team, based on its current attack and how good are the defence players of the visiting opponent team. An expected average score is also applied on the visiting team.
As is obvious, you’ll have major problems using this method during the start of the season as there’ll not be enough games for sampling purposes. Additionally, your estimates may get impacted greatly if there’s a high scoring match or if there are several goalless matches. It’d indeed result in possibly high parametric errors.
One of the suggestions for measuring the extent of these parametric errors is with the help of bootstrapping techniques. These techniques mainly involve devising solutions wherein the sample sizes are invented. At the time of penning this write-up, majority of English Premier teams had played less than 5 away and 5 home matches respectively. We highly recommend the following two methods:
Method no. 1 - Using the Straightforward approach
This particular method involves sampling with the replacement, which means creation of similar sample sizes, using the ability of picking the same value more than one time.
So, if we were to take the example of the home matches played by Leicester City, they scored 1, 2, 2 and 3 goals against Crystal Palace, Arsenal, West Ham and Aston Villa respectively in the EPL 2015/16 season. In this particular sample there’s a mean figure of 2 home goals scored per match.
Let’s look at another random sample consisting of 4 goals employing these values. It’s quite similar to the method of creating random values using Monte Carlo simulation. Hence, the extra sample sets could be:
Sample set 1 – 1, 2, 1, 1
Sample set 2 – 3, 3, 2, 2
Sample set 3 - 1, 1, 3, 2
Sample set 4 - 2, 2, 2, 1
Please note that the instances with 2 goals have two times the chances of being drawn compared to the ones with 3 or 1 goals. And that we may have possibly a different mean average in every case; it isn’t 2 in every instance.
The average per sample in this case are 1.25, 2.5, 1.75 and 1.75 respectively. You may think that the average is actually 2, but the revealed values show that the average may vary from 2.5 to 1.25.
You may also possibly extend this by working out an important number of various bootstrap samples, and go through the standard deviation among the results.
Method no. 2 - Let’s get crazy method
So in case of the Leicester’s matches, you could have possibly generated some expected score. You can arrive at this score the same way as in case of the Poisson method, but applying the data of the previous season.
For instance, let’s study Leicester’s match against Aston Villa. The average number of home goals scored by Leicester during the 2014–15 EPL was 1.474. It scored a total of 28 goals in the 19 home matches it had played. On the other hand, Aston Villa had conceded no more than 32 goals in the 19 away matches.
Hence, Leicester’s attack strength can be worked out to 1, implying that they were just like any other team when playing at home. Its average of conceded goals was 1.684.
Now, the result is 114.29% if we divide this figure by 1.474, implying that Aston Villa would concede 14% more goals whenever it plays in an away game. Hence, Leicester could be expected scoring at an average of 1 x 1.1429 x 1,474 = 1.684 goals, when playing against Aston Villa at home.
You can repeat the same process for all the Leicester’s matches and arrive at the expected number of goals scored by it on a per match basis (See the table below). You’ll learn that Leicester had actually been over performing and was scoring more goals than expected, barring the match it had played against Crystal Palace.
Team / Crystal Palace / Arsenal / West Ham / Aston Villa
Expected goals / 1.263 / 1.158 / 1.526 / 1.684
Actual goals / 1 / 2 / 2 / 3
Difference / -0.263 / 0.842 / 0.474 / 1.316
Just as in case of method no. 1, we have a sample with the replacement of few residuals. Following are the possible sample residuals:
Sample 1 - 0.474, 0.474, 1.316, 1.316
Sample 2 - 0.474, -0.263, -0.263, 0.474
We can arrive at other samples of the home goals by adding the sample residuals to the expected scores:
Sample 1 - 1.77, 1.632, 2.842, 3.000
Sample 2 - 1.77, 0.895, 1.263, 2.158
Every sample has its own average, which can be used for computing the average number of goals scored by the home team, for distinct parameters.
Although it isn’t a very simple calculation, you don’t require an extensive programming language to examine the effects of a small sample. All you need to do is open your spreadsheet and test all the possible parameters. However, please keep in mind that you’ll also have to analyse the residuals on the goals expected from the away team, if you use the above-mentioned second method.