The impact of tutoring on post-pandemic Year 6 maths performance
28th August 2025 by Timo Hannay [link]
As has been widely reported, including several times before on this blog, the COVID-19 pandemic and resulting lock-downs caused large reductions in academic progress among school children when compared to previous cohorts. In order to mitigate this, the Department for Education (DfE) launched the National Tutoring Programme (NTP) in the 2020/21 school year. This provided additional funds to state schools to support one-to-one and small-group tutoring for pupils who had fallen behind. Between November 2020 and the end of the programme in August 2024, schools ran an estimated 6.17 million tutoring courses, equivalent to about 16.4 million hours of tutoring. Since then, schools have still be able to buy extra tutoring, but are required to meet the full cost from existing funds.
The following analysis examines outcomes for primary-school pupils who received online maths tutoring provided by Third Space Learning (TSL). We would like to express our gratitude to TSL for providing access to their proprietary data, as well as funding this work, and to the DfE for granting access to National Pupil Database (NPD) records. This has allowed us to determine Key Stage 2 (KS2) outcomes for children who received TSL tutoring, and to compare this with outcomes for other similar children who did not. Responsibility for this analysis and its conclusions rest with SchoolDash alone.
In summary, we find that:
- TSL pupils are considerably different from KS2 pupils as a whole. In particular, they include a much higher proportion of children who are eligible for free school meals (FSM).
- Consistent with this, uncalibrated results from Standard Assessment Tests (SATs), taken at the end of KS2 (age 11), show that TSL pupils as a group performed below national averages, not only in Maths, but also in Reading and Writing.
- Interestingly, TSL pupils with higher session attendance rates tended to do better, but this could have been caused at least in part by confounding factors such as better preparedness, support or motivation.
- Looking at the subset of TSL pupils who were previously underperforming, and for whom 80% or more of tutoring sessions had good audio quality, their attainment gaps relative to similar control-group pupils appeared to reduce with increasing numbers of sessions.
- Furthermore, this effect was specific to Maths. For TSL pupils receiving 20 or more sessions, the gap in pass rates for compared to control-group pupils was 1.8 to 2.5 times narrower in Maths than in GPS or Reading, respectively. However, the sample sizes were relatively small and statistical significance correspondingly low.
- It is important to note that control-group pupils might have received some other form of tutoring or support. All we know with certainty is that they did not received TSL tutoring.
- Overall, these results suggest measurable improvements in Maths performance for pupils receiving TSL tutoring, but also demonstrate how difficult it can be to fully control for confounding factors, with the consequence that the results presented here probably underestimate the true effects. It would be helpful in future to analyse higher tuition doses with larger sample sizes, and to more fully control for pupil characteristics and educational context.
National Key Stage 2 attainment measures
To provide some background, this section will look at national trends in KS2 Standard Assessment Test (SAT) results using data from the 2022/23 school year (ie, tests sat by Year 6 pupils in May 2023). Raw test marks are converted by the DfE into scaled scores ranging from 80 to 120, with the 'Expected Standard' set by definition at 100 and the 'Greater Depth' threshold at 110. For simplicity, we will refer to pupils obtaining scores of 100 or more as achieving a 'pass', but the official terminology is 'working at the expected standard'.
The results presented below are derived from pupil-level records in the NPD, to which we were granted access through the Office for National Statistics Secure Research Service. In order to comply with DfE data-protection rules, no data points corresponding to fewer than 10 pupils are presented. This has some marginal effects on a couple of the figures below; these are noted where relevant.
Figure 1 shows the distribution of scaled scores in Maths, Reading and Grammar, Punctuation and Spelling (GPS) for all Year 6 pupils attending state schools in England in 2023. One artefact of the conversion to scaled scores is that fact that certain scores are much more likely than other adjacent scores, resulting in jagged distributions, particularly in Reading. Modal scores are in the range of 106-110. There appear to be 'ceiling effects' in which the attainment of the highest-performing pupils are not fully discriminated. In contrast, there do not appear to be significant 'floor effects' for low-attaining pupils.
(Use the menu below to switch between Maths, Reading and GPS. Hover over the columns to see the corresponding data values.)
Figure 1: Distributions of pupils by KS2 test score (2023)
Due to the jagged distributions seen above, cumulative data can be easier to interpret. These are shown in Figure 2. It is clear from this that most pupils obtained scores in excess of the 100 'pass mark' across Maths, Reading and GPS.
(Use the menu below to switch between Maths, Reading and GPS. Hover over the columns to see the corresponding data values.)
Figure 2: Cumulative distributions of pupils by KS2 test score (2023)
Figure 3 shows demographic details for the full DfE data set. Boys were slightly more numerous than girls. Pupils eligible for free school meals (FSM) accounted for about 29%, those with English as an additional language (EAL) about 22%, pupils receiving support for special educational needs (SEN) about 15% and those with an Education, Health and Care (EHC) plan about 2%. Note that these data cover only state schools.
(Hover over the columns to see corresponding data values.)
Figure 3: Composition of pupils in DfE KS2 test score data set (2023)
These demographic factors often correlate with substantial differences in attainment, as shown in Figure 4. Boys tend to slightly underperform girls in Reading (yellow columns) and GPS (red), but marginally outperform them in Maths (blue). EAL pupils overperform in Maths and GPS, but not in Reading. On average, FSM, SEN and EHC pupils tend to underperform in all three subjects.
(Hover over the columns to see corresponding data values.)
Figure 4: Mean KS2 test scores by pupil type and subject (2023)
Figure 5 shows a similar breakdown to the one in Figure 4, but using pass rates instead of test scores (where a 'pass' corresponds to a score of 100 or higher). The differences here are more pronounced. For example, the proportions of EHC pupils passing are only about half of those across all pupils.
Note that these pass rates are higher than those reported by the DfE. This is because we are looking only at pupils who took the tests, while the DfE national figures include some of those who didn't – eg, those deemed 'absent', 'unable' or 'just arrived' (see the DfE documentation for details). The DfE data assume that the latter group did not meet the expected level, whilst we exclude them from our analysis.
(Hover over the columns to see corresponding data values.)
Figure 5: KS2 test pass rates by pupil type and subject (2023)
Third Space Learning pupils
This section looks at the characteristics of pupils who received TSL tutoring in the context of the full DfE data set explored above. The latter contains over 640,000 pupil records (see above for exact sample sizes), while the full TSL data set comprises of 14,749 pupil records. These were collected between September 2021 and January 2024, and cover the 2021/22 and 2022/23 school years, representing a total of 762 distinct schools.
The TSL records were matched against the DfE pupil records based on school identifier, sex, full name and/or date of birth. This resulted in around 11,300 TSL records that were successfully matched against corresponding DfE pupil records. Some TSL pupils failed to match because they were not members of the 2022/23 Year 6 cohort; in other cases the pupil data were incomplete or did not match the corresponding DfE information.
Figure 6: This shows the same data as in Figure 3, but this time with TSL pupils added (red columns). TSL pupils as a whole were a bit more likely to be girls (consistent with their slight relative underperformance in Maths) and much more likely to be eligible for FSM. They were considerably less likely to be SEN or EHC, presumably because those pupils already had other forms of support in place.
(Hover over the columns to see corresponding values.)
Figure 6: Composition of pupils in DfE KS2 test score data set (2023)
Figure 7: Shows the same data as in Figure 1, but with TSL pupils overlaid (red columns), and the vertical axis showing the proportion of pupils in each data set in order to cater for the very different sample sizes. As expected from their profiles, TSL pupils skew towards lower SAT scores across Maths, Reading and GPS.
(Use the menu below to switch between Maths, Reading and GPS. Hover over the columns to see the corresponding data values.)
Figure 7: Distributions of pupils by KS2 test score (2023)
Figure 8: Shows the same data as in Figure 7, but expressed as cumulative totals. Once again, TSL pupils as a whole skew towards lower SAT scores in Maths, Reading and GPS.
(Use the menu below to switch between Maths, Reading and GPS. Hover over the columns to see the corresponding data values.)
Figure 8: Cumulative distributions of pupils by KS2 test score (2023)
Though not shown in the figures, it is also the case that TSL pupils tended to show lower progress, not just attainment. KS2 progress is measured by comparing prior attainment at the end of KS1 (age 7) with SATs attainment at the end of KS2 (age 11). Pupils who perform the same at age 11 to peers who had similar prior attainment at KS1 receive a progress score of zero. (This isn't to say that they have not progressed, only that they have made average progress.) Those who outperform their peers receive positive progress scores, while those who underperform relative to their prior attainment receive negative progress scores. The mean progress score for all pupils in a subject is by definition zero. Taking all TSL pupils together (n=11,345) their mean progress scores were -0.62 for Reading, 0.00, for Writing and -0.96 for Maths. This is a further indication that they were particularly challenged in Maths – though, interestingly, less so in Writing and not at all in Reading.
Figure 9: Shows the same data as in Figure 4, but with TSL pupil data added (red columns). TSL pupils tend to achieve lower SAT scores across most pupil types, the exceptions being SEN and EHC. Broadly similar patterns apply across Maths, Reading and GPS.
(Use the menu below to switch between Maths, Reading and GPS. Hover over the columns to see the corresponding data values.)
Figure 9: Mean KS2 test scores by pupil type and subject (2023)
Figure 10 shows the same data as in Figure 5, but with TSL pupils added (red columns). They showed lower pass rates across most pupil types. Patterns for Maths, Reading and GPS were broadly similar.
(Use the menu below to switch between Maths, Reading and GPS. Hover over the columns to see the corresponding data values.)
Figure 10: KS2 test pass rates by pupil type and subject (2023)
The effects of tutoring
Since TSL pupils are atypical, these differences with the national average do not necessarily tell us anything about the impact of tutoring per se. Given that, how can we determine whether the tuition provided has had any effect? This section looks into that question, attempting to allow for potential confounding factors. As we shall see, this is easier said than done, but it is possible to discern some differences between pupils who received TSL tutoring and their peers.
One potential route is to look at the 'dosage' effects of increased amounts of tuition. Figure 11 shows that higher attendance rates correlate with increasing Maths pass rates. However, this might be due at least in part to confounding factors that correlate with higher attendance, such as greater support, preparation and/or motivation.
Furthermore, the relationship with the number of sessions attended is not straightforward: pupils receiving 15 or more sessions showed lower pass rates than those attending fewer sessions. Note, however, that the NTP itself defined a 'course' as corresponding to about 15 hours of tuition. This very likely underlies the step-like reduction in pass rate seen in Figure 11 between pupils attending 10+ sessions (red line) and those attending 15+ sessions (yellow): the latter group were those who received full NTP-like support and are therefore likely to have also been the ones who were most in need of academic help.
(Click on the figure legend to turn individual lines on or off; double-click to show one on its own. Hover over the graph to see the corresponding data values.)
Figure 11: KS2 Maths pass rates by TSL tuition attendance and numbers of sessions
In order to concentrate on pupils affected by the NTP, we separated out the 39% of TSL pupil records that were aimed at 'SATs preparation' (as opposed to the 55% that were 'diagnostic' and the 6% that came from 'teacher selection'), and the 70% of TSL pupils who were assessed as 'Working Towards' (as opposed to the 27% 'Meeting Expectations' and 3% 'Exceeding Expectations'). We also focused on those pupils for whom at least 80% of tutoring sessions were reported to have good audio quality. In short, these were the pupils mostly likely to have been in need of NTP-style tutoring and to have received such support in appropriate quantities.
Among these pupils, 69% of those attending at least 10 sessions achieved the Expected Standard (n=975), rising to to 70% for those with 90% good audio (n=807) and 71% for those with 100% good audio (n=748). These seem impressively close to the national averages for all pupils.
A more sophisticated approach is to compare TSL pupils with a control group of otherwise similar pupils who did not receive TSL tutoring. To this end, each TSL pupil was matched with a similar non-TSL pupil from the DfE data set. In this context, 'similar' meant having the same sex, FSM status, EAL status, SEN status and month of birth. It also meant that the control pupil attended a sufficiently similar school to the TSL pupil, based on SchoolDash's standard algorithm. This takes into account the following school-level pupil characteristics: age range, sex ratio, and proportions of FSM/Pupil Premium, EAL and SEN pupils. It is possible to assign more than one control-group pupil to each TSL pupil, but this was found to make little difference to the results whilst considerably complicating the analysis, so in the results presented here we have used only one control-group pupil for each TSL pupil.
It is vital to note that control-group pupils might have received other forms of tutoring or support; all we know for sure is that they did not received TSL tutoring.
Figure 12 shows mean Maths test score results for TSL pupils (red line) and their corresponding controls (blue). TSL pupils show an increase in mean score with higher numbers of sessions, while the control group shows a reduction (probably because the pupils most similar to those receiving higher numbers of TSL sessions are more likely to be among those who have fallen furthest behind). In short, there are signs that TSL tutoring has closed the gap.
But not eliminated it altogether. The most likely explanation is that even though we have matched each TSL pupil with an otherwise similar non-TSL pupil, we have been able to fully control for each pupil's educational context. In other words, the TSL pupils might have forms of underperformance or disadvantage that are simply not captured in the official data.
(Click on the figure legend to turn individual lines on or off; double-click to show one on its own. Hover over the graph to see the corresponding data values.)
Figure 12: KS2 Maths test scores by TSL tuition attendance and numbers of sessions
Figure 13 shows the same analysis as Figure 12, but looks at KS2 Maths pass rates instead of test scores. There is a very similar effect, with the gap narrowing for pupils who received 20 or more sessions.
(Click on the figure legend to turn individual lines on or off; double-click to show one on its own. Hover over the graph to see the corresponding data values.)
Figure 13: KS2 Maths pass rates by TSL tuition attendance and numbers of sessions
If these effects are caused by TSL tuition then we would expect them to have more impact on attainment in Maths than in Reading or GPS. Figure 14 shows that this is the case for the gap in test scores among pupils who received at least 20 sessions.
(Hover over the columns to see the corresponding data values.)
Text...
Figure 14: Differences in KS2 Maths test scores between TSL and control pupils (2023)
Figure 15 shows a similar trend for the gap in test pass rates (again for pupils receiving at least 20 sessions), which reduced by around 1.8-2.5 times compared to Reading and GPS. However, note that due to the relatively small sample size the probability of the null hypothesis that the underlying Maths and Reading distributions are the same is only just below 0.2, while the usual cutoff for statistical significance is 0.05. This result should therefore be seen as indicative rather than conclusive.
(Hover over the columns to see corresponding values.)
Figure 15: Differences in KS2 Maths pass rates between TSL and control pupils (2023)
Overall, there are signs that maths tutoring did indeed have an impact, but the signals are relatively faint. This could be for one or more of the following reasons:
- The effects of tutoring are indeed modest, at least for the relatively low doses provided as part of the NTP (~15 hours compared to months of lost school time)
- It is not possibile in practice to fully control for the educational context of each pupil, and so to determine an appropriate baseline against which to set test score expectations.
- Even control-group pupils might have received support, albeit of a different kind, so we are not necessarily comparing against a baseline of zero extra support.
Yet, difficult though this kind of research is, its potential value is undeniable – and becoming all the more important in the face of what looks set to become a boom in AI-enabled personal tutoring. Without running similar analyses with higher doses, larger sample sizes and better controls we will not know whether this is having the desired effects. We encourage TSL, the DfE and others to continue gathering the data and crunching the numbers.
As ever, we welcome your feedback: [email protected].