Science Achievement in Grades 4 and 5
Exhibit 2.1.1a presents the estimated growth in students’ science achievement between the fourth and fifth grade for the nine participating countries.¹ The numerical results tab displays each country’s average achievement at each grade, along with a growth estimate and its standard error (in parentheses). The graphical results tab shows the distribution of changes in achievement scores, including percentiles, and provides confidence intervals² associated with average growth.
On average, students in all countries showed growth in science achievement between Grades 4 and 5. The interactive table can be sorted by different criteria. When sorted by magnitude of growth, the exhibit highlights Slovenia (growth of +37 scale score points), Korea (+34 scale score points), and Italy (+32 scale score points) as having the most substantial average achievement gains between the two years.
Note that the growth estimates are provided with a margin of error, given in the form of a standard error for each country. For example, Korea’s estimated average growth of 34 points has a 95% confidence interval ranging from 31 to 37 points (30 plus and minus twice the standard error of 1.7, rounded). Besides the standard error, the percentiles of the change distribution are also informative. Across countries, the distribution of change scores generally widens as the average growth decreases (Jordan is the notable exception), meaning that countries with higher average growth tend to have, on average, smaller differences among students in terms of how much they learn. In countries with, on average, smaller growth, some students showed a substantially higher test score in 2024 compared to 2023, while others showed a lower test score in 2024 compared to 2023.
It is helpful to relate these estimates of average growth to how progressions from lower to higher achievement are described in TIMSS: the TIMSS International Benchmarks are located 75 points apart (Advanced at 625, High at 550, Intermediate at 475, Low at 400), and the estimated average growth in the four countries with highest growth between Grade 4 and 5 equates to an average student advancing nearly half a benchmark, which is a substantial amount.
The graphical results tab of the exhibit provides a more comprehensive look at patterns in growth within and across countries. The variation in standard deviations between grades and countries, in interquartile ranges, and in the 5th–95th percentile difference in the growth distribution all indicate that substantial variability in achievement growth exists both between and within countries. Thus, while one country’s average growth may exceed another’s, all nine countries include students with varying growth trajectories between the two years. Overall, fewer students showed a decline in achievement, while most students showed a gain, and some showed very little change. For instance, while North Macedonia showed the lowest average growth in science achievement (+10 scale score points), some students demonstrated substantial growth in science and likely outpaced students who showed low or no growth, on average, in higher-growth nations. Notably, the 95% confidence intervals for average growth do not cover zero in any of the nine countries, indicating statistical significance³ at the 5% level for the growth estimate in these eight countries.
Exhibit 2.1.1b complements Exhibit 2.1.1a and presents average science achievement results for nine TIMSS 2023 Longitudinal countries in both assessment years: from Grade 4 (2023) to Grade 5 (2024). For each country and grade, the numerical results tab shows the average scale score, accompanied by its standard error in parentheses, the 95% confidence interval for the average science achievement, and the standard deviation of student scores with its standard error. As noted above, all countries show gains in average science achievement between the two years.
The exhibit also provides information about within-country score variability and shows that standard deviations generally increased from Grade 4 to Grade 5— sometimes only two points or less, while larger increases can be seen in Jordan (an increase from 103 to 108) and Georgia (an increase from 72 to 80). This suggests the achievement distributions within some countries widen over time.
¹See the TIMSS 2023 International Results for more information about the science assessment. ²One can think of a confidence interval as a “net” we cast to catch the true average of a whole population. A 95% confidence interval means that if we were to repeat our study 100 times, the nets casted from those 100 studies would catch the true average about 95 times. It doesn’t mean there’s a 95% chance our specific net caught the true value—it either did or it didn’t. Any specific 95% confidence interval is one such application of a method that works 95% of the time. ³Statistical significance means nothing more than that the difference found is big enough that we only expect to see it by random chance less than 5% of the time if there was truly no difference. It’s surprising enough for us to look closer but does not imply practical significance. Also, in very large studies, such as TIMSS where many students take the test, there is reason to be cautious. One can occasionally find very small differences that are technically statistically significant but are so small they don’t actually matter much in terms of real world effects (e.g., Berkson, J. (1938)).