ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Systematic Review

Sample size in educational research: A rapid synthesis

[version 1; peer review: 1 approved with reservations, 1 not approved]
PUBLISHED 09 Oct 2023
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Datta Meghe Institute of Higher Education and Research collection.

Abstract

Background: 
By conducting an in-depth study of the publications, a review was conducted with the goal of evaluating the sample size in educational research. The sample size, represented by the letter “n,” is a key factor in this research because it specifies the number of participants who represent the target population. Although various studies have been published in the literature defining the processes for calculating sample sizes, there is still much uncertainty. It is vital to understand that there is no single all-encompassing method for determining sample sizes for different study designs. Instead, different study designs call for different approaches to determine sample numbers.  
Methods: 
Information was retrieved from the databases in accordance with updated PRISMA recommendations. The keywords used for the retrieval of the relevant articles from two databases (Google Scholar and PubMed). The articles were selected by thorough scrutiny and application of inclusion and exclusion criteria.  
Results: Seven articles were selected from the 9282 articles. The comparison was made among the studies in the relation to methods, objective, and outcome from the enrolled studies.   
Conclusions: The evaluation of the seven studies as a whole concluded that the sample size for testing any novel approach essentially required 24.24 participants in each group. The median sample size for the simulation-based educational research was 30. Further research is required to determine the proper sample size based on a single universal formula for all types of designs.

Keywords

sample size, rapid review, study design, educational research

Introduction

The term “sample size” describes the number of subjects or observations that make up a study ‘n’ is typically used to represent this number. The size of a sample affects two statistical properties:1) the accuracy of estimates and 2) the study's ability to draw inferences.1

Surveys, experiments, observational studies, and other types of clinical research studies can all be categorized. Many different factors are involved in excellent research planning. The first step is to define the practical issue. Choosing the relevant participants and controls, as well as the experimental or observational units, was the second stage.

The inclusion and exclusion criteria must be carefully defined and should account for any potential variables that could affect the measurements and units being observed. The study design must be precise, and the procedures must follow the best technique currently available. Based on these considerations, the study's sample size needs to be appropriate for its goals and potential variability. The sample must be “large enough” for the effect to be statistically significant and have the expected size of scientific significance. At the same time, it is crucial that the study sample not be “too big,” where a statistically significant effect of minor scientific import could still be found.2 Additionally, the sample size was economically significant. Resources may be wasted in an insufficient study because it may not yield valuable results, whereas an excessively large study consumes more resources than is required. The sample size of a study involving human or animal subjects is a crucial ethical concern, because a poorly planned experiment exposes participants to potentially hazardous procedures without contributing to information.3,4 Therefore, calculating the power and sample size is crucial in the design of clinical research. Unaccountable studies printed in national and intercontinental journals have found that sample size estimation were incorrectly disclosed or had smaller samples than necessary, which reduced their power.1,2

There is still much confusion despite the fact that unnumbered studies clarifying the methods of sample size computation have been published in the existing literature. It is crucial to realize that there is no single universal formula for calculating the sample sizes for all study designs. Instead, different study designs require different methods to calculate sample sizes.3,4 The study was conducted with the aim of assessing the sample size in educational research.

Method

To conduct this rapid review, Preferred Reporting Items for Systematic review and meta-analysis (PRISMA-S) criteria were used to guide the search.5 Using the BOOLEAN AND operator, researchers searched databases like PubMed, Embase, and the Cochrane Library for publications containing the keywords “Sample Size” AND “Educational research.” Free full-text, unlocked articles, pertinent terminology and information, and English language usage were required for inclusion. The exclusion criterion was considered and they included abstracts, locked articles and journals, no relevance in the data, and languages other than English. The principal investigator carried out the entire review planning process, which was authorized by the investigator. The presentation of the entire search is shown in Figure 1.

bd3e8ce1-4834-4ca8-bd6a-98f0555fcec5_figure1.gif

Figure 1. The schematic presentation of the data by using updated PRISMA guidelines.

Results

Seven studies were selected from the 9282 articles by using Google Scholar and PubMed with the applicable of stringent inclusion and exclusion criteria. All the information related to the articles were shown in the Table 1.

Table 1. Comparison of the studies in relation to their methods, article type, objectives and conclusions.

Number of articleAuthor nameArticle typeObjectivesMethodConclusion
1McConnell et al.6EditorialThe purpose of this editorial was to discuss sample size calculation in context of medical research intervention.To teach nursing and anaesthetic colleagues about programmed intermittent epidural bolus analgesia, the author created a scenario in which they planned to accomplish their goal of estimating the required sample size. To this end, they developed a questionnaire and weekly tests to evaluate their coworkers' understanding of the novel method and efficacy of the intervention.The formula produced n = 24.24, or 25 in each group, for a total sample size of 50 students, as per the statement. It is extremely important to use effect size when estimating the sample size.
2Staffa et al.7ReviewThe purpose of the study, which was conducted by paediatric surgeons, was to disseminate a method for selecting a sample size to identify an effect that would have therapeutic significance through the interpretation and validation of the findings.Using various instances, the authors used a five-step technique to validate the sample size and statistical power analyses, including defining the primary outcome of interest and the expected impact size and power. Identify the relevant statistics and statistical test that will be taken into account. Conducted the necessary calculations to acquire the sample size needed using software or a reference table, Make a formal power and sample size declaration for the publication, grant application, or project proposal.Calculating the suitable statistical test to employ for sample size depends on the type of the data, clinical hypothesis, and its applications.
3Dreyhaupt et al.8ReviewThe study was performed to describe the implementation and general principles of cluster randomization, and also for outlining the general aspects of using cluster randomization in prospective two arm comparative -educational research.The study compared the individual randomization with the cluster randomization technique in educational research to evaluate the systematic bias reduction. It also demonstrated the general principles, its implementation and aspect of cluster randomization in a prospective two arm study.The studies that involve cluster randomization required relevantly bigger sample size and complex method for calculations.
4Cook et al.9Systematic reviewThe study was conducted to determine the study power across a range of effect sizes, by re-analysing meta-analysis of simulation based education.The author re-analysed 897 studies and the results of simulation based education to determine study power across a range of effect size.The median sample size for the 627 no-intervention comparison group was found as 25, whereas the median sample size for different simulation group was found as 30.
5Agnihotram 201810ReviewThis article focuses on the determination of the minimal sample size for a variety of objectives, providing a quick overview of the statistical methods employed in various research study phases.The author discussed the various steps for estimating the sample size, that included

  • 1- Clearly state aim of the study followed by the objectives.

  • 2- To choose the appropriate study design for meeting the objectives.

  • 3- Define target population.

  • 4- Use statistical/sampling technique.

  • 5- Decide data collection tools

  • 6- Perform appropriate statistical analysis.

  • 7- Communicate results and interpretation using tables and figures.

The study found that the sample size formula was based on the primary research purpose, conclusions, variables, statistical analysis planned, number of groups, and sampling technique.
6Ferreira et al.11ReviewBy using objective methodologies as the standard, the study intended to validate a priori hypothesis and sample size for evaluating the intensity and duration of physical activity in a paediatric population.The data from the electronic databases were searched, physical activity intensity was measured by questionnaire and duration was measured by accelerometer.The study indicated weak to moderate agreement between subjective and objective approaches for determining the intensity and duration of physical activity. Additionally, assessments of the stability of method-to-method agreement were provided by sample sizes of 50 to 99 subjects.
7Guo et al.12ReviewThe goal of the study was to determine the sample size for two independent groups with equal and unequal unknown variances when power and differential cost were both taken into account.In this study, Welch approximate test applied to test derive various sample size allocation ratios by minimizing the total cost or equivalently, maximizing statistical power and two types of hypothesis were used superiorly and equivalence of two means for sample size planning.The sample size formula proposed in this study should be used whenever cost factor is involved and population variances are unknown and unequal.

Discussion

Research in health science education is expanding. Emerging educational research relies on relevant conceptual frameworks, reliable research techniques, and important discoveries.13,14 Prior reviews have shown that many educational research articles employ small sample sizes, despite the fact that researchers rarely take into account the expected impact size, intend the sample size before, or describe the actual precision in evaluating the results.15,16 Although authors rarely analyse the anticipated influence size, arrange the sample group in detail, or analyse the results from the perspective of actual precision.9

According to the definition of statistical power, it is “the likelihood that the null hypothesis will be rejected in the sample if the observed effect in the population is equal to the effect size”.17 In other words, the potential that a study will uncover a real, statistically significant effect is known as power. Studies with a higher power are preferable because lower-power studies may miss potentially important connections. A power of 90% is ideal and 80% is typically considered the minimum power. The sample size (the number of observations), the effect size (the value of the effect), and the risk of type I errors all affect power (the likelihood of recognising a “significant” difference when there is none, represented by alpha).9,18

The study adopted a convenience sampling method for primary research for determination of sample size in education by examination of simulation-based education. First, most research in the sample only had the power to find effects with moderate to large SMD [0.8], while other studies only had the power to find effects with immensely large magnitudes ([2 standard deviations). Most of the negative studies, or those that did not find a statistically relevant difference, had very broad confidence intervals (CI), signifying the probability of large and likely important differences. The first point and discovery were connected. In these trials, the lack of a statistically relevant outcome did not establish superiority or equivalence of the interventions under study.9

In one study, the author aimed to present sample size calculations in the context of medical educational interventions and focused on computing sample sizes to compare distinct groups where the result was a continuous (interval or ratio) dependent variable, such as in interventional designs. The criteria for forecasting the sample group, such as the relevance factor, preferred statistical significance, predicted difference in score, and approximate evaluation variation, which may be estimated from previous studies, were discussed in order to determine the number of participants required to assess the effects of an intervention on a specific outcome or the association between variables.6,19 Interventions in education frequently concentrate on changing latent conceptions, which are theoretical and cannot be readily seen or quantified. This causes the validated scales to vary, changing how the outcome measures are calculated. The educational researcher advocated the use of effect size in determining the sample size. The study design often affects the relationship between larger effect sizes and smaller sample sizes. This resulted in the effect sizes being categorized as “small,” “medium,” and “large,” respectively, for values of 0.20, 0.50, and 0.80.20 Finally, the meta-analysis revealed that the sample size for each group was 24.24 respectively.21,22

Further, author’s also discussed errors to avoid, including considering sample size estimation as small, medium, or large, which leads to a failure in the accuracy of the evaluation tool and sample characteristics.23 Second, unless necessary, researchers should avoid creating new institute-specific assessment instruments. This is because they must be validated for accuracy and reliability before use in interventional studies.24 Third, the prospective dropout and attrition rates must be considered.25Finally, the need to avoid equating the effect size with its true significance and employing a confidence interval that offered accuracy in the sample and effect sizes.6

The objective of the report was to establish the optimal number of subjects for a study during the planning stage, with sufficient patients to resolve the most clinically important questions and statistical power calculations. The evaluation of the sample size that must be randomised to each arm in order to achieve the standard 80% or 90% power to find a clinically meaningful effect in randomised controlled trials, which frequently use parallel group designs. The need for a control arm, statistical comparability, structural equality, and resemblance of management conditions and observations are among the themes that the author elaborated on as being essential for educational research investigations. If an academic research study exhibits these traits, hence the test arm's success is significantly greater than that of the control arm, and the distinctness cannot be the result of concurrence. The cluster randomization was usually performed for non-therapeutic intervention such as prevention program, healthcare program and training programs. Two to thousands of individuals were found in each cluster. Education research may also consider different cluster sizes.8

Minimizing or reducing contamination bias is the fundamental reason for performing cluster-randomized studies. Observations inside clusters are typically more comparable to one another than observations from distinct clusters, creating a unique data structure known as a statistical dependency. The effective sample size of a cluster-randomized study is less than the concrete sample size, which has an impact on sample size computation (i.e., the number of enrolled students). Consequently, it is inappropriate to use typical methods that presume the statistical competence of all observations to rule out the sample size for cluster-randomized investigations.8

The purpose of the study, which was conducted by paediatric surgeons, was to disseminate a method for selecting a sample size to identify an effect that would have therapeutic significance through the interpretation and validation of the findings. Using a five-step approach, it is possible to calculate the minimum sample size necessary to ensure sufficient power and accurate interpretation of the study's findings.7 The sample size that can be achieved to assess a significant effect on the basis of research or primary data must be justified using the power calculation. The research sample size should have adequate statistical power to identify clinically meaningful effects in scientific investigations.7,26 The sample size of the prior control group determined the statistical power. To compare the two groups effectively, comparisons must be made with a historical control group that is comparable to the research group, for which data on assessed confounders are available. The suggested 5-step approach can be used with any type of data or study design, although power and sample size primers do not provide examples for every possible research circumstance. The fundamental objective of the primers was to compare the two treatment groups. However, due to multiplicity and multiple testing, there is a higher risk of false-positive results (Type I error) when comparing more than two groups.27,28

Guo et al. used two different types of hypotheses, taking into account sample size planning factors such superiority/non-inferiority and equivalence of two means. When population variances are unknown, no exact sample can be found through traditional sample size formula and resulting sample size must be suitable enough to meet the required level of significance and probability of correct decision and power. The cost constraint depends on the two experimental goals for given level of αand power 1-β i.e. allocation of having minimal total cost and ratios are a function of unit cost ration and standard deviations.12

Historically, three methods have been employed to determine the sample size. The first is an interval strategy, where the confidence interval is high (e.g., 95%) and the sampling error between the true parameter and its estimate is kept to a present modest amount, that is, 3 percent. Since there is no hypothesis testing involved in this method, no threshold of significance is required. The second is a hypothesis-related approach in which both the null and alternative hypotheses must be precisely specified beforehand to detect a significant difference between the parameters under study while simultaneously meeting the required level of significance (Type I error rate) and the desired power (probability of correctly accepting the specified alternative). The third strategy uses a “indifference zone,” where populations that perform better than the others are placed in a zone where they are more likely to be chosen correctly.29

A “priori" literally translates from Latin as “what comes before” and they are a fundamental part of the scientific method since they are created based on assumptions.30 From these assumptions, three hypotheses were inferred. With reference to objective methodologies, the aim of this systematic review was to offer proof for a priori hypotheses and sample size for evaluating the quantity and duration of physical activity in a pediatric population. The results of systematic review suggest that the degree of agreement between subjective and objective measures for determining the intensity and duration of physical exercise should be assumed modest to moderate.11

Currently, there are no data to support an a priori assumption regarding how well the different methods of assessment agree. To select a sample size, attain precision, or have sufficient power to reject a false null hypothesis, a robust a priori hypothesis is necessary. Cost and feasibility, which are frequently the true drivers of the sample size, cannot be disregarded by researchers. Nonetheless, typical power calculations yield only specific sample sizes by making precise assumptions. This study’s results indicate that, for assessing nearly all physical activity, intensity and duration parameters, a sample size of 50–90 subjects offers constant agreement between subjective and objective approaches. The degree of uniformity displayed in each (often non-representative) sample studied, the accuracy of the subjective method created for a target sample, and the inadequacy of the correlation coefficient for detecting agreement issues are all potential explanations for stable agreement in this sample size interval. Additionally, studies with small samples showed higher levels of variability in the range of findings, perhaps as a result of the inferior design of these studies to those with larger samples.31

The “vibration of effects” diminishes the reliability of the consensus measures in samples with less than 50 respondents. The study predicts that the basis for the decreased reliability of the agreement measures in studies with samples of 100 or more persons is that the researchers' attempts to ignore the occurrence of an exaggerated effect in a small-sample trial when a finding is made is the primary factor.32 The superiority in methodology systematic evaluations addressing the agreement between subjective and objective measures for assessing physical activity has frequently found low methodological quality in the studies.3235

The COSMIN checklist, which was employed in the cited study, identified the absence of an a priori hypothesis and small sample size (n = 50) as the primary factors affecting the methodological standard of the redeemed studies. These factors were obeyed by a lack of data regarding subjects who were missing and the way in which missing data were handled. The author disapproved with questionnaires, diaries, and/or logs those received low ratings in methodological quality evaluations are ineffective tools for gathering subjective data.32

The sample size was depended upon the degree of heterogeneity, if the analysis was performed by multiple investigators and teams. Moreover, studies with limited data showed higher levels of variability in the range of findings, perhaps as a result of the inferior design of these studies to those with larger samples.33,34

A statistician is essential to rule out the number of subjects and analyse the final results of the entire investigation. To perform a suitable well-defined study that produces rational and trustworthy implications that can be applied to the sample population, it is crucial for the investigator to understand the fundamentals of analytical methods. Clinicians can use statistics to extract crucial information from empirical data, which improves patient care. Statistical notions must be considered from the initial planning stage to the final reporting phase. In general, there are two sorts of sample size estimation problems: sample size for (a) an estimating study and (b) to tests a hypothesis, or a comparison study.10

When performing an estimation study, the researcher was interested in estimating the quantity of one or more parameters, including, among other parameters, the mean haemoglobin level or arthritis prevalence. Researchers were interested in comparing population characteristics at one or more time points or characteristics of two or more populations in studies that test hypotheses. For instance, they might compare the prevalence of arthritis between two populations before and after the administration of an intervention. A researcher should select a large number of people if they wanted the estimation in their study to be more precise, because as the accuracy (or margin of error) grows (or lowers), the minimum sample size necessary increases. For instance, for a sample size greater than that preferred for a 95% confidence level, that estimate of a parameter is required. The computation of sample size in studies testing hypotheses aims to obtain the appealed power for disclosing a difference that is therapeutically or experimental significant at a predetermined significance level.35

According to the statistics, there are various methods, test and formula for estimation of the sample size required to perform the research and other relevant studies. But the lack of research regarding the appropriate and whole number needed for performing any research is not established yet, like pilot study confirms the 12 participants for each group enrolling for the particular trial.36

Conclusion

The review suggested that the sample size should be considered as soon as possible throughout the research phase to gather more insightful background that will fundamentally have a stronger influence on pedagogic application. All types of research investigations require the determination of sample size, and selecting the appropriate formula is essential. According to the study's main goal, outcome variable, study plan, intended statistical investigation, study groups, and assorting procedure to be utilized, a suitable sample size formula was chosen. The sample population needed for a study is determined by a variety of variables, including the feasibility of the study, its power, the accuracy of the calculated value, its analytical relevance and confidence level, its ability to detect a clinically significant difference, and other factors, such as financial support, workforce, subject availability, and time. Studies involving cluster randomization require a larger sample size and a complex method for calculations. The sample size for conducting any new method basically required 24.24 members in each group. The median sample size for the simulation-based educational research was 30. Further, more research is needed for the appropriate sample size and universal single formula based on every study design.

Comments on this article Comments (0)

Version 3
VERSION 3 PUBLISHED 09 Oct 2023
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Besekar S, Jogdand S and Naqvi W. Sample size in educational research: A rapid synthesis [version 1; peer review: 1 approved with reservations, 1 not approved]. F1000Research 2023, 12:1291 (https://doi.org/10.12688/f1000research.141173.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 09 Oct 2023
Views
27
Cite
Reviewer Report 16 Feb 2024
Jorge M. Mendes, NOVA Information Management School, Nova University Lisbon, The Knowledge Hub Universities, Cairo, Egypt;  NOVA Information Management School, Universidade Nova de Lisboa, Lisbon, Lisbon, Portugal 
Approved with Reservations
VIEWS 27
The article titled "Sample size in educational research: A rapid synthesis" underwent a systematic review to evaluate the adequacy of sample sizes employed in educational research studies. The investigation focused on the crucial role of sample size, denoted as "n," ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
M. Mendes J. Reviewer Report For: Sample size in educational research: A rapid synthesis [version 1; peer review: 1 approved with reservations, 1 not approved]. F1000Research 2023, 12:1291 (https://doi.org/10.5256/f1000research.154590.r238774)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 13 Apr 2024
    Smruti Besekar, Pharmacology, Datta Meghe Institute of Higher Education & Research, Sawangi, India
    13 Apr 2024
    Author Response
    Dear reviewer,

    I really appreciate your valuable time and efforts. I valued your suggestions and tried to make corrections accordingly. I have revised the title, methodology section and even ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 13 Apr 2024
    Smruti Besekar, Pharmacology, Datta Meghe Institute of Higher Education & Research, Sawangi, India
    13 Apr 2024
    Author Response
    Dear reviewer,

    I really appreciate your valuable time and efforts. I valued your suggestions and tried to make corrections accordingly. I have revised the title, methodology section and even ... Continue reading
Views
21
Cite
Reviewer Report 05 Feb 2024
Francesco Innocenti, Maastricht University, Maastricht, The Netherlands 
Not Approved
VIEWS 21
The general impression is that this paper is a collection of sentences about sample size calculations taken from different sources, assembled without a clear structure.
The authors stated at the end of the introduction that "the study was conducted ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Innocenti F. Reviewer Report For: Sample size in educational research: A rapid synthesis [version 1; peer review: 1 approved with reservations, 1 not approved]. F1000Research 2023, 12:1291 (https://doi.org/10.5256/f1000research.154590.r232933)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 3
VERSION 3 PUBLISHED 09 Oct 2023
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.