<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.182391.1</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Research Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>Trends in the Psychometric Characteristics of NECO Mathematics Senior School Certificate Examination Over a Period of Five Years (2020-2024) among&#x00a0;Osun State&#x00a0;Candidates,&#x00a0;Nigeria</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 1; peer review: 1 approved with reservations, 1 not approved]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Alaba Adeyemi</surname>
                        <given-names>Adediwura</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-4659-3185</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Beatrice Oluwakemi</surname>
                        <given-names>Babayemi</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0009-0004-7087-8991</uri>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Odunayo Ibukun</surname>
                        <given-names>Odumbo</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0009-0006-6554-2224</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Department of Educational Foundation, Faculty of Education, Obafemi Awolowo University, Ife-Ife, Osun state, 10222, Nigeria</aff>
                <aff id="a2">
                    <label>2</label>Department of Art and Social Science, Faculty of Art, Obafemi Awolowo University, Ife-Ife, Osun state, 10222, Nigeria</aff>
                <aff id="a3">
                    <label>3</label>Department of Humanities, Faculty of Education, Kampala International University - Western Campus, Bushenyi, Western Region, 00000, Uganda</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:odumboodunayo@kiu.ac.ug">odumboodunayo@kiu.ac.ug</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>27</day>
                <month>5</month>
                <year>2026</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2026</year>
            </pub-date>
            <volume>15</volume>
            <elocation-id>818</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>13</day>
                    <month>5</month>
                    <year>2026</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2026 Alaba Adeyemi A et al.</copyright-statement>
                <copyright-year>2026</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/15-818/pdf"/>
            <abstract>
                <p>The study examined the psychometric characteristics of the National Examinations Council (NECO) Senior School Certificate Examination (SSCE) Mathematics test in Osun State, Nigeria, spanning from 2020 to 2024. A random sample comprising 10% of the total of 211,753 candidates was selected for the study. The examination item responses were used to examine three factors: item difficulty, item discrimination, and test reliability. The researchers used descriptive statistics, one-way ANOVA, and Scheffe post hoc tests to analyse the collected data. The results showed that item difficulty remained largely stable over the years, except in the most recent examination year, which exhibited a marked change. The five-year period showed major changes in item discrimination indices because item quality testing yielded different results, whereas overall item discrimination remained within acceptable limits. The KR-20 reliability coefficients were high throughout the study, indicating that the test maintained consistent internal consistency during the assessment. The study found that the NECO SSCE Mathematics examination is highly reliable but requires ongoing psychometric assessment to maintain standards across periods, including reliability, fairness, and validity.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>SSCE</kwd>
                <kwd>NECO</kwd>
                <kwd>Mathematics</kwd>
                <kwd>Secondary Schools</kwd>
                <kwd>Examination.</kwd>
            </kwd-group>
            <funding-group>
                <funding-statement>The author(s) declared that no grants were involved in supporting this work.</funding-statement>
            </funding-group>
        </article-meta>
    </front>
    <body>
        <def-list>
            <title>List of Abbreviations</title>
            <def-item>
                <term id="G7">ANOVA</term>
                <def>
                    <p>Analysis of Variance</p>
                </def>
            </def-item>
            <def-item>
                <term id="G17">BECE</term>
                <def>
                    <p>Basic Education Certificate Examination</p>
                </def>
            </def-item>
            <def-item>
                <term id="G4">CTT</term>
                <def>
                    <p>Classical Test Theory</p>
                </def>
            </def-item>
            <def-item>
                <term id="G15">Df</term>
                <def>
                    <p>Degrees of Freedom</p>
                </def>
            </def-item>
            <def-item>
                <term id="G8">DIF</term>
                <def>
                    <p>Differential Item Functioning</p>
                </def>
            </def-item>
            <def-item>
                <term id="G13">F</term>
                <def>
                    <p>F-statistic (in ANOVA)</p>
                </def>
            </def-item>
            <def-item>
                <term id="G5">IRT</term>
                <def>
                    <p>Item Response Theory</p>
                </def>
            </def-item>
            <def-item>
                <term id="G6">KR-20</term>
                <def>
                    <p>Kuder&#x2013;Richardson Formula 20</p>
                </def>
            </def-item>
            <def-item>
                <term id="G1">NECO</term>
                <def>
                    <p>National Examinations Council</p>
                </def>
            </def-item>
            <def-item>
                <term id="G16">OECD</term>
                <def>
                    <p>Organisation for Economic Co-operation and Development</p>
                </def>
            </def-item>
            <def-item>
                <term id="G9">OMR</term>
                <def>
                    <p>Optical Mark Recognition</p>
                </def>
            </def-item>
            <def-item>
                <term id="G11">

                    <inline-formula>

                        <mml:math display="inline">
                            <mml:mover accent="true">
                                <mml:mi mathvariant="normal">p</mml:mi>
                                <mml:mo stretchy="true">&#x00af;</mml:mo>
                            </mml:mover>
                        </mml:math>
</inline-formula>
</term>
                <def>
                    <p>Mean Item Difficulty Index</p>
                </def>
            </def-item>
            <def-item>
                <term id="G12">rpbis</term>
                <def>
                    <p>Point-Biserial Correlation Coefficient</p>
                </def>
            </def-item>
            <def-item>
                <term id="G10">SD</term>
                <def>
                    <p>Standard Deviation</p>
                </def>
            </def-item>
            <def-item>
                <term id="G2">SSCE</term>
                <def>
                    <p>Senior School Certificate Examination</p>
                </def>
            </def-item>
            <def-item>
                <term id="G14">Sig.</term>
                <def>
                    <p>Significance (p-value)</p>
                </def>
            </def-item>
            <def-item>
                <term id="G3">WAEC</term>
                <def>
                    <p>West African Examinations Council</p>
                </def>
            </def-item>
        </def-list>
        <sec id="sec1" sec-type="intro">
            <title>Introduction</title>
            <p>Large-scale national public exams play a crucial strategic role in a country&#x2019;s education system, particularly when certification, school transfers, or access to additional educational resources depend on exam outcomes. In Nigeria, the National Examinations Council (NECO) Senior School Certificate Examination (SSCE) Mathematics is considered a high-stakes assessment, used to obtain secondary school completion certificates, secure admission to tertiary institutions, and signal for the labour market. Consequently, the analysis of NECO Mathematics scores and the credibility of decision-making processes that depend on these scores across various such sessions are largely contingent on the examination&#x2019;s psychometric quality. The characteristics of large-scale assessments that are common targets of evaluation are item difficulty, item discrimination, test reliability, and item bias from the psychometric perspective. Item difficulty refers to the proportion of candidates who can provide correct answers to an item. In contrast, item discrimination refers to the extent to which an item can distinguish between individuals of high and low ability. Reliability assesses the consistency of scores, whereas bias analysis examines whether differentially functioning test items are equivalent across subgroups (e.g., gender, school type, or location). All these indices collectively provide empirical evidence for the validity, fairness, and technical soundness of any examination.
                <sup>
                    <xref ref-type="bibr" rid="ref1">1</xref>,
                    <xref ref-type="bibr" rid="ref2">2</xref>
                </sup>
            </p>
            <p>There has been growing concern among educational stakeholders in Nigeria over the past decade, driven by inconsistent student performance in public examinations, particularly in Mathematics. These inconsistencies could indicate disparities in instructional quality, curriculum coverage, or learner readiness, but they could also be due to inconsistencies in item quality and in the test-construction process. Empirical studies conducted in the past decade have shown that public examination items in Nigeria sometimes exhibit disparities in difficulty distribution, weak discrimination parameters, and occasional DIF, making scores from one year to the next incomparable.
                <sup>
                    <xref ref-type="bibr" rid="ref3">3</xref>,
                    <xref ref-type="bibr" rid="ref4">4</xref>
                </sup>
            </p>
            <p>In contemporary research, trend analysis plays a vital role in the measurement literature, rather than merely single-year statistical analyses. The psychometric characteristics of examinations may change over time and may be necessary for assessment purposes. In the last decade, this has involved determining whether all essential characteristics of an examination remain fixed or exhibit systematic drift in difficulty, reliability, or bias. This also contains substantive discussions of high-stakes examinations such as NECO SSCE, which underscore the importance of hours in nurturing public trust and shaping government actions, particularly in international educational achievement comparisons.
                <sup>
                    <xref ref-type="bibr" rid="ref5">5</xref>,
                    <xref ref-type="bibr" rid="ref6">6</xref>
                </sup>
            </p>
            <p>In Osun State, where Mathematics performance has remained a key policy concern, a systematic examination of psychometric trends provides valuable evidence for educational planning, test-development reforms, and accountability. Understanding how item characteristics have evolved from 2020 to 2024 can inform NECO&#x2019;s item-writing practices, guide teacher preparation strategies, and support policymakers in interpreting examination outcomes more cautiously. Consequently, this study investigates trends in the psychometric characteristics of NECO SSCE Mathematics over five years, with a focus on candidates in Osun State, Nigeria.</p>
            <p>The theoretical attributes of test items and test forms that provide empirically supported evidence of the quality, credibility, and fairness of measurement in educational assessments are the psychometric qualities. Fundamental to large-scale public examinations, as well as to ensuring that test scores adequately and reliably reflect the true abilities of examinees in the area of focus, is the evaluation of psychometric properties. This evaluation is a key component in the introduction, quality control, and scoring of high-stakes examinations such as national benchmark exams. One of the most widely studied psychometric indices is item difficulty, which measures the proportion of examinees who answer a given item or question correctly. A difficulty index can help explore whether test items are well aligned with the examination population and curriculum expectations. An item that is too easy or too difficult hardly contributes to good metric measurement and may distort score distributions and undermine test validity. A well-constructed public examination generally contains items applied in three different levels of difficulty-easy, moderate, and difficult- to ensure an optimal precision of measurement across the ability continuum.
                <sup>
                    <xref ref-type="bibr" rid="ref7">7</xref>,
                    <xref ref-type="bibr" rid="ref8">8</xref>
                </sup> Item discrimination, closely related to question difficulty, is the extent to which an item differentiates between examinees with high and low background performance. Discrimination indices indicate the measurement quality of an item with respect to its reliability and overall test validity. When the discrimination coefficient for a given item is high, it provides a strong signal to the rank ordering of candidates around an ability point. However, low-discriminative items may introduce waiving-along noise and may also prevent an item&#x2019;s associated ability level from being inferred.
                <sup>
                    <xref ref-type="bibr" rid="ref9">9</xref>,
                    <xref ref-type="bibr" rid="ref10">10</xref>
                </sup> Reliability is another crucial psychometric property that denotes the consistency and stability of test scores across items, forms, or administrations. In the case of public examinations, internal consistency indices, like KR-20 or Cronbach&#x2019;s alpha, are widely applied to estimate the degree to which conversation among the various test items about the target construct can be. According to,
                <sup>
                    <xref ref-type="bibr" rid="ref11">11</xref>,
                    <xref ref-type="bibr" rid="ref12">12</xref>
                </sup> adequate reliability is a prerequisite for valid interpretation of scores, particularly for consequential decisions, such as certification and admission, that attach social consequences to an individual&#x2019;s performance.</p>
            <p>Over the past 10&#x00a0;years, studies have indicated that the psychometric properties of public examinations in Nigeria vary from year to year. Previous studies have analysed the NECO and WAEC Mathematics exams, noting that although the overall difficulty levels are sometimes similar across the two organizations, the discrimination indices vary significantly across test versions and years, indicating potential issues with item quality and calibration.
                <sup>
                    <xref ref-type="bibr" rid="ref13">13</xref>
                    <xref ref-type="bibr" rid="ref14">14</xref>
                </sup> also reported similar findings in the content-specific analysis of the NECO examinations, with some items exhibiting weak discrimination and low psychometric capacity despite not being particularly difficult.</p>
            <p>The current results highlight the importance of conducting regular psychometric evaluations and monitoring item parameters in public examinations. Tracking methodologies make it possible to detect fluctuations in psychometric quality; however, comparability of results across examinations collapses within-cohort examinations, since the general large-scale examination system that provides so much information would lose its credibility. Awareness and assessment of the psychometric characteristics over and across examination years, therefore, remains the key concern of any test producer, policy developer, or educational measurement specialist.</p>
            <p>The National Examinations Council (NECO) was established as an alternative national examining body in Nigeria, tasked with conducting credible, valid, and reliable public examinations. Questions about the quality and comparability of NECO examinations, particularly in high-stakes subjects such as mathematics and the English language, have attracted sustained scholarly attention since their inception. As a result, a considerable body of empirical research seeks to ascertain and critique the psychometric characteristics of NECO test items under the frameworks afforded by Classical Test Theory (CTT) and Item Response Theory (IRT). Several studies conducted in the last decade have designed NECO examination items across subject areas, with a major focus on item difficulty, discrimination, dimensionality, and model&#x2013;data fit. Using IRT-based approaches, researchers reported that some NECO multiple-choice test forms did not fully meet the unidimensional assumption and exhibited local item dependence and misfitting items in certain administrations. In fact,
                <sup>
                    <xref ref-type="bibr" rid="ref3">3</xref>
                </sup> in their psychometric study of the NECO English Language item, found discrepancies in item parameter estimates and instances of poor item fit, pointing to the weaknesses in item calibration and pretesting procedures and raising the urgency for continuous psychometric scrutiny on NECO examinations to ensure measurement precision and construct validity.</p>
            <p>Empirical assessments in Mathematics have identified mixed psychometric relations across years. A comparison of Mathematics items from NECO and the West African Examination Council (WAEC) indicates that the two tests exhibit similar difficulty across administrations. Still, NECO Mathematics items, unlike WAEC Mathematics items, display higher within-test discrimination variability. Thus, the variability in the extent of discrimination would raise concerns about the uniformity of measurement standards and the stability of score interpretation over time.
                <sup>
                    <xref ref-type="bibr" rid="ref1">1</xref>
                </sup>
            </p>
            <p>Beyond item quality, research is increasingly focusing on fairness and bias in NECO examinations.
                <sup>
                    <xref ref-type="bibr" rid="ref15">15</xref>
                </sup> demonstrated, through their Differential Item Functioning (DIF) analyses, that some of the mathematics items in the NECO examinations were functioning differently across subgroups defined by gender, school type (public versus private), and location (urban versus rural), while controlling for candidates&#x2019; overall ability. This differential functioning poses a threat to score equating and may systematically favour or disadvantage particular individuals or groups, thereby invalidating any decision-making based on examination results.
                <sup>
                    <xref ref-type="bibr" rid="ref16">16</xref>,
                    <xref ref-type="bibr" rid="ref17">17</xref>
                </sup>
            </p>
            <p>Research comparing differential item functioning (DIF) indices frequently links deviations from uni-dimensionality to the presence of item bias. For instance, when a math exam clearly assesses both mathematical reasoning and language skills simultaneously, or when other test-taking techniques are employed, item parameters become less predictable and differences among subgroups become more apparent. Thus,
                <sup>
                    <xref ref-type="bibr" rid="ref18">18</xref>,
                    <xref ref-type="bibr" rid="ref6">6</xref>
                </sup> opined that the very idea of fairness becomes inchoate when the issues of test validity and measurement model are not carefully considered, and myriad empirical conditions, of which substantive-parameter tuning continues to mount, are not carefully considered.</p>
            <p>Evidence from the literature indicates that the NECO examinations are a noteworthy asset to national assessment and certification; however, psychometric challenges persist. The evidence demands routine item analysis, longitudinal monitoring of item parameters, and thoroughgoing bias validation to be integrated as cardinal components of NECO quality assurance activities. It is imperative that these issues be addressed up front, particularly in mathematics examinations, where problems of psychometric quality can distort the interpretation of students&#x2019; competence and may yield ill-informed decisions regarding educational policies.</p>
        </sec>
        <sec id="sec2">
            <title>Research objectives</title>
            <p>The main goal of this study is to examine how the psychometric properties of Mathematics in the NECO SSCE have evolved from 2020 to 2024 for candidates in Osun State, utilizing Classical Test Theory as the analytical framework. The specific objectives of the study are to:
                <list list-type="roman-lower">
                    <list-item>
                        <label>i.</label>
                        <p>Evaluate trends in item difficulty indices of NECO SSCE Mathematics multiple-choice items from 2020 to 2024 among candidates in Osun State.</p>
                    </list-item>
                    <list-item>
                        <label>ii.</label>
                        <p>Review the trends of item discrimination indices of NECO SSCE Mathematics items across these five examination terms.</p>
                    </list-item>
                    <list-item>
                        <label>iii.</label>
                        <p>Evaluate the reliability of NECO SSCE Mathematics tests across the tests given in the five years.</p>
                    </list-item>
                    <list-item>
                        <label>iv.</label>
                        <p>Compare yearly variations in psychometric characteristics (item difficulty, item discrimination, distractor efficiency, and reliability) of NECO SSCE Mathematics examinations from 2020 to 2024.</p>
                    </list-item>
                </list>
            </p>
            <sec id="sec3">
                <title>Research questions</title>
                <p>

                    <list list-type="roman-lower">
                        <list-item>
                            <label>i.</label>
                            <p>What is the trend observed in the difficulty indices of NECO SSCE Mathematics multiple-choice items from 2020 to 2024 among Osun State candidates?</p>
                        </list-item>
                        <list-item>
                            <label>ii.</label>
                            <p>How do the NECO SSCE Mathematics items&#x2019; discrimination indices vary across the five examination years (2020&#x2013;2024)?</p>
                        </list-item>
                        <list-item>
                            <label>iii.</label>
                            <p>What is the extent to which the reliability coefficients of the NECO SSCE Mathematics examinations remain consistent across the years 2020 to 2024?</p>
                        </list-item>
                    </list>
                </p>
            </sec>
            <sec id="sec4">
                <title>Hypotheses</title>
                <p>

                    <list list-type="roman-lower">
                        <list-item>
                            <label>i.</label>
                            <p>The difference in item difficulty of NECO SSCE Mathematics examinations between 2020 and 2024 varies significantly.</p>
                        </list-item>
                        <list-item>
                            <label>ii.</label>
                            <p>The item discrimination of NECO SSCE Mathematics examinations between 2020 and 2024 did not vary significantly.</p>
                        </list-item>
                    </list>
                </p>
            </sec>
            <sec id="sec5">
                <title>Methodology</title>
                <p>The study employed a descriptive quantitative design, using NECO Mathematics examinations from 2020 to 2024 as the test data and student responses from mathematics candidates in Osun State schools. The study population comprised all individuals from Osun State who enrolled in and participated in the NECO SSCE Mathematics examination between 2020 and 2024. Data from the examination board indicate that 66,256 candidates registered in 2020, followed by 34,434 in 2021, 34,682 in 2022, 35,118 in 2023, and 41,263 in 2024, for a total of 211,753 candidates over the five years. These individuals came from public and private institutions and represented a range of ability levels and learning settings in the state of Osun. A representative sample needed for an in-depth psychometric evaluation was selected, taking into account the large population and the long-term aspect of the research. Proportional random sampling was performed, drawing a 10% portion of the total population; 21,175 individuals were sampled. The sampling method was designed to ensure that each test year was accurately represented in the study sample, in proportion to its prevalence in the overall population. As a result, the study maintained the population&#x2019;s characteristics to facilitate comparison and enabled more robust trend comparisons. Using a proportional allocation, the target totals for each year are set at 6626 for 2020, 3443 for 2021, 3468 for 2022, 3512 for 2023, and 4126 for 2024. Randomly selecting samples from exam records in each year ensured that every candidate had an equal, independent probability of inclusion in the study. The primary instrument used for data collection in this study was an electronic spreadsheet of OMR data containing candidates&#x2019; item-level responses for the years 2020 to 2024. This study carefully and systematically analysed data from 2020 to 2024 to achieve the research goals and ensure a thorough evaluation of the psychometric properties of the NECO SSCE Mathematics examination. The Classical Test Theory (CTT) model was employed to provide robust evidence regarding the quality of the test items and overall assessment.</p>
            </sec>
        </sec>
        <sec id="sec6" sec-type="results">
            <title>Results</title>
            <p>

                <bold>Research Question 1:</bold> What is the trend observed in the difficulty indices of NECO SSCE Mathematics multiple-choice items from 2020 to 2024 among Osun State candidates?</p>
            <p>The proportion of candidates who answered each item correctly was used to calculate the item&#x2019;s difficulty index (p-value) for each examination year. The mean value obtained across all items for each year was computed. The results are presented in 
                <xref ref-type="table" rid="T1">
Table 1</xref>.</p>
            <table-wrap id="T1" orientation="portrait" position="float">
                <label>
Table 1. </label>
                <caption>
                    <title>NECO mathematics average difficulty Indices trend (2020&#x2013;2024).</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top">Year</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Mean item difficulty (p&#x0304;)</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">
Interpretation</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2020</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.75</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Very Easy</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2021</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.70</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Moderately Easy</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2022</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.74</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Moderately Easy</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2023</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.80</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Very Easy</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2024</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.65</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Moderately Easy</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <p>
                <xref ref-type="table" rid="T1">
Table 1</xref> presents the average item difficulty indices for the NECO Senior School Certificate Examination (SSCE) Mathematics test from 2020 to 2024. The mean item difficulty index (p&#x00af;) indicates the percentage of test takers who answered items correctly; higher values indicate easier items, whereas lower values indicate harder items. According to the results, the 2020 NECO Mathematics exam exhibited the greatest mean difficulty index (p&#x00af; = 0.83), implying that the entire set of questions was accessible and straightforward for Osun State students. The data demonstrate that most candidates from that year were successful in answering most test questions. The mean difficulty index for 2021 was 0.70, indicating a test of moderate difficulty, even though the assessment items maintained their appropriate range for large-scale testing. In 2022, item difficulty increased slightly to 0.74, indicating that assessment items from that year were easier to solve than those from 2021. The 2023 examination recorded a further increase in difficulty index to 0.80, indicating that the mathematics items were again largely easy for candidates. The 2024 examination showed a substantial decrease, with a mean difficulty index of 0.65, indicating that this assessment required greater effort from students than in previous years. However, it remained at moderate difficulty levels. The five-year period shows a nonlinear progression, with alternating patterns of test difficulty across examination years. The test forms maintained a consistent level of difficulty, yet their average-difficulty assessments showed irregularities, resulting in testing problems that needed to be resolved across different assessment periods. Score fluctuations between different years require systematic item pretesting and difficulty-balancing methods to establish consistent standards for the NECO Mathematics exams.</p>
            <p>

                <bold>Research Question 2:</bold> How do the NECO SSCE Mathematics items&#x2019; discrimination indices vary across the five examination years (2020&#x2013;2024)?</p>
            <p>
                <xref ref-type="table" rid="T2">
Table 2</xref> displays the progression of the average item discrimination indices for the NECO SSCE Mathematics exam from 2020 to 2024. The mean discrimination index, calculated using the point-biserial correlation coefficient (rpbis), reflects the extent to which test items distinguish between top-performing and lower-performing students. Higher discrimination values indicate that test items are of higher quality, which helps to correctly rank candidates. The 2020 and 2021 exams produced identical mean discrimination indices of 0.27, indicating moderate tracking ability. The test items from these two years successfully differentiated between candidates with higher and lower abilities, although their effectiveness fell short of the 0.30 benchmark, which defines highly effective test items. The 2022 mean discrimination index rose to 0.29, which indicated a small improvement in item discrimination that approached the standard for high-quality multiple-choice items. The mean discrimination index decreased to 0.25 in 2023, indicating reduced discriminatory power compared with earlier years. The 2023 examination showed reduced effectiveness because a greater number of test items failed to distinguish between high-and low-performing students. The 2024 examination showed a substantial increase in the mean discrimination index, which reached 0.36 and indicated good to very good discrimination power. The 2024 test items were more effective than in previous years at differentiating candidates by ability level. The item discrimination indices exhibit trend patterns over their five-year span, oscillating rather than showing regular development; the data indicate improvements in 2024 after moderate discrimination in previous years. The 2024 data show a significant rise, indicating that either item construction standards improved or items were better evaluated for candidates&#x2019; actual skill levels. The annual fluctuations observed by researchers indicate that NECO Mathematics examinations require regular item evaluation and quality assurance procedures to maintain consistent examination performance across years.</p>
            <table-wrap id="T2" orientation="portrait" position="float">
                <label>
Table 2. </label>
                <caption>
                    <title>NECO mathematics average discrimination indices trend (2020&#x2013;2024).</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top">Year</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Mean discrimination (rpbis)</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">
Interpretation</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2020</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.27</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Moderate Index</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2021</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.27</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Moderate index</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2022</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.29</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Good</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2023</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.25</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Weak index</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2024</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.36</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">Very Good Index</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <p>

                <bold>Research Question 3:</bold> What is the extent to which the reliability coefficients of the NECO SSCE Mathematics examinations remain consistent across the years 2020 to 2024?</p>
            <p>
                <xref ref-type="table" rid="T3">
Table 3</xref> presents the KR-20 statistics for the test batteries of the NECO Senior School Certificate Examination (SSCE) Mathematics over five successive years of administration, from 2020 to 2024. Results for the official outcome indicate that reliability was maintained for the NECO Mathematics exams throughout the five years. More succinctly, KR-20 coefficients were similarly high across all five years: 0.90 in 2020, 0.88 in 2021, 0.87 in 2022, 0.89 in 2023, and 0.90 in 2024. In all cases, values exceeded the minimum acceptable reliability of 0.70, and a majority exceeded 0.90, suggesting high internal consistency. Any mild undulations observed over the year were largely unaddressed, remaining below the required limit, indicating generally consistent, homogeneous functions across items. With KR-20 recovery to.90 in 2024, lying high in its range similar to 2024, our inference regarding the consistency of test construction and the administration of quality assurance processes was justified for NECO by fortuitous excellence.
                <statement id="state1">
                    <label>Hypothesis 1:</label>
                    <p>The difference in item difficulty of NECO SSCE Mathematics examinations between 2020 and 2024 varies significantly.</p>
                </statement>
            </p>
            <table-wrap id="T3" orientation="portrait" position="float">
                <label>
Table 3. </label>
                <caption>
                    <title>NECO mathematics reliability coefficients (kr-20) trend (2020&#x2013;2024).</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top">Year</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">
KR-20 Reliability</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="middle">2020</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.90</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="middle">2021</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.88</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="middle">2022</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.87</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="middle">2023</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.89</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="middle">2024</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">0.90</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <p>The item difficulty indices for NECO SSCE Mathematics items from 2020 to 2024 are presented in 
                <xref ref-type="table" rid="T4">
Table 4</xref>.</p>
            <table-wrap id="T4" orientation="portrait" position="float">
                <label>
Table 4. </label>
                <caption>
                    <title>Descriptive statistics of item difficulty of NECO SSCE mathematics between 2020&#x2013;2024.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top">Year</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">N</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">

                                <inline-formula>

                                    <mml:math display="inline">
                                        <mml:mover accent="true">
                                            <mml:mi>x</mml:mi>
                                            <mml:mo stretchy="true">&#x00af;</mml:mo>
                                        </mml:mover>
                                    </mml:math>
</inline-formula>
</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">SD</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Min</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">
Max</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2020</td>
                            <td align="left" colspan="1" rowspan="1" valign="middle">60</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.75</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.22773</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.00</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">1.00</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2021</td>
                            <td align="left" colspan="1" rowspan="1" valign="middle">60</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.70</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.23711</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.00</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.91</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2022</td>
                            <td align="left" colspan="1" rowspan="1" valign="middle">60</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.74</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.14051</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.09</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.88</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2023</td>
                            <td align="left" colspan="1" rowspan="1" valign="middle">60</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.80</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.16507</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.04</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.94</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2025</td>
                            <td align="left" colspan="1" rowspan="1" valign="middle">60</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.65</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.19615</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.17</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.90</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Total</td>
                            <td align="left" colspan="1" rowspan="1" valign="middle">300</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.73</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.20221</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.00</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">1.00</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <p>The study uses 60 multiple-choice items each academic year, yielding 300 items across the five testing years. The item difficulty indices range from 0.00 to 1.00 over the five years, with higher values indicating easier assessment materials. The average item difficulty across five years was 0.73 (SD&#x00a0;=&#x00a0;0.20), indicating that the NECO SSCE Mathematics assessment materials were of moderate difficulty for students. The data indicate that a considerable number of students answered most test questions correctly during the period examined. The difference in item difficulty over the five years was then assessed using a One-Way Analysis of variance. The result is presented in 
                <xref ref-type="table" rid="T5">
Table 5</xref>.</p>
            <table-wrap id="T5" orientation="portrait" position="float">
                <label>
Table 5. </label>
                <caption>
                    <title>
One-Way ANOVA showing the difference in item difficulty of NECO SSCE mathematics between 2020&#x2013;2024.
</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top"/>
                            <th align="left" colspan="1" rowspan="1" valign="top">Sum of Squares</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Df</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Mean Square</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">F</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">
Sig.</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Between Groups</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.806</td>
                            <td align="left" colspan="1" rowspan="1" valign="middle">4</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.202</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">5.206</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.000</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Within Groups</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">11.419</td>
                            <td align="left" colspan="1" rowspan="1" valign="middle">295</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.039</td>
                            <td colspan="1" rowspan="1"/>
                            <td colspan="1" rowspan="1"/>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Total</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">12.226</td>
                            <td align="left" colspan="1" rowspan="1" valign="middle">299</td>
                            <td colspan="1" rowspan="1"/>
                            <td colspan="1" rowspan="1"/>
                            <td colspan="1" rowspan="1"/>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <p>
                <xref ref-type="table" rid="T5">
Table 5</xref> presents the results of a one-way Analysis of Variance (ANOVA), which tested whether the NECO SSCE Mathematics item difficulty means differed significantly across examination years from 2020 to 2024. The computed F ratio (F
                <sub>(4, 7,749)</sub>&#x00a0;=&#x00a0;5.206) was statistically significant at the. 05 level implies that the average item difficulty indices across the five years of examination were statistically significant. In other words, the difficulty levels of NECO SSCE Mathematics test items were not constant in the set range of years of 2020 to 2024, or at least one year&#x2019;s average item difficulty was statistically significantly different from that of the others. Thus, a Scheffe test was conducted to determine where the difference lies. The results are presented in 
                <xref ref-type="table" rid="T6">
Table 6</xref>.</p>
            <table-wrap id="T6" orientation="portrait" position="float">
                <label>
Table 6. </label>
                <caption>
                    <title>Scheffe Multiple comparison of item difficulty of NECO SSCE mathematics between 2020&#x2013;2024.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top">(I) Neco Item Difficulty Indices</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">(J) Neco Item Difficulty Indices</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Mean Difference 
(I-J)</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Std. Error</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">
Sig.</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="4" valign="top">2020</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2021</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.05799</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.626</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2022</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.01342</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.998</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2023</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.05094</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.734</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2025</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.10131</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.096</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="4" valign="top">2021</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2020</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.05799</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.626</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2022</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.04457</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.819</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2023</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.10893</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.059</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2025</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.04332</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.834</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="4" valign="top">2022</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2020</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.01342</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.998</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2021</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.04457</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.819</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2023</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.06436</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.524</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2025</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.08789</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.203</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="4" valign="top">2023</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2020</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.05094</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.734</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2021</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.10893</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.059</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2022</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.06436</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.524</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2025</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.15225
                                <sup>*</sup>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.002</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="4" valign="top">2025</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2020</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.10131</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.096</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2021</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.04332</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.834</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2022</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.08789</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.203</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2023</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.15225
                                <sup>*</sup>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03592</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.002</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <p>
                <xref ref-type="table" rid="T6">
Table 6</xref> presents the Scheffe post hoc analysis of item difficulty indices for the mathematics examination in the NECO SSCE for the years 2020&#x2013;2024. The results indicate that, in the vast majority of pairwise comparisons across the exam years, p-values did not reach the 0.05 significance level. This indicates a general category of item-difficulty consistency over those years. However, a statistically significant difference was observed between the 2023 and 2024 examinations: the mean difference in item difficulty between the two years was estimated at 0.15225 (p&#x00a0;=&#x00a0;0.002). This indicates a significant difference in item difficulty between the two years. A positive mean difference indicates that the items of 2023 were relatively easier than the 2024 items (or, otherwise, the 2024 items were harder than the 2023 items).
                <statement id="state2">
                    <label>Hypothesis 2:</label>
                    <p>The item discrimination of NECO SSCE Mathematics examinations between 2020 and 2024 did not vary significantly.</p>
                </statement>
            </p>
            <p>
                <xref ref-type="table" rid="T7">
Table 7</xref> presents descriptive statistics of items&#x2019; discrimination indices for the NECO SSCE Mathematics for each of the five years from 2020 to 2024. Each year&#x2019;s examination comprised 60 multiple-choice items, for a total of 300 items analyzed over the five years. The average item discrimination indices ranged from 0.2546 (2023) to 0.3559 (2024). Specifically, 2020 and 2021 had average item discrimination values of 0.2723 and 0.2744, respectively, indicating moderate discrimination, whereas 2022 had a mean item discrimination of 0.2861, indicating a slight improvement in discrimination quality. In 2023, with an average discrimination of 0.2546, items exhibited the lowest differentiation, indicating weaker differentiation power that year. The highest mean discrimination value across all years was 2024, with an average of 0.3559, indicating a substantial improvement in item quality and in items&#x2019; ability to differentiate among candidates of varying ability levels.</p>
            <table-wrap id="T7" orientation="portrait" position="float">
                <label>
Table 7. </label>
                <caption>
                    <title>Descriptive statistics of item discrimination of NECO SSCE mathematics between 2020&#x2013;2024.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top">Year</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">N</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">

                                <inline-formula>

                                    <mml:math display="inline">
                                        <mml:mover accent="true">
                                            <mml:mi>x</mml:mi>
                                            <mml:mo stretchy="true">&#x00af;</mml:mo>
                                        </mml:mover>
                                    </mml:math>
</inline-formula>
</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">SD</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Min</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">
Max</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2020</td>
                            <td align="left" colspan="1" rowspan="1" valign="middle">60</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.2723</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.10728</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.05</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.40</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2021</td>
                            <td align="left" colspan="1" rowspan="1" valign="middle">60</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.2744</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.10239</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.05</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.42</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2022</td>
                            <td align="left" colspan="1" rowspan="1" valign="middle">60</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.2861</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.07985</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.01</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.42</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2023</td>
                            <td align="left" colspan="1" rowspan="1" valign="middle">60</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.2546</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.06861</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.06</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.36</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2025</td>
                            <td align="left" colspan="1" rowspan="1" valign="middle">60</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.3559</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.22995</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.05</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.74</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Total</td>
                            <td align="left" colspan="1" rowspan="1" valign="middle">300</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.2887</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.13489</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.06</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.74</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <p>According to 
                <xref ref-type="table" rid="T8">
Table 8</xref>, the one-way ANOVA results indicate the possible presence of significant differences in the average discrimination indices for NECO SSCE Mathematics examinations taken between 2020 and 2024. The F-test yielded a significant F-statistic (F&#x00a0;=&#x00a0;5.375; p&#x00a0;&lt;&#x00a0;0.05). A significant ANOVA result indicates that item discrimination quality differs across at least one year.</p>
            <table-wrap id="T8" orientation="portrait" position="float">
                <label>
Table 8. </label>
                <caption>
                    <title>One-Way ANOVA showing the difference in Item discrimination indices of NECO SSCE mathematics between 2020&#x2013;2024.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top"/>
                            <th align="left" colspan="1" rowspan="1" valign="top">Sum of squares</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">df</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Mean square</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">F</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Sig.</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Between Groups</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.370</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">4</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.092</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">5.375</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.000</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Within Groups</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">5.071</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">295</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.017</td>
                            <td colspan="1" rowspan="1"/>
                            <td colspan="1" rowspan="1"/>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">Total</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">5.441</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">299</td>
                            <td colspan="1" rowspan="1"/>
                            <td colspan="1" rowspan="1"/>
                            <td colspan="1" rowspan="1"/>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <p>Following the detection of a statistically significant source, a Scheffe pairwise comparison was conducted to identify the specific years with notable differences. The results are presented in 
                <xref ref-type="table" rid="T9">
Table 9</xref>.</p>
            <table-wrap id="T9" orientation="portrait" position="float">
                <label>
Table 9. </label>
                <caption>
                    <title>Scheffe multiple comparison of item difficulty of NECO SSCE mathematics between 2020&#x2013;2024.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1" valign="top">(I) Neco item difficulty indic</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">(J) Neco Item difficulty indices</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Mean difference 
(I-J)</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">Std. Error</th>
                            <th align="left" colspan="1" rowspan="1" valign="top">
Sig.</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td align="left" colspan="1" rowspan="4" valign="top">2020</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2021</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.00202</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">1.000</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2022</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.01380</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.988</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2023</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.01772</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.968</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2025</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.08358
                                <sup>*</sup>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.017</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="4" valign="top">2021</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2020</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.00202</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">1.000</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2022</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.01178</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.993</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2023</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.01975</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.954</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2025</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.08156
                                <sup>*</sup>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.022</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="4" valign="top">2022</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2020</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.01380</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.988</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2021</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.01178</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.993</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2023</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.03152</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.784</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2025</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.06978</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.078</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="4" valign="top">2023</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2020</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.01772</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.968</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2021</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.01975</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.954</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2022</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.03152</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.784</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2025</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">&#x2212;.10130
                                <sup>*</sup>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.002</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="4" valign="top">2025</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">2020</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.08358
                                <sup>*</sup>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.017</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2021</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.08156
                                <sup>*</sup>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.022</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2022</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.06978</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.078</td>
                        </tr>
                        <tr>
                            <td align="left" colspan="1" rowspan="1" valign="top">2023</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.10130
                                <sup>*</sup>
                            </td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.02394</td>
                            <td align="left" colspan="1" rowspan="1" valign="top">.002</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <p>
                <xref ref-type="table" rid="T9">
Table 9</xref> shows that the year-to-year pairwise analyses yielded few significant results, as indicated by p-values &gt;0.05. Specifically, comparisons between 2020 and 2021, 2020 and 2022, 2022 and 2023, 2021 and 2022, 2021 and 2023, and 2022 and 2023 revealed no significant differences, indicating that test difficulty was stable over time. However, significant differences were observed on the other hand within the 2024 examination year and most of the previous years: significant differences were observed between 2020 and 2024 (Mean Difference&#x00a0;=&#x00a0;&#x2212;0.08358, p&#x00a0;=&#x00a0;.017), 2021 and 2024 (Mean Difference&#x00a0;=&#x00a0;&#x2212;0.08156, p&#x00a0;=&#x00a0;.022), and 2023 and 2024 (Mean Difference&#x00a0;=&#x00a0;&#x2212;0.10130, p&#x00a0;=&#x00a0;.002). This indicated that items in the 2024 examination were significantly more difficult than those in 2020, 2021, and 2023, as indicated by negative mean differences when each year was contrasted with 2024.</p>
        </sec>
        <sec id="sec7" sec-type="discussion">
            <title>Discussion</title>
            <p>The research evaluated changes in psychometric properties of the NECO Senior School Certificate Examination SSCE Mathematics during the five-year period from 2020 to 2024, which tested students from Osun State. The study evaluated changes in psychometric properties of the NECO Senior School Certificate Examination SSCE Mathematics during the five-year period from 2020 to 2024, which tested students from Osun State.</p>
            <p>The item difficulty analysis showed that NECO SSCE Mathematics questions had an average difficulty which remained within acceptable CTT limits (p&#x00a0;&#x2248;&#x00a0;0.30&#x2013;0.80). The descriptive results showed that item difficulty remained stable between 2020 and 2023 until it experienced a significant change in 2024, which ANOVA and Scheff&#x00e9; post-hoc analysis confirmed. The NECO examination established a standard difficulty level which it maintained throughout most years, but the 2024 test showed a clear break from this established pattern. Variations in item difficulty across years of testing are common in large-scale assessments because they reflect changes in educational programs, test developers&#x2019; understanding of learning objectives, and the determination of assessment standards, according to.
                <sup>
                    <xref ref-type="bibr" rid="ref19">19</xref>
                </sup> The significant difference involving the 2024 items implies a possible recalibration of examination standards, which, while not inherently problematic, underscores the importance of systematic equating and longitudinal monitoring to ensure comparability of scores across years.
                <sup>
                    <xref ref-type="bibr" rid="ref20">20</xref>,
                    <xref ref-type="bibr" rid="ref21">21</xref>
                </sup>
            </p>
            <p>The study found notable variations in item discrimination outcomes when assessed over five separate testing intervals. The NECO Mathematics items achieved good performance in student ability testing because their mean discrimination indices reached acceptable limits which extended to value 0.20. The results showed year-to-year variation in testing results which reached their peak in 2024 when the highest average discrimination value was achieved. The evidence from recent years shows that testing organizations now give better priority to item testing standards which leads to improved assessment results through better testing writing and testing assessment and testing review methods. Assessments for high-stakes testing require high discrimination indices because those indices improve test score interpretation.
                <sup>
                    <xref ref-type="bibr" rid="ref9">9</xref>
                </sup> The assessment results show negative discrimination values across multiple years because approximately 10 percent of assessment items did not perform properly because of test item confusion and mistaken answer keys and test item content that did not match test objectives. Developing assessment systems need to conduct regular post-examination item evaluation because previous studies have shown similar results in their research on public examination systems.
                <sup>
                    <xref ref-type="bibr" rid="ref22">22</xref>,
                    <xref ref-type="bibr" rid="ref14">14</xref>
                </sup>
            </p>
            <p>The KR-20 reliability coefficients obtained for the five examination years ranged from 0.87 to 0.90. This range of reliability coefficients confirmed that all test administrations achieved high internal consistency. Manual comparison of the coefficients revealed only minimal year-to-year differences which remained below 0.02. The overall coefficient range between two years was 0.03. The NECO SSCE Mathematics examination maintained consistent measurement accuracy throughout its testing period because these variations stayed within psychometrically nonessential limits. The NECO SSCE Mathematics examination maintained consistent measurement accuracy throughout its testing period because these variations stayed within psychometrically nonessential limits. The test construction practices and test length requirements together with item assessment of the core construct for the test demonstrate reliable assessment through their high reliability coefficients. The study showed stable results which matched the expected standards for large-scale assessments because assessment reliability should remain stable across different testing conditions.</p>
            <sec id="sec8">
                <title>Implications for examination quality assurance</title>
                <p>The research results demonstrate that NECO SSCE Mathematics exam has maintained its strong reliability throughout testing while its testing materials show acceptable quality standards. The test development process demonstrates its dynamic nature through ongoing need for psychometric assessments which experts should conduct to maintain valid results in high-stakes certification and selection exams.
                    <sup>
                        <xref ref-type="bibr" rid="ref19">19</xref>,
                        <xref ref-type="bibr" rid="ref21">21</xref>
                    </sup> recommend that regular item analysis, alongside structured feedback loops for item writers and moderators, would help sustain improvements in discrimination quality while ensuring that changes in difficulty do not compromise fairness or comparability across cohorts. Such practices are critical for strengthening public confidence in examination outcomes and supporting evidence-based assessment reforms in Nigeria.</p>
            </sec>
        </sec>
        <sec id="sec9" sec-type="conclusion">
            <title>Conclusion</title>
            <p>The study investigated the psychometric evolution of the NECO Senior School Certificate Examination (SSCE) Mathematics test over a five-year period starting from 2020 to 2024 for candidates in Osun State through the application of Classical Test Theory. The research results demonstrate that examination items in high-stakes public assessments maintain consistent quality while exhibiting different levels of performance. The study concludes that NECO SSCE Mathematics test demonstrates strong psychometric properties which show particular excellence in testing reliability. The examination requires continuous systematized monitoring of item difficulty and discrimination assessment which will enable fair testing and consistent evaluation of academic performance across different years.</p>
            <sec id="sec10">
                <title>Recommendations</title>
                <p>Based on the findings of this study, the following recommendations are made:
                    <list list-type="roman-lower">
                        <list-item>
                            <label>i.</label>
                            <p>

                                <bold>Routine Post-Examination Item Analysis:</bold> NECO should institutionalize comprehensive post-examination item analysis after each examination cycle to identify poorly functioning items, particularly those with low or negative discrimination indices, for revision or elimination.</p>
                        </list-item>
                        <list-item>
                            <label>ii.</label>
                            <p>

                                <bold>Strengthening Item Writer Training:</bold> Regular capacity-building workshops should be organized for item writers and moderators, with emphasis on writing items that achieve optimal difficulty and high discrimination in line with Classical Test Theory guidelines.</p>
                        </list-item>
                        <list-item>
                            <label>iii.</label>
                            <p>

                                <bold>Monitoring Longitudinal Item Trends:</bold> NECO should adopt a structured framework for monitoring longitudinal trends in psychometric indices to ensure consistency of examination standards across years and prevent unintended shifts in difficulty.</p>
                        </list-item>
                        <list-item>
                            <label>iv.</label>
                            <p>

                                <bold>Use of Statistical Evidence in Test Review:</bold> Decisions regarding item retention, modification, or replacement should be guided by empirical psychometric evidence rather than solely by expert judgment.</p>
                        </list-item>
                        <list-item>
                            <label>v.</label>
                            <p>

                                <bold>Expansion to Advanced Psychometric Models:</bold> Future evaluations of NECO examinations should complement Classical Test Theory with Item Response Theory analyses to provide deeper insights into item functioning and candidate ability estimation.</p>
                        </list-item>
                        <list-item>
                            <label>vi.</label>
                            <p>

                                <bold>Policy Support for Examination Quality Assurance:</bold> Educational policymakers should support the integration of psychometric research findings into national examination quality assurance policies to enhance public confidence in examination results.</p>
                        </list-item>
                    </list>
                </p>
            </sec>
        </sec>
        <sec id="sec11">
            <title>Ethical approval</title>
            <p>Ethical approval was not required for this study as it involved secondary analysis of anonymized examination data with no direct involvement of human participants.</p>
        </sec>
    </body>
    <back>
        <sec id="sec14" sec-type="data-availability">
            <title>Data availability</title>
            <p>Open Science Framework: Adediwura, A. A., Babayemi, B. O., &amp; Odumbo, O. I. (2026, May 12). Trends in the Psychometric Characteristics of NECO Mathematics Senior School Certificate Examination Over a Period of Five Years (2020&#x2013;2024) among Osun State&#x2019;s Candidates. 
                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.17605/OSF.IO/GSPU5">https://doi.org/10.17605/OSF.IO/GSPU5</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref23">23</xref>
                </sup>
            </p>
            <p>This project contains the following extended data
                <list list-type="bullet">
                    <list-item>
                        <label>&#x2022;</label>
                        <p>ODUMBO DATA REPOSITORY file.pdf/doc.</p>
                    </list-item>
                </list>
            </p>
            <p>Data are available under the terms of the 
                <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/publicdomain/zero/1.0/legalcode">Creative Commons Zero &#x201c;No rights reserved&#x201d; data waiver</ext-link> 

                <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">(CC0 1.0 Public domain dedication).</ext-link>
            </p>
        </sec>
        <ref-list>
            <title>References</title>
            <ref id="ref1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Aborisade</surname>
                            <given-names>OJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Fajobi</surname>
                            <given-names>OO</given-names>
                        </name>
</person-group>:
                    <article-title>Comparative analysis of psychometric properties of mathematics items constructed by WAEC and NECO in Nigeria using item response theory approach.</article-title>
                    <source>

                        <italic toggle="yes">Educ Res Rev.</italic>
</source>
                    <year>2020</year>;<volume>15</volume>(<issue>1</issue>):<fpage>1</fpage>&#x2013;<lpage>7</lpage>.
                    <pub-id pub-id-type="doi">10.5897/ERR2019.3850</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Oghenerume</surname>
                            <given-names>RA</given-names>
                        </name>
</person-group>:
                    <article-title>Item statistics disparity between 2023 WASSCE and NECO SSCE mathematics large-scale assessments.</article-title>
                    <source>

                        <italic toggle="yes">Int J Educ Res.</italic>
</source>
                    <year>2025</year>;<volume>16</volume>(<issue>1</issue>).</mixed-citation>
            </ref>
            <ref id="ref3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jimoh</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Opesemowo</surname>
                            <given-names>AA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Faremi</surname>
                            <given-names>YA</given-names>
                        </name>
</person-group>:
                    <article-title>Psychometric analysis of SSCE 2017 NECO English language multiple choice items using IRT.</article-title>
                    <source>

                        <italic toggle="yes">J Appl Res Multidiscip Stud.</italic>
</source>
                    <year>2022</year>.</mixed-citation>
            </ref>
            <ref id="ref4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Adediwura</surname>
                            <given-names>AA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Asowo</surname>
                            <given-names>PA</given-names>
                        </name>
</person-group>:
                    <article-title>Examining the nature of item bias on NECO mathematics senior school certificate dichotomously scored items in Nigeria.</article-title>
                    <source>

                        <italic toggle="yes">Int J Contemp Educ.</italic>
</source>
                    <year>2022</year>.</mixed-citation>
            </ref>
            <ref id="ref5">
                <label>5</label>
                <mixed-citation publication-type="book">
                    <collab>OECD</collab>:
                    <chapter-title>An OECD learning framework 2030.</chapter-title>
                    <source>

                        <italic toggle="yes">The future of education and labor.</italic>
</source>
                    <publisher-loc>Cham</publisher-loc>:
                    <publisher-name>Springer International Publishing</publisher-name>;<year>2019</year>; p.<fpage>23</fpage>&#x2013;<lpage>35</lpage>.
                    <pub-id pub-id-type="doi">10.1007/978-3-030-26068-2_3</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kane</surname>
                            <given-names>MT</given-names>
                        </name>
</person-group>:
                    <article-title>Validating the interpretations and uses of test scores.</article-title>
                    <source>

                        <italic toggle="yes">J Educ Meas.</italic>
</source>
                    <year>2021</year>;<volume>58</volume>(<issue>2</issue>):<fpage>135</fpage>&#x2013;<lpage>150</lpage>.</mixed-citation>
            </ref>
            <ref id="ref7">
                <label>7</label>
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>De Ayala</surname>
                            <given-names>RJ</given-names>
                        </name>
</person-group>:
                    <source>

                        <italic toggle="yes">The theory and practice of item response theory.</italic>
</source>
                    <publisher-loc>New York</publisher-loc>:
                    <publisher-name>Guilford Press</publisher-name>;
                    <edition>2nd ed</edition>
                    <year>2013</year>.</mixed-citation>
            </ref>
            <ref id="ref8">
                <label>8</label>
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Crocker</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Algina</surname>
                            <given-names>J</given-names>
                        </name>
</person-group>:
                    <source>

                        <italic toggle="yes">Introduction to classical and modern test theory.</italic>
</source>
                    <publisher-loc>Boston</publisher-loc>:
                    <publisher-name>Cengage Learning</publisher-name>;
                    <edition>2nd ed</edition>
                    <year>2018</year>.</mixed-citation>
            </ref>
            <ref id="ref9">
                <label>9</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Downing</surname>
                            <given-names>SM</given-names>
                        </name>
</person-group>:
                    <article-title>Reliability: On the reproducibility of assessment data.</article-title>
                    <source>

                        <italic toggle="yes">Med Educ.</italic>
</source>
                    <year>2003</year>;<volume>37</volume>(<issue>9</issue>):<fpage>830</fpage>&#x2013;<lpage>837</lpage>.</mixed-citation>
            </ref>
            <ref id="ref10">
                <label>10</label>
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bond</surname>
                            <given-names>TG</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Fox</surname>
                            <given-names>CM</given-names>
                        </name>
</person-group>:
                    <source>

                        <italic toggle="yes">Applying the Rasch model: Fundamental measurement in the human sciences.</italic>
</source>
                    <publisher-loc>New York</publisher-loc>:
                    <publisher-name>Routledge</publisher-name>;
                    <edition>3rd ed</edition>
                    <year>2015</year>.</mixed-citation>
            </ref>
            <ref id="ref11">
                <label>11</label>
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kline</surname>
                            <given-names>RB</given-names>
                        </name>
</person-group>:
                    <source>

                        <italic toggle="yes">Principles and practice of structural equation modeling.</italic>
</source>
                    <publisher-loc>New York</publisher-loc>:
                    <publisher-name>Guilford Press</publisher-name>;
                    <edition>4th ed</edition>
                    <year>2016</year>.</mixed-citation>
            </ref>
            <ref id="ref12">
                <label>12</label>
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Lane</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Raymond</surname>
                            <given-names>MR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Haladyna</surname>
                            <given-names>TM</given-names>
                        </name>
</person-group>:
                    <source>

                        <italic toggle="yes">Handbook of test development.</italic>
</source>
                    <publisher-loc>New York</publisher-loc>:
                    <publisher-name>Routledge</publisher-name>;
                    <edition>2nd ed</edition>
                    <year>2016</year>.</mixed-citation>
            </ref>
            <ref id="ref13">
                <label>13</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Aborisade</surname>
                            <given-names>OJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Fajobi</surname>
                            <given-names>OO</given-names>
                        </name>
</person-group>:
                    <article-title>Comparative analysis of psychometric properties of mathematics items constructed by WAEC and NECO in Nigeria using item response theory approach.</article-title>
                    <source>

                        <italic toggle="yes">Educ Res Rev.</italic>
</source>
                    <year>2020</year>;<volume>15</volume>(<issue>1</issue>):<fpage>1</fpage>&#x2013;<lpage>7</lpage>.
                    <pub-id pub-id-type="doi">10.5897/ERR2019.3850</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref14">
                <label>14</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jimoh</surname>
                            <given-names>MA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Yusuf</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Adebayo</surname>
                            <given-names>SO</given-names>
                        </name>
</person-group>:
                    <article-title>Dimensionality and item functioning of public examination items in Nigeria.</article-title>
                    <source>

                        <italic toggle="yes">J Educ Meas Eval.</italic>
</source>
                    <year>2022</year>;<volume>14</volume>(<issue>2</issue>):<fpage>45</fpage>&#x2013;<lpage>62</lpage>.</mixed-citation>
            </ref>
            <ref id="ref15">
                <label>15</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ekong</surname>
                            <given-names>EM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ubi</surname>
                            <given-names>IO</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Eni</surname>
                            <given-names>EI</given-names>
                        </name>
</person-group>:
                    <article-title>Differential item functioning of 2018 basic education certificate examination (BECE) in mathematics: A comparative study of male and female candidates.</article-title>
                </mixed-citation>
            </ref>
            <ref id="ref16">
                <label>16</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Adeyemi</surname>
                            <given-names>AA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Arogundade</surname>
                            <given-names>I</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Oluwakemi</surname>
                            <given-names>BBO</given-names>
                        </name>
</person-group>:
                    <article-title>Hybrid learning approaches and their effect on students' engagement and academic performance in secondary schools in some Nigerian states.</article-title>
                    <source>

                        <italic toggle="yes">Int J Res Innov Soc Sci.</italic>
</source>
                    <year>2025</year>;<volume>9</volume>(<issue>11</issue>).</mixed-citation>
            </ref>
            <ref id="ref17">
                <label>17</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Zumbo</surname>
                            <given-names>BD</given-names>
                        </name>
</person-group>:
                    <article-title>A measure of fairness: Using differential item functioning to detect bias.</article-title>
                    <source>

                        <italic toggle="yes">Educ Meas Issues Pract.</italic>
</source>
                    <year>2016</year>;<volume>35</volume>(<issue>1</issue>):<fpage>3</fpage>&#x2013;<lpage>12</lpage>.</mixed-citation>
            </ref>
            <ref id="ref18">
                <label>18</label>
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Millsap</surname>
                            <given-names>RE</given-names>
                        </name>
</person-group>:
                    <source>

                        <italic toggle="yes">Statistical approaches to measurement invariance.</italic>
</source>
                    <publisher-loc>New York</publisher-loc>:
                    <publisher-name>Routledge</publisher-name>;<year>2018</year>.</mixed-citation>
            </ref>
            <ref id="ref19">
                <label>19</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Haladyna</surname>
                            <given-names>TM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Rodriguez</surname>
                            <given-names>MC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Downing</surname>
                            <given-names>SM</given-names>
                        </name>
</person-group>:
                    <article-title>A review of multiple-choice item-writing guidelines for classroom assessment.</article-title>
                    <source>

                        <italic toggle="yes">Appl Meas Educ.</italic>
</source>
                    <year>2018</year>;<volume>31</volume>(<issue>1</issue>):<fpage>1</fpage>&#x2013;<lpage>18</lpage>.</mixed-citation>
            </ref>
            <ref id="ref20">
                <label>20</label>
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kolen</surname>
                            <given-names>MJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Brennan</surname>
                            <given-names>RL</given-names>
                        </name>
</person-group>:
                    <source>

                        <italic toggle="yes">Test equating, scaling, and linking.</italic>
</source>
                    <publisher-loc>New York</publisher-loc>:
                    <publisher-name>Springer</publisher-name>;
                    <edition>3rd ed</edition>
                    <year>2014</year>.</mixed-citation>
            </ref>
            <ref id="ref21">
                <label>21</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kim</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lee</surname>
                            <given-names>WC</given-names>
                        </name>
</person-group>:
                    <article-title>Trends in item difficulty and discrimination across repeated large-scale assessments.</article-title>
                    <source>

                        <italic toggle="yes">Educ Meas Issues Pract.</italic>
</source>
                    <year>2021</year>;<volume>40</volume>(<issue>3</issue>):<fpage>23</fpage>&#x2013;<lpage>34</lpage>.</mixed-citation>
            </ref>
            <ref id="ref22">
                <label>22</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Awopeju</surname>
                            <given-names>OA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Afolabi</surname>
                            <given-names>ERI</given-names>
                        </name>
</person-group>:
                    <article-title>Comparative analysis of classical test theory and item response theory-based item parameter estimates of senior school certificate mathematics examination.</article-title>
                    <source>

                        <italic toggle="yes">Eur Sci J.</italic>
</source>
                    <year>2016</year>;<volume>12</volume>(<issue>28</issue>):<fpage>263</fpage>&#x2013;<lpage>284</lpage>.</mixed-citation>
            </ref>
            <ref id="ref23">
                <label>23</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Adediwura</surname>
                            <given-names>AA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Babayemi</surname>
                            <given-names>BO</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Odumbo</surname>
                            <given-names>OI</given-names>
                        </name>
</person-group>:
                    <article-title>Trends in the Psychometric Characteristics of NECO Mathematics Senior School Certificate Examination Over a Period of Five Years (2020&#x2013;2024) among Osun State&#x2019;s Candidates.</article-title>
                    <year>2026, May 12</year>.
                    <pub-id pub-id-type="doi">10.17605/OSF.IO/GSPU5</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report489954">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.201328.r489954</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Ayanwale</surname>
                        <given-names>Musa Adekunle</given-names>
                    </name>
                    <xref ref-type="aff" rid="r489954a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-7640-9898</uri>
                </contrib>
                <aff id="r489954a1">
                    <label>1</label>University of Pretoria, Pretoria, South Africa</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>13</day>
                <month>6</month>
                <year>2026</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2026 Ayanwale MA</copyright-statement>
                <copyright-year>2026</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport489954" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.182391.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>reject</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The topic is important because monitoring psychometric quality is essential for maintaining fairness, reliability, and public confidence in high-stakes examinations. The study benefits from a large dataset and addresses a relevant assessment issue. However, the manuscript contains several methodological, psychometric, statistical, and reporting weaknesses that substantially limit confidence in the conclusions.</p>
            <p> </p>
            <p> The most significant concern is comparing psychometric indices across different examination years without evidence that the examination forms are psychometrically comparable. The authors do not report any equating procedures, anchor items, linking design, or measurement invariance analyses. Consequently, differences observed across years may reflect differences in examination forms rather than genuine changes in examination quality. This limitation affects the validity of the reported trend analyses and should be explicitly addressed.</p>
            <p> </p>
            <p> The methodology section lacks sufficient detail to permit replication. Important information is missing regarding item analysis procedures, the computation of difficulty and discrimination indices, the treatment of missing responses, the statistical software used, classification criteria for psychometric indices, and data cleaning procedures. These omissions reduce transparency and reproducibility.</p>
            <p> </p>
            <p> The statistical analyses also require reconsideration. The rationale for applying ANOVA to item-level psychometric indices is not adequately justified, assumptions are not discussed, and several reporting inconsistencies are present. For example, some tables refer to 2025 rather than 2024, and discrepancies exist between ANOVA values reported in the text and those reported in the tables. These issues raise concerns regarding the accuracy of the analyses and interpretation.</p>
            <p> </p>
            <p> Furthermore, the study relies exclusively on Classical Test Theory despite extensive discussion of Item Response Theory and Differential Item Functioning in the literature review. Additional psychometric evidence relating to dimensionality, item fit, local independence, validity, and fairness would strengthen the study considerably. There is also a mismatch between the stated objectives and the reported results: distractor efficiency is listed as a psychometric characteristic of interest, yet no distractor analysis is presented.</p>
            <p> </p>
            <p> The discussion section primarily repeats the results rather than providing a deeper interpretation. Greater engagement with the literature is needed to explain possible reasons for the observed fluctuations in difficulty and discrimination, particularly the notable changes reported for the 2024 examination. Consideration of curriculum changes, examination reforms, post-pandemic educational effects, and differences in candidate preparation would strengthen the discussion.</p>
            <p>Is the work clearly and accurately presented and does it cite the current literature?</p>
            <p>Partly</p>
            <p>If applicable, is the statistical analysis and its interpretation appropriate?</p>
            <p>No</p>
            <p>Are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Partly</p>
            <p>Is the study design appropriate and is the work technically sound?</p>
            <p>No</p>
            <p>Are the conclusions drawn adequately supported by the results?</p>
            <p>Partly</p>
            <p>Are sufficient details of methods and analysis provided to allow replication by others?</p>
            <p>No</p>
            <p>Reviewer Expertise:</p>
            <p>Test development, Psychometrics, Test Theories, Educational Measurement.</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report489957">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.201328.r489957</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Onuh</surname>
                        <given-names>Omale</given-names>
                    </name>
                    <xref ref-type="aff" rid="r489957a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0009-0001-5432-0806</uri>
                </contrib>
                <aff id="r489957a1">
                    <label>1</label>Joseph Sarwuan Tarka University, Makurdi, Nigeria</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>3</day>
                <month>6</month>
                <year>2026</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2026 Onuh O</copyright-statement>
                <copyright-year>2026</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport489957" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.182391.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>Despite these strengths, the manuscript requires substantial revision before it can be considered suitable for indexing. Several methodological, statistical, conceptual, and presentation issues need to be addressed. First, there is a mismatch between the stated objectives and the analyses conducted. While distractor efficiency was included among the psychometric characteristics to be examined, no results or analyses relating to distractor efficiency were presented. The authors should either provide the relevant analyses or revise the objectives accordingly.</p>
            <p> </p>
            <p> The methodology section lacks sufficient detail to allow replication of the study. Important information regarding the procedures used for item analysis, computation of psychometric indices, handling of missing data, criteria for classifying item difficulty and discrimination levels, and statistical software employed is not adequately described. Furthermore, the study compares psychometric characteristics across different examination years without providing evidence that the examination forms are comparable. Since different test forms were administered across years, the absence of equating procedures, anchor items, or measurement invariance analyses raises concerns about the validity of direct comparisons and the interpretation of trends.</p>
            <p> </p>
            <p> There are also several inconsistencies and errors in the reporting of results. For example, some tables refer to the year 2025 instead of 2024, and discrepancies exist between values reported in the narrative and those presented in the tables. The reporting of ANOVA statistics is also inconsistent, with degrees of freedom stated differently in the text and tables. Such errors suggest inadequate proofreading and raise concerns about the accuracy of the analyses.</p>
            <p> </p>
            <p> The discussion section largely repeats the results rather than providing deeper interpretation of the findings. The authors should engage more critically with the literature and explore possible explanations for the observed fluctuations in item difficulty and discrimination, particularly the notable changes observed in 2024. Issues such as curriculum modifications, examination reforms, changes in candidate preparation, and post-pandemic educational effects could be considered. In addition, the literature review would benefit from greater synthesis and engagement with recent psychometric research, particularly studies focusing on longitudinal assessment monitoring and examination quality assurance.</p>
            <p> </p>
            <p> The manuscript also requires substantial language editing. Numerous grammatical errors, awkward sentence constructions, repetitions, and unclear expressions reduce the readability of the paper. In some sections, sentences appear duplicated or poorly structured, which detracts from the overall quality of the manuscript.</p>
            <p> In conclusion, the study addresses an important topic and has the potential to contribute to the field of educational measurement and assessment. However, the current version contains significant methodological and reporting weaknesses that must be addressed. I therefore recommend major revision. The manuscript may become suitable for publication after the authors carefully address the methodological concerns, correct statistical inconsistencies, strengthen the discussion and literature review, improve reporting transparency, and undertake thorough language editing.</p>
            <p>Is the work clearly and accurately presented and does it cite the current literature?</p>
            <p>Yes</p>
            <p>If applicable, is the statistical analysis and its interpretation appropriate?</p>
            <p>No</p>
            <p>Are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>No</p>
            <p>Is the study design appropriate and is the work technically sound?</p>
            <p>No</p>
            <p>Are the conclusions drawn adequately supported by the results?</p>
            <p>Yes</p>
            <p>Are sufficient details of methods and analysis provided to allow replication by others?</p>
            <p>No</p>
            <p>Reviewer Expertise:</p>
            <p>Educational Measurment and Evaluation, Bias in Psychometric</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
    </sub-article>
</article>
