Task-Specific Oral Diadochokinetic Timing Variability and Articulation Error Counts in Typically Developing Japanese Preschoolers

Shinsuke Nagami; Shiho Yamasaki; Masashi Shiomi

doi:10.12688/f1000research.183360.1

Home Browse Task-Specific Oral Diadochokinetic Timing Variability and Articulation...

ALL Metrics

-

Views

-

Downloads

Get PDF

Get XML

Export

▬

✚

Brief Report

Task-Specific Oral Diadochokinetic Timing Variability and Articulation Error Counts in Typically Developing Japanese Preschoolers

[version 1; peer review: 1 not approved]

Shinsuke Nagami ¹, Shiho Yamasaki², Masashi Shiomi²

PUBLISHED 18 Jun 2026

Author details Author details

¹ Department of Speech-Language-Hearing Therapy, School of Rehabilitation Science, Health Sciences University of Hokkaido, Tobetsu, Hokkaido, 061-0293, Japan
² Department of Speech-Language Pathology and Audiology, Faculty of Rehabilitation, Kawasaki University of Medical Welfare, Kurashiki, Okayama, 701-0193, Japan

Shinsuke Nagami
Roles: Conceptualization, Formal Analysis, Methodology, Software, Visualization, Writing – Original Draft Preparation, Writing – Review & Editing

Shiho Yamasaki
Roles: Data Curation, Investigation, Methodology, Resources, Writing – Review & Editing

Masashi Shiomi
Roles: Investigation, Resources, Supervision, Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS

This article is included in the Japan Institutional Gateway gateway.

Abstract

Background

Oral diadochokinetic tasks using/pa/, /ta/, and/ka/are often summarized as a single rate or timing-variability measure. If the three tasks do not behave coherently, however, a composite may obscure task-specific associations with speech-sound accuracy.

Methods

This cross-sectional observational analysis included 28 typically developing Japanese-speaking preschool children aged 55 to 78 months. Children completed monosyllabic oral diadochokinetic tasks and a standardized Japanese articulation assessment. Timing variability was quantified as the coefficient of variation (CV) of inter-peak intervals. Participant-level and task-level regression models examined associations with articulation error count, adjusting for age and site.

Results

The cross-task mean coefficient of variation was not detectably associated with articulation error count (coefficient = −0.001, p = .778). Inter-task correlations were negligible (r = −0.155 to −0.041). In task-specific models, /pa/timing variability showed evidence of a positive association with articulation error count (coefficient = 0.020, p = .009), whereas/ta/and/ka/did not show comparable evidence. A task-level interaction model was consistent with task-specific slopes, although the random-intercept mixed model produced a singular-fit warning.

Conclusions

In this small cross-sectional sample, aggregating timing variability across/pa/, /ta/, and/ka/may obscure task-specific patterns. The results support reporting oral diadochokinetic timing variability by task alongside rate as a hypothesis-generating measurement consideration, but they should be interpreted as exploratory and require replication in larger independent samples.

Keywords

oral diadochokinesis; pediatric speech assessment; articulation; speech timing; preschool children; speech motor assessment; restricted data

Corresponding author: Shinsuke Nagami

Competing interests: No competing interests were disclosed.

Grant information: This work was supported by JSPS KAKENHI (Grant No. JP21K02694; principal investigator: Shiho Yamasaki) and JSPS KAKENHI (Grant No. JP26K13069; principal investigator: Shinsuke Nagami). The funder had no role in the study design, data collection, analysis, interpretation, manuscript preparation, or decision to submit the work for publication.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Copyright: © 2026 Nagami S et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Nagami S, Yamasaki S and Shiomi M. Task-Specific Oral Diadochokinetic Timing Variability and Articulation Error Counts in Typically Developing Japanese Preschoolers [version 1; peer review: 1 not approved]. F1000Research 2026, 15:973 (https://doi.org/10.12688/f1000research.183360.1) First published: 18 Jun 2026, 15:973 (https://doi.org/10.12688/f1000research.183360.1) Latest published: 18 Jun 2026, 15:973 (https://doi.org/10.12688/f1000research.183360.1)

Introduction

Oral diadochokinesis (DDK), elicited by rapid repetition of syllables such as /pa/, /ta/, and/ka/, is widely used in speech-language assessment.¹^–³ Rate is the most commonly reported metric, but timing variability may capture aspects of rhythmic speech-motor control that rate alone does not.²^,³ Pediatric DDK studies have shown that task choice, age, and speech-sound status can affect interpretation.⁴^–⁷ In pediatric work, values from the three monosyllabic tasks are sometimes averaged or interpreted as if they reflect one underlying ability. That practice is only defensible if the task measures show enough coherence to justify a composite.

The present study addressed a narrow interpretive question: when /pa/, /ta/, and /ka/ recordings are already available, should timing variability be interpreted by task or as a single mean across tasks? We examined whether coefficient-of-variation measures from typically developing Japanese preschool children were associated with articulation error count, and whether the cross-task mean captured or obscured task-level information.

Methods

Participants and ethics

This was a cross-sectional observational analysis of children recruited from two preschool settings in Japan. Data were collected between October 2022 and April 2023. Thirty-one children were initially enrolled, and three were excluded because usable DDK task data were unavailable for the present acoustic analyses. The final analyzed sample comprised 28 typically developing Japanese-speaking children aged 55 to 78 months. Children were included if they were judged to be typically developing and had no documented speech, language, hearing, neurologic, or structural oral-motor diagnosis at the time of assessment. Exact preschool names, site mapping, dates of birth, and exact assessment dates are withheld from public materials to reduce re-identification risk. No a priori sample-size calculation was performed; this secondary analysis used the available eligible preschool sample with usable DDK data. Sex information was incomplete in the source descriptive records and was not retained in the restricted analysis dataset used for modeling; sex/gender was not used as a model covariate, and the findings should not be interpreted as sex- or gender-specific estimates.

The Institutional Review Board of Kawasaki University of Medical Welfare approved the study (approval No. 21–077; approval date: 29 October 2021). The approved participant-protection materials included written informed consent from parents or guardians before data collection and a child-facing assent/explanation document for the preschool participants.

Speech-sound assessment

Speech-sound accuracy was assessed with the word version of the Japanese Articulation Test.⁸ Children named picture stimuli, and responses were scored by a certified speech-language pathologist for articulation errors. The total articulation error count served as the primary predictor.

DDK recording and acoustic processing

Each child repeated /pa/, /ta/, and /ka/ as quickly and steadily as possible. Speech was recorded with an IC recorder and a video camera with an external microphone, and the right-channel audio track was extracted at 48 kHz. The acoustic pipeline applied a 200-Hz low-pass Butterworth filter to derive an amplitude envelope, used a locally run OpenAI Whisper-assisted screening step (exact local Whisper model/version not recorded in the processing log; run locally, not through a cloud/API service) to locate candidate task segments within longer assessment recordings,⁹ and then detected amplitude peaks using a median absolute deviation-based threshold applied to the envelope. Segment boundaries and retained peaks were visually reviewed in Praat before task-level values were accepted for analysis.¹⁰ Automated DDK extraction has been used in other clinical speech-motor contexts, but the present pipeline was treated as a measurement workflow requiring visual review rather than as a fully automated clinical tool.¹¹^,¹²

At least six peaks were required for a task to contribute to variability analyses. When the initial automated detection did not identify enough peaks, a rescue procedure based on root mean square envelope burst detection with relaxed thresholds was applied. The final derived dataset contained task-level rate and timing-variability measures for /pa/, /ta/, and/ka/.

Outcome measures and analysis

For each task, coefficient of variation (CV) was calculated as the standard deviation of inter-peak/inter-response intervals divided by the mean inter-peak/inter-response interval. Derived CV values are ratios, not percentages. The primary cross-task measure was the participant-level arithmetic mean of /pa/, /ta/, and /ka/ CV values. Task-specific analyses examined /pa/, /ta/, and /ka/ CV values separately.

Participant-level ordinary least squares models were adjusted for age in months and anonymized site. Additional participant-level models examined rate adjustment, non-normalized variability metrics, and trimmed CV measures. Complete-case denominators for the non-normalized and trimmed-CV sensitivity models were n = 20 and n = 17, respectively. A task-level long-format ordinary least squares model examined the task-by-error-count interaction. The author-controlled analysis script fits participant-level models and task-level interaction/random-intercept models from the restricted derived numerical data using R, lme4, and lmerTest.¹³^–¹⁵ The restricted-data repository record contains an analysis-workflow script and session information, but it does not contain participant-level data sufficient to rerun these models publicly. The random-intercept mixed model is retained for transparency, but it produced a singular-fit warning and should not be overinterpreted as evidence for stable between-participant random-intercept variance.

Results

The analyzed sample included 28 children with complete derived values for the primary participant-level and task-level analyses. Age ranged from 55 to 78 months, and articulation error count ranged from 0 to 13. Sensitivity analyses using non-normalized inter-peak interval variability and trimmed CV used smaller complete-case denominators, as described in the Methods.

The cross-task mean CV was not detectably associated with articulation error count in the primary participant-level model (coefficient = −0.001, p = .778; Figure 1). The three task-specific CV measures also showed negligible correlations with one another: /pa/ versus /ta/, r = −0.155; /pa/ versus /ka/, r = −0.048; and /ta /versus /ka/, r = −0.041 ( Figure 3). These results did not support treating the three tasks as interchangeable indicators of a single timing-variability construct in this sample.

Figure 1. Articulation-error coefficient forest plot.

The plot compares the unstandardized articulation error count coefficients across the composite model and task-specific models. Horizontal lines indicate 95% confidence intervals.

Task-specific analyses suggested a pattern that was not apparent in the cross-task mean. Higher articulation error count was associated with higher /pa/ timing variability (coefficient = 0.020, p = .009), and this association remained similar after adjustment for/pa/rate (coefficient = 0.021, p = .008; Figure 2). In contrast, /ta/ and /ka/timing variability did not show comparable evidence of association with articulation error count. The task-level ordinary least squares interaction model was consistent with different error-count slopes across tasks. In the author-controlled mixed/random-intercept sensitivity model, the task-by-error-count term was statistically detectable (F = 4.75, p = .011), but the model was singular; the result is therefore presented as supportive sensitivity evidence rather than as a standalone primary mixed-effects result.

Figure 2. /pa/ CV partial residual plot.

The plot illustrates the adjusted association between articulation error count and/pa/timing variability after adjustment for age in months and anonymized site. Public site labels are shown only as anonymized site_a and site_b categories to reduce disclosure of site-stratified participant-level information.

Figure 3. Task CV correlation heatmap.

The heatmap shows Pearson correlations among /pa/, /ta/, and /ka/coefficient-of-variation measures.

Discussion

The main finding is that /pa/, /ta/, and /ka/ timing variability did not behave as interchangeable measures in this small sample. The cross-task mean did not show a detectable association with articulation error count, whereas /pa/ CV showed evidence of a positive association. This pattern suggests that aggregating across tasks may hide a task-specific association, particularly when inter-task correlations are near zero.

The mechanism remains unresolved. The /pa/ pattern could reflect task-specific speech-motor timing, differences in acoustic detectability, measurement behavior of the envelope-based peak detection pipeline, developmental differences among consonant gestures, or a combination of these.¹⁶^–¹⁸ One possible explanation is that bilabial burst timing in /pa/was more consistently captured by the envelope-based peak-detection workflow than the lingual gestures in /ta/ and /ka/; another is that anterior articulatory timing variability was more closely coupled to the articulation error counts used here. These explanations are speculative. The present data suggest an interpretable pattern, but they do not distinguish among physiological, developmental, and measurement explanations.

Several limitations constrain interpretation. The sample was small, cross-sectional, and restricted to typically developing children from two preschool sites. The findings do not establish diagnostic accuracy, clinical utility, developmental trajectories, or causal mechanisms. The mixed/random-intercept task-level model produced a singular-fit warning, so the task-level interaction should be read as supportive sensitivity evidence alongside the simpler task-specific models, not as a definitive hierarchical-model result. The analyses were reproduced locally from author-controlled de-identified derived numerical values, but the restricted-data repository record does not allow independent rerunning of participant-level or task-level models. Raw recordings are not publicly shared because they may identify preschool children and were not prepared for unrestricted public release. Replication in larger samples, including clinical samples and independent measurement pipelines, is needed.

Conclusions

In this small cross-sectional sample of typically developing Japanese preschoolers, the cross-task mean CV across /pa/, /ta/, and /ka/ was not detectably associated with articulation error count. In task-specific analyses, /pa/ CV showed evidence of a positive association with articulation error count, whereas /ta/ and /ka/ CV did not show comparable evidence. These findings suggest that aggregating DDK timing variability across tasks may obscure task-specific patterns, but the results are hypothesis-generating, exploratory, and require replication before clinical or developmental interpretation.

Ethics and consent

The study was approved by the Institutional Review Board of Kawasaki University of Medical Welfare (approval No. 21–077; approval date: 29 October 2021). The approved participant-protection materials included written informed consent from parents or guardians before data collection and a child-facing assent/explanation document for the preschool participants. No identifiable child recordings, images, videos, site names, dates of birth, assessment dates, or original identifiers are included in this article or repository record.

Data and software availability

Underlying data

The participant-level and participant-task-level derived numerical datasets underlying the reported analyses are restricted. They are not publicly available because the study involved preschool child participants, the source materials derive from speech/audio-video assessments collected under institutional review board approval No. 21–077 (approval date: 29 October 2021) and guardian-consent conditions, and the located ethics, guardian-consent, and child-facing assent/explanation materials do not pre-authorize unrestricted public release or unreviewed external transfer of participant-level derived numerical data. The derived numerical data are also restricted to mitigate re-identification risk through linkage with age, site, speech-profile, or other contextual information. Requests for access to the restricted derived numerical data may be directed to the corresponding author ([email protected]), but access is not guaranteed. Requests will be considered case by case and only if the proposed use appears compatible with the original ethics, guardian-consent, and child-facing assent/explanation materials; before any data transfer, the authors would confirm with the responsible ethics committee whether the proposed transfer is permissible and whether a data use agreement or other review is required. Requests should include the proposed research purpose, requested variables, evidence of ethics approval or exemption, data-security plan, and agreement not to attempt re-identification or redistribute the data. Raw audio/video recordings, original participant identifiers, original site identities, dates of birth, assessment dates, local file names, and linkage keys are not publicly available and are not available for unrestricted redistribution.

Software and reporting materials. Zenodo: Task-specific oral diadochokinetic timing variability and articulation error counts in typically developing Japanese preschoolers: restricted-data code and reporting materials [Software and Documentation]. DOI: https://doi.org/10.5281/zenodo.20361773, cite as Nagami et al.¹⁹

The restricted-data repository record contains code/yamasaki2_f1000_analysis.R, a restricted-data analysis-workflow script; environment/R_REQUIREMENTS.md, package requirements; environment/sessionInfo_after_analysis.txt, session information from the author-controlled restricted rerun; outputs/, aggregate descriptive tables, model summaries, model coefficients, VIF tables, task-correlation tables, and task-interaction outputs from the author-controlled restricted rerun; docs/STROBE_checklist_cross_sectional_sections.md, the completed reporting checklist; and LICENSE_CODE/LICENSE_DOCS license files. The analysis-workflow script documents the intended analysis and exits successfully with an explanatory message when the restricted input CSV files are absent. The included output CSV files provide aggregate/model-output evidence from the author-controlled restricted rerun but do not contain participant-level or participant-task-level rows. If the restricted participant-level and participant-task-level CSV files are supplied in an approved author-controlled environment, the script fits the reported participant-level and task-level models. The author-controlled restricted rerun used R version 4.6.0 (2026-04-24); package versions for lme4, lmerTest, broom, ggplot2, tidyverse, and other dependencies are listed in environment/sessionInfo_after_analysis.txt. Code is available under the MIT License; documentation and reporting materials are available under CC BY 4.0. These licenses do not apply to restricted derived data, raw recordings, original identifiers, site identities, dates, local file names, or linkage keys.

Restricted source data

Raw audio and video recordings, original participant identifiers, original site identities, dates of birth, assessment dates, local file names, and linkage keys are not publicly available. These source materials contain potentially identifiable recordings and contextual information from preschool children, and were collected under consent conditions that did not permit unrestricted public sharing of identifiable child audio/video data. Under the current consent and ethics approval, raw recordings must not be publicly deposited or redistributed unless additional ethics and consent permissions are confirmed.

Reporting guidelines

A completed STROBE checklist for this cross-sectional observational study is included in the restricted-data repository record as docs/STROBE_checklist_cross_sectional_sections.md.

Acknowledgments

The authors thank the children, parents, and staff at the participating preschools for their cooperation. OpenAI ChatGPT (GPT-5.5; accessed May-June 2026) was used only to check English wording and nuance during manuscript preparation, because the authors are not native English speakers. It was not used to generate research data, perform analyses, create figures, or determine scientific interpretations. The authors reviewed and approved all final wording.

References

1. Fletcher SG: Time-by-count measurement of diadochokinetic syllable rate. J. Speech Hear. Res. 1972; 15(4): 763–770. PubMed Abstract | Publisher Full Text
2. Kent RD, Kent JF, Rosenbek JC: Maximum performance tests of speech production. J. Speech Hear. Disord. 1987; 52(4): 367–387. Publisher Full Text
3. Kent RD, Kim Y, Chen LM: Oral and laryngeal diadochokinesis across the life span: a scoping review of methods, reference data, and clinical applications. J. Speech Lang. Hear. Res. 2022; 65(2): 574–623. Publisher Full Text
4. Icht M, Ben-David BM: Evaluating rate and accuracy of real word vs. non-word diadochokinetic productions from childhood to early adulthood in Hebrew speakers. J. Commun. Disord. 2021; 92: 106112. Publisher Full Text
5. Gao R, Yuen JTW, Li XX, et al.: Oral diadochokinetic performance on perceptual and acoustic measures for typically developing Cantonese-speaking preschool children. J. Speech Lang. Hear. Res. 2023; 66(5): 1445–1466. PubMed Abstract | Publisher Full Text
6. Williams P, Stackhouse J: Diadochokinetic skills: normal and atypical performance in children aged 3-5 years. Int. J. Lang. Commun. Disord. 1998; 33(suppl): 481–486. PubMed Abstract | Publisher Full Text
7. Ha S: Oral diadochokinetic production in children with typical speech development and speech-sound disorders. Int. J. Lang. Commun. Disord. 2023; 58(5): 1783–1798. PubMed Abstract | Publisher Full Text
8. Kenkyukai KR, editors. Shinpan Koon Kensa [Japanese Articulation Test, revised edition]. Chiba Test Center; 2010. In Japanese.
9. Radford A, Kim JW, Xu T, et al.: Robust speech recognition via large-scale weak supervision. Proceedings of the 40th International Conference on Machine Learning. PMLR; 2023; 28492–28518.
10. Boersma P, van Heuven V : Praat, a system for doing phonetics by computer. Glot Int. 2001; 5(9/10): 341–347.
11. Rong P: Automated acoustic analysis of oral diadochokinesis to assess bulbar motor involvement in amyotrophic lateral sclerosis. J. Speech Lang. Hear. Res. 2020; 63(1): 59–73. Publisher Full Text
12. Tanchip C, Guarin DL, McKinlay S, et al.: Validating automatic diadochokinesis analysis methods across dysarthria severity and syllable task in amyotrophic lateral sclerosis. J. Speech Lang. Hear. Res. 2022; 65(3): 940–953. PubMed Abstract | Publisher Full Text | Free Full Text
13. R Core Team: R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2026. Publisher Full Text Reference Source
14. Bates D, Maechler M, Bolker B, et al.: Fitting linear mixed-effects models using lme4. J. Stat. Softw. 2015; 67(1): 1–48. Publisher Full Text
15. Kuznetsova A, Brockhoff PB, Christensen RHB: lmerTest package: tests in linear mixed effects models. J. Stat. Softw. 2017; 82(13): 1–26. Publisher Full Text
16. Green JR, Moore CA, Higashikawa M, et al.: The physiologic development of speech motor control: lip and jaw coordination. J. Speech Lang. Hear. Res. 2000; 43(1): 239–255. PubMed Abstract | Publisher Full Text | Free Full Text
17. Smith A, Zelaznik HN: Development of functional synergies for speech motor coordination in childhood and adolescence. Dev. Psychobiol. 2004; 45(1): 22–33. PubMed Abstract | Publisher Full Text
18. McLeod S, Crowe K: Children's consonant acquisition in 27 languages: a cross-linguistic review. Am. J. Speech Lang. Pathol. 2018; 27(4): 1546–1571. PubMed Abstract | Publisher Full Text
19. Nagami S, Yamasaki S, Shiomi M: Task-specific oral diadochokinetic timing variability and articulation error counts in typically developing Japanese preschoolers: restricted-data code and reporting materials [Software and Documentation]. Zenodo. 2026. Publisher Full Text

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 18 Jun 2026

Author details Author details

¹ Department of Speech-Language-Hearing Therapy, School of Rehabilitation Science, Health Sciences University of Hokkaido, Tobetsu, Hokkaido, 061-0293, Japan
² Department of Speech-Language Pathology and Audiology, Faculty of Rehabilitation, Kawasaki University of Medical Welfare, Kurashiki, Okayama, 701-0193, Japan

Shinsuke Nagami
Roles: Conceptualization, Formal Analysis, Methodology, Software, Visualization, Writing – Original Draft Preparation, Writing – Review & Editing

Shiho Yamasaki
Roles: Data Curation, Investigation, Methodology, Resources, Writing – Review & Editing

Masashi Shiomi
Roles: Investigation, Resources, Supervision, Writing – Review & Editing

Competing interests

No competing interests were disclosed.

Grant information

This work was supported by JSPS KAKENHI (Grant No. JP21K02694; principal investigator: Shiho Yamasaki) and JSPS KAKENHI (Grant No. JP26K13069; principal investigator: Shinsuke Nagami). The funder had no role in the study design, data collection, analysis, interpretation, manuscript preparation, or decision to submit the work for publication.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Article Versions (1)

version 1

Published: 18 Jun 2026, 15:973

https://doi.org/10.12688/f1000research.183360.1

Copyright

© 2026 Nagami S et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

0

SEE MORE DETAILS

CITE

how to cite this article

Nagami S, Yamasaki S and Shiomi M. Task-Specific Oral Diadochokinetic Timing Variability and Articulation Error Counts in Typically Developing Japanese Preschoolers [version 1; peer review: 1 not approved]. F1000Research 2026, 15:973 (https://doi.org/10.12688/f1000research.183360.1)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Version 1

VERSION 1

PUBLISHED 18 Jun 2026

Views

0

Reviewer Report 08 Jul 2026

Raymond Kent, University of Wisconsin-Madison, Madison, Wisconsin, USA

Not Approved

https://doi.org/10.5256/f1000research.202404.r495795

The basic question addressed in this report is of interest in the clinical assessment of children’s speech production. However, confidence in the results of this study is limited by problems in methodology, as described in the following.

... Continue reading

The basic question addressed in this report is of interest in the clinical assessment of children’s speech production. However, confidence in the results of this study is limited by problems in methodology, as described in the following.

The independent variable was a measure of speech sound accuracy determined by the Japanese Articulation Test. The value of this measure is clouded by several factors.

First, the articulation test may not be familiar to readers from different language backgrounds. Therefore, it is essential to summarize features of this test, including its phonetic composition (vowels and consonants), details on the elicitation procedure and environment, and availability of normative data.

Second, judging from the data displayed in Figure 2, the articulation test scores are saturated at the low end of the scores, with only a few scores greater than 4. This pattern raises concerns about the sensitivity of the test as a criterion index. The clustering of data at the low end of test scores reduces the value of this test for its intended purposes in this study. The lines of fit in Figure 2 are of questionable validity because of the data distribution.

Third, there would be more confidence in the test results if more than one rater was used or if reliability data were given for the one rater. Qualifying the rater as a certified speech-language pathologist does not guarantee reliability or validity. The results of this study hinge critically on the measure of speech sound accuracy, making it essential to establish the suitability of this measure.

Several studies, including some cited in this paper, comment on methods to ensure optimum performance of the DDK task by children. It is not clear from this report if steps were taken in this regard. Although the task is relatively simple, performance can be affected by instruction, practice, motivation, reward, and other factors. The authors state that, “Each child repeated /pa/, /ta/, and /ka/ as quickly and steadily as possible.” This statement severely underestimates the vulnerability of the DDK task to the factors just noted and raises the concern that the authors are not familiar with the assessment of children’s speech.

If only six peaks were sufficient for analysis of temporal features, then the data could reflect little more than one second of speech performance, assuming a DDK rate of 4 to 6 syllables/second, which is typical for children of this age. But data were included for even fewer peaks through use of a rescue procedure. How often was such rescue used? What was the actual minimum for number of peaks? The method of DDK analysis is barely acceptable to ensure confidence in the statistics. The coefficient of variation loses value as the number of values is reduced. The report should summarize the number of peaks analyzed for individual children. Another concern is that the authors acknowledge possible problems in the process of envelope-based peak detection. Was an effort made to determine the reliability of this process across the three different syllables?

Given these limitations, I cannot recommend indexing of this paper in its current form.

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Partly
Are sufficient details of methods and analysis provided to allow replication by others?

Partly
If applicable, is the statistical analysis and its interpretation appropriate?

Partly
Are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions drawn adequately supported by the results?

Partly

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Speech development in children, acoustic analysis of speech, anatomy and physiology of speech production, speech disorders

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

CITE

Report a concern

Respond or Comment

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 18 Jun 2026

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1
Version 1 18 Jun 26	read

Raymond Kent, University of Wisconsin-Madison, Madison, USA

Comments on this article

All Comments(0)

Add a comment

Sign up for content alerts

Browse by related subjects

Back to all reports

Reviewer Report

0 Views

08 Jul 2026 | for Version 1

Raymond Kent, University of Wisconsin-Madison, Madison, Wisconsin, USA

0 Views Cite this report Responses(0)

Not Approved

The basic question addressed in this report is of interest in the clinical assessment of children’s speech production. However, confidence in the results of this study is limited by problems in methodology, as described in the following.

The independent variable was a measure of speech sound accuracy determined by the Japanese Articulation Test. The value of this measure is clouded by several factors.

First, the articulation test may not be familiar to readers from different language backgrounds. Therefore, it is essential to summarize features of this test, including its phonetic composition (vowels and consonants), details on the elicitation procedure and environment, and availability of normative data.

Second, judging from the data displayed in Figure 2, the articulation test scores are saturated at the low end of the scores, with only a few scores greater than 4. This pattern raises concerns about the sensitivity of the test as a criterion index. The clustering of data at the low end of test scores reduces the value of this test for its intended purposes in this study. The lines of fit in Figure 2 are of questionable validity because of the data distribution.

Third, there would be more confidence in the test results if more than one rater was used or if reliability data were given for the one rater. Qualifying the rater as a certified speech-language pathologist does not guarantee reliability or validity. The results of this study hinge critically on the measure of speech sound accuracy, making it essential to establish the suitability of this measure.

Several studies, including some cited in this paper, comment on methods to ensure optimum performance of the DDK task by children. It is not clear from this report if steps were taken in this regard. Although the task is relatively simple, performance can be affected by instruction, practice, motivation, reward, and other factors. The authors state that, “Each child repeated /pa/, /ta/, and /ka/ as quickly and steadily as possible.” This statement severely underestimates the vulnerability of the DDK task to the factors just noted and raises the concern that the authors are not familiar with the assessment of children’s speech.

If only six peaks were sufficient for analysis of temporal features, then the data could reflect little more than one second of speech performance, assuming a DDK rate of 4 to 6 syllables/second, which is typical for children of this age. But data were included for even fewer peaks through use of a rescue procedure. How often was such rescue used? What was the actual minimum for number of peaks? The method of DDK analysis is barely acceptable to ensure confidence in the statistics. The coefficient of variation loses value as the number of values is reduced. The report should summarize the number of peaks analyzed for individual children. Another concern is that the authors acknowledge possible problems in the process of envelope-based peak detection. Was an effort made to determine the reliability of this process across the three different syllables?

Given these limitations, I cannot recommend indexing of this paper in its current form.

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Partly
Are sufficient details of methods and analysis provided to allow replication by others?

Partly
If applicable, is the statistical analysis and its interpretation appropriate?

Partly
Are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions drawn adequately supported by the results?

Partly

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Speech development in children, acoustic analysis of speech, anatomy and physiology of speech production, speech disorders

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

Respond to this report

Responses (0)

[1] 1. Fletcher SG: Time-by-count measurement of diadochokinetic syllable rate. J. Speech Hear. Res. 1972; 15(4): 763–770. PubMed Abstract | Publisher Full Text

[2] 2. Kent RD, Kent JF, Rosenbek JC: Maximum performance tests of speech production. J. Speech Hear. Disord. 1987; 52(4): 367–387. Publisher Full Text

[3] 3. Kent RD, Kim Y, Chen LM: Oral and laryngeal diadochokinesis across the life span: a scoping review of methods, reference data, and clinical applications. J. Speech Lang. Hear. Res. 2022; 65(2): 574–623. Publisher Full Text

[4] 4. Icht M, Ben-David BM: Evaluating rate and accuracy of real word vs. non-word diadochokinetic productions from childhood to early adulthood in Hebrew speakers. J. Commun. Disord. 2021; 92: 106112. Publisher Full Text

[5] 5. Gao R, Yuen JTW, Li XX, et al.: Oral diadochokinetic performance on perceptual and acoustic measures for typically developing Cantonese-speaking preschool children. J. Speech Lang. Hear. Res. 2023; 66(5): 1445–1466. PubMed Abstract | Publisher Full Text

[6] 6. Williams P, Stackhouse J: Diadochokinetic skills: normal and atypical performance in children aged 3-5 years. Int. J. Lang. Commun. Disord. 1998; 33(suppl): 481–486. PubMed Abstract | Publisher Full Text

[7] 7. Ha S: Oral diadochokinetic production in children with typical speech development and speech-sound disorders. Int. J. Lang. Commun. Disord. 2023; 58(5): 1783–1798. PubMed Abstract | Publisher Full Text

[8] 8. Kenkyukai KR, editors. Shinpan Koon Kensa [Japanese Articulation Test, revised edition]. Chiba Test Center; 2010. In Japanese.

[9] 9. Radford A, Kim JW, Xu T, et al.: Robust speech recognition via large-scale weak supervision. Proceedings of the 40th International Conference on Machine Learning. PMLR; 2023; 28492–28518.

[10] 10. Boersma P, van Heuven V : Praat, a system for doing phonetics by computer. Glot Int. 2001; 5(9/10): 341–347.

[11] 11. Rong P: Automated acoustic analysis of oral diadochokinesis to assess bulbar motor involvement in amyotrophic lateral sclerosis. J. Speech Lang. Hear. Res. 2020; 63(1): 59–73. Publisher Full Text

[12] 12. Tanchip C, Guarin DL, McKinlay S, et al.: Validating automatic diadochokinesis analysis methods across dysarthria severity and syllable task in amyotrophic lateral sclerosis. J. Speech Lang. Hear. Res. 2022; 65(3): 940–953. PubMed Abstract | Publisher Full Text | Free Full Text

[13] 13. R Core Team: R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2026. Publisher Full Text Reference Source

[14] 14. Bates D, Maechler M, Bolker B, et al.: Fitting linear mixed-effects models using lme4. J. Stat. Softw. 2015; 67(1): 1–48. Publisher Full Text

[15] 15. Kuznetsova A, Brockhoff PB, Christensen RHB: lmerTest package: tests in linear mixed effects models. J. Stat. Softw. 2017; 82(13): 1–26. Publisher Full Text

[16] 16. Green JR, Moore CA, Higashikawa M, et al.: The physiologic development of speech motor control: lip and jaw coordination. J. Speech Lang. Hear. Res. 2000; 43(1): 239–255. PubMed Abstract | Publisher Full Text | Free Full Text

[17] 17. Smith A, Zelaznik HN: Development of functional synergies for speech motor coordination in childhood and adolescence. Dev. Psychobiol. 2004; 45(1): 22–33. PubMed Abstract | Publisher Full Text

[18] 18. McLeod S, Crowe K: Children's consonant acquisition in 27 languages: a cross-linguistic review. Am. J. Speech Lang. Pathol. 2018; 27(4): 1546–1571. PubMed Abstract | Publisher Full Text

[19] 19. Nagami S, Yamasaki S, Shiomi M: Task-specific oral diadochokinetic timing variability and articulation error counts in typically developing Japanese preschoolers: restricted-data code and reporting materials [Software and Documentation]. Zenodo. 2026. Publisher Full Text

Task-Specific Oral Diadochokinetic Timing Variability and Articulation Error Counts in Typically Developing Japanese Preschoolers

Abstract

Background

Methods

Results

Conclusions

Keywords

Introduction

Methods

Participants and ethics

Speech-sound assessment

DDK recording and acoustic processing

Outcome measures and analysis

Results

Figure 1. Articulation-error coefficient forest plot.

Figure 2. /pa/ CV partial residual plot.

Figure 3. Task CV correlation heatmap.

Discussion

Conclusions

Ethics and consent

Data and software availability

Underlying data

Restricted source data

Reporting guidelines

Acknowledgments

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated