Convergent validity and reliability of a novel repeated agility protocol in junior rugby league players [version 2; peer review: 2 approved]

Background: : Rugby league involves repeated, complex, and high intensity change-of-direction (COD) movements with no existing test protocols that specifically assesses these multiple physical fitness components simultaneously. Thus, the current study examined the convergent validity of a repeated Illinois Agility (RIA) protocol with the repeated T-agility protocol, and the repeatability of the RIA protocol in adolescent Rugby League players. Furthermore, aerobic capacity and anaerobic and COD performance were assessed to determine whether these physical qualities were important contributors to the RIA protocol. Methods: Twenty-two junior Rugby League players completed 4 sessions with each separated by 7 days. Initially, physical fitness characteristics at baseline (i.e., Beep test,, countermovement jump, 30-m sprint, single-effort COD and repeated sprint ability [RSA]) were assessed. The second session involved a familiarisation of RIA and repeated T-agility test (RTT) protocols. During the third and fourth sessions, participants completed the RIA and RTT protocols in a randomised, counterbalanced design to examine the validity and testretest reliability of these protocols. Results: For convergent validity, significant correlations were identified between RIA and RTT performances (r= >0.80; p<0.05). For contributors to RIA performance, significant correlations were identified between all baseline fitness characteristics and RIA (r = >0.71; p < 0.05). Reliability of the RIA protocol was near perfect with excellent intra-class correlation coefficient (0.87-0.97), good ratio limits of agreement (×/÷ 1.05-1.06) and low coefficient of variations (1.8-2.0%). Conclusions: The current study has demonstrated the RIA to be a simple, valid and reliable field test for RL athletes that can provide coaches with information about their team’s ability to sustain high Open Peer Review


Introduction
Rugby League (RL) is an intermittent, invasion type game that requires players to complete repetitive bursts of sprinting and change-of-direction (COD) movements in response to the dynamic constraints of the game 1 . Traditionally, the physical component of COD has been assessed using protocols with a single bout approach for the COD performance measure considered a strong determinant of match-performance in team sports 2-5 . However, team sports, such as RL, encounter repeated bursts of COD movements to defend or evade defenders during a game 6 . Consequently, performance of repeated-COD activities with brief periods of rest may be an important performance component necessary for RL athletes.
As a monitoring tool, the reliability of repeated-COD protocols have been explored in a variety of sports 7-9 . Results from a study examining a Repeated T-Test (RTT) agility protocol in soccer players significantly correlated with anaerobic measures of power, speed and repeat-sprint ability (RSA), with excellent test-retest reliability 9 . While a good indicator of COD performance, the Agility T-Test consists of a linear sprint, lateral shuffles and a backwards run, which are movements that are sporadic in RL 9 . In fact, RL players change direction frequently and utilise evading movements 10 that are not replicated by the Agility T-Test. Therefore, the Illinois Agility test may be more reflective of the evading activities undertaken in RL, as the protocol includes vigorous changes in direction by weaving in and out of cones 11 . Furthermore, the majority of studies examining the reliability of repeated-COD protocols have been conducted in adult athletes, despite a previous study reporting lower reliability in younger athletes 12,13 . Thus, research is warranted examining the reliability of repeated COD-protocols in adolescent athletes.
In addition to reliability, there has been limited investigations into validating repeated-COD protocols. Indeed, the RTT protocol was significantly correlated with anaerobic measures of power, speed and repeat-sprint ability (RSA) 9 , suggesting that these physical qualities were pertinent for repeated-COD performance. However, separate repeated-COD protocols have yet to be compared, which is essential as each COD protocol exhibits distinct movement demands that may be suitable for specific sports. To date, no studies have examined the validity and reliability of a repeated Illinois Agility (RIA) protocol. Reporting these properties would be essential for widespread usability in RL 4 .
The aims of this study were three-fold: 1) to examine the convergent validity of a novel RIA test with the repeated Agility T-test protocol (i.e. RTT); 2) to identify contributors of RIA performance by correlating its measures to speed, anaerobic capacity and RSA; and 3) to determine the test-retest reliability of the RIA protocol. It was hypothesised that the RIA would demonstrate acceptable convergent validity and reliability as a repeated-COD test, with relationships identified between results of the RIA and the RTT, aerobic capacity, speed, and anaerobic capacity protocols. Examining the convergent validity of the RIA protocol will determine whether this novel assessment exhibits similar attributes to a standardised COD protocol (i.e., RTT). In addition, the reliability of the RIA will determine whether this test can be reliably adopted in practice by accounting for the inherent error of the test across repeated measurements. The quality of these psychometric properties will provide coaches with a tool to assist in monitoring and training RL athletes as well as in talent development and identification.

Research design
The current study was a randomised, counter-balanced study conducted across five sessions from June, 2018 to August, 2018 ( Figure 1). During the first session, the participants completed a Multistage Shuttle test to determine predicted maximal aerobic capacity (VO 2max ) 14 . The second session was utilised to obtain baseline assessments of speed (30-metre sprint), COD (Illinois Agility test, Agility T-Test) and repeat-sprint ability (RSA). The third session familiarised participants with the RTT and RIA tests. During the fourth and fifth sessions, participants undertook both the RIA and RTT, in randomised order, with at least 15-minutes of recovery between each protocol.
At the start of each session, muscle soreness rating was collected prior to performing a standardised warm up, using a 1-10 visual analogue scale, with 1 and 10 indicating 'no soreness' and 'very, very sore', respectively 15 . Participants then performed a standardised warm-up consisting of jogging for 3-5 minutes and 15-metre sprints at 50%, 70% and 100%

Amendments from Version 1
The authors thank the reviewers for their comments and input of our manuscript. The major changes that have been made based on the reviewers' comments are provided below.
Converted the terms "agility" to "change-of-direction (COD)" throughout the manuscript Expanded on previous work on the validity and reliability of single-effort COD protocols in the Introduction and Discussion sections; Provided more information on the design of the research and the description of a number of protocols to improve clarity in the Methods section; Highlighted aerobic capacity as a potential contributor to the repeated Illinois Agility (RIA) performance measures in the Discussion; Clarified that the measures of the repeated T-agility test (RTT) was to assess the convergent validity of the RIA protocol, whilst anaerobic capacity, aerobic capacity and COD capability was for contributors to the RIA protocol; of maximal effort. A countermovement jump (CMJ) test (Yard Stick, Swift Performance, Queensland, Australia) was then conducted to assess leg power 16 , which was also repeated before the second COD test to confirm recovery between the repeated-COD tests.

Participants
In total, 22 adolescent, male, RL players (age 16.2 ± 0.8 yrs; body mass 80.7 ± 16.3 kg; height 1.77 ± 0.7 m) were recruited via word of mouth, flyers and liaison with sporting teams. The participants were part of the School of Athletic Excellence program, which selects and prepares students to compete at state and national competitions. The participants were injury-free with at least 2 years of RL experience. According to an a priori calculation 17 , a sample size of 22 was sufficient to identify significant differences in repeated-COD performance (power of 80%, alpha level of 0.05). Participants were instructed to avoid strenuous physical activity and caffeine for up to 12 hours before each testing session. All protocols were approved by the Institutional Human Research Ethics Committee and written informed consent was received from the participants and their parent/guardian prior to partaking in this study (Approval number H7248).

Multistage shuttle test
For the Multistage Shuttle test, participants ran back and forth in time with a series of audio signals on a 20-m indoor court in time with a series of audio signals 14 . The time between audio signals progressively decreased during the test resulting in an increased effort and running speed for athletes each minute. Predicted VO 2max was estimated based on the level completed, using a previously developed regression equation 14 .

Countermovement jump test
The countermovement jump protocol was measured with a vertical jump apparatus, based on 1-cm increments, with the units of measure reported in cm (Yard Stick, Swift Performance, Queensland, Australia). To ensure standardisation of the countermovement jump test, participants were instructed to draw their arms backwards upon the eccentric phase, then swing the arms forward during the concentric phase to gain momentum and maximise the stretch-shortening cycle mechanics 18 . The participants attempted three countermovement jumps, with approximately 30-60 seconds of rest in-between, and the highest jump reported.

30-m Sprint and Agility protocols
Assessment of speed was achieved by completing 30-m maximal sprints. The Agility T-test protocol was set up within a 10-m x 10-m figure-T course (Figure 2A) 19 . The Illinois Agility protocol consisted of a 10-m x 5-m course ( Figure 2B) 4,20 . To ensure protocol familiarity, the participants completed three trials at sub-maximal effort followed by one final maximal trial, with each trial interspersed by two minutes of recovery. Trial completion times were recorded using an electronic timing gate system (Speedlight Timing Gates, Swift Performance, Australia) positioned at the start/ finishing line, and reported in seconds. The fastest time was used for later analysis.

Repeat Sprint and Agility Protocols
The RSA, RTT and RIA protocols were completed by repeating the previously described protocols (i.e. 30-m sprint, T-test and Illinois Agility, respectively) across 6 repetitions with varying recovery periods in-between each repetition. Specifically, each repetition within the RSA, RTT and RIA was separated by 20-, 35-and 60-second recovery, respectively, with work-to-rest ratios of approximately 1:3 8,9 . The participant's instantaneous heart rate (HR, Polar Heart Rate Monitor, Polar H10, Finland) and rating of perceived exertion (RPE, Borg category scale 1-10) were collected at the completion of each repetition of the RSA, RTT and RIA protocols. The maximum and average HR and RPE values were then reported from the 6 repetitions 21 . The following parameters were also calculated  for each repeated agility protocol: total time (TT) of 6 cycles, best cycle time (BT), the average cycle time (AT) and fatigue index (FI) 8 . FI was calculated as follows 9 :

Statistical analysis
Data was analysed using a statistical software (IBM SPSS version 25, Chicago, Illinois) and reported as mean ± standard deviation. Normality of the data was assessed using the Kolmogorov-Smirnov statistic. Convergent validity of the repeated-COD protocols was identified via Pearson's product moment correlation coefficients for RTT and RIA measures (i.e., TT, BT, AT and FI) and construct validity with aerobic capacity, leg power, speed and COD variables (i.e., VO 2max , CMJ, 30-m sprint time, T-Test and Illinois Agility, respectively) were assessed as contributors to the RIA protocol. The cut-off for acceptable convergent validity and contributors to the RIA protocol was established when the association was statistically significant with an r-value of ≥ 0.70 22,23 . Reliability of the repeated-COD measures was determined via a paired T-test, intraclass correlation coefficients (ICC, SPSS 2-way mixed, 95% confidence intervals), coefficient of variation (CV, 95% confidence intervals) and systematic bias/ratio with 95% limits of agreement (LOA) 24 . Where significant relationships existed between the mean difference and average of test-retest values (i.e. heteroscedastic errors), variables were transformed (natural logarithm) prior to the calculation of measurement bias/ratio × / ÷ ratio LOA 25 . The level of significance for all analyses was set at 0.05. Finally, effect size (Cohen's d) with 95% CI was used to calculate the magnitude of differences in muscle soreness and CMJ measures between RIA and RTT protocols to determine whether the recovery periods were appropriate. The ES classifications were set as small, moderate and large with values of 0.2, 0.5 and 0. 8, respectively (Cohen, 1988).

Results
For convergent validity, significant correlations were identified between RIA and most RTT variables (     Table 3).
No significant differences were found for muscle soreness (p = 0.10) and CMJ performance (p = 0.80) between the testing sessions.

Discussion
This study showed that the RIA and RTT protocols were strongly correlated with each other, particularly with respect to the time-derived measures (BT, TT and AT). In addition, strong correlations were identified between the time-derived measures of RIA with VO2max, CMJ and 30-m sprint performance. Excellent test-retest reliability was evident for the time-derived, perceptual and physiological measures of the RIA protocol, although FI was questionable. The current findings support RIA is a reliable and valid assessment of COD and fitness in young RL players.
The strong correlations of the time-derived measures (BT, TT and AT) between the RIA and RTT protocols, highlighted the RIA protocol was a valid assessment of a repeated-COD, but with movement demands more representative of RL. In addition, the TT and BT of the RIA was strongly associated with the TT and BT of the RSA, indicating that the ability to maintain linear speed would result in superior performances in the RIA protocol, possibly due to similar metabolic demands 7 . Comparable findings were reported by Fessi, Table 3.

Test-retest results, intra-class correlation coefficients (ICC, 95% confidence interval (CI)), measurement bias/ratio (log-transformed data) (×/÷ 95% ratio limits of agreement (ratio-LOA)) and within-subject coefficient of variation (95 % CI) of the repeated Illinois Agility (RIA) and T-test (RTT) protocol.
Test ( Makni 9 , with strong correlations identified between the BT and TT of their repeated agility protocol and RSA protocols in 45 team-sport athletes. The comparable measures between RIA, RTT and RSA suggests that anaerobic fitness, in conjunction with efficient recovery dynamics during short periods of rest in-between explosive activities, are essential qualities for optimal performance in an RIA protocol. Collectively, our results and others 7,9 , suggest that performance of repeated-COD relies heavily upon the anaerobic system, a metabolic pathway predominant in RL 27 . The current study also identified strong test-retest reliability for time-derived measures (i.e., BT, TT and AT) of the RIA, with minimal measurement error. However, the measurement error was substantially higher for FI, confirming previous studies that reported substantially stronger reliability measures for BT, TT and AT compared to that of FI from various repeated-COD protocols 7, 8,28 . It has been suggested that FI may exhibit weaker reproducibility as the measure is multifactorial and dependent on the stability of other variables (i.e., TT and BT) 7,29 . Subsequently, we, and others 7,8,28,29 , recommend that time-derived measures be primarily evaluated during repeated-COD protocols.
Another novelty of the current study was the reliability of the psychophysiological responses during both RIA and RTT protocols. The test-retest reliability values for HR and RPE ranged between questionable-to-excellent classifications according to ICC scores for both RIA and RTT. However, distinctly greater measurement error and bias was observed for RPE when compared to HR measures for both RIA and RTT. These findings were similar to previous studies with poorer reliability for RPE than HR measures during various running protocols 30-32 . It has been postulated that HR has better stability across days given that it is an objective measure, compared to the highly subjective RPE 33 . It has also been reported that participant's prior knowledge of the number of sprints during repeated sprint-type protocols may affect results due to pacing 34 . Accordingly, HR measures may be a better physiological indicator for monitoring exercise-induced stress during repeated-COD protocols.
An additional, yet essential finding of this study was the relationship between baseline characteristics and performances measures from the repeated-COD tests. Measures of CMJ, best-effort speed and best-effort COD performance correlated significantly with the time-derived variables of the RIA. These relationships indicated that lower limb power, linear speed and COD capabilities were contributing factors to successful repeated-COD performances. Our findings aligned with those of Haj-Sassi, Dardouri 8 , who reported strong correlations between measures of jump performance and repeated-COD performance with an Agility T-test protocol. Similar findings were also reported by previous studies with muscular strength, and linear and COD speed considered strong contributors of COD performance 35,36 . The significance of this finding attests to lower limb power production being a critical component of repeated-COD= performance, especially within the RIA.
Finally, the current study identified significant correlations between VO 2max and RIA performance measures. These findings are similar to previous studies using various repeated-COD protocols 28,29 as well as RSA protocols 37-39 . Measures of VO 2max has been considered essential for repeated-sprint type protocols, due to muscular reoxygenation rate 8,40 , optimal capacity to remove and buffer hydrogen ions within working muscles 41 and efficiently replenish phosphagen stores 42 . The findings of the present study suggest that aerobic capacity is a strong contributor to superior repeated-COD efforts, further highlighting the need to optimise recovery capacities between high-intensity bouts for RL athletes.
In conclusion, the RIA protocol exhibited moderate-to-excellent test-retest reliability and low measurement error for the majority of time-derived measures and psychophysiological measures, and questionable reliability for FI. Further, the RIA protocol showed strong correlations with the RTT protocol, demonstrating that the RIA protocol provided a valid measure of repeated COD performance. Finally, this study has clearly demonstrated that repeated agility performances rely upon contributions from both anaerobic and aerobic systems with the RIA, demonstrating that the qualities required for optimal RIA performance may be representative of the physical demands in RL. The RIA protocol may provide practitioners with a simple, yet effective monitoring tool to quantify athlete's ability to generate and sustain multi-directional efforts, and their ability to recover during intermittent activities.

Open Peer Review
such extensive consideration to the previous reviewer suggestions and making relevant changes where applicable. The manuscript has been strengthened, and I only have some further minor comments for consideration: Abstract -thank you for adding "Beep test", but for consistency with the article text, consider changing this to "Multistage Shuttle test".

1.
Introduction, paragraph 4, second aim -is RSA meant to be aerobic capacity here? In either case, aerobic capacity needs to be included in this aim specifically alongside the other fitness attributes.

2.
Methods, 30-m sprint and agility protocols -be sure to check headings as well given previous suggestions on use of the term "agility". Also in this section, please indicate if the position of the timing gates were kept consistent across all participants with mention of the height (above ground) and distance apart available.

3.
Statistical analysis -still use of "construct validity" on the fourth line in this paragraph as per previous revisions.

4.
Discussion -there is a random equals sign on the last line of the fifth paragraph. 5.

Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Exercise and sport science I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Version 1
Reviewer Report 03 August 2021 https://doi.org/10.5256/f1000research.25532.r90678 © 2021 Scanlan A. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Aaron T. Scanlan
Human Exercise and Training Laboratory, School of Health, Medical and Applied Sciences, Central Queensland University, Rockhampton, Qld, Australia Thank you for the opportunity to review this manuscript. A logical study with potential practical impact was presented exploring the validity and reliability of a repeated change-of-direction speed test suited to rugby league. This aspect of fitness testing is lacking in the literature and of use for end-users, so this manuscript has merit for publication. I do however have some suggestions and queries listed below that will help strengthen some aspects of the work that should be addressed by the authors: Abstract, Methods -indicate whether you are referring to the multi-stage fitness test (i.e., Beep test) here or if just an assessment of multi-stage fitness was performed. 1.
Abstract -some justification of using the repeated T-agility test as the other standard repeated change-of-direction performance to assess convergent agility of the repeated Illinois Agility test is needed in the Abstract.

2.
Abstract -the focus of exploring the contribution of different fitness attributes to repeated Illinois Agility test performance is not made in the Abstract. Consider including this as a secondary objective and/or making it clear why this analysis is needed.

3.
Abstract, Conclusions -make it clear that this test is useful for Rugby League coaches specifically, and change "athlete" to plural form or indicate "team" here instead.

4.
Introduction, opening paragraph -in this paragraph you identify "agility" and hint that it involves physical and cognitive components. So in essence, you are examining change-ofdirection speed (the physical component) rather than "agility" per se. Consider making this clear and using the term "change-of-direction speed" or "change-of-direction performance" thereafter when referring to tests and attributes that are purely physical without the cognitive component.

5.
Introduction, opening paragraph -stronger rationale is needed justifying the inclusion of the Agility T-test and Illinois Agility test in rugby league. Can you add a sentence or two outlining why these tests are suited to the sport and therefore the focus of your study?

6.
Introduction, end of second paragraph -here indicate whether the validity and reliability of this test has not been investigated at all or just specifically in Rugby League athletes. Also, at the end of this sentence make it clear that you are referring to the usability of this test.

7.
Introduction, aims -for the second aim, you are not comparing the RIA measures to other measures, but instead correlating them, so please change this aim accordingly. Also, consider including a little rationale around this aim in the previous paragraph as it is unclear as to why this is important.

8.
Introduction, aims -it might pay to include a sentence stating why examining convergent validity (e.g. to show that you are assessing similar attributes with a new test that is more practical and specific to Rugby League when compared to a standard, generic test routinely used) and retest reliability (e.g. to detect the inherent error in the test and ascertain whether it can be reliably adopted in practice to assess repeated measurements in athletes) are needed for practical uptake of the test, which would strengthen the rationale of the first and third aims as well in the Introduction section.

9.
Introduction, hypotheses -you only mention speed and anaerobic capacity as your fitness attributes here, but you also included aerobic fitness (Multistage Fitness test)? 10.
Methods, Research design -why was the order flipped between session 4 and session 5? It 11. seems like you would want to the athletes to complete the same exact session across both when assessing retest reliability as it is introducing a confounding factor? Also, no hyphen needed between "15" and "minutes" here.
Methods, Participants -can you provide some further indication as to the specific playing level of the athletes? The name of the program is great, but this is not exactly clear for all readers.
12. Figure 1 -consider changing the session numbers, as in text you identify the Multistage Fitness test as session 1, but here you indicate the other fitness testing as session 1. Consistency needed.
Methods, Multistage Shuttle test -stay consistent with capitalising the test names like this one as it is done inconsistently throughout. Also, hyphenate "20m" here.

15.
Methods, Countermovement jump test -hyphenate "1 cm" here. Also, try to indicate what units the key outcomes from each test were reported in (e.g. mL/kg/min, cm, s).

16.
Methods, 30-m Sprint and Agility protocols -separate statements on linear and change-ofdirection speed here. Also, hyphenate "10m" and "5m" for the Illinois Agility test.

17.
Methods, repeat protocols -to calculate average HR and RPE, was HR measured from when the test started to when the test finished in 1-second intervals? And was RPE taken after each effort or just after all efforts for each specific test? These are not quite clear.

18.
Methods, Statistical analysis -you mention "construct" validity for the first time here. If this is a key aim and aspect of the study (i.e., correlating performance during the test with fitness attributes), then this needs to be established earlier (i.e., introduction and aims).

19.
Methods, Statistical analysis -an r value of 0.5 seems quite low to establish convergent validity (only 25% shared variance)? In this regard, what was the cut-off for construct validity?

20.
Results, first paragraph -clarify whether this is CMJ height specifically. 21.
Results, second paragraph -make it clear that you are referring to the third and fourth sessions here (and make sure this is consistent as you identify these as the fourth and fifth sessions earlier).

22.
Results, third paragraph -you do not indicate the maximum RPE was taken as an outcome in the methods anywhere (only average), yet it is listed here? Same for maximum HR, which appears later in this section also.

23.
Table 1 -shouldn't RIA and RTT be the two key tests going from left to right rather than RIA and RSA? In fact, it is not clear why RSA is included here given you identify RIA vs. RTT for 24. convergent validity assessment throughout the manuscript. Please adjust or rework earlier sections.
Discussion, first paragraph -some minor errors, but change "was" to "were" in the 7 th line. 25.
Discussion, first paragraph -at the end of this paragraph, some explanation as to how yours (and the previous studies) support that anaerobic fitness is predominantly stressed in the repeated COD tests. I am assuming the strong correlations with mostly anaerobic fitness attributes, but this is an assumption and should be explained further to clarify this statement.

26.
Discussion, fourth paragraph -be careful with your statement suggesting they are "key attributes for RL athletes". You did not show this but state it, so rework or remove this part to focus specifically on what your data show.

27.
Discussion, concluding paragraph -here you highlight how anaerobic and aerobic fitness underpin test performance, but previously you focus on anaerobic fitness in other sections and in this section. Please make sure the message is consistent throughout.

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Yes
Competing Interests: No competing interests were disclosed.

Author Response 27 Sep 2021
Kenji Doma, James Cook University, Douglas, Australia Reviewer's comment: Thank you for the opportunity to review this manuscript. A logical study with potential practical impact was presented exploring the validity and reliability of a repeated change-of-direction speed test suited to rugby league. This aspect of fitness testing is lacking in the literature and of use for end-users, so this manuscript has merit for publication. I do however have some suggestions and queries listed below that will help strengthen some aspects of the work that should be addressed by the authors: Author's response: Dear Reviewer, thank you very much for your comments. We believe your feedback has improved the quality of our work.
Reviewer's comment: Abstract, Methods -indicate whether you are referring to the multistage fitness test (i.e., Beep test) here or if just an assessment of multi-stage fitness was performed.

Author's response: A Beep test was performed, which is now included in the Abstract.
Reviewer's comment: Abstract -some justification of using the repeated T-agility test as the other standard repeated change-of-direction performance to assess convergent agility of the repeated Illinois Agility test is needed in the Abstract.

Author's response: This is a valid point. The Background now includes, "Thus, the current study examined the convergent validity of a repeated Illinois Agility (RIA) protocol with the repeated T-agility protocol, and the repeatability of the RIA protocol in adolescent Rugby League players."
Reviewer's comment: Abstract -the focus of exploring the contribution of different fitness attributes to repeated Illinois Agility test performance is not made in the Abstract. Consider including this as a secondary objective and/or making it clear why this analysis is needed. Author's response: Thank you for the suggestion. We have now included, "Furthermore, aerobic capacity and anaerobic and COD performance were assessed to determine whether these physical qualities were important contributors to the RIA protocol." Reviewer's comment: Abstract, Conclusions -make it clear that this test is useful for Rugby League coaches specifically, and change "athlete" to plural form or indicate "team" here instead.

Author's response: The conclusion has been rewritten to, "The current study has demonstrated the RIA to be a simple, valid and reliable field test for RL athletes that can provide coaches with information about their team's ability to sustain high intensity, multi-directional running efforts."
Reviewer's comment: Introduction, opening paragraph -in this paragraph you identify "agility" and hint that it involves physical and cognitive components. So in essence, you are examining change-of-direction speed (the physical component) rather than "agility" per se. Consider making this clear and using the term "change-of-direction speed" or "change-ofdirection performance" thereafter when referring to tests and attributes that are purely physical without the cognitive component. Author's response: All the terms with 'agility' has been replaced with 'change-of-direction' performance or protocols throughout the text.
Reviewer's comment: Introduction, opening paragraph -stronger rationale is needed justifying the inclusion of the Agility T-test and Illinois Agility test in rugby league. Can you add a sentence or two outlining why these tests are suited to the sport and therefore the focus of your study? Author's response: We have removed Agility T-test and Illinois Agility test from the first paragraph, and focused the justification of the Illinois Agility test for RL in the second paragraph (see lines 41-47).
Reviewer's comment: Introduction, end of second paragraph -here indicate whether the validity and reliability of this test has not been investigated at all or just specifically in Rugby League athletes. Also, at the end of this sentence make it clear that you are referring to the usability of this test. Author's response: We separated the second paragraph into two paragraphs. Thus, we highlighted that no studies have examined the validity and reliability of the RIA protocol at the end of the fourth paragraph.
Reviewer's comment: Introduction, aims -for the second aim, you are not comparing the RIA measures to other measures, but instead correlating them, so please change this aim accordingly. Also, consider including a little rationale around this aim in the previous paragraph as it is unclear as to why this is important. Author's response: Changed to 'correlating' as requested. The previous paragraph expanded on the need to examine the validity of the RIA protocol, which we now hope provides further justification of our aims.
Reviewer's comment: Introduction, aims -it might pay to include a sentence stating why examining convergent validity (e.g. to show that you are assessing similar attributes with a new test that is more practical and specific to Rugby League when compared to a standard, generic test routinely used) and retest reliability (e.g. to detect the inherent error in the test and ascertain whether it can be reliably adopted in practice to assess repeated measurements in athletes) are needed for practical uptake of the test, which would strengthen the rationale of the first and third aims as well in the Introduction section. Author's response: This is a great suggestion. We have now worded the end of the Introduction to, "Examining the convergent validity of the RIA protocol will determine whether this novel assessment exhibits similar attributes to a standardised COD protocol (i.e., RTT). In addition, the reliability of the RIA will determine whether this test can be reliably adopted in practice by accounting for the inherent error of the test across repeated measurements. The quality of these psychometric properties will provide coaches with a tool to assist in monitoring and training RL athletes as well as in talent development and identification." Reviewer's comment: Introduction, hypotheses -you only mention speed and anaerobic capacity as your fitness attributes here, but you also included aerobic fitness (Multistage Fitness test)? Author's response: Included aerobic capacity in the hypotheses.

Reviewer's comment: Methods, Research design -why was the order flipped between
Reviewer's comment: Discussion, first paragraph -at the end of this paragraph, some explanation as to how yours (and the previous studies) support that anaerobic fitness is predominantly stressed in the repeated COD tests. I am assuming the strong correlations with mostly anaerobic fitness attributes, but this is an assumption and should be explained further to clarify this statement. Author's response: Closer to the end of this paragraph, we have included a sentence that reads, "The comparable measures between RIA, RTT and RSA suggests that anaerobic fitness, in conjunction with efficient recovery dynamics during short periods of rest in-between explosive activities, are essential qualities for optimal performance in an RIA protocol." The present study aimed to explore the validity and reliability of repeated "agility" tests in a sample of youth rugby players. The study provides interesting data for an applied audience and is methodologically sound and accurate in the presentation of findings. This being said, I have identified a range of areas that should be improved prior to potential resubmissions of this manuscript. For example, the introduction (and study in its entirety) is inaccurate in its use of the term "agility" (opposed to COD which the authors actually assess). Furthermore, the introduction needs to be further developed to support the scope of the study. The authors need to create more of a rationale for running such a large number of correlations -exploring correlational analyses needs to be warranted based upon an assumption of a relationship (or lack thereof) between variables. In its current form, I am unsure whether this manuscript provides such rationale. The methods and results are accurate, but with a large number of small errors. Finally, the discussion addresses key discussion points formulated from the findings of the study, but similar to the introduction, lacks some critical depth.

Introduction
The introduction is short and concise. However, I feel some key information that would further support the rationale for your study has been omitted. I would encourage the authors to further develop this section. A suggestion would be to expand the second paragraph into two: the first discussing the validity aspects, the second the reliability aspects. I do not expect these changes to be too onerous, but believe it would strengthen this section of your manuscript if you were to implement these changes. I have some further specific comments below. The opening sentence would read better if it were split into two separate sentences. The authors should attempt to clarify during opening paragraph that they are discussing change of direction (COD) performance, rather than agility. Although COD performance is a key component of agility performance, the lack of an external stimuli prevents classification of agility. Although the tests identified by the authors in this opening paragraph have titled themselves 'agility' tests, by definition they are not. The authors should be careful in their stance here as the literature has evolved substantially since the inception of tests such as these, and appropriate classification and terminology is essential for the present study.  Sports Sci. 2006, 24, 919-932. 3 Regarding the reliability aspect of the study, two recent papers have suggested that reliability of COD and agility may be lower in adolescent and youth athletes (as per your sample). Although these authors observed this during maximal COD/agility performance, opposed to repeated as in the present study, I believe this would be worth including and would further warrant the exploration of your research question and study. Taylor

Methods
The methods report the appropriate details relating to the study design, protocols, participants, and analyses performed. However, further clarity is required in a few places throughout this section (specific comments below). Further, there are a number of oversights throughout this section that require careful attention. None of these are major points, and should take little time to amend. Yet, need to be addressed prior to potential resubmissions of this manuscript. Figure 1: I am unsure how much this figure aids the interpretation of the study design. I personally do not feel the visual nature of this schematic clarifies the description of the trials from the text, as the study is relatively simplistic. I do not insist that the authors remove this figure, however, please consider my suggestion that it may not add value to the manuscript and may just be a "figure for the sake of a figure". Secondly, between the trials, the authors have stated that washout periods of > 7 days were implemented. The authors should consider including the range of days (e.g. 7-14 days), as this would provide further information to the reader and allow for more accurate interpretation of the results. Can a citation or elaboration be provided to rationalise the >15 min rest interval between attempts? The term 'respectively' needs to be included after the explanation of the 1-10 muscle rating scale. Also, a brief sentence as to why this was collected should accompany this methodological point. The authors state that the warm up was the same prior to each session. I would encourage using the term 'standardised' when describing this in their methods to add certainty around their controls. Also, a citation needs to accompany the statement that the CMJ test was implemented to assess recovery. If this is what was intended by the insertion of citation 12, I would encourage moving this citation to the end of the sentence. Participant details -participant height needs to be reported to 2dp. Can the authors further elaborate on what the "School of Athletic Excellence" is? For example, what standard is this, is it affiliated to a specific school, or region performance programme? This may not be clear to readers. Multistage shuttle test -a brief sentence stating how performance or a score was derived is necessary (i.e. maximum number of shuttles completed, level, etc.). This information is currently not evident. CMJ -please state the unit of measurement for this test, and to what degree (e.g. recorded in cm to the nearest 0.1cm). 30-m sprint and agility protocols -please separate each test into a separate sentence. At present, they appear too cramped together and it is difficult to read (particularly with reference to figures and citations within). Also, please provide citations for the original studies for the T-drill and Illinois agility tests. These need to be credited within the manuscript.
Repeated sprint and agility protocols -were these tests performed one after the other (what I interpret from the term 'cycles') or were they performed independently? Clarity is required to ensure the reader understands which of these interpretations is correct. The sentence starting "Immediately after each repeated agility cycle…" does not make sense. Please rephrase this. Statistical analysis -Reference 16 seems a weak reference for this point. I encourage the authors to consider a more suitable citation for selecting this acceptable correlation cut off point.

Results
The section of the results dedicated to the muscle soreness and CMJ differences (or lack thereof) should be condensed to a singular sentence and placed at the foot of the results section. This is a methodological control and not a key finding from your study, therefore, it should not feature so early and heavily within your results section. Stating that there were no differences, providing ES, and significance values would suffice. Table 3 -I am interested in your rationale for not providing ES to demonstrate differences (or lack thereof) for your test-retest data? This would add value in my opinion. I would also encourage the authors to present CV% data to 1dp both within this table and throughout the manuscript. I notice that the authors use terminology such as "excellent" or "moderate" test-retest reliability, yet they have not included these threshold within the statistical analysis section of the methods. Please include this detail so that the reader understands your criteria. I also note that the authors say "most RIA performance measures exhibited excellent test-retest reliability…". Could the authors rephrase this to say "all RIA performance measures except… measures exhibited excellent test-retest reliability"? This would be more accurate and less ambiguous for the reader. This comment also applies later in this section where the authors refer to "a few variables…".

Discussion
I found the discussion well-written and thought the authors accurately interpreted and attempted to explain their findings. However, similar to the introduction, I felt this section lacked depth when discussing the main findings of the study. I encourage the authors to discuss their findings in greater depth, utilising some of the reading attached to this review. I encourage the authors to utilise the first paragraph of the discussion to summarise all of their main findings, and then progress to discussing each point in turn. The third paragraph begins "another novelty of this study", yet this appears to be the first time novelty is addressed in the discussion? The fourth paragraph can draw upon a wealth of correlational studies to further support the findings observed here. Young Sport 2006, 9, 342-349. 9 The conclusion should be reframed around the key objectives of the study (i.e. that it is reliable and valid). The mention of aerobic and anaerobic energy systems seems misplaced here.

References
While a variety of key texts are identified and listed within the bibliography and cited throughout the manuscript, a number of key texts are omitted. I encourage the authors to further familiarise themselves with relevant studies assessing reliability and validity of COD and agility in team sport athletes, as well as the texts identified earlier within this review.  Sports Physiol. Perform. 2009, 4, 345-354. 11 Is the work clearly and accurately presented and does it cite the current literature? Partly

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes Are the conclusions drawn adequately supported by the results? Yes intend them and are useful in further strengthening this manuscript. I hope to receive the Introduction Reviewer's comment: The introduction is short and concise. However, I feel some key information that would further support the rationale for your study has been omitted. I would encourage the authors to further develop this section. A suggestion would be to expand the second paragraph into two: the first discussing the validity aspects, the second the reliability aspects. I do not expect these changes to be too onerous, but believe it would strengthen this section of your manuscript if you were to implement these changes. I have some further specific comments below. Author's response: The second paragraph has been further developed into two as requested, with the first focusing on reliability and the second on validity.
Reviewer's comment: The opening sentence would read better if it were split into two separate sentences. Author's response: We have simplified the first sentence by removing the context of 'agility', as we have converted all 'agility' terms to 'change-of-direction'.
Reviewer's comment: The authors should attempt to clarify during opening paragraph that they are discussing change of direction (COD) performance, rather than agility. Although COD performance is a key component of agility performance, the lack of an external stimuli prevents classification of agility. Although the tests identified by the authors in this opening paragraph have titled themselves 'agility' tests, by definition they are not. The authors should be careful in their stance here as the literature has evolved substantially since the inception of tests such as these, and appropriate classification and terminology is essential for the present study.
interpretation of the results. Author's response: Thank you for your comment. We have decided to keep the figure to ensure clarity as we used a cross-over randomized design, which is unusual for a study examining the reliability and validity of protocols. We have included 7-14 days as a washout period as suggested in the figure.
Reviewer's comment: Can a citation or elaboration be provided to rationalise the >15 min rest interval between attempts? Author's response: A citation has been included with a study that used a 15-minute recovery period in-between repeated sprint and COD protocols in the one testing session.
Reviewer's comment: The term 'respectively' needs to be included after the explanation of the 1-10 muscle rating scale. Also, a brief sentence as to why this was collected should accompany this methodological point. Author's response: Included as requested.
Reviewer's comment: The authors state that the warm up was the same prior to each session. I would encourage using the term 'standardised' when describing this in their methods to add certainty around their controls. Also, a citation needs to accompany the statement that the CMJ test was implemented to assess recovery. If this is what was intended by the insertion of citation 12, I would encourage moving this citation to the end of the sentence. Author's response: The term 'standardised' has been included when describing the warm-up as requested in the Methods section. The citation explaining the use of CMJ to determine recovery dynamics was moved to the end of the sentence as requested.
Reviewer's comment: Participant details -participant height needs to be reported to 2dp. Can the authors further elaborate on what the "School of Athletic Excellence" is? For example, what standard is this, is it affiliated to a specific school, or region performance programme? This may not be clear to readers. Author's response: The participant height is reported to 2dp (1.77m). Further information on the excellence program has been included.
Reviewer's comment: Multistage shuttle test -a brief sentence stating how performance or a score was derived is necessary (i.e. maximum number of shuttles completed, level, etc.). This information is currently not evident. Author's response: The VO2max was estimated based on the level completed, which has now been included in the Methods section.
Reviewer's comment: CMJ -please state the unit of measurement for this test, and to what degree (e.g. recorded in cm to the nearest 0.1cm). Author's response: The CMJ was recorded to the full cm, not to the nearest 0.1cm. This has now been clarified.
Reviewer's comment: 30-m sprint and agility protocols -please separate each test into a separate sentence. At present, they appear too cramped together and it is difficult to read (particularly with reference to figures and citations within). Also, please provide citations for the original studies for the T-drill and Illinois agility tests. These need to be credited within the manuscript. Author's response: The sentences for the description of the protocols have been separated as requested. The original studies have also been included as requested.
Reviewer's comment: Repeated sprint and agility protocols -were these tests performed one after the other (what I interpret from the term 'cycles') or were they performed independently? Clarity is required to ensure the reader understands which of these interpretations is correct. Author's response: We have reworded the term 'cycles' to 'repetitions' to improve clarity. As mentioned in Research Design and in Figure 1, RSA was conducted in Session 2, whilst RTT and RIA were conducted in the same session with 15-min rest inbetween each protocol. The Figure reinforces this design, and rewording 'cycles' to 'repetitions' improves the clarity of the protocols.
Reviewer's comment: The sentence starting "Immediately after each repeated agility cycle…" does not make sense. Please rephrase this. Author's response: This has been reworded to, "The participant's heart rate (HR, Polar Heart Rate Monitor, Polar H10, Finland) and rating of perceived-exertion (RPE, Borg category scale 1-10) were collected at the completion of each repetition of the RSA, RTT and RIA protocols. The maximum and average HR and RPE values were then reported from the 6 repetitions" Reviewer's comment: Statistical analysis -Reference 16 seems a weak reference for this point. I encourage the authors to consider a more suitable citation for selecting this acceptable correlation cut off point. Author's response: Additional reference included as requested.

Results
Reviewer's comment: The section of the results dedicated to the muscle soreness and CMJ differences (or lack thereof) should be condensed to a singular sentence and placed at the foot of the results section. This is a methodological control and not a key finding from your study, therefore, it should not feature so early and heavily within your results section. Stating that there were no differences, providing ES, and significance values would suffice. Author's response: Simplified and included at the end of the Results section as requested.
Reviewer's comment: Table 3 -I am interested in your rationale for not providing ES to demonstrate differences (or lack thereof) for your test-retest data? This would add value in my opinion. I would also encourage the authors to present CV% data to 1dp both within this table and throughout the manuscript. Author's response: Thank you for your comment. The ES was not reported as we based the measurement error on CV, with larger CV% values exhibiting greater measurement error and poorer test-retest reliability. In our opinion, inclusion of ES would not provide any further clarity about the reliability of the tests, beyond that of CV.