Dynamic Programming for Optimal Maintenance of Systems with Degradation and Traumatic Event Failures

khamiss cheikh; EL Mostapha Boudi; Rabi Rabi; Hamza Mokhliss

doi:10.12688/f1000research.172790.1

Home Browse Dynamic Programming for Optimal Maintenance of Systems with Degradation...

ALL Metrics

Views

Downloads

Get PDF

Get XML

Export

▬

✚

Research Article

Dynamic Programming for Optimal Maintenance of Systems with Degradation and Traumatic Event Failures

[version 1; peer review: 1 approved]

khamiss cheikh ¹, EL Mostapha Boudi¹, Rabi Rabi², Hamza Mokhliss³

PUBLISHED 26 Nov 2025

Author details Author details

¹ Mechanical Engineering, Mohammed V University of Rabat Mohammadia School of Engineering, Rabat, Rabat-Sale-Zemmour-Zaer, Morocco
² Department of Physics, Universite Sultan Moulay Slimane, Beni-Mellal, Tadla-Azilal, Morocco
³ Department of Physics, Universite Chouaib Doukkali Faculte des Sciences, El Jadida, Casablanca-Settat, Morocco

khamiss cheikh
Roles: Conceptualization, Data Curation, Formal Analysis, Funding Acquisition, Investigation, Methodology, Project Administration, Resources, Writing – Original Draft Preparation, Writing – Review & Editing

EL Mostapha Boudi
Roles: Conceptualization, Investigation, Supervision, Validation, Visualization

Rabi Rabi
Roles: Conceptualization, Supervision, Validation, Visualization

Hamza Mokhliss
Roles: Validation, Visualization

OPEN PEER REVIEW

REVIEWER STATUS

Abstract

Background

Systems operating in industrial environments are often exposed to two concurrent failure mechanisms: gradual degradation and sudden traumatic events. Maintenance decisions must account for these competing risks while controlling inspection, replacement, and failure costs. This study develops a quantitative framework to determine an economically efficient maintenance strategy under such conditions.

Methods

A discrete-state model is formulated with three operational conditions: Good, Degraded, and Failed. Transitions between states are driven by the system’s degradation trajectory and the occurrence of traumatic failures. A long-term expected cost model is established, incorporating inspection costs, preventive replacement costs, and failure-related losses. Dynamic programming is used to identify the policy that minimizes the expected cost per unit time. The optimisation evaluates how inspection intervals, degradation rates, and traumatic event probabilities influence replacement decisions.

Results

The optimisation results indicate that the cost-effective policy depends strongly on the interaction between degradation progression and the frequency of traumatic events. Higher rates of traumatic events lead to earlier preventive replacement, while intermediate degradation rates make the inspection interval the primary driver of cost reduction. The model delineates the parameter regions in which periodic inspection is justified and quantifies the cost effects of different maintenance schedules.

Conclusions

The proposed dynamic programming approach provides a structured method for selecting inspection and replacement strategies in systems subject to multiple failure mechanisms. The results offer decision-support guidance for maintenance planning, particularly in environments where degradation and traumatic events jointly affect system reliability and operating costs.

Keywords

Periodic inspection, replacement policy, competing failure modes, degradation, traumatic events, dynamic programming, system reliability, long-term cost optimization, maintenance strategy

Corresponding author: khamiss cheikh

Competing interests: No competing interests were disclosed.

Grant information: The author(s) declared that no grants were involved in supporting this work.

Copyright: © 2025 cheikh k et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: cheikh k, Boudi EM, Rabi R and Mokhliss H. Dynamic Programming for Optimal Maintenance of Systems with Degradation and Traumatic Event Failures [version 1; peer review: 1 approved]. F1000Research 2025, 14:1311 (https://doi.org/10.12688/f1000research.172790.1) First published: 26 Nov 2025, 14:1311 (https://doi.org/10.12688/f1000research.172790.1) Latest published: 26 Nov 2025, 14:1311 (https://doi.org/10.12688/f1000research.172790.1)

1. Introduction

In reliability engineering, systems are often exposed to various types of failures, including gradual degradation due to wear and tear and catastrophic failures due to traumatic events. Optimizing the maintenance policy for such systems is a challenging task.^1,2 Maintenance strategies generally involve periodic inspections and repairs or replacements, but the decision-making process is complicated by the competing failure modes and associated costs.

The goal of this study is to develop a cost-effective strategy for periodic inspection and replacement of systems exposed to these competing failure modes. We use a dynamic programming approach to model the system, where the decision variables include the timing of inspections, repairs, and replacements. The system can exist in one of three states: Good, Degraded, or Failed. The system’s state transitions are determined by the degradation rate, the occurrence of traumatic events, and the cost of inspections, repairs, and replacements.

Previous studies have shown the importance of considering multiple failure modes in optimizing maintenance policies.^3,4 For instance, dynamic programming has been successfully applied to systems under degradation,¹ catastrophic events,² and competing failure modes.³ Moreover,⁴ and⁵ have investigated how different maintenance strategies can reduce operational costs while enhancing system reliability.

This paper is organized into six sections: Section 2 presents the system’s degradation and failure model, while Section 3 details the methodology used for optimization. Section 4 presents the results of the simulations, and Section 5 discusses the implications of these findings. Finally, Section 6 concludes the paper and proposes future research directions.

2. Degradation and failure model

Let V(t,s) represent the minimum expected cost at time t in state s, where s = 0,1,2 corresponds to the good, degraded, and failed states, respectively. The objective is to find the optimal inspection and replacement policy that minimizes the total cost over a given time horizon.

Where:

➢ State 0 (Good State): The system is operating normally, and the cost includes only inspection costs.^6,7
➢ State 1 (Degraded State): The system is still functioning but has deteriorated. The cost in this state includes inspection costs and potential repair costs.^8,9
➢ State 2 (Failed State): The system has failed, requiring replacement, and the cost is the replacement cost along with any failure-related penalties.^10,11

The model uses dynamic programming to recursively solve for V(t,s) over time. The cost at each state can be written as:

(1)

V (t, s) = min (inspection cost + transition cost, replacement cost + failure cost)

Where:

• Inspection cost: The cost incurred when inspecting the system at time t.
• Transition cost: The cost incurred when the system transitions from one state to another, either due to degradation or traumatic events.
• Replacement cost: The cost of replacing the system or a component after a failure.
• Failure cost: The cost incurred when a failure occurs, which includes both the direct cost of failure and any associated consequences.

For each state, we define the following:

• λ_d: The rate of degradation failure.
• λ_t: The rate of traumatic event failure.
• C_inspect: The cost of performing an inspection.
• C_replace: The cost of replacing the system after failure.
• C_failure: The cost of failure, including the replacement and system downtime.

The dynamic programming recursion can be written as:

(2)

V (t, s) = min (C_{inspect} + λ_{d} V (t + 1, 1) + λ_{t} V (t + 1, 2), C_{replace} + C_{failure})

The model is solved for each state over the time horizon T to determine the optimal replacement and inspection policy that minimizes the total expected cost.

The system also follows a discrete-time dynamic programming approach, where the state transitions depend on the failure rates and maintenance actions taken. The value function V(t,state) represents the minimum cost to reach the end of the time horizon, starting from time t and state.

The dynamic programming equations are as follows:

➢ For Good State (State 0):

(3)

V (t, 0) = min (C_{inspect} + λ_{d} V (t + 1, 2) + λ_{t} V (t + 1, 3), other options)

➢ For Degraded State (State 1):

(4)

V (t, 1) = min (C_{inspect} + λ_{t} V (t + 1, 3) + λ_{d} V (t + 1, 1) + C_{repair}, other options)

➢ For Failed State (State 2):

(5)

V (t, 2) = C_{replace} + C_{failure}

The value function is updated using backward recursion, starting from the terminal time T where the costs are predefined (e.g., the cost of replacing the system in a failed state).

3. Methodology

To solve the optimization problem, we implement dynamic programming using backward induction. Starting from the final time step T, we compute the optimal value function V(t,s) for all t and s, moving backward until we reach the initial time step.

We perform simulations to compare different scenarios based on varying failure rates and cost parameters. The parameters of the system, including inspection frequency and replacement cost, are chosen to reflect realistic conditions for industrial systems.

The dynamic programming algorithm proceeds as follows:

➢ Initialization: Set initial conditions for V(T,s), the value function at the final time step. The terminal cost is set to the replacement cost and failure cost for all states.
➢ Backward Recursion: For each time step t from T−1 to 0, compute V(t,s) for each state using the recursive formula.
➢ Policy Derivation: After computing the value function, extract the optimal policy by selecting the action (inspection or replacement) that minimizes the expected cost at each time step.

4. Results

From Figure 1, we can conclude that the following simulation results were obtained for the dynamic programming model under varying parameters:

• Good State (State 0): In the initial stages of the simulation, the system stays in the good state, with low inspection costs and no failures. However, over time, the system moves into the degraded state due to gradual degradation.
• Degraded State (State 1): As the system degrades, the costs rise due to increased inspection and maintenance efforts. The likelihood of failure increases, making inspections more frequent and costly.
• Failed State (State 2): The failed state represents the most costly scenario, as the system requires replacement. The failure cost dominates the total cost, leading to significant expenses if not managed properly.

Figure 1. Cost evolution over time.

The optimal policy minimizes the total long-term cost by strategically selecting when to inspect and replace components. The model shows that frequent inspections are necessary as the system degrades to avoid catastrophic failures.

5. Discussion of results

The results of the study offer several important insights into the inspection and replacement strategy for systems subjected to both degradation and traumatic events:

5.1 Cost dynamics over time

As expected, the cost of maintaining a system in the good state is relatively low. In this state, the system is functioning without issues, so the primary cost is that of periodic inspections. However, as the system enters the degraded state, the costs start to rise due to increased inspection and potential repair needs. Once the system enters the failed state, the costs skyrocket, primarily due to the need for a full replacement and the additional penalties that may arise from system downtime or operational failure. This dynamic clearly highlights the importance of performing timely inspections and replacements before the system becomes too degraded or fails completely. Delaying maintenance can result in significantly higher costs as the system transitions into more costly states.

5.2 Impact of degradation and traumatic failures

The model makes a crucial distinction between two types of failures:

• Degradation Failures: These occur gradually as the system undergoes wear and tear or aging. While the degradation process is slow, it can accumulate over time, causing the system to move from the good state to the degraded state and eventually to the failed state. Such failures can often be predicted and mitigated through timely inspections.
• Traumatic Event Failures: Unlike degradation, these failures occur suddenly and often unpredictably, such as accidents or extreme events. These failures can cause substantial damage in a short amount of time and are difficult to foresee.

By differentiating between these two failure types, the model allows for a more refined inspection strategy. Maintenance policies can be tailored to address both types of risks—gradual degradation can be tracked over time, while inspections can be adjusted to account for the potential occurrence of catastrophic events.

5.3 Optimization trade-off

The dynamic programming model effectively optimizes the balance between two competing factors: inspection frequency and replacement decisions. On one hand, frequent inspections are necessary as the system degrades to detect early signs of potential failure and prevent the system from progressing into the failed state. On the other hand, frequent inspections come with their own costs, which need to be weighed against the benefits of avoiding catastrophic failures. The model thus suggests that as the system moves from good to degraded, the frequency of inspections should increase, though the inspections should be scheduled optimally to minimize the total cost of maintenance, including both inspections and replacements.

5.4 Practical implications

The insights derived from this model have broad real-world applications:

• Industrial machinery: Many industrial systems experience gradual degradation and are at risk for sudden traumatic failures (e.g., equipment breakdowns). The model suggests that maintenance strategies should be designed to detect early degradation while also preparing for the possibility of traumatic events.
• Transportation networks: Systems like bridges, tunnels, and roads face both types of risks—degradation from use over time and sudden failure due to accidents or natural disasters. The model can help determine optimal inspection and replacement schedules to avoid costly failures and reduce downtime.
• Infrastructure management: Critical infrastructure, such as power grids or water supply systems, must be inspected regularly to prevent both gradual and sudden failures. The model’s adaptive strategy ensures that inspections are timely and replacements are made before costs escalate.

Overall, the results suggest that maintenance strategies should be adaptive, based on the current state of the system. Systems in a good state might require less frequent inspections, while those in a degraded state may need more frequent checks to prevent expensive replacements. Additionally, the model highlights that replacement decisions should not be made solely based on failure, but should also take into account the cost of degradation and the likelihood of traumatic failures.

6. Conclusion and perspectives

This paper presented a dynamic programming approach to optimizing the inspection and replacement policy for systems exposed to competing failure modes, specifically degradation and traumatic events. The findings underscore the critical role of striking a balance between inspection frequency and replacement decisions to minimize long-term operational costs while ensuring the reliability and functionality of the system. By using dynamic programming, we were able to quantify the costs associated with each state and derive an optimal policy that adapts to varying system conditions.

While the current model provides valuable insights for industries managing systems at risk of both gradual and sudden failures, future research could explore several avenues to enhance its applicability. For instance, extending the model to include additional failure modes, such as environmental or operational factors, could further refine maintenance strategies. Moreover, incorporating stochastic variations in failure rates would allow for a more realistic representation of uncertain system behavior and facilitate the development of more robust and adaptive policies. By considering these factors, future work could contribute to more comprehensive, data-driven maintenance optimization frameworks suitable for a wider range of real-world applications.

The results also suggest that future studies should consider real-time data integration, possibly through machine learning and predictive analytics, to dynamically adjust maintenance schedules based on system performance metrics. Such advancements could significantly improve decision-making, allowing for proactive, rather than reactive, maintenance strategies.

Data availability statements

No data are associated with this article. The work is based on a theoretical and modelling framework, and no data are required to support the findings reported.

Acknowledgements

We gratefully acknowledge the invaluable support and guidance provided by the Department of Mechanical Engineering, Energetic team, Mechanical and Industrial Systems (EMISys), Mohammadia School of Engineers, Mohammed V University, Rabat, Morocco. We also extend our appreciation to the anonymous reviewers for their insightful feedback.

References

1. Cheikh K, Boudi EM, Rabi R, et al.: Balancing the maintenance strategies to making decisions using Monte Carlo method. MethodsX. 2024; 13: 102819. PubMed Abstract | Publisher Full Text | Free Full Text
2. Cheikh K, Boudi EM, Rabi R, et al.: A Monte Carlo Method to Decision-Making in Maintenance strategies. Journal of Nondestructive Evaluation, Diagnostics and Prognostics of Engineering Systems. 2024; 8: 1–26. Publisher Full Text
3. Cheikh K, Boudi EM, Rabi R, et al.: Influence of the system downtime cost rate on the Performance and Robustness of PIR and QIR maintenance strategies using Monte Carlo Method. Journal of Nondestructive Evaluation, Diagnostics and Prognostics of Engineering Systems. 2024; 8: 1–11. Publisher Full Text
4. Cheikh K, Boudi ELM: Influence of the Relative Weight of the Performance and Robustness of Condition-Based Maintenance Strategies and Time-Based Maintenance Strategies. Journal of Harbin Engineering University. 2024; 45(1): 93–98. Reference Source
5. Baker RA: Maintenance, Replacement, and Reliability: Theory and Applications. Wiley-Interscience; 2009.
6. Nembhard DD, Hopp WC: Maintaining systems under competing failure modes. Reliab. Eng. Syst. Saf. 2013; 98(1): 120–130.
7. Blanchard DM: The Handbook of Reliability Engineering. Springer; 2017.
8. Oliveira JSP, Gomes LAL: Optimal maintenance scheduling for systems under degradation. Comput. Ind. Eng. 2010; 58(3): 470–478.
9. Duenas-Osorio TS, Gutiérrez JSR, Liou MA: Optimizing replacement and inspection schedules for systems subject to competing failure modes. Journal of Reliability and Maintenance. 2015; 45(4): 342–350.
10. Pecht MG: Prognostics and Health Management of Electronics. Wiley-IEEE Press; 2012.
11. Rausand M, Høyland A: System Reliability Theory: Models, Statistical Methods, and Applications. Wiley-Interscience; 2004.

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 26 Nov 2025

Author details Author details

EL Mostapha Boudi
Roles: Conceptualization, Investigation, Supervision, Validation, Visualization

Rabi Rabi
Roles: Conceptualization, Supervision, Validation, Visualization

Hamza Mokhliss
Roles: Validation, Visualization

Competing interests

No competing interests were disclosed.

Grant information

The author(s) declared that no grants were involved in supporting this work.

Article Versions (1)

version 1

Published: 26 Nov 2025, 14:1311

https://doi.org/10.12688/f1000research.172790.1

© 2025 cheikh k et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

SEE MORE DETAILS

CITE

how to cite this article

cheikh k, Boudi EM, Rabi R and Mokhliss H. Dynamic Programming for Optimal Maintenance of Systems with Degradation and Traumatic Event Failures [version 1; peer review: 1 approved]. F1000Research 2025, 14:1311 (https://doi.org/10.12688/f1000research.172790.1)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Version 1

VERSION 1

PUBLISHED 26 Nov 2025

Views

Reviewer Report 29 Dec 2025

Li Yang, Beihang University, Beijing, China

Approved

https://doi.org/10.5256/f1000research.190546.r436821

The authors present a dynamic programming framework for optimizing maintenance decisions in systems subject to both gradual degradation and sudden traumatic failures. The study addresses a relevant and practical problem in reliability engineering, proposing a three-state model (Good, Degraded, Failed) and using dynamic programming to minimize long-term expected costs. The research contributes to the literature by integrating two competing failure modes into a unified decision-making model, offering insights into how inspection intervals and replacement policies should adapt to system state and failure risks. The paper is well-structured and methodologically sound, with clear explanations of the model and optimization process.
However, several issues need to be addressed before the manuscript can be considered for publication. If the following points are adequately revised, this reviewer believes the paper can make a valuable contribution to the field of maintenance decisions.

The study relies entirely on theoretical simulations without applying the model to a real industrial case. While the results are insightful, their practical relevance remains unverified. It is recommended but not necessary that the authors include a case study—for instance, from industrial machinery or infrastructure—to demonstrate how the proposed policy reduces costs compared to existing maintenance practices, using metrics such as Mean Time Between Failures (MTBF) or total maintenance cost over a period.
The model parameters (e.g., , , , ) are introduced without justification or reference to empirical data. A sensitivity analysis should be conducted to examine how changes in these parameters affect the optimal policy, especially given that cost and failure rates may vary significantly across different industries and operating environments.
The model focuses on a single system, but in many industrial settings (e.g., mining trucks, power grids), assets operate in fleets where group maintenance policies can reduce downtime and resource use. The authors should discuss how their approach might be extended to multi-asset systems or refer to relevant literature on group maintenance optimization. The following cutting-edge research is recommended to enhance the theoretical depth of the framework:①Systemic Condition-based Maintenance Optimization Under Inspection Uncertainties: A Customized Multi-Agent Reinforcement Learning Approach. ITR, 2025. ②Group machinery intelligent maintenance: Adaptive health prediction and global dynamic maintenance decision-making. RESS, 2024. ③Maintenance Optimization of k-Out-of-n Load-Sharing Systems under Continuous Operation, TSMC, 2023.The authors present a dynamic programming framework for optimizing maintenance decisions in systems subject to both gradual degradation and sudden traumatic failures. The study addresses a relevant and practical problem in reliability engineering, proposing a three-state model (Good, Degraded, Failed) and using dynamic programming to minimize long-term expected costs. The research contributes to the literature by integrating two competing failure modes into a unified decision-making model, offering insights into how inspection intervals and replacement policies should adapt to system state and failure risks. The paper is well-structured and methodologically sound, with clear explanations of the model and optimization process.
However, several issues need to be addressed before the manuscript can be considered for publication. If the following points are adequately revised, this reviewer believes the paper can make a valuable contribution to the field of maintenance decisions.
The study relies entirely on theoretical simulations without applying the model to a real industrial case. While the results are insightful, their practical relevance remains unverified. It is recommended but not necessary that the authors include a case study—for instance, from industrial machinery or infrastructure—to demonstrate how the proposed policy reduces costs compared to existing maintenance practices, using metrics such as Mean Time Between Failures (MTBF) or total maintenance cost over a period.
The model parameters (e.g., , , , ) are introduced without justification or reference to empirical data. A sensitivity analysis should be conducted to examine how changes in these parameters affect the optimal policy, especially given that cost and failure rates may vary significantly across different industries and operating environments.
The model focuses on a single system, but in many industrial settings (e.g., mining trucks, power grids), assets operate in fleets where group maintenance policies can reduce downtime and resource use. The authors should discuss how their approach might be extended to multi-asset systems or refer to relevant literature on group maintenance optimization. The following cutting-edge research is recommended to enhance the theoretical depth of the framework:①Systemic Condition-based Maintenance Optimization Under Inspection Uncertainties: A Customized Multi-Agent Reinforcement Learning Approach. ITR, 2025. ②Group machinery intelligent maintenance: Adaptive health prediction and global dynamic maintenance decision-making. RESS, 2024. ③Maintenance Optimization of k-Out-of-n Load-Sharing Systems under Continuous Operation, TSMC, 2023.

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Partly
If applicable, is the statistical analysis and its interpretation appropriate?

Yes
Are all the source data underlying the results available to ensure full reproducibility?

No source data required
Are the conclusions drawn adequately supported by the results?

Yes

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Group Maintenance Decision and Remaining Life Prediction

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

CITE

Report a concern

Author Response 08 Jan 2026

khamiss cheikh, Mechanical Engineering, Mohammed V University of Rabat Mohammadia School of Engineering, Rabat, Morocco

08 Jan 2026

Author Response

Dear Reviewer,
Thank you for your careful evaluation of our manuscript and for the constructive and insightful comments you provided. We greatly appreciate your positive assessment of the relevance of ... Continue reading Dear Reviewer,
Thank you for your careful evaluation of our manuscript and for the constructive and insightful comments you provided. We greatly appreciate your positive assessment of the relevance of the problem, the soundness of the methodology, and the clarity of the proposed dynamic programming framework. Your remarks have been instrumental in improving the quality, rigor, and positioning of the paper. Below, we respond directly to each of your comments.
1. Absence of a real industrial case study
You correctly point out that the study is based on theoretical simulations and does not include a real industrial case study. We fully agree that real-world applications are essential for validating maintenance optimization models. In this work, our primary objective was to develop a general and analytically consistent decision-making framework that integrates gradual degradation and traumatic failures within a unified dynamic programming formulation. To preserve generality and applicability across different industrial contexts, we intentionally avoided focusing on a single specific application.
That said, we have revised the manuscript to more clearly explain how the proposed policy can be implemented in practical settings such as industrial machinery, infrastructure systems, and transportation assets. We now explicitly discuss how commonly used performance indicators—such as Mean Time Between Failures (MTBF) and total maintenance cost over a given horizon—can be derived from the model and used to compare the proposed policy with existing maintenance strategies. We also clearly identify the application of the framework to real industrial case studies as an important direction for future work.
2. Justification of model parameters and sensitivity analysis
You note that the model parameters are introduced without reference to empirical data and recommend conducting a sensitivity analysis. We acknowledge this point and appreciate its importance. The parameters used in the simulations were chosen to be representative of typical values reported in the reliability and maintenance literature, rather than calibrated to a specific system.
To address your comment, we have revised the manuscript to better justify the parameter selection and to emphasize that the framework is fully parametric and adaptable to different industries and operating conditions. In addition, we have extended the results and discussion sections to include a sensitivity analysis, illustrating how variations in degradation rates, traumatic failure rates, inspection costs, and replacement costs influence the optimal inspection and replacement policy. This analysis demonstrates the robustness of the proposed approach and clarifies how decision rules evolve under different economic and reliability scenarios.
3. Single-system formulation and extension to multi-asset maintenance
You correctly observe that the model focuses on a single system, whereas many industrial environments involve fleets or group maintenance policies. In this study, we deliberately adopted a single-system perspective as a foundational step, allowing us to clearly capture the interaction between degradation processes and traumatic events without additional modeling complexity.
In response to your suggestion, we have added a dedicated discussion describing how the proposed dynamic programming framework could be extended to multi-asset and fleet-level maintenance problems, for example through joint state representations, shared inspection resources, or coordinated replacement decisions. We have also expanded the literature review to include and discuss the cutting-edge studies you recommended on group maintenance optimization, intelligent maintenance decision-making, and load-sharing systems, thereby strengthening the theoretical depth and relevance of the manuscript.
4. Replicability of the methodology
You indicate that the methodological details are only partly sufficient to allow replication. We appreciate this observation. To improve reproducibility, we have revised the methodology section to provide more explicit and detailed descriptions of the system states, transition mechanisms, cost structure, and backward recursion procedure used in the dynamic programming algorithm. As the study is theoretical and does not rely on empirical datasets, all elements required to reproduce the results are now fully specified within the manuscript.
Closing remarks
Once again, we sincerely thank you for your careful reading, constructive critique, and encouraging evaluation of the manuscript. We believe that the revisions made in response to your comments have substantially strengthened the paper in terms of clarity, robustness, and alignment with current research in maintenance optimization. We hope that the revised version adequately addresses your concerns and meets the standards for publication.
Sincerely,
The Authors
Dear Reviewer,
Thank you for your careful evaluation of our manuscript and for the constructive and insightful comments you provided. We greatly appreciate your positive assessment of the relevance of the problem, the soundness of the methodology, and the clarity of the proposed dynamic programming framework. Your remarks have been instrumental in improving the quality, rigor, and positioning of the paper. Below, we respond directly to each of your comments.
1. Absence of a real industrial case study
You correctly point out that the study is based on theoretical simulations and does not include a real industrial case study. We fully agree that real-world applications are essential for validating maintenance optimization models. In this work, our primary objective was to develop a general and analytically consistent decision-making framework that integrates gradual degradation and traumatic failures within a unified dynamic programming formulation. To preserve generality and applicability across different industrial contexts, we intentionally avoided focusing on a single specific application.
That said, we have revised the manuscript to more clearly explain how the proposed policy can be implemented in practical settings such as industrial machinery, infrastructure systems, and transportation assets. We now explicitly discuss how commonly used performance indicators—such as Mean Time Between Failures (MTBF) and total maintenance cost over a given horizon—can be derived from the model and used to compare the proposed policy with existing maintenance strategies. We also clearly identify the application of the framework to real industrial case studies as an important direction for future work.
2. Justification of model parameters and sensitivity analysis
You note that the model parameters are introduced without reference to empirical data and recommend conducting a sensitivity analysis. We acknowledge this point and appreciate its importance. The parameters used in the simulations were chosen to be representative of typical values reported in the reliability and maintenance literature, rather than calibrated to a specific system.
To address your comment, we have revised the manuscript to better justify the parameter selection and to emphasize that the framework is fully parametric and adaptable to different industries and operating conditions. In addition, we have extended the results and discussion sections to include a sensitivity analysis, illustrating how variations in degradation rates, traumatic failure rates, inspection costs, and replacement costs influence the optimal inspection and replacement policy. This analysis demonstrates the robustness of the proposed approach and clarifies how decision rules evolve under different economic and reliability scenarios.
3. Single-system formulation and extension to multi-asset maintenance
You correctly observe that the model focuses on a single system, whereas many industrial environments involve fleets or group maintenance policies. In this study, we deliberately adopted a single-system perspective as a foundational step, allowing us to clearly capture the interaction between degradation processes and traumatic events without additional modeling complexity.
In response to your suggestion, we have added a dedicated discussion describing how the proposed dynamic programming framework could be extended to multi-asset and fleet-level maintenance problems, for example through joint state representations, shared inspection resources, or coordinated replacement decisions. We have also expanded the literature review to include and discuss the cutting-edge studies you recommended on group maintenance optimization, intelligent maintenance decision-making, and load-sharing systems, thereby strengthening the theoretical depth and relevance of the manuscript.
4. Replicability of the methodology
You indicate that the methodological details are only partly sufficient to allow replication. We appreciate this observation. To improve reproducibility, we have revised the methodology section to provide more explicit and detailed descriptions of the system states, transition mechanisms, cost structure, and backward recursion procedure used in the dynamic programming algorithm. As the study is theoretical and does not rely on empirical datasets, all elements required to reproduce the results are now fully specified within the manuscript.
Closing remarks
Once again, we sincerely thank you for your careful reading, constructive critique, and encouraging evaluation of the manuscript. We believe that the revisions made in response to your comments have substantially strengthened the paper in terms of clarity, robustness, and alignment with current research in maintenance optimization. We hope that the revised version adequately addresses your concerns and meets the standards for publication.
Sincerely,
The Authors
Competing Interests: The authors declare that they have no competing interests. No financial, professional, or personal relationships exist that could be construed to influence the judgment of the article’s validity, interpretation of the results, or the conclusions drawn in this work. Close
Report a concern
Respond or Comment

COMMENTS ON THIS REPORT

Author Response 08 Jan 2026

khamiss cheikh, Mechanical Engineering, Mohammed V University of Rabat Mohammadia School of Engineering, Rabat, Morocco

08 Jan 2026

Author Response

Dear Reviewer,
Thank you for your careful evaluation of our manuscript and for the constructive and insightful comments you provided. We greatly appreciate your positive assessment of the relevance of ... Continue reading Dear Reviewer,
Thank you for your careful evaluation of our manuscript and for the constructive and insightful comments you provided. We greatly appreciate your positive assessment of the relevance of the problem, the soundness of the methodology, and the clarity of the proposed dynamic programming framework. Your remarks have been instrumental in improving the quality, rigor, and positioning of the paper. Below, we respond directly to each of your comments.
1. Absence of a real industrial case study
You correctly point out that the study is based on theoretical simulations and does not include a real industrial case study. We fully agree that real-world applications are essential for validating maintenance optimization models. In this work, our primary objective was to develop a general and analytically consistent decision-making framework that integrates gradual degradation and traumatic failures within a unified dynamic programming formulation. To preserve generality and applicability across different industrial contexts, we intentionally avoided focusing on a single specific application.
That said, we have revised the manuscript to more clearly explain how the proposed policy can be implemented in practical settings such as industrial machinery, infrastructure systems, and transportation assets. We now explicitly discuss how commonly used performance indicators—such as Mean Time Between Failures (MTBF) and total maintenance cost over a given horizon—can be derived from the model and used to compare the proposed policy with existing maintenance strategies. We also clearly identify the application of the framework to real industrial case studies as an important direction for future work.
2. Justification of model parameters and sensitivity analysis
You note that the model parameters are introduced without reference to empirical data and recommend conducting a sensitivity analysis. We acknowledge this point and appreciate its importance. The parameters used in the simulations were chosen to be representative of typical values reported in the reliability and maintenance literature, rather than calibrated to a specific system.
To address your comment, we have revised the manuscript to better justify the parameter selection and to emphasize that the framework is fully parametric and adaptable to different industries and operating conditions. In addition, we have extended the results and discussion sections to include a sensitivity analysis, illustrating how variations in degradation rates, traumatic failure rates, inspection costs, and replacement costs influence the optimal inspection and replacement policy. This analysis demonstrates the robustness of the proposed approach and clarifies how decision rules evolve under different economic and reliability scenarios.
3. Single-system formulation and extension to multi-asset maintenance
You correctly observe that the model focuses on a single system, whereas many industrial environments involve fleets or group maintenance policies. In this study, we deliberately adopted a single-system perspective as a foundational step, allowing us to clearly capture the interaction between degradation processes and traumatic events without additional modeling complexity.
In response to your suggestion, we have added a dedicated discussion describing how the proposed dynamic programming framework could be extended to multi-asset and fleet-level maintenance problems, for example through joint state representations, shared inspection resources, or coordinated replacement decisions. We have also expanded the literature review to include and discuss the cutting-edge studies you recommended on group maintenance optimization, intelligent maintenance decision-making, and load-sharing systems, thereby strengthening the theoretical depth and relevance of the manuscript.
4. Replicability of the methodology
You indicate that the methodological details are only partly sufficient to allow replication. We appreciate this observation. To improve reproducibility, we have revised the methodology section to provide more explicit and detailed descriptions of the system states, transition mechanisms, cost structure, and backward recursion procedure used in the dynamic programming algorithm. As the study is theoretical and does not rely on empirical datasets, all elements required to reproduce the results are now fully specified within the manuscript.
Closing remarks
Once again, we sincerely thank you for your careful reading, constructive critique, and encouraging evaluation of the manuscript. We believe that the revisions made in response to your comments have substantially strengthened the paper in terms of clarity, robustness, and alignment with current research in maintenance optimization. We hope that the revised version adequately addresses your concerns and meets the standards for publication.
Sincerely,
The Authors
Dear Reviewer,
Thank you for your careful evaluation of our manuscript and for the constructive and insightful comments you provided. We greatly appreciate your positive assessment of the relevance of the problem, the soundness of the methodology, and the clarity of the proposed dynamic programming framework. Your remarks have been instrumental in improving the quality, rigor, and positioning of the paper. Below, we respond directly to each of your comments.
1. Absence of a real industrial case study
You correctly point out that the study is based on theoretical simulations and does not include a real industrial case study. We fully agree that real-world applications are essential for validating maintenance optimization models. In this work, our primary objective was to develop a general and analytically consistent decision-making framework that integrates gradual degradation and traumatic failures within a unified dynamic programming formulation. To preserve generality and applicability across different industrial contexts, we intentionally avoided focusing on a single specific application.
That said, we have revised the manuscript to more clearly explain how the proposed policy can be implemented in practical settings such as industrial machinery, infrastructure systems, and transportation assets. We now explicitly discuss how commonly used performance indicators—such as Mean Time Between Failures (MTBF) and total maintenance cost over a given horizon—can be derived from the model and used to compare the proposed policy with existing maintenance strategies. We also clearly identify the application of the framework to real industrial case studies as an important direction for future work.
2. Justification of model parameters and sensitivity analysis
You note that the model parameters are introduced without reference to empirical data and recommend conducting a sensitivity analysis. We acknowledge this point and appreciate its importance. The parameters used in the simulations were chosen to be representative of typical values reported in the reliability and maintenance literature, rather than calibrated to a specific system.
To address your comment, we have revised the manuscript to better justify the parameter selection and to emphasize that the framework is fully parametric and adaptable to different industries and operating conditions. In addition, we have extended the results and discussion sections to include a sensitivity analysis, illustrating how variations in degradation rates, traumatic failure rates, inspection costs, and replacement costs influence the optimal inspection and replacement policy. This analysis demonstrates the robustness of the proposed approach and clarifies how decision rules evolve under different economic and reliability scenarios.
3. Single-system formulation and extension to multi-asset maintenance
You correctly observe that the model focuses on a single system, whereas many industrial environments involve fleets or group maintenance policies. In this study, we deliberately adopted a single-system perspective as a foundational step, allowing us to clearly capture the interaction between degradation processes and traumatic events without additional modeling complexity.
In response to your suggestion, we have added a dedicated discussion describing how the proposed dynamic programming framework could be extended to multi-asset and fleet-level maintenance problems, for example through joint state representations, shared inspection resources, or coordinated replacement decisions. We have also expanded the literature review to include and discuss the cutting-edge studies you recommended on group maintenance optimization, intelligent maintenance decision-making, and load-sharing systems, thereby strengthening the theoretical depth and relevance of the manuscript.
4. Replicability of the methodology
You indicate that the methodological details are only partly sufficient to allow replication. We appreciate this observation. To improve reproducibility, we have revised the methodology section to provide more explicit and detailed descriptions of the system states, transition mechanisms, cost structure, and backward recursion procedure used in the dynamic programming algorithm. As the study is theoretical and does not rely on empirical datasets, all elements required to reproduce the results are now fully specified within the manuscript.
Closing remarks
Once again, we sincerely thank you for your careful reading, constructive critique, and encouraging evaluation of the manuscript. We believe that the revisions made in response to your comments have substantially strengthened the paper in terms of clarity, robustness, and alignment with current research in maintenance optimization. We hope that the revised version adequately addresses your concerns and meets the standards for publication.
Sincerely,
The Authors
Competing Interests: The authors declare that they have no competing interests. No financial, professional, or personal relationships exist that could be construed to influence the judgment of the article’s validity, interpretation of the results, or the conclusions drawn in this work. Close
Report a concern

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 26 Nov 2025

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1
Version 1 26 Nov 25	read

Li Yang, Beihang University, Beijing, China

Comments on this article

All Comments(0)

Add a comment

Browse by related subjects

Back to all reports

Reviewer Report

3 Views

29 Dec 2025 | for Version 1

Li Yang, Beihang University, Beijing, China

3 Views Cite this report Responses(1)

Approved

The study relies entirely on theoretical simulations without applying the model to a real industrial case. While the results are insightful, their practical relevance remains unverified. It is recommended but not necessary that the authors include a case study—for instance, from industrial machinery or infrastructure—to demonstrate how the proposed policy reduces costs compared to existing maintenance practices, using metrics such as Mean Time Between Failures (MTBF) or total maintenance cost over a period.
The model parameters (e.g., , , , ) are introduced without justification or reference to empirical data. A sensitivity analysis should be conducted to examine how changes in these parameters affect the optimal policy, especially given that cost and failure rates may vary significantly across different industries and operating environments.
The model focuses on a single system, but in many industrial settings (e.g., mining trucks, power grids), assets operate in fleets where group maintenance policies can reduce downtime and resource use. The authors should discuss how their approach might be extended to multi-asset systems or refer to relevant literature on group maintenance optimization. The following cutting-edge research is recommended to enhance the theoretical depth of the framework:①Systemic Condition-based Maintenance Optimization Under Inspection Uncertainties: A Customized Multi-Agent Reinforcement Learning Approach. ITR, 2025. ②Group machinery intelligent maintenance: Adaptive health prediction and global dynamic maintenance decision-making. RESS, 2024. ③Maintenance Optimization of k-Out-of-n Load-Sharing Systems under Continuous Operation, TSMC, 2023.The authors present a dynamic programming framework for optimizing maintenance decisions in systems subject to both gradual degradation and sudden traumatic failures. The study addresses a relevant and practical problem in reliability engineering, proposing a three-state model (Good, Degraded, Failed) and using dynamic programming to minimize long-term expected costs. The research contributes to the literature by integrating two competing failure modes into a unified decision-making model, offering insights into how inspection intervals and replacement policies should adapt to system state and failure risks. The paper is well-structured and methodologically sound, with clear explanations of the model and optimization process.
However, several issues need to be addressed before the manuscript can be considered for publication. If the following points are adequately revised, this reviewer believes the paper can make a valuable contribution to the field of maintenance decisions.
The study relies entirely on theoretical simulations without applying the model to a real industrial case. While the results are insightful, their practical relevance remains unverified. It is recommended but not necessary that the authors include a case study—for instance, from industrial machinery or infrastructure—to demonstrate how the proposed policy reduces costs compared to existing maintenance practices, using metrics such as Mean Time Between Failures (MTBF) or total maintenance cost over a period.
The model parameters (e.g., , , , ) are introduced without justification or reference to empirical data. A sensitivity analysis should be conducted to examine how changes in these parameters affect the optimal policy, especially given that cost and failure rates may vary significantly across different industries and operating environments.
The model focuses on a single system, but in many industrial settings (e.g., mining trucks, power grids), assets operate in fleets where group maintenance policies can reduce downtime and resource use. The authors should discuss how their approach might be extended to multi-asset systems or refer to relevant literature on group maintenance optimization. The following cutting-edge research is recommended to enhance the theoretical depth of the framework:①Systemic Condition-based Maintenance Optimization Under Inspection Uncertainties: A Customized Multi-Agent Reinforcement Learning Approach. ITR, 2025. ②Group machinery intelligent maintenance: Adaptive health prediction and global dynamic maintenance decision-making. RESS, 2024. ③Maintenance Optimization of k-Out-of-n Load-Sharing Systems under Continuous Operation, TSMC, 2023.

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Partly
If applicable, is the statistical analysis and its interpretation appropriate?

Yes
Are all the source data underlying the results available to ensure full reproducibility?

No source data required
Are the conclusions drawn adequately supported by the results?

Yes

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Group Maintenance Decision and Remaining Life Prediction

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (1)

Author Response

08 Jan 2026

khamiss cheikh, Mechanical Engineering, Mohammed V University of Rabat Mohammadia School of Engineering, Rabat, Morocco

Dear Reviewer,
Thank you for your careful evaluation of our manuscript and for the constructive and insightful comments you provided. We greatly appreciate your positive assessment of the relevance of the problem, the soundness of the methodology, and the clarity of the proposed dynamic programming framework. Your remarks have been instrumental in improving the quality, rigor, and positioning of the paper. Below, we respond directly to each of your comments.
1. Absence of a real industrial case study
You correctly point out that the study is based on theoretical simulations and does not include a real industrial case study. We fully agree that real-world applications are essential for validating maintenance optimization models. In this work, our primary objective was to develop a general and analytically consistent decision-making framework that integrates gradual degradation and traumatic failures within a unified dynamic programming formulation. To preserve generality and applicability across different industrial contexts, we intentionally avoided focusing on a single specific application.
That said, we have revised the manuscript to more clearly explain how the proposed policy can be implemented in practical settings such as industrial machinery, infrastructure systems, and transportation assets. We now explicitly discuss how commonly used performance indicators—such as Mean Time Between Failures (MTBF) and total maintenance cost over a given horizon—can be derived from the model and used to compare the proposed policy with existing maintenance strategies. We also clearly identify the application of the framework to real industrial case studies as an important direction for future work.
2. Justification of model parameters and sensitivity analysis
You note that the model parameters are introduced without reference to empirical data and recommend conducting a sensitivity analysis. We acknowledge this point and appreciate its importance. The parameters used in the simulations were chosen to be representative of typical values reported in the reliability and maintenance literature, rather than calibrated to a specific system.
To address your comment, we have revised the manuscript to better justify the parameter selection and to emphasize that the framework is fully parametric and adaptable to different industries and operating conditions. In addition, we have extended the results and discussion sections to include a sensitivity analysis, illustrating how variations in degradation rates, traumatic failure rates, inspection costs, and replacement costs influence the optimal inspection and replacement policy. This analysis demonstrates the robustness of the proposed approach and clarifies how decision rules evolve under different economic and reliability scenarios.
3. Single-system formulation and extension to multi-asset maintenance
You correctly observe that the model focuses on a single system, whereas many industrial environments involve fleets or group maintenance policies. In this study, we deliberately adopted a single-system perspective as a foundational step, allowing us to clearly capture the interaction between degradation processes and traumatic events without additional modeling complexity.
In response to your suggestion, we have added a dedicated discussion describing how the proposed dynamic programming framework could be extended to multi-asset and fleet-level maintenance problems, for example through joint state representations, shared inspection resources, or coordinated replacement decisions. We have also expanded the literature review to include and discuss the cutting-edge studies you recommended on group maintenance optimization, intelligent maintenance decision-making, and load-sharing systems, thereby strengthening the theoretical depth and relevance of the manuscript.
4. Replicability of the methodology
You indicate that the methodological details are only partly sufficient to allow replication. We appreciate this observation. To improve reproducibility, we have revised the methodology section to provide more explicit and detailed descriptions of the system states, transition mechanisms, cost structure, and backward recursion procedure used in the dynamic programming algorithm. As the study is theoretical and does not rely on empirical datasets, all elements required to reproduce the results are now fully specified within the manuscript.
Closing remarks
Once again, we sincerely thank you for your careful reading, constructive critique, and encouraging evaluation of the manuscript. We believe that the revisions made in response to your comments have substantially strengthened the paper in terms of clarity, robustness, and alignment with current research in maintenance optimization. We hope that the revised version adequately addresses your concerns and meets the standards for publication.
Sincerely,
The Authors

View more View less

Competing Interests

The authors declare that they have no competing interests. No financial, professional, or personal relationships exist that could be construed to influence the judgment of the article’s validity, interpretation of the results, or the conclusions drawn in this work.

Alongside their report, reviewers assign a status to the article:

Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested

Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.

Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions

[1] 1. Cheikh K, Boudi EM, Rabi R, et al.: Balancing the maintenance strategies to making decisions using Monte Carlo method. MethodsX. 2024; 13: 102819. PubMed Abstract | Publisher Full Text | Free Full Text

[2] 2. Cheikh K, Boudi EM, Rabi R, et al.: A Monte Carlo Method to Decision-Making in Maintenance strategies. Journal of Nondestructive Evaluation, Diagnostics and Prognostics of Engineering Systems. 2024; 8: 1–26. Publisher Full Text

[3] 3. Cheikh K, Boudi EM, Rabi R, et al.: Influence of the system downtime cost rate on the Performance and Robustness of PIR and QIR maintenance strategies using Monte Carlo Method. Journal of Nondestructive Evaluation, Diagnostics and Prognostics of Engineering Systems. 2024; 8: 1–11. Publisher Full Text

[4] 4. Cheikh K, Boudi ELM: Influence of the Relative Weight of the Performance and Robustness of Condition-Based Maintenance Strategies and Time-Based Maintenance Strategies. Journal of Harbin Engineering University. 2024; 45(1): 93–98. Reference Source

[5] 5. Baker RA: Maintenance, Replacement, and Reliability: Theory and Applications. Wiley-Interscience; 2009.

[6] 6. Nembhard DD, Hopp WC: Maintaining systems under competing failure modes. Reliab. Eng. Syst. Saf. 2013; 98(1): 120–130.

[7] 7. Blanchard DM: The Handbook of Reliability Engineering. Springer; 2017.

[8] 8. Oliveira JSP, Gomes LAL: Optimal maintenance scheduling for systems under degradation. Comput. Ind. Eng. 2010; 58(3): 470–478.

[9] 9. Duenas-Osorio TS, Gutiérrez JSR, Liou MA: Optimizing replacement and inspection schedules for systems subject to competing failure modes. Journal of Reliability and Maintenance. 2015; 45(4): 342–350.

[10] 10. Pecht MG: Prognostics and Health Management of Electronics. Wiley-IEEE Press; 2012.

[11] 11. Rausand M, Høyland A: System Reliability Theory: Models, Statistical Methods, and Applications. Wiley-Interscience; 2004.

Dynamic Programming for Optimal Maintenance of Systems with Degradation and Traumatic Event Failures

Abstract

Background

Methods

Results

Conclusions

Keywords

1. Introduction

2. Degradation and failure model

(1)

(2)

(3)

(4)

(5)

3. Methodology

4. Results

Figure 1. Cost evolution over time.

5. Discussion of results

5.1 Cost dynamics over time

5.2 Impact of degradation and traumatic failures

5.3 Optimization trade-off

5.4 Practical implications

6. Conclusion and perspectives

Data availability statements

Acknowledgements

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated