Nonparametric Survival Analysis estimation and comparison with Algorithm

Arkan J.S .AL-Majidi; Enas Abdul Hafedh Mohammed; Sada Faydh Mohammed

doi:10.12688/f1000research.177792.1

Home Browse Nonparametric Survival Analysis estimation and comparison with Algorithm

ALL Metrics

-

Views

-

Downloads

Get PDF

Get XML

Export

▬

✚

Research Article

Nonparametric Survival Analysis estimation and comparison with Algorithm

[version 1; peer review: awaiting peer review]

Arkan J.S .AL-Majidi¹, Enas Abdul Hafedh Mohammed², Sada Faydh Mohammed ²

PUBLISHED 01 Jul 2026

Author details Author details

¹ Al-Karkh University of Science, Baghdad, Baghdad Governorate, Iraq
² Department of Statistics, Faculty of Administration and Economics, University of Kerbala, Karbala, Karbala Governorate, Iraq

Arkan J.S .AL-Majidi
Roles: Investigation, Resources, Writing – Original Draft Preparation

Enas Abdul Hafedh Mohammed
Roles: Conceptualization, Formal Analysis, Project Administration

Sada Faydh Mohammed
Roles: Methodology, Software

OPEN PEER REVIEW

REVIEWER STATUS AWAITING PEER REVIEW

This article is included in the Fallujah Multidisciplinary Science and Innovation gateway.

Abstract

Accurate estimation of survival-related probability functions on positive support domains is a fundamental problem in reliability and survival analysis, particularly when data exhibit skewness and boundary effects. This study proposes a flexible nonparametric framework based on asymmetric kernel-family estimation for density, distribution, survival, and hazard functions on (0,∞). Instead of relying on a single kernel, several positive-support kernel families derived from Log-Lindley, Birnbaum–Saunders, and Inverse-Weibull distributions are constructed and compared with benchmark kernels such as Gamma and Inverse-Gaussian kernels. Bandwidth selection is performed using likelihood cross-validation (LCV) and a Silverman-type rule adapted to positive support. The proposed framework is evaluated through simulation studies under multiple distributional scenarios and then applied to real catheterization survival data. Performance is assessed using IMSE, IAE, weighted survival discrepancy measures, and information criteria. The results indicate that asymmetric kernel families substantially reduce boundary bias and provide flexible estimation for skewed survival data. In the real-data application, kernel-based survival estimates closely matched the empirical Kaplan–Meier survival curve, while several parametric competitors exhibited larger discrepancy measures. The findings demonstrate that kernel-family estimation combined with data-driven bandwidth selection offers a robust and practical alternative for nonparametric survival and hazard estimation.

Keywords

nonparametric survival, positive-support KDE asymmetric kernel families; hazard estimation; cross-validation

Corresponding author: Sada Faydh Mohammed

Competing interests: No competing interests were disclosed.

Grant information: The author(s) declared that no grants were involved in supporting this work.

Copyright: © 2026 J.S .AL-Majidi A et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: J.S .AL-Majidi A, Abdul Hafedh Mohammed E and Faydh Mohammed S. Nonparametric Survival Analysis estimation and comparison with Algorithm [version 1; peer review: awaiting peer review]. F1000Research 2026, 15:1054 (https://doi.org/10.12688/f1000research.177792.1) First published: 01 Jul 2026, 15:1054 (https://doi.org/10.12688/f1000research.177792.1) Latest published: 01 Jul 2026, 15:1054 (https://doi.org/10.12688/f1000research.177792.1)

1. Introduction

Nonparametric estimation has become an important statistical approach for modeling complex data without imposing restrictive parametric assumptions. In survival and reliability analysis, observed data are frequently positively supported, skewed, and bounded below by zero, which makes flexible estimation methods particularly important. Kernel density estimation (KDE) is one of the most widely used smoothing techniques for estimating unknown probability density functions from observed samples. However, classical symmetric kernels may suffer from substantial boundary bias when applied to positive-support data, especially near zero. To overcome these limitations, asymmetric kernel estimation methods have been developed using positively supported distributions such as Gamma, Inverse-Gaussian, and related skewed families.^1–3 These kernels improve estimation accuracy in bounded domains and provide better adaptability for skewed survival and reliability data. Recent developments in asymmetric kernel estimation have demonstrated improved performance in survival applications, hazard estimation, and density reconstruction for nonnegative random variables.^4–6 In survival analysis, flexible nonparametric estimation of the survival function and hazard function is essential for accurately representing lifetime behavior without relying on restrictive parametric assumptions. Kernel-based survival estimation provides a useful alternative to classical approaches by combining smoothing flexibility with data-driven estimation.^7,8 In addition, several recent studies have emphasized the importance of transformed survival models and algorithm-based estimation methods in reliability and lifetime analysis.^9–11 This study adopts a kernel-family framework rather than relying on a single asymmetric kernel. Several positive-support kernel families derived from Log-Lindley, Birnbaum–Saunders, and Inverse-Weibull distributions are constructed and evaluated under unified comparison criteria. The proposed framework extends asymmetric kernel estimation from density estimation to survival and hazard estimation while integrating likelihood cross-validation and Silverman-type bandwidth selection methods. Recent related work and applications can be found in Refs. 12–20. The main contributions of this study can be summarized as follows:

(i) proposing a flexible asymmetric kernel-family framework for positive-support survival estimation;
(ii) extending kernel estimation to density, survival, and hazard function estimation;
(iii) comparing several asymmetric kernel families under unified evaluation criteria;
(iv) integrating data-driven bandwidth selection methods;
(v) evaluating the proposed methodology through simulation studies and real survival data applications.

2. Asymmetric Kernel Families on (0, ∞)

Kernel density estimation (KDE) is one of the most widely used nonparametric techniques for estimating unknown probability density functions. Given a random sample x₁, x₂, …, x_n from a positive-support distribution, the KDE provides a smooth estimate of the underlying density by averaging localized kernel functions., which leads to a smooth and flexible estimate of the probability density function by kernel functions.²¹ Given observations x₁, x₂, …, x_n with x_i > 0, the kernel density estimator is defined as:

{\hat{f}}_{h} (x) = \frac{1}{n} \sum_{i = 1}^{n} K (t; x_{i}; h) t > 0 .

where K (t; x_i, h) is a nonnegative asymmetric kernel centered around x_i and controlled by the bandwidth parameter h. The kernel integrates to one over (0, ∞), ensuring that the estimator remains a valid density function on positive support. In asymmetric KDE, K depends on x_i so that the kernel adapts locally to the positive support and reduces boundary bias.

The following asymmetric kernel families are considered in this study. Each kernel is defined on the positive semi-axis and parameterized locally through the observation x_i and the bandwidth parameter h.

1. Log Lindley based kernel (via exponential transformation)
The Log-Lindley kernel is motivated by the flexibility of the Lindley distribution in modeling skewed positive data and by its analytical tractability near the boundary region.
Start from the Linley pdf as a baseline:
$fL (z; θ) = \frac{θ^{2}}{θ + 1} (1 + z) e^{- θz} z > 0 .$

Using the transformation Y = exp (−Z), where Z follows the Lindley distribution, the induced Log-Lindley density on (0,1) is obtained as:

g (y; θ) = \frac{θ^{2}}{θ + 1} (1 - log y) y^{θ - 1}, 0 < y < 1 .

Define $T_{i} = - x_{i} log (Y) \in (0, \infty) .$ Then a convenient log-Lindley-based kernel is:

KLL (t; x_{i}, h) = \frac{{θ^{2}}_{i}}{θ_{i} + 1} (1 + \frac{t}{x_{i}}) exp (- θ_{i} \frac{t}{x_{i}}) \frac{1}{x_{i}}, t > 0,

A practical local bandwidth parameterization is adopted through θ_i = 1/h, allowing the kernel shape to adapt according to the smoothing level. Birnbaum-Sauders (fatigue-life) kernel.

Considering the standard normal pdf $\emptyset (.)$ and for the shape parameter $α > 0 and scale parameter β > 0,$

KBS (t; x_{i}, h) = \frac{1}{2 α_{i} t} (\sqrt{\frac{t}{β_{i}}} + \sqrt{\frac{β_{i}}{t}}) \emptyset (\frac{1}{α_{i}} \sqrt{\frac{t}{β_{i}}} - \sqrt{\frac{β_{i}}{t}}), t > 0,

One can use practically local $β_{i} = x_{i}, α_{i} = h .$

The Birnbaum–Saunders kernel is suitable for lifetime and fatigue-type data due to its positive support and skewness flexibility.

2. Invers-Weibull kernel
The Inverse-Weibull kernel is particularly useful for modeling heavy-tailed lifetime behavior and decreasing hazard structures.

For parameters $β > 0, γ > 0$ ,

KIW (t; x_{i}, h) = β_{i} γ_{i} t^{- (γ_{i} + 1)} exp (- β_{i} t^{- γ_{i}}), t > 0 .

These asymmetric kernels provide flexible local smoothing mechanisms while preserving the positive support of survival data. Compared with symmetric kernels, they reduce boundary distortion and improve estimation accuracy near zero.

3. Benchmark Kernel Families

To evaluate the performance of the proposed asymmetric kernel families, several benchmark kernels commonly used in positive-support density estimation are considered for comparison. These include Gamma, Inverse-Gaussian, Lindley-based, and symmetric Epanechnikov-type kernels. Consider the following kernels:

1. Gamma:
$KG (t; k_{i}; θ_{i}) = \frac{t^{k_{i} - 1} e^{- t / θ_{i}}}{Γ (k_{i}) {θ_{i}}^{k_{i}}} t > 0$
A practical local parameterization is adopted through:
$k_{i} = \frac{x_{i}}{h} + 1, θ_{i} = h .$
2. Inverse-Gaussian kernel are given by:
The Inverse-Gaussian kernel is suitable for positively skewed lifetime data and provides adaptive smoothing near the boundary.
$KIG (t; μ, λ) = \sqrt{\frac{λ}{2 π t^{3}}} exp (- \frac{λ {(t - μ)}^{2}}{2 μ^{2} t}), t > 0$
3. Symmetric Epanechnikov kernel:
For comparison purposes, a symmetric Epanechnikov kernel is adapted to positive support through a logarithmic transformation.

on the log-scale: let $u = (log t - log x_{i}) / h$ and

KE (u) = \frac{3}{4} (1 - u^{2}) 1 (| u | \leq 1)

Then the positive support version is

{KE}_{+} (t; x_{i}, h) = \frac{1}{th} KE (u) .

The multiplicative factor 1/ t arises from the logarithmic transformation Jacobian and guarantees proper normalization on the positive semi-axis.

These benchmark kernels provide reference models for evaluating the flexibility and estimation performance of the proposed asymmetric kernel-family framework.

4. Bandwidth selection

Estimator performance is significantly impacted by bandwidth selection. We employ two complementary approaches: Silverman-type rule adapted to positive support (pilot scale estimate) and Likelihood cross-validation (LCV): choose h that maximizes the leave-one-out log-likelihood

LCV (h) = \sum_{i = 1}^{n} log (\hat{f} (x_{i}; h)) .

5. Nonparametric survival and hazard estimation

Once the kernel density estimator $\hat{f} (t)$ is obtained, the corresponding distribution and survival functions can be computed numerically.

\hat{F} (t) = \int_{0}^{t} \hat{f} (u) du, \hat{S} (t) = 1 - \hat{F} (t) .

The corresponding hazard function is estimated as

\hat{h} (t) = \hat{f} (t) / \hat{S} (t) .

In the presence of censored observations, the Kaplan–Meier estimator ${\hat{S}}_{KM} (t)$ is used as a benchmark nonparametric survival estimator and compared with the kernel-based survival estimate. Kernel-based survival estimation provides a smooth alternative to empirical survival estimation and allows flexible representation of lifetime behavior on positive support.

Algorithm of Asymmetric Kernel-Family Survival Estimation

Step 1: preprocess data and define minimum and maximum of data

Step 2: Preprocess data and define grid T on (min(t), max(t)).

Step 3: For each kernel family j:

(a) Select bandwidth h_j via likelihood cross-validation (or Silverman-type rule).

(b) Compute density ${\hat{f}}_{j} (t)$ on T.

(c) Numerically integrate to obtain ${\hat{F}}_{j} (t)$ and ${\hat{S}}_{j} (t)$ = 1− ${\hat{F}}_{j} (t)$ .

(d) Compute hazard ${\hat{h}}_{j} (t)$ = ${\hat{f}}_{j} (t)$ /max( ${\hat{S}}_{j} (t)$ , ε).

Step 4: Compute Kaplan–Meier survival ${\hat{S}}_{KM} (t)$ (benchmark, when censoring exists).

Step 5: Fit parametric models M_k by MLE under censoring and compute S_k(t), h_k(t).

Step 6: Evaluate kernels and models using multiple criteria (Section 6) and select the best performer.

Step 7: Report tables and figures for $\hat{f}$ (t), $\hat{S}$ (t), and $\hat{h}$ (t).

6. Simulation study

We assess performance under two data-generating scenarios to represent different shapes and tail behaviors (e.g., Gamma-like and Lognormal-like). For each scenario, we consider several sample sizes (e.g., n = 25, 50, 100, 200) and repeat the experiment over R replications. For each replication, we compute the kernel estimates and evaluate them using integrated error measures and predictive (CV) scores.

Recommended criteria (replace/extend beyond ISE): Integrated Absolute Error (IAE) Integrated Mean Squared Error (IMSE), Hellinger distance, and likelihood cross-validation score (LCV).

This table reports the Integrated Mean Squared Error (IMSE) of the estimated density under Scenario A using two bandwidth selectors: a Silverman-type rule and likelihood cross-validation (LCV). For each kernel family, the corresponding selected bandwidth values are also reported. Lower IMSE indicates better estimation accuracy. As shown in Table 1, kernel-family performance under, Scenario A varies across bandwidth selection methods, highlighting the impact of the bandwidth choice on estimation accuracy.

Table 1. Performance of kernel families under Scenario A (example structure).

Kernel family	IMSE (Silverman)	IMSE (LCV)	Bandwidth (Silverman)	Bandwidth (LCV)
Log-Lindley	0.00547	0.007451	0.19	0.11
Birnbaum–Saunders	0.05749	0.1749	0.25	0.198
Inverse-Weibull	0.07871	0.09745	0.15	0.1546
Inverse-Gaussian	0.04512	0.05478	0.35	0.1784
Gamma	0.0371	0.0145	0.21	0.14
Lindley	0.0457	0.0398	0.25	0.1
Epanechnikov (sym.)	0.05487	0.044	0.28	0.18

This table reports the Integrated Absolute Error (IAE) under Scenario B using two bandwidth selectors (Silverman-type and LCV). The selected bandwidth values are included for each kernel family. Lower IAE indicates better estimation accuracy. Table 2 summarizes kernel-family performance under Scenario B, where accuracy is evaluated using IAE under both Silverman-type and LCV bandwidth selection.

Table 2. Performance of kernel families under Scenario B (example structure).

Kernel family	IAE (Silverman)	IAE (LCV)	Bandwidth (Silverman)	Bandwidth (LCV)
Log-Lindley	0.124	0.0478	0.19	0.11
Birnbaum–Saunders	0.114	0.0145	0.25	0.198
Inverse-Weibull	0.111	0.0241	0.15	0.1546
Inverse-Gaussian	0.154	0.037461	0.35	0.1784
Gamma	0.174	0.0145	0.21	0.14
Lindley	0.146	0.0547	0.25	0.1
Epanechnikov (sym.)	0.117	0.0178	0.28	0.18

7. Application to real survival data

7.1 Numerical results for the real data

Censoring status: all observations correspond to events (δ_i = 1 for all i). Therefore, the Kaplan–Meier estimator reduces to the empirical survival function ${\hat{S}}_{KM} (t)$ = 1 − ECDF(t). A 95% confidence interval is reported using Greenwood’s formula with the log–log transformation.

Estimated median survival time (KM): 0.75.

Kaplan–Meier (KM) survival probabilities $\hat{S}$ (t) are reported at selected quantiles of the observed survival times. Since all observations correspond to events (δ_i = 1 for all i), the KM estimator reduces to the empirical survival 1-ECDF(t). A 95% confidence interval is computed using Greenwood’s formula with the log–log transformation. The estimated median survival time is 0.75. such that Empirical/KM survival estimates at key time points are reported in Table 3.

Table 3. Kaplan–Meier survival estimates at selected time points (all events).

Quantile	Time (t)	KM S ( t)	Lower 95%	Upper 95%
0.1	0.22	0.893333	0.831798	0.933249
0.25	0.4225	0.746667	0.66902	0.808699
0.5	0.75	0.493333	0.411118	0.570264
0.75	1.0775	0.253333	0.186895	0.324962
0.9	1.271	0.1	0.0586361	0.154242

The dataset consists of positive survival times (in the study unit) for patients who underwent catheterization. Since no censoring indicators were provided, the empirical survival is computed as 1 − ECDF, which coincides with the Kaplan–Meier estimator in the absence of censoring.

Summary statistics for the positive survival times (n = 150) are reported, including minimum, quartiles, mean, standard deviation, maximum, interquartile range (IQR), skewness, and coefficient of variation (CV). These statistics provide an overview of the scale and dispersion of the real survival dataset used in the application. Where Descriptive statistics for the real dataset are provided in Table 4.

Table 4. Descriptive statistics of the catheterization survival times.

n	min	Q1	median	mean	std	Q3	max	IQR	skewness	cv
150	0.09	0.4225	0.75	0.7496	0.383	1.0775	1.41	0.655	−0.0005	0.5115

For each asymmetric kernel family, this table reports the bandwidth selected by a Silverman-type rule and by likelihood cross-validation (LCV). The maximized LCV objective value, LCV( h∗) is also reported to quantify the cross-validated fit. These bandwidths are used to construct the kernel-based density and survival estimates in the real-data application. Where Bandwidths selected for the real dataset are summarized in Table 5.

Table 5. Bandwidth selection for asymmetric kernel families (Silverman vs LCV).

Kernel family	h (Silverman)	h (LCV)	LCV(h*)
Gamma kernel	0.149109	0.0191596	−58.0008
Inverse-Gaussian kernel	0.149109	0.137498	−60.6772
Lognormal kernel	0.149109	0.137498	−60.6902

Maximum likelihood estimates (MLEs) are reported for three non-Weibull parametric survival models (Gamma, Lognormal, Log-logistic), along with the log-likelihood (logL), Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC). Smaller AIC/BIC indicate a better trade-off between goodness-of-fit and model complexity. Parametric competitors and their information-criterion values are reported in Table 6.

Table 6. Parametric model comparison (non-Weibull) using MLE and information criteria.

Parametric model	MLE/Estimates	logL	AIC	BIC
Gamma	k = 2.89134, theta = 0.259257	−71.1432	146.286	152.308
Lognormal	mu = −0.471004, sigma = 0.673754	−82.9566	169.913	175.934
Log-logistic	alpha = 0.67575, beta = 2.62355	−83.5944	171.189	177.21

This table compares kernel-family survival estimates against the empirical survival function 1 − ECDF(t) (equivalent to KM with no censoring). For each kernel family, the LCV-selected bandwidth h∗, the LCV log-likelihood, and several discrepancy measures between the estimated and empirical survival curves are reported (weighted ISE, ISE, and IAE). The mean hazard (grid average) is included as a descriptive summary of the estimated hazard level over the evaluation grid. Lower error measures indicate closer agreement with the empirical survival. Where Kernel-family survival estimates are quantitatively compared with the empirical survival in Table 7.

Table 7. Kernel-family survival comparison vs empirical survival (1 − ECDF).

Kernel family	h* (LCV)	LCV log-likelihood	Weighted ISE on S ( t)	ISE on S ( t)	IAE on S ( t)	Mean hazard (grid avg)
Inverse-Gaussian kernel	0.137498	−60.6772	0.000332559	0.00036475	0.0177397	2.67995
Lognormal kernel	0.137498	−60.6902	0.000336311	0.00036852	0.0178502	2.6746
Gamma kernel	0.0191596	−58.0008	0.000418933	0.000437512	0.019186	2.70542

This table compares fitted parametric survival models (Gamma, Log-logistic, Lognormal) against the empirical survival 1 − ECDF(t). Discrepancy is quantified using weighted ISE, ISE, and IAE computed over the evaluation grid. Lower values indicate improved agreement with the empirical survival curve. Parametric survival models are compared to the empirical survival in Table 8.

Table 8. Parametric survival comparison vs empirical survival (1 − ECDF).

Parametric model	Weighted ISE on S ( t)	ISE on S ( t)	IAE on S ( t)
Gamma	0.00525944	0.004153	0.0680224
Log-logistic	0.00666845	0.00543173	0.0760906
Lognormal	0.0093188	0.00740298	0.0905809

A real survival dataset (survival times) is used to illustrate the proposed methodology. We estimate the density and the survival function using the best-performing asymmetric kernel family and compare it with:

• Kaplan–Meier estimator (nonparametric survival benchmark).
• A selected parametric model (e.g., Lognormal or Log-logistic) fitted by MLE (non-Weibull).

Evaluation focuses on survival-level discrepancies and predictive performance rather than relying only on classical goodness-of-fit tests.

7.2 Figures

Figures 1–4 summarize the real-data application of the proposed positive-support kernel-family framework. We present kernel-based density estimates under different bandwidth selection strategies and compare the resulting fitted curves with the empirical distribution of the data. In addition, we report normalized error/predictive measures to quantify performance across kernels and bandwidth selectors, and we compare survival curves to evaluate how well the nonparametric estimators reproduce the empirical survival pattern. Together, these figures illustrate the impact of bandwidth selection (Silverman vs LCV), the differences between kernel families on positive support, and the resulting consequences for density and survival estimation.

Figure 1. Displays kernel-family density estimates for the catheterization data using both Silverman’s rule and likelihood cross-validation (LCV) bandwidths; the histogram represents the empirical distribution, while the solid curves correspond to asymmetric kernel estimates.

Figure 2. Presents kernel-family density estimates for the real data using the LCV-selected bandwidth, highlighting differences among positive-support (asymmetric) kernel families.

Figure 3. Summarizes normalized error and predictive measures across kernel families and bandwidth selectors, including IMSE and IAE (computed against a lognormal reference fit on a dense grid) and the LCV log-likelihood (higher values indicate better fit).

Figure 4. Survival function comparison for the real data. The empirical survival (1 − ECDF; equivalent to Kaplan–Meier with no censoring) is contrasted with the best-performing kernel-family survival estimate and the best parametric survival model selected by information criteria.

8. Conclusions

This paper provided a kernel-family system for positive-support nonparametric estimation and applied it to survival evaluation by estimating survival and hazard functions. In contrast to single-kernel methods, the family-based design enables practitioners to choose kernels that correspond to the data’s tail characteristics and boundary behavior. An efficient, data-driven method for choosing bandwidth is likelihood cross-validation. Comparing kernel-based survival with Kaplan-Meier and non-Weibull parametric models in real survival analysis reveals the useful trade-off between interpretability/parsimonious structure (parametric) and flexibility (nonparametric).

All tables have been labeled sequentially ( Tables 1–8), cited in the text, and provided with complete.

Data availability

Underlying data (Raw data)

Repository name: Data and code for: Nonparametric Survival Analysis estimation and comparison with Algorithm. https://doi.org/10.5281/zenodo.18827908.⁶

The project contains the following underlying data:

• My data of Nonparametric Survival Analysis.xlsx (raw survival times/primary dataset and source data for the reported results; includes sheets “real data” and “Table 1”–“Table 8” containing the values behind analyses and tables).

Extended data

Repository name: Data and code for: Nonparametric Survival Analysis estimation and comparison with Algorithm. https://doi.org/10.5281/zenodo.18827908.⁶

This project contains the following extended data:

• figure 1. jpg (Figure 1).
• figure 2. jpg (Figure 2).
• figure 3. jpg (Figure 3).
• figure 4. jpg (Figure 4).

Data are available under the terms of the Creative Commons Zero “No rights reserved” data waiver (CC0 1.0 Universal) license.

References

1. Bouezmarni T, Scaillet O: Consistency of asymmetric kernel density estimators and smoothed histograms with application to income data. Econometr. Theory. 2005; 21(2): 390–412.
2. Markovich LA: Gamma kernel estimation of the density derivative on the positive semi-axis by dependent data. arXiv preprint arXiv:1502.02373. 2015.
3. Scaillet O: Density estimation using inverse and reciprocal inverse Gaussian kernels. J. Nonparametr. Stat. 2004; 16(1–2): 217–226. Publisher Full Text
4. Belzile LR, Desgagné A, Genest C, et al.: Normal approximations for the multivariate inverse Gaussian distribution and asymmetric kernel smoothing on d-dimensional half-spaces. arXiv preprint arXiv:2209.04757. 2022.
5. Hanif M: A nonparametric approach to the estimation of jump-diffusion models with asymmetric kernels. Cogent Mathematics. 2016; 3(1): 1179247. Publisher Full Text
6. AL-Majid AJS, Abdul Hafedh Mohammed E, Faydh Mohammed S: Data and code for: Nonparametric Survival Analysis estimation and comparison with Algorithm. [Data set]. Zenodo. 2026. Publisher Full Text
7. Al-Azzawi SF, Al-Kadim KA: A Transmuted Survival Model with Application. J. Phys. Conf. Ser. 2021; 1897(1): 012020.
8. Al-Azzawi SF, Al-Kadim KA: Using Survival Function and Transmuted Formula to Produce Lifetime Models with Application on Real Data Set. AIP Conf. Proc. 2023; 2457: 020017. Publisher Full Text
9. Al-Azzawi SF, Al-Kadim KA: Additive Weibull Model: An Application of Real Data Set. AIP Conf. Proc. 2023; 2414: 040019. Publisher Full Text
10. Al-Azzawi SF, Mohammed EAH, Resen IA: Simulation of estimation of transformed semicircular gamma distribution parameters with algorithm. AIP Conf. Proc. 2024; 3229(1): 080034. Publisher Full Text
11. Madloom MA, Mohammed EAH, Al-Azzawi SF, et al.: On DUS transformation Lindley distribution: An application on real data sets with algorithm and MATLAB code. AIP Conf. Proc. 2026; 3393(1): 060040.
12. Bareche A, Aïssani D: Kernel density in the study of the strong stability of the M/M/1 queueing system. Oper. Res. Lett. 2008; 36(5): 535–538.
13. Ghitany ME, Atieh B, Nadarajah S: Lindley distribution and its application. Math. Comput. Simul. 2008; 78(4): 493–506. Publisher Full Text
14. Jones MC, Marron JS, Sheather SJ: A brief survey of bandwidth selection for density estimation. J. Am. Stat. Assoc. 1996; 91(433): 401–407. Publisher Full Text
15. Maiti SS, Mukherjee I: Some estimators of the PDF and CDF of the Lindley distribution. arXiv preprint arXiv:1604.06308. 2016.
16. Maswadah M: Kernel inference on the Weibull distribution. Proc. 3rd Natl. Stat. Conf., Lahore, Pak. 2007; 14: 77–86.
17. Samiuddin M, El-Sayyad GM: On nonparametric kernel density estimates. Biometrika. 1990; 77(4): 865–874. Publisher Full Text
18. Silverman BW: Density Estimation for Statistics and Data Analysis. Routledge; 2018. Publisher Full Text
19. Al-Sabbah SAS, Al-Azzawi SF, Mezher ZK: Solving the Multi Collinearity Problem Using Inequality Constraints Ridge Regression with Algorithms. AIP Conf. Proc. 2025; 3264(1): 050069. Publisher Full Text
20. Yang F, Yue Z: Kernel density estimation of three-parameter Weibull distribution with neural network and genetic algorithm. Appl. Math. Comput. 2014; 247: 803–814. Publisher Full Text
21. Silverman BW: Density Estimation for Statistics and Data Analysis. Chapman & Hall; 1986.

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 01 Jul 2026

Author details Author details

¹ Al-Karkh University of Science, Baghdad, Baghdad Governorate, Iraq
² Department of Statistics, Faculty of Administration and Economics, University of Kerbala, Karbala, Karbala Governorate, Iraq

Arkan J.S .AL-Majidi
Roles: Investigation, Resources, Writing – Original Draft Preparation

Enas Abdul Hafedh Mohammed
Roles: Conceptualization, Formal Analysis, Project Administration

Sada Faydh Mohammed
Roles: Methodology, Software

Competing interests

No competing interests were disclosed.

Grant information

The author(s) declared that no grants were involved in supporting this work.

Article Versions (1)

version 1

Published: 01 Jul 2026, 15:1054

https://doi.org/10.12688/f1000research.177792.1

Copyright

© 2026 J.S .AL-Majidi A et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

0

SEE MORE DETAILS

CITE

how to cite this article

J.S .AL-Majidi A, Abdul Hafedh Mohammed E and Faydh Mohammed S. Nonparametric Survival Analysis estimation and comparison with Algorithm [version 1; peer review: awaiting peer review]. F1000Research 2026, 15:1054 (https://doi.org/10.12688/f1000research.177792.1)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 01 Jul 2026

Open Peer Review

Reviewer Status

AWAITING PEER REVIEW

Comments on this article

All Comments(0)

Add a comment

Sign up for content alerts

Browse by related subjects

[1] 1. Bouezmarni T, Scaillet O: Consistency of asymmetric kernel density estimators and smoothed histograms with application to income data. Econometr. Theory. 2005; 21(2): 390–412.

[2] 2. Markovich LA: Gamma kernel estimation of the density derivative on the positive semi-axis by dependent data. arXiv preprint arXiv:1502.02373. 2015.

[3] 3. Scaillet O: Density estimation using inverse and reciprocal inverse Gaussian kernels. J. Nonparametr. Stat. 2004; 16(1–2): 217–226. Publisher Full Text

[4] 4. Belzile LR, Desgagné A, Genest C, et al.: Normal approximations for the multivariate inverse Gaussian distribution and asymmetric kernel smoothing on d-dimensional half-spaces. arXiv preprint arXiv:2209.04757. 2022.

[5] 5. Hanif M: A nonparametric approach to the estimation of jump-diffusion models with asymmetric kernels. Cogent Mathematics. 2016; 3(1): 1179247. Publisher Full Text

[6] 6. AL-Majid AJS, Abdul Hafedh Mohammed E, Faydh Mohammed S: Data and code for: Nonparametric Survival Analysis estimation and comparison with Algorithm. [Data set]. Zenodo. 2026. Publisher Full Text

[7] 7. Al-Azzawi SF, Al-Kadim KA: A Transmuted Survival Model with Application. J. Phys. Conf. Ser. 2021; 1897(1): 012020.

[8] 8. Al-Azzawi SF, Al-Kadim KA: Using Survival Function and Transmuted Formula to Produce Lifetime Models with Application on Real Data Set. AIP Conf. Proc. 2023; 2457: 020017. Publisher Full Text

[9] 9. Al-Azzawi SF, Al-Kadim KA: Additive Weibull Model: An Application of Real Data Set. AIP Conf. Proc. 2023; 2414: 040019. Publisher Full Text

[10] 10. Al-Azzawi SF, Mohammed EAH, Resen IA: Simulation of estimation of transformed semicircular gamma distribution parameters with algorithm. AIP Conf. Proc. 2024; 3229(1): 080034. Publisher Full Text

[11] 11. Madloom MA, Mohammed EAH, Al-Azzawi SF, et al.: On DUS transformation Lindley distribution: An application on real data sets with algorithm and MATLAB code. AIP Conf. Proc. 2026; 3393(1): 060040.

[12] 12. Bareche A, Aïssani D: Kernel density in the study of the strong stability of the M/M/1 queueing system. Oper. Res. Lett. 2008; 36(5): 535–538.

[13] 13. Ghitany ME, Atieh B, Nadarajah S: Lindley distribution and its application. Math. Comput. Simul. 2008; 78(4): 493–506. Publisher Full Text

[14] 14. Jones MC, Marron JS, Sheather SJ: A brief survey of bandwidth selection for density estimation. J. Am. Stat. Assoc. 1996; 91(433): 401–407. Publisher Full Text

[15] 15. Maiti SS, Mukherjee I: Some estimators of the PDF and CDF of the Lindley distribution. arXiv preprint arXiv:1604.06308. 2016.

[16] 16. Maswadah M: Kernel inference on the Weibull distribution. Proc. 3rd Natl. Stat. Conf., Lahore, Pak. 2007; 14: 77–86.

[17] 17. Samiuddin M, El-Sayyad GM: On nonparametric kernel density estimates. Biometrika. 1990; 77(4): 865–874. Publisher Full Text

[18] 18. Silverman BW: Density Estimation for Statistics and Data Analysis. Routledge; 2018. Publisher Full Text

[19] 19. Al-Sabbah SAS, Al-Azzawi SF, Mezher ZK: Solving the Multi Collinearity Problem Using Inequality Constraints Ridge Regression with Algorithms. AIP Conf. Proc. 2025; 3264(1): 050069. Publisher Full Text

[20] 20. Yang F, Yue Z: Kernel density estimation of three-parameter Weibull distribution with neural network and genetic algorithm. Appl. Math. Comput. 2014; 247: 803–814. Publisher Full Text

[21] 21. Silverman BW: Density Estimation for Statistics and Data Analysis. Chapman & Hall; 1986.

Nonparametric Survival Analysis estimation and comparison with Algorithm

Abstract

Keywords

1. Introduction

2. Asymmetric Kernel Families on (0, ∞)

3. Benchmark Kernel Families

4. Bandwidth selection

5. Nonparametric survival and hazard estimation

6. Simulation study

Table 1. Performance of kernel families under Scenario A (example structure).

Table 2. Performance of kernel families under Scenario B (example structure).

7. Application to real survival data

7.1 Numerical results for the real data

Table 3. Kaplan–Meier survival estimates at selected time points (all events).

Table 4. Descriptive statistics of the catheterization survival times.

Table 5. Bandwidth selection for asymmetric kernel families (Silverman vs LCV).

Table 6. Parametric model comparison (non-Weibull) using MLE and information criteria.

Table 7. Kernel-family survival comparison vs empirical survival (1 − ECDF).

Table 8. Parametric survival comparison vs empirical survival (1 − ECDF).

7.2 Figures

Figure 1. Displays kernel-family density estimates for the catheterization data using both Silverman’s rule and likelihood cross-validation (LCV) bandwidths; the histogram represents the empirical distribution, while the solid curves correspond to asymmetric kernel estimates.

Figure 2. Presents kernel-family density estimates for the real data using the LCV-selected bandwidth, highlighting differences among positive-support (asymmetric) kernel families.

Figure 3. Summarizes normalized error and predictive measures across kernel families and bandwidth selectors, including IMSE and IAE (computed against a lognormal reference fit on a dense grid) and the LCV log-likelihood (higher values indicate better fit).

Figure 4. Survival function comparison for the real data. The empirical survival (1 − ECDF; equivalent to Kaplan–Meier with no censoring) is contrasted with the best-performing kernel-family survival estimate and the best parametric survival model selected by information criteria.

8. Conclusions

Data availability

Underlying data (Raw data)

Extended data

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated