Keywords
Odd log-logistic generalized exponential distribution, maximum-likelihood estimation, generating functions, moments, simulation and order statistics.
The creation of developing new generalized classes of distributions has attracted applied and theoretical statisticians owing to their properties of flexibility. The development of generalized distribution aims to find distribution flexibility and suitability for available data. In this decade, most authors have developed classes of distributions that are new, to become valuable for applied researchers.
This study aims to develop the odd log-logistic generalized exponential distribution (OLLGED), one of the lifetime newly generated distributions in the field of statistics. The advantage of the newly generated distribution is the heavily tailed distributed lifetime data set. Most of the probabilistic properties are derived including generating functions, moments, and quantile and order statistics.
Estimation of the model parameter is done by the maximum likelihood method. The performance of parametric estimation is studied through simulation. Application of OLLGED and its flexibilities is done using two data sets and while its performance is done on the randomly simulated data set.
The application and flexibility of the OLLGED are ensured through empirical observation using two sets of lifetime data, establishing that the proposed OLLGED can provide a better fit in comparison to existing rival models, such as odd generalized log-logistic, type-II generalized log-logistic, exponential distributions, odd exponential log-logistic, generalized exponential, and log-logistic.
Odd log-logistic generalized exponential distribution, maximum-likelihood estimation, generating functions, moments, simulation and order statistics.
The revised manuscript version contains the following changes:
Introduction
We have provided some demerits of the distribution and some more explanation about Figure 1. The physical interpretation of shape parameter is also provided.
Data Analysis
Figures 7 and 8 are reconstructed by adding PP plots as suggested by the reviewer. Moreover, we have added some more explanation about real data applications.
See the authors' detailed response to the review by Boikanyo Makubate
See the authors' detailed response to the review by Sadaf Khan
To cover the need for applied statistics in a field like economics, education, engineering, geology, health, and many others to mention, as well as in the area of development of models and analysis for lifetime data, some statistical probability distributions have been developed. However, these developed distributions have not been able to suffice the whole vacuum of data fit. As a result, room for the development of new distributions by researchers to model day-to-day lifetime data has always been there. The creation of developing new generalized classes of distributions has attracted applied and theoretical statisticians owing to their properties of flexibility. The development of generalized distribution aims to find distribution flexibility and suitability for available data. In this decade, most authors have developed classes of distributions that are new, to become valuable for applied researchers. Development methods for the new distribution are numerous in the literature. Generalization of probability distributions was initially introduced1 where the authors generalized Weibull probability distribution, and the result was named exponential Weibull distribution which is common in modeling lifetime data.2 Later, a modeling failure time data was developed3 by Lehmann-type alternatives named as an exponentiated form to base distribution. Later on, two parameters of generalized exponential distribution (GED) were developed,4 also called exponential distribution (ED). For more details on GED, refer to Refs. 5, 6. Due to its importance in statistical inference and reliability applications, numerous authors studied the various properties of this distribution.5,7–14 It is proved that the GED is an excellent substitute for gamma, log-Normal and Weibull distributions.
The motive for extending distributions for modeling lifetime data is the capacity to simulate both monotonically and non-monotonically growing, decreasing, and constant failure rates, or more critically with bathtub shaped failure rates, even if the baseline failure rate is monotonic. The fundamental justifications for implementing a new distribution model in practice are as follows: to create tail weight distributions for modeling various real data sets, to generate distributions with negative, positive, and symmetric skewness, to define special models with all varieties of hazard rate functions, to make the kurtosis more flexible than the baseline distribution, and to consistently produce better fits than other generated distributions with the same underlying model.
A random variable X is said to have the GED, hereafter referred to as baseline distribution with shape and scale parameters if its probability density function (PDF) and cumulative density function (CDF) are given as respectively:
On the other hand, generalization was done in beta distribution under the name of generalized beta distribution; for more details refer to Ref. 15. They developed further generalized beta-generated (GBG) distribution, with a total of three parametric values.16 There are other many generalization methods in the literature depending on the nature of the distribution of data in hand.17 The researchers intend to introduce a new family of distribution which is named odd log-logistic generalized exponential distribution (OLLGED) to model heavy-tailed data set in daily-to-daily data set.18
The OLLGED is a generalization of exponential distribution with the addition of two parameters, which makes it have a total of three parametric values. The proposed distribution has a total of three parameters, lambda ( ) as the only scale parameter, alpha ( ) and gamma ( ), which are shape parameters introduced by generalization methods procedures, making it more flexible and thus, enabling the OLLGED to have an application to lifetime data and more extended to acceptance sampling plans and quality control charts.19,20
This paper is aimed at studying and defining a new lifetime paradigm namely OLLGED. Wide-ranging statistical properties and its applications through real data sets are given. More works on OLLGED have been presented.21,22 The distribution proposed contains several lifetime distributions, such as GED.23–25 OLLGED was introduced here for the reason:
1. It comprises a number above mentioned of well-known lifetime particular distributions;
2. The OLLGED demonstrates that shapes of hazard rates as monotonically decreasing, increasing, J, reversed-J, bathtub, and upside-down bathtub, which establishes that the recommended model has advanced to other lifetime distributions in hand;
3. To construct distribution to be used in special models that are capable of modeling skewed life time data and can also be used in a various areas of applications;
4. From the studies in section 2, the OLLGED would be considered with GED as baseline distribution6;
5. Asymmetric data that may not be well-fitted to other regular distributions may be fitted properly by the proposed model; and
6. The OLLGED beats numerous competitor distributions based on two real data illustrations.
7. The main drawback of this model or any model is while estimating parameters in simulation studies convergency creates a problem. Sometimes model validity is veridificult due more parameters in the model.
The class of distributions called the OLL-G family (generalized log-logistic-G family) by adding one more shape parameter was introduced.22 OLL-G family PDF and CDF are as follows:
The next sections of this article are organized as follows; in Section 2, special models associated with OLLGED are explained. In Section 3, useful expansions and OLLGED properties are derived. Section 4 discussed the estimations of the parameters. The simulation study is carried out based on various parametric values of the proposed distribution in Section 5. Data analysis is done using two-lifetime data sets in Section 6, and in Section 7 of the article, discussion and conclusion are done.
Using equations (1) and (2) in equations (3) and (4), we can develop the OLL-G family with baseline distribution as GED and it is named OLLGED. The PDF and CDF of OLLGED are given by
Here and are shape parameters and is a scale parameter of the distribution. Henceforth, if a random variable X follows to OLLGED with shape parameters and scale parameter , it is denoted as .
The OLLGED is a more flexible distribution that provides several distributions by inter-changing parametric values. It contains the following models:
i) When , the resulting distribution becomes GED.6
ii) When , the resulting distribution becomes an OLLGED.
iii) When and , the resulting distribution becomes an ED.
Figure 1 is displayed for PDF and Figure 2 is displayed for CDF for various parametric values for OLLGED. Figures 1 and 2 reveal that the OLLGE family produces distributions with different shapes namely symmetrical, reversed-J and right-skewed. Figures 1 and 2 revealed that the OLLGED is more flexible with different shapes namely symmetrical, Reversed-J, and left and right-skewed. Figures 1 and 2 revealed that the OLLGED is more flexible with various parameter values considered which gives the property that it was suitable to use for lifetime data, for whichever data set distribution will fit its characteristics. More specifically, when and the shape of the distribution is reversed-J. It shows that the shape parameter has more influence on the nature of the curve of the distribution. Specifically, for small values of shape parameters, there is a reverse J shape and for larger values of shape parameters, the nature of the curves is gradually increasing and then gradually decreasing.
The survival function and hazard rate, and respectively for OLLGED are respectively given below:
The visualization of survival functions and hazard rates of OLLGED for various parametric values are presented in Figures 3 and 4. Supplementary figures 3 and 4 disclose that this family can generate shapes for instance increasing, reversed-J, decreasing, constant, and upside-down bathtubs. This shows that the OLLGE family could be extremely practical to fit data sets for diversified shapes.
Using Taylor’s series specifically binomial series expansion for expansion of CDF and PDF for distribution as derived by OLLGED enables us to obtain the following functions as alternatives to the Equations given as PDF and CDF in equation (5) and (6) respectively. At this juncture, the CDF of OLLGED can be written using binomial expansion of its expressions as it was derived in Ref. 20 while expressing in much more simplified form parts of the CDF equations see in equation (9) and then substituted in the equation (6) to obtain the CDF see equation (10):
Whereas, .
The generalized binomial expansion is considered for :
Where .
Thus, the CDFs of the OLLGED can be expressed as follows:
Where .
The following expression is for the ratio of the two-power series:
Where and the coefficients of CK for are determined from the recurrence generator which is given as:
The quantile function of the OLLGED is given by derivations while considering important theories.
Recalling the function for the quantile of the probability distribution to be given as:
Insert equation (10) in equation (18), and solve for the variable x we get
Upon substituting the appropriate value of quantile , we will be able to obtain its quantile value .
The moment for the OLLGED is given as:
Since
Now consider .
Then we obtain equation (17) as follows:
Where
Moment generating function for the OLLGED is derived in the following manner:
Where .
Since the moment cannot be obtained easily, in such a case, there are several methods for evaluating Skewness and Kurtosis in literature. Some of the famous methods are Galton's Skewness and Moor’s Kurtosis methods,26 both of which utilize octile of the distribution.
Galton skewness of the distribution is given by considering octiles as follows:
Thus, based on varying values of distributional parameters, various values of skewness can be obtained and Figure 5 displayed the 3-dimensional plot of the skewness of the distribution. From Figure 5 it is evident that the skewness decreases as both increase when .
While for kurtosis, Moor’s Kurtosis method is used, which is based on octiles and it is given by:
A 3-dimensional plot for varying values of distributional parameters is presented in Figure 6. From Figure 5 it is clear that the kurtosis decreases as both increase when . The moments, skewness, and kurtosis for various parametric combinations are given in Table 1. When we fix the parameter λ, the skewness and kurtosis of OLLGED increases as α and γ increases. More specifically when parametric values are increases the skewness becomes negative and kurtosis becomes mesokurtic.
For the residual life, moment is generally given as, , which is uniquely determined for the cumulative function . Assuming X to be a random lifetime variable with then the residual life moment is obtained as .
Many other functions are derived from the residual life moment such as mean residual life (MRLF) or life expectation at time t defined by:
, this presents the expected additional life length for a unit that is alive at time t.
The reversed residual life moment is generally defined as, only defined for and , then, can be used to determine uniquely .
Thus, the mean inactivity time (MIT) also referred to as mean waiting time (MWT) or mean reversed residual lifetime given by; , which is the waiting time, since the failure of an item on condition that the failure has occurred in (0, t).
In practice, most of the events occur randomly following a chronological order either ascending or descending. Thus, their probability distribution properties such as CDF and PDF can be written taking into consideration such criteria of their orders. The order statistics consider the order of occurrence of a random variable. Suppose that X1, X2 … Xn, is a random sample from the OLLGED, in the ascending values of the ordered random variables as , the PDF of the jth order statistic, say Xj;n, is given in the next equation (24):
Whereas, is the beta function.
Upon substitution of equations (9) and (10) in equation (24) we get the following expression:
Where denotes the probability density function for OLLGED having r+k+1 power parameter.
Where, , hence the quantity is obtained recursively by and for values of .
Therefore, the density function of the OLLGED order statistics is a combination of GED. Based on , it is noted that the properties of follow from the properties of . Thus, the moment of can be expressed as:
Consider moment in equation (25) for the derivation of explicit expression for L-moments of X as infinite weighted linear combinations of suitable OLLGED order statistics defined as a linear function as:
The consideration of the unknown OLLGED model parameters from the complete samples is determined by using maximum likelihood estimations (MLE) as it is commonly used in the literature,27 which for OLLGED parameters are . Assuming be a random sample from OLLGED, the log-likelihood function is given by:
Upon finding the second derivative, we obtain the following equations:
Similarly, second derivatives concerning parameters are obtained and
hence an information matrix is formed and given as:
Since it seems not possible to solve the obtained MLE of parametric estimates analytically, then it is wise to solve these estimates using softwares such as R (an open source software for statistical computing and graphics) and SAS (an integrated software suite for advanced analytics, business intelligence, data management, and predictive analytics), we can find MLE for the OLLGED parameters or else find the solution to obtained non-linear likelihood equations. For the sake of this research work, the analysis is carried out using the R statistical software28 to obtain parametric values for the MLE estimate of the suggested OLLGED.
This section deals with the behavior of the MLEs of the unknown parameters of the proposed OLLGED has been assessed through simulation. The simulation study is carried out for sample sizes n = 50, 100, 150, 200, 250, and 300 from OLLGED with 6 combinations of parameters. To evaluate the performance of the MLEs for the OLLGED model, the simulation study was performed as follows: Generate B = 3000 samples of size n from , compute the MLE for the B samples, say . Compute the biases and mean squared errors (MSE) based on B samples. We repeated these steps for n = 50, 100, 150, 200, 250, and 300 with different values of . To estimate the MLEs, the Broyden–Fletcher–Goldfarb–Shanno (BFGS) method in R software was used. Table 2 gives empirical results and its values reveal that the estimates are quite stable and, meaningfully, are near to the actual value of the parameters as the sample size increases for all parameters. The bias and mean square error (MSE) of both parameters decrease as the sample size increases as anticipated. The bias and MSE of the parameters are obtained as follows:
. Where .
The following two data sets were used to reveal the applications of OLLGED for showing the flexibility and importance of the proposed distribution. For the application of the OLLGED using the first data set for illustration, the data represent waiting times (in seconds) between 65 successive eruptions of water through a hole in the cliff at the coastal town of Kiama (New South Wales, Australia), known as the Blowhole; the data can be obtained from http://www.statsci.org/data/oz/kiama.html. This data set has already been used29: DOI: http://dx.doi.org/10.15446/rce.v42n1.66205 as follows: 83, 51, 87, 60, 28, 95, 8, 27, 15, 10, 18, 16, 29, 54, 91, 8, 17, 55, 10, 35,47, 77, 36, 17, 21, 36, 18, 40, 10, 7, 34, 27, 28, 56, 8, 25, 68, 146, 89, 18, 73, 69, 9, 37, 10, 82, 29, 8, 60, 61, 61, 18, 169, 25, 8, 26, 11, 83, 11, 42, 17, 14, 9, 12.
The second data set used here was the survival times (given in years) of a group comprising 46 patients treated with chemotherapy alone. This data set was earlier reported18,30; doi: https://doi.org/10.1016/j.joems.2014.12.002, for ready reference, the survival times (years) are 0.047, 0.115, 0.121, 0.132, 0.164, 0.197, 0.203, 0.260, 0.282, 0.296, 0.334, 0.395, 0.458, 0.466, 0.501, 0.507,0.529,0.534,0.540, 0.641, 0.644, 0.696, 0.841, 0.863, 1.099, 1.219, 1.271, 1.326, 1.447, 1.485, 1.553, 1.581, 1.589, 2.178, 2.343, 2.416, 2.444, 2.825, 2.830, 3.578, 3.658, 3.743, 3.978, 4.003, 4.033.
Furthermore, the developed OLLGED fits were compared with other models like odd generalized exponential log-logistic distribution (OGELLD),31 Type-II generalized log-logistic distribution (ELLD),32 odd exponential log-logistic distribution (OELLD),33 generalized exponential distribution (GED),6 exponential distribution (ED) and log-logistic distribution (LLD) studied by.25,34 The competency of the proposed model with other models is examined based on goodness-of-fit criteria such as the maximized log-likelihood under the model ( ), Akaike information criterion (AIC), Bayesian information criterion (BIC), Anderson-Darling (A*), Cramer-von Mises (W*) and Kolmogorov Smirnov (KS) statistic along with its p-value.
Tables 3 and 5 presented the MLEs of the model parameters respectively (of the fitted distribution) and their standard errors (SEs), KS, and p-value statistics for the distributions fitted OLLGED, OGELLD, OELLD, ELLD, LLD, GED, and ED models for the two data sets correspondingly. Tables 4 and 6 show the values of , A*, W*, BIC, and AIC the for the two data sets separately. As shown in Tables 3-6, the OLLGED is the best among those distributions because it has the smallest value of (K-S), AIC, BIC, , A* and W*. The histogram of the first data set, fitted PDFs of the best seven fitted OLLGED, OGELLD, OELLD, ELLD, LLD, GED, and ED, their CDF plots and PP-plot are demonstrated in Figure 7. The histogram of the second data set fitted PDFs of the best seven fitted OLLGED, OGELLD, OELLD, ELLD, LLD, GED, and ED, their CDF plots and PP-plot are displayed in Figure 8. From Figures 7 and 8, highlighted that the proposed OLLGED is best model as compared with rival existing distributions.
This article extends a new odd log-logistic generalized exponential distribution with three parameters to study the nature of the distribution in terms of kurtosis and skewness. The special models of the odd log-logistic generalized exponential family namely generalized exponential distribution, log-logistic distribution, and exponential distribution are presented. The common mathematical properties are obtained for the OLLGED. The parameters estimation is considered by the maximum-likelihood approach and simulation results are acquired to confirm the performance of these estimators. The application and flexibility of the OLLGED are ensured through empirical observation using two sets of lifetime data, establishing that the proposed OLLGED can provide a better fit in comparison to existing rival models, such as odd generalized log-logistic, type-II generalized log-logistic, exponential distributions, odd exponential log-logistic, generalized exponential, and log-logistic. The bias and mean square error of the parameters decrease as the sample size increases. The limitation of the proposed model is for very small values the bias and MSE are not stable. This model may not suitable for small samples and high peaked data.
The first data set related to waiting times (in seconds) between 65 successive eruptions of water through a hole in the cliff at the coastal town of Kiama obtained from http://www.statsci.org/data/oz/kiama.html. Used by Silva, R., Gomes-Silva, F., Ramos, M., Cordeiro, G., Marinho, P., & Andrade, T. A. N. D. (2019). The Exponentiated Kumaraswamy-G Class: General Properties and Application. Revista Colombiana de Estadística, 42, 1-33.
The second data set is drawn from Alizadeh, M., Tahir, M. H., Cordeiro, G. M., Mansoor, M., Zubair, M., & Hamedani, G. G. (2015). The Kumaraswamy Marshal-Olkin family of distributions. Journal of the Egyptian Mathematical Society, 23(3), 546-557. doi: https://doi.org/10.1016/j.joems.2014.12.002
Also, used by Bekker, A., Roux, J. J. J., & Mosteit, P. J. (2000). A generalization of the compound rayleigh distribution: using a bayesian method on cancer survival times. Communications in Statistics - Theory and Methods, 29(7), 1419-1433. doi: https://doi.org/10.1080/03610920008832554
The authors are deeply thankful to the editor and reviewers for their valuable suggestions to improve the quality of the paper. The authors would like to acknowledge the technical support received from the Office of the Deputy Vice Chancellor Academic, Research and Consultancy, The University of Dodoma, Tanzania.
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Distribution theory, applied statistics, mathematical modelling.
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Distribution theory, applied statistics, mathematical modelling.
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Mathematical Statistics, Distributional theory, Applied Statistics, Medical statistics
Is the work clearly and accurately presented and does it cite the current literature?
No
Is the study design appropriate and is the work technically sound?
No
Are sufficient details of methods and analysis provided to allow replication by others?
No
If applicable, is the statistical analysis and its interpretation appropriate?
No
Are all the source data underlying the results available to ensure full reproducibility?
No
Are the conclusions drawn adequately supported by the results?
No
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Mathematical Statistics, Distributional theory, Applied Statistics, Medical statistics
Is the work clearly and accurately presented and does it cite the current literature?
Yes
Is the study design appropriate and is the work technically sound?
Partly
Are sufficient details of methods and analysis provided to allow replication by others?
Partly
If applicable, is the statistical analysis and its interpretation appropriate?
Yes
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
Partly
References
1. Gleaton JU, Lynch JD: Properties of generalized log-logistic families of lifetime distributions.J Probab Statist Sci.2006; 4 (1): 51-64Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Distribution theory, applied statistics, mathematical modelling.
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | ||
---|---|---|
1 | 2 | |
Version 3 (revision) 21 Nov 23 |
read | |
Version 2 (revision) 10 Aug 23 |
read | read |
Version 1 06 Dec 22 |
read | read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)