Dagum Power Series Class of Distribution with Applications to Lifetime Data

Boikanyo Makubate; Broderick Oluyede; Galetlhakanelwe Motsewabagale

Dagum Power Series Class of Distribution with Applications to Lifetime Data

Boikanyo Makubate^1*, Broderick Oluyede², and Galetlhakanelwe Motsewabagale³

¹Botswana International University of Science and Technology Palapye, Botswana

²Georgia Southern University, USA

³Galetlhakanelwe Motsewabagale, Botswana International University of Science and Technology, Botswana

Corresponding Author:: Boikanyo Makubate
Botswana International University of Science and Technology Palapye, Botswana.
E-mail: makubateb@biust.ac.bw

Received date: 08/01/2019 Accepted date: 06/02/2019 Published date: 18/02/2019

Visit for more related articles at Research & Reviews: Journal of Statistics and Mathematical Sciences

Abstract

We present a new class of distributions called the Dagum-Power Series (DPS) distribution. This model is obtained by compounding Dagum distribution with the power series distribution. Statistical properties of the DPS distribution are obtained. Methods of finding estimators including maximum likelihood estimation will be discussed. A simulation study is carried out for the special case of Dagum-Poisson distribution to assess the performance of the DPS distribution. Finally, we apply the special case of Dagum geometric and Dagum Poisson in the class of (DPS) distributions to real data sets to illustrate the usefulness and applicability of the distributions.

Keywords

Dagum distribution, Dagum power series distribution, Dagum poisson distribution, Dagum geometric distribution, Dagum logarithmic distribution

Introduction

Dagum distribution [1,2] is a well-known and established three-parameters distribution for modeling empirical income and wealth data, that could accommodate both heavy tails in empirical income and wealth distributions, and also permit interior mode. A random variable X following the Dagum distribution with shape parameters β, δ>0 and scale parameter λ>0 has a Cumulative Distribution Function (CDF) given by:

equation (1)

The r^th raw or non central moments of Dagum distribution are given by:

The rth raw or non central moments of Dagum distribution are given by:

equation (2)

for δ>r, and β,δ>0, where B(.,.) is the beta function. The q^th percentile of the Dagum distribution is

equation (3)

Dagum distribution has positive asymmetry, and its hazard rate can be monotonically decreasing, upside-down bathtub and bathtub followed by upsidedown bathtub [3], thus overcoming the forms presented by the exponential, gamma and Rayleigh distributions for modeling lifetime data. Dagum distribution has been used in several research areas such as economics and lifetime data. However, in order to fit still more complex situations, a number of extensions have been proposed in recent years. For example, see the works by Domma, Huang Oluyede [4,7] . It is the purpose of this article to propose a new extension of the Dagum distribution by compounding the Dagum distribution and the power series distributions. The general class is called the Dagum Power Series (DPS) family. We employ the compounding procedure in [8]. The power series family of distributions includes binomial, Poisson, geometric and logarithmic distributions [9]. In the same way, several classes of distributions were proposed by compounding some useful lifetime and power series distributions [10-16]. Also, a primary motivation for the development of this distribution is the modeling of lifetime data with a diverse model that takes into consideration not only shape, and scale but also skewness, kurtosis and tail variation. Also, motivated by various applications of geometric, Poisson and Dagum distributions in several areas including reliability, finance and actuarial sciences, as well as economics, where Dagum distribution plays an important role in the size distribution of personal income, we construct and develop the statistical properties of this new class of generalized Dagum-type distribution called the Dagum-geometric and Dagum-Poisson distributions and apply the models to real life data in order to demonstrate the usefulness of the proposed class of distributions. In this regard, we present the special cases of the compound new four-parameter distributions, referred to as the Dagum-geometric (DG) and Dagum-Poisson (DP) distributions. This paper is organized as follows. In Section 2; we introduce the new family and present a useful representation for its CDF and pdf: We discuss in Section 3 four special models of the proposed family. In Section 4 we present a representation for the hazard and reverse hazard function for the new family.

Some statistical properties of the new distribution including the quantile function, moments, and conditional moments are presented in Section 5. Mean and median deviations, Bonferroni and Lorenz curves are derived in Section 6: The distribution of order statistics and R_enyi entropy are given in Section 7: Maximum likelihood estimates of the unknown parameters are presented in section 8: The special cases of the Dagum-geometric and Dagum-Poisson distributions are discussed in section 9: A simulation study is conducted in order to examine the bias, mean square error of the maximum likelihood estimators and width of the con_dence interval for each parameter of the DP model in section 9: Section 10 contains concluding remarks.

Dagum Power Series Distribution

Recently, Dagum distribution has been studied from a reliability point of view and used to analyze survival data [4,17]. Huang and Oluyede [17] developed the exponentiated Kumaraswamy-Dagum distribution and applied it to income and lifetime data. Kleiber [18] provided a comprehensive summary of Dagum distribution. See Kleiber and Kotz [19] for additional results on statistical size distribution with applications in economics and actuarial sciences. Suppose that the random variable X has the Dagum distribution where its CDF is given in equation (1). Given N; let X₁,…,X_N be independent and identically distributed random variables from Dagum distribution. Let N be a discrete random variable with a power series distribution (truncated at zero) and Probability Mass Function (PMF)

equation

where a_n ≥ 0 depends only on n, equation and θ ÃÆÃÂÃâÃ¢â¬Å¾ (0,s) (s can be ∞) is chosen such that C(θ) is finite and its three derivatives with respect to _ are defined and given by C’(.),C”(.) and C’”(.), respectively. The power series family of distributions includes binomial, Poisson, geometric and logarithmic distributions. See Table 1 for some useful quantities including a_n, C(θ), C(θ)^-1, C’(θ); and C”(θ) for the Poisson, geometric, logarithmic and binomial distributions.

Distribution	C(θ)	C’(θ)	C”(θ)	C(θ)-1	an	Parameter space
Poisson	e^θ -1	e^θ	e^θ	Log (1+θ)	(n!)^-1	(0,∞)
Geometric	Θ (1-θ)^-1	(1-θ)^-2	2 (1-θ)^-3	θ(1+θ)^-1	1	(0,1)
Logarithmic	-log (1-θ)	(1-θ)^-1	(1-θ)^-2	1- e^θ	n^-1	(0,1)
Binomial	(1+θ)^m-1	m(1+θ)^m-1	m(m-1)(1+θ)^m-2	(1+θ)1/^m-1		(0,1)

Table 1. Useful quantities for some power series distributions.

Let X=max(X₁,….,X_N), then the cumulative distribution function (CDF) of X|N=n is given by:

equation (4)

which is the Dagum distribution with parameters λ, δ and nβ. The Dagum Power series class of distributions denoted is defined by the marginal CDF of X.

The general form of the CDF and pdf of the Dagum-power series distribution are given by:

equation (5)

and

equation (6)

respectively, where equation with a_n>0 depends only on n. A series representation of the DPS CDF is given by:'

equation (7)

Where equation is the Dagum CDF with parameters β (n+1), δ, λ>0. Similarly, the DPS PDF can be written as:

equation (8)

where f_D(x, β (n+1), δ, λ) is the Dagum pdf with parameters β (n+1), δ, λ> 0.

On the other hand, if we consider X₍₁₎=min(X₁,…..,X_N) and conditioning upon N=n, then the conditional distribution of X₍₁₎ given N=n is obtained as:

equation

The CDF of X₍₁₎, say F_DPS, is given by:

equation

Special Cases

In this section, we present four special cases of the DPS family, namely the Dagum Poisson (DP); Dagum Geometric (DG); Dagum Logarithmic (DL) and Dagum Binomial (DB): For x>0, λ, β, δ, θ>0; the CDF of the DP is defined by equation (5) with C(θ)=e^θ-1 leading to:

equation (9)

The PDF of the DP distribution is given by:

equation (10)

Similarly, the CDF and PDF of the DG distribution are respectively, given by:

equation (11)

and

equation (12)

Whereas, the CDF and pdf of the DL distribution are given by:

equation (13)

and

equation (14)

respectively. Finally, the CDF and PDF of the DB distribution are given by:

equation (15)

and

equation (16)

respectively.

Hazard and Reverse Hazard Functions

The hazard and reverse functions of the DPS distribution are given by:

equation (17)

And,

equation (18)

respectively.

Special Cases of Hazard Functions

The hazard and reverse hazard functions for DP distribution are given by:

equation

and

equation

respectively. Likewise, the hazard and reverse hazard functions for DG distribution are given by:

equation

and

equation

respectively. The hazard and reverse hazard functions for the DL distribution are given by:

equation

and

equation

respectively. The hazard and reverse hazard functions for DB distribution are given by:

equation

and

equation

respectively.

Quantile Function, Moments and Conditional Moments

The q^th quantile of the DPS distribution is given by:

equation (19)

where U ÃÆÃÂÃâÃ¢â¬Å¾ (0,1) and C^-1(.) is the inverse function of C(.), Consequently, a random number can be generated based on equation (19).

Moments are necessary and crucial in any statistical analysis, especially in applications. Moments can be used to study the most important features and characteristics of a distribution (e.g., tendency, dispersion, skewness and kurtosis). If the random variable X has a DPS distribution, with parameter vector θ=(λ, δ, β, θ); then the r^th moment of X is given by:

equation (20)

for δ>r. Note that the r^th non-central moment follows readily from the fact that DPS pdf can be written as a linear combination of Dagum densities with parameters λ, β (n+1),δ>0.

The r^th conditional moment for DPS distribution is given by:

equation (21)

Where equation The mean residual life function is

Mean and Median Deviations, Bonferroni and Lorenz Curves

The amount of scattering in a population can be measured to some extent by the totality of deviations from the mean and the median. In this section, the mean and median deviations, as well as Bonferroni and Lorenz curves of the DPS distribution are presented. If X has the DPS distribution, we can derive the mean deviation about the mean μ=E(X) and the mean deviation about the median M from

equation

respectively. The mean μ is obtained from equation (20) with r=1, and the median M is given by equation (19) when q=1/2. The measure δ₁ and δ₂ can be calculated by the following relationships:

equation (22)

where equation is given by:

equation

and equation then Bonferroni, B(p) and Lorenz, L(p) curves are given by:

equation

respectively, The mean of the DPS distribution is obtained from equation (20) with r=1 and the quantile function is given in equation (19). Consequently,

equation (23)

equation is incomplete Beta function.

Order Statistics and R_Enyi Entropy

In this section, the distribution of the kth order statistic and Renyi entropy for the DPS distribution are presented. The entropy of a random variable is a measure of variation of the uncertainty.

Order Statistics

The pdf of the k^th order statistics from a pdf f(x) is

equation (24)

We apply the series expansion

equation (25)

for b>0 and |z<1, to obtain the series expansion of the distribution of order statistics from DPS distribution. Using equations (25), the pdf of the k^th order statistic from DPS distribution is given by:

equation (26)

Renyi Entropy

Renyi entropy of a distribution with pdf f(x) is defined as:

equation

Note that by using equation (6), we have:

equation (27)

Consequently, Renyi entropy of DPS distribution is given by:

equation

Renyi entropy for the special cases can be readily obtained for specified C(θ).

Estimation

In this section, we discuss several methods of estimation of the model parameters including maximum likelihood, ordinary least squares, weighted least squares, minimum distance and maximum product of spacing. The method of maximum likelihood is presented in detail.

Maximum Likelihood Estimation

Estimation of the unknown parameters of the DPS distribution by the method of maximum likelihood is addressed in this section. Let x₁,…, x_n be a random sample of size n from DPS distribution and θ=(β,δ,λ,θ)T be the parameter vector. The loglikelihood function for _ based on this sample becomes

equation (28)

The score components corresponding to the parameters in θ are given by:

equation

And,

equation (29)

where The Maximum Likelihood Estimator (MLE), can be obtained by solving simultaneously the nonlinear equations”.

equation

These are not linear in the parameters. Hence iterative methods are required to solve them. The Newton-Raphson method is an iterative method for solving nonlinear equations. To implement the Newton-Raphson method we require the second partial derivatives of log-likelihood function (28). The second partial derivatives are obtained as

equation (30)

Note that (30) is a matrix called the Hessian matrix. In the Newton-Raphson method, provisional estimates for the vector of parameters θ on iteration ‘i’ are improved by:

equation

These iterations continue until the changes in the parameter estimates and/or likelihood value are sufficiently small. At this point, the solution is said to have converged, and the large-sample variance-covariance matrix of the maximum likelihood estimator is then obtained as the negative inverse of the matrix of second derivatives. Standard errors of the parameter estimates are obtained as the square root values of the diagonal entries of this (negative inverse) matrix. The maximum likelihood estimates and their accompanying standard errors can be used to compute asymptotic z-statistics (i.e., Wald statistics) or construct confidence intervals.

Ordinary Least Squares and Weighted Least Squares

In this subsection, we discuss the method of both ordinary least squares and Weighted Least Squares. Let x_1:n<x_2:n<…<x_n:n denote the order statistics based on a random sample of size n from the distribution with CDF F(y), then

equation (31)

Now, let equation denote the order statistics based on a random sample of size n from the DPS distribution. The Ordinary Least Square (OLS) estimates of the DPS parameters are obtained by minimizing the function

equation (32)

The Weighted Least Square (WLS) estimates of the DPS parameters equation are obtained by minimizing the function

equation (33)

where equation

Minimum Distance Methods

The estimates of the DPS parameters can be obtained via the minimization of the well-known Anderson-Darling and Cramervon Mises goodness-of-_t statistics. This class of goodness-of-_t statistics is based on the difference between the estimates of the DPS CDF and the corresponding empirical distribution function. The Anderson-Darling (AD) estimates of the DPS model parameters θ_AD, say equation are obtained by minimizing the function

equation (34)

The Cramer-von Mises (CVM) estimates of the DPS parameters equation are obtained by minimizing the function

equation (35)

with respect to the parameters θ=(β, δ, λ, θ)^T.

Maximum Product of Spacings Method

The (n+1) uniform spacings of the first order of the sample are given by D1=F(x_1:n|θ), Dn+1=1-F(_xn:n|θ) and D_i=F(x_i:n| θ)- F(x_(i-1):n| θ); i=1,2,…, n: The Maximum Product of Spacings (MPS) method consist of finding the values of θ which maximizes the geometric mean of the spacings given by:

equation (36)

or equivalently,

equation (37)

by taking The MPS estimates of the parameters, denoted by is obtained by solving the nonlinear equations

equation

using a numerical method. See Chen and Amin [20] for additional details.

Dagum-Geometric and Dagum Poisson Distributions

In this section, we consider and study Dagum-geometric and Dagum-Poisson distributions in detail. We focus on the case in which N is a discrete random variable following a geometric distribution (truncated at zero) with the probability mass function given by

equation (38)

Note that N can also be taken to follow other discrete distributions, such as binomial, Poisson, logarithmic, where the discrete distribution need to be truncated zero because one must have N ≥ 1. Now, let X₁,X₂,….,X_N be N independent and identically distributed (iid) random variables following the Dagum distribution cumulative distribution function. If equation then the conditional cumulative distribution of X(1)|N=n is given by:

equation

and the CDF of X₍₁₎ is given by:

equation

equation (39)

Note that with equation then the conditional cumulative distribution of X(N)|N=n is given by:

equation

and the CDF of X_(N) is given by:

equation (40)

The corresponding pdf of X₍₁₎ is given by:

equation (41)

We shall refer to the distribution given by equations (40) and (41) as the Dagum geometric (DG) CDF and pdf, respectively. The Hazard Function (HF) and Reverse Hazard Functions (RHF) of the DG distribution are given by:

equation (42)

and

equation (43)

respectively. The plots of DG pdf and hazard rate function for selected values of the model parameters λ, δ, β and θ are given in Figures 1 and 2.

statistics-and-mathematical-sciences-graphs

Figure 1. Graphs of DG pdf.

statistics-and-mathematical-sciences-hazard-functions

Figure 2. Graphs of DG hazard functions.

These plots of the hazard function show various shapes including monotonically decreasing, unimodal and upside down bathtub shapes for different combinations of the values of the parameters. The density and hazard functions can exhibit different behavior depending on the values of the parameters when chosen to be positive, as shown in these plots (Figures 1 and 2). However, it is hard to analyze the shape of both the density and hazard function due to their complicated forms. The plots of the hazard rate function show various shapes including monotonic and non-monotonic shapes such as upside down bathtub shapes for the combinations of the values of the parameters. This flexibility makes the DG hazard rate function suitable for both monotonic and non-monotonic empirical hazard behaviors that are likely to be encountered in real life situations.

Let N be distributed according to the zero-truncated Poisson distribution

with PDF

equation (44)

Let X=max(Y₁,….,Y_N), then the CDF of X|N=n is given by:

equation (45)

which is the Dagum distribution with parameters λ, δ, and nβ: The Dagum-Poisson (DP) distribution denoted by DP (λ, δ, β, θ) is defined by the marginal CDF of X, that is,

equation (46)

for x>0, λ, β, δ, θ>0. The DP density function is given by:

equation (47)

The plots of DP pdf and hazard rate function for selected values of the model parameters λ, δ, β and θ are given in Figures 3 and 4.

Figure 3. Graphs of DP PDF.

Figure 4. Graphs of DP hazard functions.

The plots of the hazard function show various shapes including monotonically decreasing, unimodal and bathtub followed by upside-down bathtub, upside down bathtub shapes for different combinations of the values of the parameters.

Some Sub-Models of the Dagum-Geometric and Dagum-Poisson Distributions

The DG distribution is a very flexible model that has several different sub-models when its parameters are changed. The DG distribution contains several sub-models including the following distributions.

• If β=1, then DG distribution reduces to a new distribution called Log-Logistic Geometric (LLoGG) or Fisk Geometric (FG) distribution with CDF given by:

equation (48)

If in addition to β=1; we have θ↓0, then the resulting distribution is the log-logistic or Fisk distribution.

• When θ↓0 in the DG distribution, we obtain Dagum distribution

• The Burr-III geometric (BIIIG) distribution is obtained when λ=1: If in addition, θ↓0, the Burr-III distribution with parameter δ,β>0 is obtained

The DP distribution contains several special-models including the following distributions.

• If β=1; then DP distribution reduces to a new distribution called Log-Logistic Poisson (LLoGP) or Fisk-Poisson (FP) distribution with CDF given by:

equation (49)

If in addition to β=1; we have θ↓0, then the resulting distribution is the log-logistic or Fisk distribution.

• When θ↓0 in the DP distribution, we obtain Dagum distribution

• The Burr-III Poisson (BIIIP) distribution is obtained when λ=1: If in addition, θ↓0, Burr-III distribution with parameter δ,β>0 is obtained

Some Statistical Properties

In this subsection, some statistical properties of DG and DP distributions including quantile function, moments, conditional moments, Lorenz and Bonferroni curves are presented.

Expansion of Dg and Dp Densities

In this subsection, we provide an expansion of the DG distribution. Note that, using the fact that if |z<1 and k>0, we have the series representation

equation (50)

and the binomial expansion [1-(1+λx-^δ)^-β]ⁱ given by:

equation (51)

the DG pdf can be written as follows:

equation (52)

where equation is the Dagum pdf with parameters

λ, β (j+1),δ>0. The above equation shows that the DG density is indeed a linear combination of Dagum densities. Consequently, the mathematical and statistical properties can be immediately obtained from those of the Dagum distribution.

Similarly, applying the fact that Maclaurin series expansion of equation the DP pdf can be written as follows:

equation

(53)

where is the Dagum pdf with parameters λ, β (k+1), δ>0. The above equation shows that the DP density is indeed a linear combination of Dagum densities.

Quantile Functions

The q^th quantile of the DG distribution is obtained by solving the nonlinear equation , where U is a uniform variate on the unit interval [21]. It follows that the q^th quantile of the DG distribution is given by

equation

Similarly, the q^th quantile of the DP distribution is obtained by solving the nonlinear equation where U is a uniform variate on the unit interval [21]. It follows that the q^th quantile of the DP distribution is given by:

equation (55)

Consequently, the random number can be generated based on equation (55).

Moments

In this subsection, we present the r^th moment of DG and DP distributions. The r^th moment of the DG distribution is given by:

equation (56).

The Moment Generating Function (MGF) of X is given by:

equation

Similarly, the r^th moment of the DP distribution is given by:

equation (57)

Conditional Moments

For income and lifetime distributions, it is of interest to obtain conditional moments and mean residual life function. The r^th conditional moments for DG distribution is given by:

equation

where equation The mean residual life function is E (X|X>t)-t.

Similarly, the conditional moments for DP distribution is given by:

equation (58)

where equation The mean residual life function is E (XjX>t) –t.

Mean Deviations, Bonferroni and Lorenz Curves

If X has the DG distribution, we can derive the mean deviation about the mean μ=E(X) and the mean deviation about the median M from equation 22, that is,

equation

where the mean μ is obtained from equation (56) with r=1, the median M is given by equation (54) when q=1/2 , and

equation

Also, the measure δ₁ and δ₂ for the DP distribution can be calculated by the following relationships:

equation (59)

where follows from equation (58), that is,

equation

Bonferroni and Lorenz curves are a widely used tool for analyzing and visualizing income inequality. Lorenz curve, L(p) can be regarded as the proportion of total income volume accumulated by those units with income lower than

or equal to the volume p, and Bonferroni curve, B(p) is the scaled conditional mean curve, that is, the ratio of group mean income of the population. Let and μ=E(X), then Bonferroni and Lorenz curves are given by and respectively, for 0 ≤ p ≤ 1, and The mean of the DG distribution is obtained from equation (56) with r=1 and the quantile function is given in equation (54). Consequently,

equation

for δ>1, where equation is incomplete Beta function.

Similarly, for the DP distribution, we have:

equation

for δ>1, where equation is incomplete Beta function.

Order Statistics and Renyi Entropy

In this section, the distribution of the kth order statistic, L-moments [22] and Renyi entropy for the DG and DP distributions are presented. The entropy of a random variable is a measure of variation of the uncertainty.

Order Statistics

We apply the series expansion

equation (62)

For b>0 and |z|<1, and equation (50), to obtain the series expansion of the distribution of order statistics from DG distribution. The pdf of the kth order statistic from DG distribution (using equation (26)) is given by:

equation

where equation

The r^th moment of the distribution of the kth order statistics from the DG distribution is given by:

equation (63)

For δ>r; where equation is the complete beta function.

Similarly, the pdf of the k^th order statistic from DP distribution is given by:

equation

That is,

equation

where equation Thus, the pdf of the i^th order statistic from the DP distribution is a linear combination of Dagum pdfs with parameters λ, β (w+1) and δ>0. The rth moment of the distribution of the ith order statistic is given by:

equation

L-Moments

The L-moments [22] are expectations of some linear combinations of order statistics and they exist whenever the mean of the distribution exits, even when some higher moments may not exist. They are relatively robust to the effects of outliers and are given by:

equation (65)

The L-moments of the DG and DP distributions can be readily obtained from equation (65). The first four L-moments are given by respectively.

Renyi Entropy

Note that by using equation (62), we have,

equation (66)

Renyi entropy of DG distribution is given by:

equation

Maximum Likelihood Estimation

In this section, we consider the Maximum Likelihood Estimators (MLE's) of the parameters of the DG and DP distributions. Let x₁,…, x_n be a random sample of size n from DG or DP distribution and Θ=(λ, δ, β, θ)T be the parameter vector. The log-likelihood function for the DG distribution can be written as:

equation

equation (67)

The associated score function is given by equation where

equation

equation (69)

equation (70)

and

equation (71)

The Maximum Likelihood Estimate (MLE) of Θ, say , is obtained by solving the nonlinear system Un(Θ)=0. The solution of this nonlinear system of equation is not in a closed form. These equations cannot be solved analytically, and statistical software can be used to solve them numerically via iterative methods. We can use iterative techniques such as a Newton-Raphson type algorithm to obtain the estimate

The log-likelihood function for the DP distribution can be written as:

equation (72)

The associated score function is given by:

equation

where

equation (73)

equation (74)

equation (75)

and

equation (76)

The Maximum Likelihood Estimate (MLE) of Θ, say , is obtained by solving the nonlinear system U_n(θ)=0: The solution of this nonlinear system of equation is not in a closed form. These equations cannot be solved analytically, and statistical software can be used to solve them numerically via iterative methods. We can use iterative techniques such as a Newton-Raphson type algorithm to obtain the estimate of Θ.

Asymptotic Confidence Intervals

In this section, we present the asymptotic confidence intervals for the parameters of the DG and DP distributions. The expectations in the Fisher Information Matrix (FIM) can be obtained numerically. Let equation be the maximum likelihood estimate of Θ = (λ,α ,β ,θ ) . Under the usual regularity conditions and that the parameters are in the interior of the parameter space, but not on the boundary, we have where I(Θ) is the expected Fisher information matrix. The asymptotic behavior is still valid if I(Θ) is replaced by the observed information matrix evaluated at equation that is. The multivariate normal distribution with mean vector

0=(0, 0, 0, 0)T and covariance matrix I^-1(Θ) can be used to construct confidence intervals for the model parameters. That is, the approximate 100 (1-η)% two-sided confidence intervals for λ, δ, β and θ are given by:

equation respectively, where are diagonal elements of and Z_η/2 is the upper (η/2)^th percentile of a standard normal distribution.

We can use the Likelihood Ratio (LR) test to compare the fit of the DG or DP distribution with its sub-models for a given data set. For example, to test θ=0; the LR statistic is equation where are the unrestricted estimates, and are the restricted estimates. The LR test rejects the null hypothesis if where denote the upper 100_{ÃÆÃÂÃâÃ¢â¬Å¾}% point of the χ² distribution with 1 degrees of freedom.

Monte Carlo Simulation Study

In this section, a simulation study is conducted to assess the performance and examine the mean estimate, average bias, root mean square error of the maximum likelihood estimators and width of the confidence interval for each parameter. We study the performance of the DP distribution by conducting various simulations for different sample sizes and different parameter values. Equation (55) is used to generate random data from the DP distribution.

The simulation study is repeated for N=5,000 times each with sample size n=25, 50, 75, 100, 200, 400, 800 and parameter values I: λ=2:5, β=0:6, δ=1:5, θ=0:8 and II: λ=3:8, β=0:5, δ=0:2, θ=1:2. Five quantities are computed in this simulation study.

Mean estimate of the MLE equation of the parameter ϑ = λ,δ ,β ,θ :

equation

Average bias of the MLE equation of the parameter ϑ = λ,δ ,β ,θ :

equation .

Root mean squared error (RMSE) of the MLE equation of the parameter ϑ = λ,δ ,β ,θ :

equation

Coverage probability (CP) of 95% confidence intervals of the parameter ϑ = λ,δ ,β ,θ , i.e., the percentage of intervals that contain the true value of the parameter ϑ

Average width (AW) of 95% confidence intervals of the parameter ϑ = λ,δ ,β ,θ

Table 2 presents the Mean, Average Bias, RMSE, CP and AW values of the parameters λ, δ, β and θ for different sample sizes. From the results in Table 2, we can verify that as the sample size n increases, the RMSEs decay toward zero. We also observe that for all the parametric values, the biases decrease as the sample size n increases. The table shows that the coverage probabilities of the confidence intervals are reasonably close to the nominal level of 95% and the average confidence widths decrease as the sample size n increases. Consequently, the MLE's and their asymptotic results can be used for estimating and constructing confidence intervals even for reasonably small sample sizes.

	l						ll
Parameter	n	Mean	Average bias	RMSE	CP	AW	Mean	Average bias	RMSE	CP	AW
λ	25	6.1059	3.6059	12.8314	0.9208	57.6441	10.4999	6.6999	18.0013	0.9442	99.1421
	50	4.1516	1.6516	8.4074	0.881	29.089	7.3514	3.5514	12.0537	0.9074	53.1391
	75	3.21	0.7097	5.2219	0.85	18.9689	6.2794	2.4794	9.2801	0.8894	38.5516
	100	2.7732	0.2732	3.3678	0.8226	14.5403	5.425	1.625	7.2022	0.8694	29.5216
	200	2.3204	-0.1796	1.8215	0.8004	9.7076	4.4593	0.6593	3.9145	0.875	18.3216
	400	2.233	-0.267	1.2408	0.8102	7.3836	4.1527	0.3527	2.5355	0.8794	12.898
	800	2.2742	-0.2258	0.9722	0.8288	5.8144	4.0522	0.2522	1.8465	0.8846	9.2902
β	25	0.581	-0.019	0.7345	0.8916	3.2394	0.4635	-0.03646	0.5781	0.9178	2.3821
	50	0.5124	-0.0876	0.3648	0.8966	1.5152	0.4205	-0.0795	0.3546	0.9174	1.2299
	75	0.5076	-0.0924	0.2996	0.9064	1.2018	0.4151	-0.0849	0.2097	0.9174	0.9724
	100	0.5039	-0.0961	0.2318	0.9126	1.0504	0.4207	-0.0793	0.1729	0.9168	0.8574
	200	0.5205	-0.0795	0.1765	0.9326	0.7726	0.4395	-0.0605	0.1421	0.9316	0.6306
	400	0.5447	-0.0553	0.1333	0.9408	0.536	0.4615	-0.0385	0.1131	0.9446	0.4499
	800	0.5673	-0.0327	0.092	0.9532	0.3613	0.4797	-0.0203	0.0779	0.9682	0.3066
δ	25	1.7356	0.2356	0.5704	0.987	2.2488	0.2314	0.0314	0.0708	0.9892	0.3004
	50	1.6076	0.1076	0.3801	0.9776	1.5019	0.2155	0.0155	0.0489	0.9834	0.2054
	75	1.55	0.05	0.2914	0.9698	1.2069	0.2097	0.0097	0.0408	0.9734	0.1672
	100	1.5211	0.0211	0.2487	0.9672	1.0362	0.2064	0.0065	0.0343	0.9756	0.1446
	200	1.4842	-0.0158	0.1742	0.9466	0.7559	0.2017	0.0017	0.0242	0.963	0.1044
	400	1.4775	-0.0225	0.1294	0.9424	0.5591	0.2001	0.0001	0.01797	0.9554	0.0762
	800	1.482	-0.018	0.094	0.9506	0.4134	0.1997	-0.0003	0.0129	0.9548	0.0553
θ	25	2.0443	1.2443	1.825	0.9422	14.2042	2.1845	0.9847	1.6176	0.976	14.1843
	50	2.1779	1.3779	2.005	0.9508	12.4729	2.2855	1.0855	1.806	0.9828	12.2621
	75	2.1795	1.3795	2.085	0.948	11.5404	2.2897	1.0897	1.9271	0.988	11.4679
	100	2.2186	1.4186	2.1324	0.957	10.8672	2.2543	1.0543	1.897	0.9912	10.5053
	200	2.0282	1.2282	1.9951	0.9748	9.1258	2.0514	0.8514	1.7881	0.9942	8.3227
	400	1.7006	0.9006	1.6196	0.9804	6.8676	1.7566	0.5566	1.469	0.9938	6.1189
	800	1.3582	0.5582	1.1099	0.9682	4.9404	1.4705	0.2706	1.0325	0.9854	4.2113

Table 2. Monte carlo simulation results: mean, average bias, RMSE, CP, and AW.

Applications

In this section, we present an example to illustrate the edibility of the DG and DP distributions and its sub-models for data modeling. Estimates of the parameters of DG distribution (standard error in parentheses), Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Kolmogorov-Smirnov (KS) are presented for each data set. The command NLP in SAS and the R package are used here. We also compare DG distribution with other distributions including Exponentiated Weibull Geometric (EWG) distribution and Exponentiated Weibull-Poisson (EWP) (Mahmoudi and Sephadar, [23] distribution. The CDF of the EWG and EWP distributions are given by

equation (77)

and

equation (78)

We also fitted the four parameter sub-model of the beta Weibull geometric (BWG) distribution given by:

equation (79)

To the failure times, data set. Plots of the _tted densities and the histogram of the data are given in Figure 5. Probability plots [24] are also presented in Figure 5. For the probability plot, we plotted equation against j-0:375/n+0:25, j=1, 2, ….,n, where x_(j) are the ordered values of the observed data. We also computed a measure of closeness of each plot to the diagonal line. This measure of closeness is given by the sum of squares

statistics-and-mathematical-sciences-failure-times

Figure 5. Graphs of failure times.

equation (80)

Failure Times Data

This data set represents the failure times of 50 components (per 1000h) from Murthy et al. [25], Weibull models (Vol. 505), page 180. Initial value for DG model in R code are λ=1, δ=1, β=1, θ=0.1. Estimates of the parameters of DG distribution and its related sub-models (standard error in parentheses), AIC, BIC, SS and KS for failure times data are given in Table 3. The estimated variance-covariance matrix for DG model is

Estimates					Statistics
Model	λ	δ	β	θ	-2logL	AIC	BIC	SS	KS
DG	756914.7	4.9705	0.1363	0.7528	201.21	209.21	216.86	0.11	0.09
	(0.0000002)	-0.403	-0.0319	-0.1486
FG	6.9602	0.9108	1	0.8391	210.97	216.97	222 .71	0.1788	0.1354
	-0.0009	-0.1033	(-)	(0.041)
F	1.1200	0.9108	1	0	210.97	214.97	218.79	0.1788	0.1355
	(0.2858)	-0.1033	(-)	(-)
D	42293.26	4.4033	0.0949	0	206.59	212.59	218.33	0.1867	0.0986
	(0.000002)	-0.2664	-0.0134	(-)
BIIIG	1	0.8915	1.0598	0	211.03	217.03	222 .76	0.1822	0.1407
	(-)	-0.1788	-0.4826	(-)
Biil	1	0.8915	1.0598	0	211.03	215.03	218.85	0.1822	0.1407
	(-)	-0.1091	-0.1627	(-)
	α	β	θ	p
B\.YG	0.4438	0.6937	8.9834	0.8926	203.77	211.77	219.41	0.2066	0.1557
	-0.1639	-1.0563	-31.151	-0.3603
	α	β	θ	ϒ
EWG	5.9525	28.7391	0.4315	0.2698	205.41	213.41	221.06	0.1737	0.1262
	(2.2754)	-0.1522	(0.452)	-0.0241
	α	β	ϒ	θ
EWP	0.0139	0.4972	0.6020	84.9653	204.91	212.91	220 .56	0.1580	0.1031
	-0.0267	-1.4189	-0.6779	-0.0048

Table 3. Estimates of models for failure times data set.

equation

The 95% two-sided asymptotic confidence intervals for λ, δ, β and θ are given by 756914.7 ± 0.000000392, 4.9705 ± 0.7899, 0.1363 ± 0.0.0625, and 0.7528 ± 0.2913, respectively.

The LR test statistics of the hypothesis H₀: FG vs H_a: DG and H₀: BIIIG vs H_a: DG are 9.7594 (p-value=0.001784) and 9.8160 (p-value=0.00173). The DG distribution is significantly better than FG and BIIIG distributions. Also, the DG distribution is significantly better than D distribution based on the LR test. The values of the statistics AIC, BIC and KS when compared to those for the sub-models and non-nested EWG, EWP and BWG models also points to DG distribution as a very good _t for the failure times data. The graphs of the fitted densities, probability plots and survival suction also shows that the DG distribution is indeed the “best" model for the failure times data set.

Kevlar 49/Epoxy Strands Failure Times Data

This data set consists of 101 observations corresponding to the failure times of Kevlar 49/epoxy strands with pressure at 90% [26-31]. The failure times in hours was analyzed by [32]. Estimates of the parameters of DP distribution and its related submodels (standard error in parentheses) [33-38], AIC, BIC, W*, A*, SS and KS for failure times data are given in Table 4. We also use the LR test to compare the DP distribution and its sub-models [39-42].

Estimates					Statistics
Model	λ	δ	β	θ	-2logL	AIC	AICC	BIC	W*	A*	SS
DP	7.2834	3.3711	0.2015	0.3744	200.09	208.09	208.51	218.55	0.0657	0.4632	0.0699
	(6.7978)	(0.6884)	(0.0602)	(1.0496)
FP	0.0108	0.6444	1.0000	39.4781	261.40	267.40	267.82	275.24	1.1126	6.0129	1.0084
	(0.0015)	(0.0455)		(4.09E-05)
BIIIP	1.0000	2.1278	0.3085	1.7423	207.48	213.48	213.90	221.33	0.2039	1.1841	0.2319
		(0.2470)	(0.0920)	(0.8605)
D	9.0357	3.4267	0.2110		200.21	206.21	206.46	214.06	0.0742	0.4980	0.0809
	(6.9969)	(0.6904)	(0.0548)
F	0.6240	1.2705	1.0000		225.37	229.37	229.49	234.60	0.5654	3.0709	0.3893
	(0.0850)	(0.1069)
Bill	c	k
	1.1737	1.6327			217.10	221.10	221.22	226.33	0.4401	2.3866	0.4741
	(0.0983)	(0.1637)
	α	β	ω	θ
EPLP	0.7894	1.7952	0.9385	1.1684	204.44	212.44	212.86	222.90	0.1349	0.8141	0.1292
	(0.2025)	(0.6112)	(0.3829)	(1.2585)
	δ	β	ϒ	θ
EWP	0.8588	1.3030	0.8717	1.2662	204.62	212.62	213.03	223.08	0.1408	0.8415	0.1347
	(0.3679)	(0.7394)	(0.2408)	(1.2007)

Table 4. Estimates of models for kevlar strands failure times data.

The estimated variance-covariance matrix for the DP distribution is given by:

equation

and the 95% two-sided asymptotic confidence intervals for for λ, δ, β and θ are given by 7.2834 ± 13.3269; 3.3711 ± 1.3493, 0.2015 ± 0.11799, and 0.3744 ± 2.0572, respectively.

Plots of the fitted densities and the histogram, observed probability vs predicted probability are given in Figure 6.

statistics-and-mathematical-sciences-probability-plots

Figure 6. Fitted densities and probability plots for kevlar failure times.

The LR test statistics of the hypothesis H₀:FP vs H_a:DP and H0:BIIIP vs H_a:DP are 61.31 (p-value<0.0001) and 7.40 (p-value=0.0065). DP distribution is significantly better than FP and BIIIP distributions. Also, the LR test indicates that DP distribution is also significantly better than Fisk and Burr-III distributions, however, there is no difference between DP and Dagum distribution based on the LR test. DP distribution has the smallest goodness of _t statistic W* and A* values as well as the smallest SS value among all the models that were fitted [43-45]. A comparison of DP distribution with the non-nested EWP and EPLP distributions using the AIC, AICC, BIC, W*, A*, and SS statistics, clearly shows that it is the superior model. Hence, DP distribution is the “best” fit for the data when compared to all the other models that were considered.

Conclusion

A new class of distributions called the Dagum Power Series (DPS) is proposed and studied. The DPS distribution has Dagum Poisson, Dagum geometric, Dagum logarithmic, Dagum binomial and several other distributions as special cases. The DPS distribution is exible for modeling various types of lifetime and reliability data. We also obtain closed-form expressions for the moments, conditional moments, mean deviations, Lorenz and Bonferroni curves, distribution of order statistics and Renyi entropy. Methods of finding estimators such as Minimum Distance, Maximum Product of Spacing, Least Squares, Weighted Least Squares, and Maximum Likelihood were discussed. The special cases of DG and DP distributions re-discussed in details.

References

Dagum C. A new model of personal income distribution: specification and estimation. Economie Applique'e. 1977;30:413-437.
Dagum C. The generation and distribution of income, the lorenz curve and the gini ratio. Economie Applique'e. 1980;33:327-367.
Domma F. L'andamento della hazard function nel modello diDagum a tre parametri, Quaderni di Statistica. 2002;4:103-114.
Domma F, et al. The beta-dagum distribution: definition and properties. Commun in Stat Theor Methods, in press. 2013.
Huang S, et al. Exponentiated kumaraswamy dagum distribution with applications to income and lifetime data. J Stat Distrib Appl. 2014;1:1-20.
Oluyede BO, et al. A new class of generalized dagum distribution with applications to income and lifetime data. J Stat Econom Methods. 2014;3:125-151.
Oluyede BO, et al. The dagum-poisson distribution: model, properties and application. Electr J App Stat Anal. 2016;9:169-197.
Marshall NA, et al. A new method for adding a parameter to a family of distributions with applications to the exponential and weibull families. Biometrika. 1997;84:641-652.
Johnson NL, et al. Continuous univariate distributions. John Wiley and Sons, New York. 1994.
Bourguignon M, et al. A new class of fatigue life distributions. J Stat Comput Simu. 2014;12:2619-2635.
Chahkandi M, et al. On some lifetime distributions with decreasing failure rate. Comput Stat Data Anal. 2009;53:4433-4440.
Mahmoudi E, et al. Generalized exponential power series distributions. Comput Stat Data Anal. 2012;56:4047-4066.
Morais AL, et al. A compound class of weibull and power series distributions. Comput Stat Data Anal. 2011;5:1410-1425.
Silva RB, et al. The compound family of extended Weibull power series distributions, Comput Stat Data Anal. 2013;58:352-367.
Silva RB, et al. The burr xii power series distributions: a new compounding family. Braz J Probab Stat. 2014;29:565-589.
Silva RB, et al. A new compounding family of distributions: The generalized gamma power series distributions. J Comput Appl Math. 2016;303:119-139.
Huang S, et al. Exponentiated kumaraswamy dagum distribution with applications to income and lifetime data. J Stat Distrib Appl. 2014;1:1-20.
Kleiber C. A guide to the dagum distribution. In: Duangkamon, edition C. modeling income distributions and lorenz curves series: economic studies in inequality, social exclusion and well being. Springer, New York. 2008.
Kleiber C, et al. Statistical size distribution in economics and actuarial sciences. John Wiley and Sons. 2003.
Chen RCH, et al. Estimating parameters in a continuous univariate distribution with a shifted origin. J R Stat Soc. 1983;45:394-403.
Adamidis K, et al. A lifetime distribution with decreasing failure rate. Stat Probab Lett. 1998;39:35-42.
Hoskings JRM. L-Moments: Analysis and estimation of distributions using linear combinations of order statistics. J R Stat Soc. 1990;52:105-124.
Mahmoudi E, et al. Exponentiated weibull-poisson distribution: model, properties and applications. Math Comput Simul. 2013;92:76-97.
Chambers JM, et al. Graphical methods for data analysis, belmont, CA: Wadsworth. 1983.
Murthy DP, et al. Weibull models. 2004;505:180.
Adamidis K, et al. On a generalization of the exponential-geometric distribution. Stat Probab Lett. 2005;73:259-269.
Andrews, et al. Data: A collection of problems from many fields for the student and research work. Springer, New York. 1985.
Barlow, et al. A Bayesian analysis of stress-rupture life of kevlar 49/epoxy spherical pressure vessels, In: Proc. Canad. Conf. Appl. Statist. New York: Marcel Dekker. 1984.
Barreto SW, et al. A generalization of the exponential-poisson distribution. Stat Probab Lett. 2009;79:2493-2500.
Barreto SW, et al. A new distribution with decreasing, increasing and upside-down bathtub failure rate. Comput Stat Data Anal. 2010;54:935-944.
Barreto SW, et al. The Weibull-geometric distribution. J Stat Comput Simu. 2011;81:645-657.
Cooray K, et al. A generalization of the half- normal distribution with applications to lifetime data. Commun Stat Theor Methods. 2008;37:1323-1337.
Chen G,et al. A general purpose approximate goodness-of-fit test. J Qual Technol. 1995;27:154-161.
Domma F, et al. Maximum likelihood estimation in dagum distribution with censored samples. J Appl Stat. 2011;38:2971-2985.
Hoskings JRM, et al. L-Moments: Analysis and estimation of distributions using linear combinations of order statistics. J R Stat Soc. 1990;52:105-124.
Klein JP, et al. Survival Analysis: Techniques for censored and truncated data. Spring-verlag, New York. 1997.
Kus C. A new lifetime distribution. Comput Stat Data Anal. 2007;51:4497-4509.
Lu W, et al. A new compounding life distribution: the Weibull–Poisson distribution. J Appl Stat. 2012;39:21-38.
McDonald B. Some generalized functions for the size distribution of income. Econometrica. 1984;52:647-663.
McDonald B, et al., A generalization of the beta distribution with application, J Econom. 1995;69:133-152.
Marshall AW, et al. A new method for adding a parameter to a family of distributions with applications to exponential and weibull families. Biometrika. 1997;84:641-652.
Morais AL, et al. A compound class of weibull and power series distributions. Comput Stat Data Anal. 2011;55:1410-1425.
Pararai M, et al. Exponentiated power lindley-poisson distribution: properties and applications. Commun Stat-Theory Methods. 2017;46:4726-4755.
Rezaei S, et al. A new three parameter lifetime distribution. Statistics. 2011:1-26.
Swain J, et al. Least squares estimation of distribution function in Johnson's translation system. J Stat Comput Simul. 1988;29:271-297.