An Estimation of Survival and Hazard Rate Functions of Exponential Rayleigh Distribution

In this paper, we used the maximum likelihood estimation method to find the estimation values for survival and hazard rate functions of the Exponential Rayleigh distribution based on a sample of the real data for lung cancer and stomach cancer obtained from the Iraqi Ministry of Health and Environment, Department of Medical City, Tumor Teaching Hospital, depending on patients' diagnosis records and number of days the patient remains in the hospital until his death. Keyword: Exponential Rayleigh ER distribution; Maximum Likelihood Estimation; Chi Square Test; Real Life Data Application.

distribution of data fits with the distribution that is expected (i.e., to test the goodness of fit), it is used to analyze categorical data.
The goodness of fit test is used to check whether a given sample of data follows from a proposed distribution. The formula for calculating a chi-square statistic is [4,6]: Where represents the number of classes.
represents the observed frequency in class i represents the expected frequency in class i The characteristics of the chi-square test is that it can be easily calculated and applied to both continuous and discrete variables. It is not recommended for small sample size ( less than 25). Also the asymptotic distribution of the test statistic is the chi-square distribution with ( -1) degrees of freedom. The null and alternative hypotheses are as follows: 0 : The failure time data is distributed as Exponential Rayleigh p.d.f. 1 : The failure time data is not distributed as Exponential Rayleigh p.d.f.
The null hypothesis is rejected when comparing the tabulated value at level of significance ( ) and degree of freedom ( − 1) is less than the calculated value 2 > 2 ,( −1) .
The exponential distribution is the most commonly used distribution for lifetime data analysis. Its simplicity and mathematical feasibility made it the most widely used lifetime model in reliability (survival) theory. It is commonly used to model the time until something occurs in the process.
A continuous non-negative random variable is called to have an Exponential distribution with parameter , if its probability density function is given by [3]: And zero otherwise, where is the scale parameter. The cumulative distribution function is: Rayleigh distribution with one parameter is one of the most widely used distributions. It is an essential distribution in statistics and operation research. In (1880) Lord Rayleigh presented the Rayleigh distribution [5]. It shows the main role in connection with a problem in various branches modeling and analyzing lifetime data for instance project effort loading modeling, communication, survival and reliability theory, physical sciences, technology, diagnostic imaging, clinical subjects, and applied statistics [7].
A continuous non-negative random variable is called to have a Rayleigh distribution with parameter , if its probability density function is given by [2]: And zero otherwise, where is the scale parameter. The cumulative distribution function is: ; ≥ 0; > 0 (4) The Exponential Rayleigh distribution is obtained based on mixed between cumulative distribution function of Exponential distribution in equation (2) and cumulative distribution function of Rayleigh distributions in equation (4)  ; ≥ 0; , > 0 (5) ; ≥ 0; , > 0 And zero otherwise, where nd are scale parameters. The ℎ moment about the origin can be expressed by [1]: 2 )] ; = 1,2,3, The mean: The first moment which is named the mean is obtain by put = 1 in equation (9) thus [1]: The variance: The general formula ( ) of distribution is given by [1]: The moment generating function of distribution can be obtained by [1]:

Maximum Likelihood Estimation
Let = ( 1 , 2 , … , ) be a random sample of size ( ) drawn from distribution with pdf given by equation (6). The complete data likelihood function ( , | ) for a given random sample can be expressed by, The natural loglikelihood function is: We derive the natural loglikelihood function partially with respect to and respectively and setting it equal to zero yields, The maximum likelihood estimators denoted by ̂( ) and ̂( ) are the values of and that maximizes ( , | ) can be obtained by the solution of equations (15), (16). Note that there are no closed solutions of these equations; therefore, Newton -Raphson method is iterative technique can be applied to find the solution.
In Newton -Raphson method, the solution of the likelihood equation at iteration (ℎ + 1) is extract through the following iterative process, Where, Where the first partial derivatives as in equations (15), (16) and the second partial derivatives are obtained as follows, Now, based on an invariant property of the estimator, the survival function at mission time (t) of the distribution can be obtained by replacing and in equation (7), by their estimators as follows:

Practical Application (1)
In this section, real data for lung cancer disease is analyzed, because of the importance of this disease, we have collected data related to mortality from this disease from the Iraqi Ministry of Health and Environment, Department of Medical City, Tumor Teaching Hospital, from 1 / 1 / 2015 to 1 / 1 / 2021. The data related to this disease were not taken during 2021 due to the spread of the Covid-19 epidemic, as patients do not stay in the hospital for more than a day or two for fear of contracting this epidemic. The sample size consists (100) observations. It was noted that all patients died during different periods and this means that the data or sample used is a complete data set.
Using the maximum likelihood estimation in practical application to calculate the estimated value for the two parameters and as follows;   It is discovered that the calculated value is (17.915); when comparing this value with tabulated value at the level of significance (0.01) and degrees of freedom (9) we find out that the calculated value is less than the tabulated value (21.67). That means accepting the null hypothesis 0 and the data is distributed according to Exponential Rayleigh distribution.
Now, using estimate values for two parameters in Exponential Rayleigh distribution by method to find numerical values for probability density function ̂( ), cumulative distribution function ̂( ), survival function ̂( ) and hazard rate function ĥ( ) as follows; Here, we will discuss the following important notes on the previous results table (1): 1. The values of probability density function are increasing until = 6. Then the probability density function are decreasing when the failure times ( 7 ≤ ≤ 29 ), so = 6 is the mode of this function. Noting that the differences between all the values of probability density function are very small and converged.
Using the maximum likelihood estimation in practical application to calculate the estimated value for the two parameters and as follows; Number of Observation 81 Initial Value of 0.01 Initial Value of 0.01 Estimated Value of 0.0182 Estimated Value of 0.0063 The following graphic shows the frequency histogram table for stomach cancer patients, the vertical axis represents the number of patients and the horizontal axis represents the number of days from the patient's admission to the hospital until his death.  It is discovered that the calculated value is (19.526), when comparing this value with the tabulated value at the level of significance (0.01) and degrees of freedom (9), we find out that the calculated value is less than the tabulated value (21.67). That means accepting the null hypothesis 0 and the data is distributed according to Exponential Rayleigh distribution . Now, using estimate values for two parameters in Exponential Rayleigh distribution by method to find numerical values for probability death density function ̂( ), cumulative distribution function ̂( ), survival function ̂( ) and hazard rate function ĥ( ) Here we will discussion the following important notes on the previous results table (2)

Conclusion
Based on real data of lung cancer and stomach cancer the differences between all the values of probability density function ̂( ) are very small and converged. The values of cumulative distribution function ̂( ) are increasing with the increasing of failure times for the patients in the hospital. The values of the probability of survival for patients was great at small failure times and vice versa. There is a reverse relationship between failure times and survival function ̂( ). The values of hazard function ĥ( ) are increasing with the increasing of the failure times for the patients in the hospital, that means there is a direct relationship between the failure times and hazard function.