Thailand Statistician https://ph02.tci-thaijo.org/index.php/thaistat <p style="text-align: justify;">The main objective of Thailand Statistician is to encourage research in statistics and related fields in order to support the need for new knowledge and techniques as called upon by other subject matters. This journal is devoted to publication of original research papers, expository research and survey articles, and short research notes in pure and applied statistics, and other related fields.</p> en-US wararit@tu.ac.th (Associate Professor Dr.Wararit Panichkitkosolkul) suvimol.p@sci.kmutnb.ac.th (Assistant Professor Dr.Suvimol Phanyaem) Wed, 25 Dec 2024 20:04:40 +0700 OJS 3.3.0.8 http://blogs.law.harvard.edu/tech/rss 60 A New Class of Distributions for Modelling Continuous Positively Skewed Data Sets https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257208 <p><span class="fontstyle0">In this paper, we proposed a new class of distributions by introducing a new constant in the existing model. We discuss general properties of the family such as density function, quantile function and hazard rate function. We then discuss a member of the family considering the exponential distribution as baseline distribution. Various properties of the model such as quantile function, moments, moment generating function, order statistics, stress-strength parameter, and mean residual life function are discussed. We also discussed the mean, variance, skewness and kurtosis of the proposed model numerically. The expression for R</span><span class="fontstyle2">e</span><span class="fontstyle3">´</span><span class="fontstyle0">nyi and Shannon entropies are also derived. The different methods of estimation such as maximum likelihood estimation, maximum product spacing and least squares estimates are used for the estimation of the unknown parameters of the proposed distribution. The simulation study is performed to study the behaviour of the estimates based on their mean squared errors. Lastly, we apply our proposed model to two real data sets.</span> </p> Nishant Kumar Srivastava, Sanjay Kumar Singh, Vikas Kumar Sharma, Umesh Singh Copyright (c) 2024 http://creativecommons.org/licenses/by-nc-nd/4.0 https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257208 Wed, 25 Dec 2024 00:00:00 +0700 Setbacks of Joint Location and Dispersion Control Charts https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257210 <p><span class="fontstyle0">It is a common practice to monitor or control a process with control charts. The Shewhart <img src="https://latex.codecogs.com/svg.image?\bar{X}" alt="equation"> - and<br>R- or S-charts are the most common in use for monitoring process location and dispersion respectively. The literature reveals that until lately, the tradition has been to apply the charts for location and dispersion independently, but now, some works considered them jointly. The use of three-sigma limits, estimated parameter, and multiple charting have been shown to affect the joint chart schemes<br>by deteriorating the performance. In the literature, works exist on joint charts for <img src="https://latex.codecogs.com/svg.image?\bar{X}" alt="equation"> ¯<span class="fontstyle1"> and R-charts when the process parameters are known and the process is in a state of control, on <img src="https://latex.codecogs.com/svg.image?\bar{X}" alt="equation"> - and R-charts when the process parameters are unknown and the process is in a state of control and on <img src="https://latex.codecogs.com/svg.image?\bar{X}" alt="equation"> ¯ and S-charts (here, S-chart has one-sided control limit) for both when the process parameters are known<br>and unknown and the process is in a state of control. For the works so mentioned, the in-control average run length was used as the sole index for measuring the charts performance. Similar works on such joint charts for both the in-control and out-of-control states with estimated parameters where the performance is evaluated in terms of the average run and the median run lengths lack in the literature and this work will fill the gap. Therefore, in this work, a joint <img src="https://latex.codecogs.com/svg.image?\bar{X}" alt="equation"> and&nbsp; <img src="https://latex.codecogs.com/svg.image?%20S^2" alt="equation"> -charts will be extensively considered when the <img src="https://latex.codecogs.com/svg.image?%20S^2" alt="equation"> -chart is with both one-sided and two-sided control limits using the information from the unconditional run length (RL) cumulative distribution function (cdf) and its percentiles (mainly the median). New control limit constants will be provided to guarantee the desired in-control performance for the joint chart. <br><br><br><br><br></span><br></span></p> Samson Offorma Ugwu, Akinenyene Udo Udum Copyright (c) 2024 http://creativecommons.org/licenses/by-nc-nd/4.0 https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257210 Wed, 25 Dec 2024 00:00:00 +0700 Bayesian Analysis of Generalized Linear Models for Count Data using Stan https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257211 <p><span class="fontstyle0">In this paper, Bayesian approach is used to model count data for generalized linear models (GLMs) using Stan language. Commonly used GLMs include logistic regression for binary data and Poisson regression or Negative binomial regression for count data. The loo package is used for model selection. The Bayesian model comparison criteria LOOIC and WAIC are applied to evaluate the models. Furthermore, parallel simulations tools are also implemented with an extensive use of R.</span> </p> Firdoos Yousuf, Athar Ali Khan Copyright (c) 2024 http://creativecommons.org/licenses/by-nc-nd/4.0 https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257211 Wed, 25 Dec 2024 00:00:00 +0700 Non-mixture Cure Fraction Model in the Presence of Right Censored Survival Data and Covariates Based on Nadarajah-Haghighi Distribution with Applications to Medical Data https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257213 <p><span class="fontstyle0">In survival analysis it is assumed that every individual in the study population will eventually experience the event of interest if followed-up for a long period of time. However, there are some occasions in which a proportion of the population will never experience the event of interest. Hence, cure fraction models are used in modelling such type of data. The present paper introduces the Nadarajah-Haghighi distribution in the presence of cure fraction, right censored data and covariates. Comprehensive statistical properties of the model were explored. Inferences for the proposed model were obtained under the maximum likelihood and Bayesian approaches. Simulation study was provided in order to ascertain the performance of the maximum likelihood estimates. Illustrations of the proposed methodology were made by considering medical data sets using maximum likelihood and Bayesian methods. Results of the applications showed that the proposed methodology is a good competitor.</span> </p> Yakubu Aliyu, Umar Usman Copyright (c) 2024 http://creativecommons.org/licenses/by-nc-nd/4.0 https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257213 Wed, 25 Dec 2024 00:00:00 +0700 Further Statistical Properties of Size-Biased Geometric Distribution https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257214 <p><span class="fontstyle0">In this article, size biased geometric distribution is introduced. Some of its statistical and reliability properties are discussed like probability function, cumulative distribution and reliability function, hazard rate function and moments. The unknown distribution parameter is obtained by the moments and the maximum likelihood method in a closed form. Monte Carlo simulation study is performed in order to investigate the performance of maximum likelihood estimators. Additionally, this distribution is evaluated for the new count regression model. Real life data which comes from the statistical literature are considered for the illustration and the results are compared with Poisson, geometric, discrete Weibull, discrete Gamma and discrete Lindley distributions.</span> </p> Nuri Celik, Iklim Gedik Balay Copyright (c) 2024 http://creativecommons.org/licenses/by-nc-nd/4.0 https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257214 Wed, 25 Dec 2024 00:00:00 +0700 Modified Topp-Leone Distribution: Properties, Classical and Bayesian Estimation with Application to COVID-19 and Reliability Data https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257215 <p><span class="fontstyle0">In this article, we have proposed a new continuous model called Modified Topp-Leone distribution. One of the main features of this model is that it has only one parameter but contains varieties of shapes for density and hazard rate functions. We have discussed its various impressive properties like heavy-tailed behavior, mode, moments, quantile, median, skewness, kurtosis, moment generating<br>function, mean deviation, various inequality measures, mean residual life, expected inactivity time function, weighted moments, entropies, and various other important reliability characteristics including stress-strength reliability, hazard rate, survival function, stochastic ordering, and order statistics. The parameter estimation of the proposed model is discussed in the classical and Bayesian paradigm. In classical point estimation, we have used the method of maximum likelihood, ordinary and weighted least squares, Cramer-Von-Mises, and the method of maximum product of spacings. The asymptotic distribution of the maximum likelihood estimator is also provided and it is used to develop the asymptotic confidence interval. In Bayesian estimation, we have used informative and non-informative priors under symmetric and asymmetric loss functions to obtain the Bayes estimator of the unknown<br>parameter. The highest posterior density interval of the parameter is also obtained. An extensive simulation study is presented to the assessment of the different estimation procedures. In the end, three real datasets are examined to show the utility of the proposed model in the real world.</span> </p> Bhupendra Singh, Shikhar Tyagi, Ravindra Pratap Singh, Abhishek Tyagi Copyright (c) 2024 http://creativecommons.org/licenses/by-nc-nd/4.0 https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257215 Wed, 25 Dec 2024 00:00:00 +0700 An Alternative Version of Half-Logistic Distribution: Properties, Estimation and Application https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257227 <p><span class="fontstyle0">In this paper, we derived a new version of half-logistic distribution by using standard half-logistic distribution and a one-parameter family introduced by Zhao et al. (2020). The new model’s statistical properties were determined mathematically. The beauty and the novelty of the new model are that it was proved that it has three more shapes of hazard rate function than the existing one in the literature, which has only an increasing shape. The new model parameter estimation efficiency was checked by simulated data sets and different classical estimation methods. Also, we showed numerically that the behavior of our proposed model parameter estimation is better than the existing one in the literature. A real data set was analyzed, and it was discovered that the proposed model outperforms the half-logistic distribution and other competing models for fitting this data. Finally, we showed that the exponentiated version of our proposed model is better than the exponentiated half-logistic distribution.</span> </p> Abd-Elmonem A. M. Teamah, Ahmed A. Elbanna, Ahmed M. Gemeay Copyright (c) 2024 http://creativecommons.org/licenses/by-nc-nd/4.0 https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257227 Wed, 25 Dec 2024 00:00:00 +0700 Performance Comparison of the Quantile Regression Coefficient Estimation with Outliers https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257228 <p>This research aims to compare the coefficients estimation performance of quantile regression (QR) at different percentiles and simple regression (SR) for dataset with outliers. Five quantile positions were considered comprising QR(10)th, QR(25)th, QR(50)th, QR(75)th and QR(90)th. These were obtained by modifying the probability density functions for the kernel function adjustment under errors. The results were then compared with the simple regression coefficient estimation. After applying variety situations of the simulation, the mean absolute error (MAE) was used as criteria for consideration. The results showed that although the best performance model for large sample size was the SR, it still gave the best performance for small sample size as well as QR(25)th and QR(50)th models. Furthermore, the QR(25)th and the QR(50)th models were the most efficient estimation of the quantile regression coefficients with outliers for moderate sample size. They also indicated that there were changes scattered around zero value when the sample size was small. Comparing between the SR model and the QR(50)th model for every sample size, it was found that their model coefficients are slightly different. Considering kurtosis and skewness for the SR and all QR models, the results revealed that both values increased with small sample size. Then, they decreased with moderate sample size and increased again with large sample size. Therefore, the quantile regression coefficient estimation is effective in relationship analysis and provides estimates that are more accurate than one answer and suitable as an alternative analysis for skewed data.</p> Pimpan Ampanthong, Rungruttikarn Moungmai Copyright (c) 2024 http://creativecommons.org/licenses/by-nc-nd/4.0 https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257228 Wed, 25 Dec 2024 00:00:00 +0700 Impact of COVID-19 Pandemic on Road Traffic Accident Severity in Thailand: An Application of K-Nearest Neighbor Algorithm with Feature Selection Techniques https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257229 <p>This study aims to develop road crash severity classifiers utilizing available government data from Thailand, specifically focusing on the period of the COVID-19 outbreak and possible factors. Three primary machine learning algorithms including logistic regression, random forest, and K-Nearest Neighbor (KNN) were utilized. Focusing on factors affecting accident severity, the feature importance was analyzed by stepwise, mean decrease in accuracy and mean decrease in impurity selection techniques. Customizing the three ML models and three feature selection techniques, nine different predictive models were built and evaluated based on accuracy, precision, recall, and F1-score. The results indicated that KNN with feature selections outperform candidate models, particularly KNN-MDA and KNN-MDI for pre-pandemic and during pandemic periods, respectively. Among the eight features, vehicle type was the most important factor causing a higher number of fatal accidents, followed by region, crash type, weather, and time of incidence. That is, motorcycle riders and pedestrians are especially susceptible. Therefore, this study can aid practitioners in formulating effective management policies to enhance road safety.</p> Teerawat Simmachan, Sangdao Wongsai, Rattana Lerdsuwansri, Pichit Boonkrong Copyright (c) 2024 http://creativecommons.org/licenses/by-nc-nd/4.0 https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257229 Wed, 25 Dec 2024 00:00:00 +0700 Detecting Automobile Insurance Fraud: A Novel Two-Step Strategy Using Effective Ensemble Learning Techniques https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257230 <p class="Keywords" style="margin-bottom: .0001pt; line-height: 115%;">Like other industries, insurance companies processed large volumes of data during the industrial revolution<span lang="TH">. </span>The industry<span lang="TH">’</span>s major concern is increasing numbers of fraudulent claims<span lang="TH">. </span>These claims affect not only financial losses but also the entire industry, honest policyholders, and society<span lang="TH">. </span>Machine learning<span lang="TH"> (</span>ML<span lang="TH">) </span>approaches are recently utilized in insurance fraud detection to reduce such losses<span lang="TH">. </span>To further improve, this article introduces a novel prediction framework for fraudulent claims called the Two<span lang="TH">-</span>step models<span lang="TH">. </span>The anonymous US auto insurance dataset was used to demonstrate and evaluate the framework<span lang="TH">. </span>Under<span lang="TH">-</span>sampling and synthetic minority over<span lang="TH">-</span>sampling technique <span lang="TH">(</span>SMOTE<span lang="TH">) </span>were used to balance data<span lang="TH">. </span>Mutual information was employed as a feature selection tool<span lang="TH">. </span>Five proposed models were built in two steps<span lang="TH">. </span>Early on, eight basic ML models were implemented<span lang="TH">. </span>The top three affective models were chosen based on their F<span lang="TH">-</span>measure scores<span lang="TH">. </span>Then, their predicted values were used as components to construct the two<span lang="TH">-</span>step models using ensemble techniques<span lang="TH">. </span>Statistical tests were utilized to appraise all models<span lang="TH">. </span>Numerical results indicated that the proposed models yielded significant enhancements<span lang="TH">. </span>Moreover, the most effective model is a combination of SMOTE and improved multilayer perceptron <span lang="TH">(</span>IMLP<span lang="TH">). </span>This research could help insurance firms improve their fraud detection systems to prevent insurance abuse<span lang="TH">.</span></p> Wikanda Phaphan, Samach Sathitvudh, Tikumporn Suntornsuwan, Kamon Budsaba, Teerawat Simmachan Copyright (c) 2024 http://creativecommons.org/licenses/by-nc-nd/4.0 https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257230 Wed, 25 Dec 2024 00:00:00 +0700 Clustering Based Approach to Quantify the Unsafe Driving at Uncontrolled Median Openings due to Forced Deceleration: A Case Study in India https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257231 <p>Uncontrolled median openings with no lane discipline lead to complex driving phenomenon since the approaching through vehicles (major traffic stream) generally have to slow down their vehicles due to the presence of U-turns in the opposing traffic stream (minor traffic stream). It is observed that the approaching through vehicles despite being the major stream vehicles are forced to slow their vehicles, which sometimes also leads to evasive deceleration beyond limits of safety. In the present study, field data has been collected and the speed reduction from start of median opening to center of median opening is evaluated. Further, the effect of vehicle category and lane in which they are travelling, on speed reduction is also examined. Linear and quadratic probabilistic models have been developed to estimate the reduction in speed based on speed at the center of median opening, vehicle category, and zone/lane in which the vehicle is travelling. Safe reduction in speed is determined by using the IRC 66-1976 guidelines regarding SSD calculation. Thereafter, an undesirable driving index/safety index is proposed by conducting k-mean clustering utilizing the percentage reduction in speed. If the average reduction in speed increases beyond the proposed index, traffic facilities require deeper analysis for smooth flow of vehicles. The present study shall help in identifying the proportion of safe and unsafe reduction in speeds at median openings and proper steps can be undertaken to improve the safety and comfort of road users.</p> Malaya Mohanty, Rachita Panda, Ritik Raj Arya, Sarthak Lenka Copyright (c) 2024 http://creativecommons.org/licenses/by-nc-nd/4.0 https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257231 Wed, 25 Dec 2024 00:00:00 +0700 A Study on Bivariate Inverse Topp-Leone Model to Counter Heterogeneous Data: Properties, Dependence Studies, Classical and Bayesian Estimation https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257232 <p>In probability and statistics, reliable modeling of bivariate continuous characteristics remains a real insurmountable consideration. During the analysis of bivariate data, we have to deal with heterogeneity that is present in data. Therefore, for dealing with such a scenario, we investigate a novel technique based on a Farlie-Gumbel-Morgenstern (FGM) copula and the inverse Topp-Leone (ITL) model in this study. The idea is to use the oscillating functionalities of the FGM copula and the flexibility of the ITL model to propose a serious bivariate solution for the modeling of bivariate lifetime phenomena to counter the heterogeneity present in data. Both theory and practice are developed. In particular, we determine the main functions related to the model, like the cumulative model function, probability density function, and various useful dependence measures for bivariate modeling. The model parameters are estimated using the maximum likelihood method and Bayesian framework of the Markov Chain Monte Carlo (MCMC) methodology. Following that, model comparison methods are used to compare models. To explain the findings and show that better models are recommended, the famous Drought and Burr data sets are used.</p> Shikhar Tyagi Copyright (c) 2024 http://creativecommons.org/licenses/by-nc-nd/4.0 https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257232 Wed, 25 Dec 2024 00:00:00 +0700 Exponentiated Arctan-X Family of Distribution: Properties, Simulation and Applications to Insurance Data https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257233 <p>A new family of distribution has been investigated in this paper. Exponentiated method has been used to generate the new family of distribution. The proposed family of distribution is known as the exponentiated arctan-X family of distributions which are immensely beneficial and play an important role in modelling data sets. Various properties like reliability analysis, moments, moment generating function, quantile function, median are studied. Exponential distribution has been taken as a special case for the specific purpose of the show of strength. To estimate the parameters of the exponentiated arctan exponential distribution, the frequentist approach, i.e., maximum likelihood estimation, is used. A rigorous Monte Carlo simulation analysis is used to determine the efficiency of the obtained estimators. A complete percentage point study has been discussed and presented. The exponentiated arctan exponential model is demonstrated using two life-time data sets. The proposed family of distribution is compared to well-known two, three, and four parameter competitors. For model comparison, we used the most precise tests used to know whether the exponentiated arctan-exponential distribution is more useful than competing models.</p> Habibah Rahman Copyright (c) 2024 http://creativecommons.org/licenses/by-nc-nd/4.0 https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257233 Wed, 25 Dec 2024 00:00:00 +0700 Discriminating Between New Weibull Pareto Distribution and Weibull Distribution https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257234 <p class="ParagraphLevel1" style="margin-bottom: .0001pt; text-align: justify; text-justify: inter-cluster; line-height: 115%;">In this paper, distinguision between the new Weibull Pareto distribution and the Weibull distribution, test statistics based on likelihood functions, order statistics, and population quantiles are suggested<span lang="TH">. </span>The percentiles of the proposed test statistics are tabulated with the aid of simulated sampling distributions of the test statistics due to the non<span lang="TH">-</span>tractability of their exact sampling distributions<span lang="TH">. </span>The test statistics power is also tabulated, and a comparison between the powers for a given sample and the level of significance is calculated<span lang="TH">.</span></p> K.N.V.R. Lakshmi, Boyapati Srinivasa Rao Copyright (c) 2024 http://creativecommons.org/licenses/by-nc-nd/4.0 https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257234 Wed, 25 Dec 2024 00:00:00 +0700 Designing Bayesian Modified Group Chain Sampling Plan for Quality Regions Based on Average Number of Defectives https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257236 <p class="ParagraphLevel1" style="margin-bottom: .0001pt; line-height: 115%;">Acceptance sampling provides a mid<span lang="TH">-</span>way between zero and 100<span lang="TH">% </span>inspection, which can provide a statistically reliable inference on deciding whether or not to accept the entire lot based on a sample inspection<span lang="TH">. </span>If the prior information is available then according to experts, Bayesian approach is the best approach to reach a correct decision<span lang="TH">. </span>Based on the Bayesian approach, this study proposes a Bayesian modified group chain sampling plan <span lang="TH">(</span>BMGChSP<span lang="TH">) </span>to estimate the average number of defectives<span lang="TH">. </span>The Poisson distribution is used to estimate the average number of defectives and gamma as a prior distribution for the average probability of acceptance<span lang="TH">. </span>Instead of basing a sampling plan one point<span lang="TH">-</span>wise description of quality like the conventional plans, the proposed plan uses quality range for a wider coverage and offers better protection to both consumer<span lang="TH">’</span>s and producer<span lang="TH">’</span>s<span lang="TH">. </span>In this paper by considering consumer<span lang="TH">’</span>s risk and producer<span lang="TH">’</span>s risk, quality regions are estimated for the average probability of acceptance<span lang="TH">. </span>For all quality regions, acceptable quality level <span lang="TH">(</span>AQL<span lang="TH">) </span>and limiting quality level <span lang="TH">(</span>LQL<span lang="TH">) </span>are used to find design parameters for BMGChSP<span lang="TH">. </span>Where AQL is associated with consumer<span lang="TH">’</span>s risk and LQL is associated with producer<span lang="TH">’</span>s risk<span lang="TH">. </span>The values based on all possible combinations of design parameters for BMGChSP are tabulated and inflection points are found<span lang="TH">. </span>Based on the minimum number of defective the finding exposes that BMGChSP is a better substitute for industrial practitioners<span lang="TH">. </span>In comparison study by OC curves, it is concluded that the proposed BMGChSP plan gives a smaller number of defective than the existing BGChSP<span lang="TH">.</span></p> Waqar Hafeez, Nazrina Aziz , Said Farooq Shah, Akbar Ali Khan Copyright (c) 2024 http://creativecommons.org/licenses/by-nc-nd/4.0 https://ph02.tci-thaijo.org/index.php/thaistat/article/view/257236 Wed, 25 Dec 2024 00:00:00 +0700