A Statistical Profile of Arsenic Prevalence in the Mekong Delta Region

  • Uyen Huynh Department of Mathematics, Faculty of Science, Mahidol University, Bangkok, Thailand
  • Nabendu Pal Department of Mathematics, University of Louisiana at Lafayette, Lafayette, Louisiana, USA
  • Buu-Chau Truong Faculty of Mathematics and Statistics, Ton Duc Thang University, Ho Chi Minh City, Vietnam
  • Man Nguyen Department of Mathematics, Faculty of Science, Mahidol University, Bangkok, Thailand
Keywords: Skew-normal distribution, nonlinear regression, parameter estimation, bootstrap method, confidence interval, Akaike information criterion (AIC)


This work is a novel approach to model the concentration of arsenic in groundwater in An Phu district, An Giang province, in the Mekong Delta Region (MDR) of Vietnam, based on data available from a sample of water-wells. Arsenic contamination is a major problem in Vietnam, especially in the MDR where a large population depends on the groundwater pumped through tubewells for daily consumption as well as irrigation. It is a time consuming and expensive process to do a detailed chemical test to measure arsenic at every possible site of groundwater extraction. However, using a suitable statistical regression model we can construct a statistical profile of arsenic concentration over a suitable area which then can be further used to predict arsenic concentration at a new site within the
same surveyed area just based on its geographic characteristics. First, we provide a brief overview of the textbook type regression model based on normally (or, Gaussian) distributed errors. Then, we provide a more general model based on the skew-normal distribution (SND) for the errors. It should be noted that the SND is a generalization of the regular normal probability distribution, and hence provides a greater flexibility in our regression model. We provide a step by step approach to estimate all parameters of the regression model which is not only new and easy, but also quite different from the approaches followed by the other researchers. The sampling distributions of the parameter estimates are then studied using the bootstrap method which enables one to construct interval estimates of the model parameters. The methodology of using SND errors to develop a suitable regression model to build a profile of arsenic prevalence in the MDR can easily be adopted by the investigators for many other similar applied research problems.


Download data is not yet available.