Document Type: Research Paper
Authors
University of Zanjan
Abstract
Keywords
[Research]
Application of multivariate statistics and geostatistical techniques to identify the spatial variability of heavy metals in groundwater resources
F. Khanduzi, A. Parizanganeh*, A. Zamani
Department of Environmental Sciences, Faculty of Science, Environmental Science Research Laboratory, University of Zanjan, Zanjan, Iran
* Corresponding author’s Email: h_zanganeh@znu.ac.ir
(Received: Feb. 29.2015 Accepted: July. 22.2015)
ABSTRACT
The performance of geostatistical and spatial interpolation techniques for estimation of spatial variability of heavy metals and water quality mapping of groundwater resources in Ramiyan district (Golestan province Iran) were investigated. 24 spring/well water samples were collected and the concentration of heavy metals (Ni, Co, Pb, Cd and Cu) was determined using Differential Pulse Polarography. Multivariate and geostatistical methods have been applied to differentiate the influences of natural processes and human activities as to the pollution of heavy metals in groundwater across the study area. The results of the Cluster Analysis and Factor Analysis show that Ni and Co are grouped in the factor F1, whereas, Pb and Cd in F2 and Zn and Cu in F3. The probability of presence of elevated levels for the three factors was predicted by utilizing the most appropriate Variogram Model, whilst the performance of methods, was evaluated by using Mean Absolute Error, Mean Bias Error and Root Mean Square Error. The spatial structure results show that the variograms and crossvalidation of the six variables can be modeled with three methods, namely,the Radial Basis Fraction, Inverse Distance Weight and Ordinary Kriging. Moreover, results illustrated that Radial Basis Fraction method was the best as it had the highest precision and lowest error. The Geographic Information System can fully display spatial patterns of heavy metal concentrations, in groundwater resources of the study area.
Key words: Groundwater resources, Heavy metals contamination, Geostatistical, Multivariate statistics, Interpolation, Spatial mapping.
INTRODUCTION
Water is the basic requirement for all life on earth and an increase in the population and urbanization necessitates growth of agricultural and industrial sectors, increasing demand for fresher water. When surface water is not available; the alternative is to depend on Groundwater (GW) (Subramani et al., 2012). A variety of natural and human factors, affects the quality and use of water resources. Heavy metals are among the major pollutants of these sources (Marcovecchio et al., 2007). Many human activities, such as agriculture, mining and the combustion of fossil fuels, release heavy metals into the environment. Thereby, with an increase in their concentration and a decrease in the capacity of soils towards heavy metals, these leach into the soil solution and GW and then they accumulate in living tissues among people through the food chain (Mantovi et al., 2003; Lei et al., 2008), in addition to being sensitive indicators for monitoring changes in the aqueous environment. In environmental monitoring, such as groundwater quality investigations, the collected data may harbor signiﬁcant uncertainty, including complex or extremely complicated variations in the observed values of measurable characteristics, of the investigated medium or pollution sources in time and space (Yeh et al., 2006). Geostatistics, is a spatial statistical technique used in environmental monitoring, which is applied to analyze and map distributions of pollutant concentrations and their spatial and temporal variations. It is more widely used to analyze the collected data from groundwater resources (Yu et al., 2003; Yeh et al., 2006; Nas & Berkta, 2006; Khodapanah & Sulaiman, 2009; Uyan & Cay, 2010; Amin et al., 2010; Belkhiri et al., 2011; Sarukkalige, 2012). Furthermore, the application of different multivariate statistical techniques helps in the interpretation of complex data matrices, for a better understanding of water quality of the studied systems. These methods allow identification of possible factors/sources which inﬂuence the water systems and offer a valuable tool for a reliable management of water (Shrestha & Kazama, 2007; Iscen et al., 2008; Ogunribido & Kehinde–Philips, 2011; Li et al., 2012; Bajpayee et al., 2012). Multivariate geostatistical methods combine the advantages of geostatistical techniques and multivariate analysis, while incorporating spatial or temporal correlations and multivariate relationships to detect and map the varied sources of spatial variation on different scales (Smyth & Istok, 1989; Einax & Soldt, 1999; Yeh et al., 2006; Zheng et al., 2008; Lin et al., 2009). Excavation of coal mines, agricultural activities and development of industrial parks in Ramiyan, in Golestan Province (Iran), provoke evaluation of contaminations resulting from these activities. The lack of a systematic investigation of the probable contamination by heavy metals in Ramiyan, urges an assessment of the quality of groundwater sources in this area.
The aquifer is the main source for drinking and irrigation critical for the local residents. 24 well/spring samples were collected and analyzed by voltametric method for determination of such heavy metals. The presence and concentration of heavy metals were determined and the results were compared to the maximum contaminant level, specified by WHO and the Institute of Standards and Industrial Research of Iran (ISIRI). This study aims at investigating the contents of Cu, Ni, Zn, Cd, Pb and Co in the groundwater resources of Ramiyan, including the analysis of their spatial distribution as well as unveiling their possible sources by integrating multivariate statistical and geostatistical methods.
MATERIAL AND METHODS
Site description
Golestan Province is located in the Southeast of the Caspian Sea in Northern Iran. The study area is Ramiyan district, with an area of 780.73 km^{2 }situated between 54˚ 45´ and 55˚ 15´ east longitude and 36˚ 48´ and 37˚ 12´ north latitude. The main activity carried out in this area is agriculture and the main crops grown are wheat, oilseeds, rice and garden products (Mosaedi & Gharib, 2008). Due to the presence of coal mines, industrial and mining activities have also been developed across the study area.
Sample collection
The samples for the assessment of groundwater pollution with heavy metals were collected from twenty four stations (wells/springs) in the study area (Fig 1, Table 1). The sampling was carried out in summer 2012 and from each station three replicate samples were selected for analysis. The glassware and vessels were treated in 10% (v/v) nitric acid solution for 24 hrs and were washed with distilled and deionized water. The samples were collected in polypropylene containers, labeled and a few drops of HNO_{3} (ultrapure grade) of pH < 2 were added immediately, to prevent the loss of metals, bacterial and fungal growth. These were then stored in a refrigerator.
Multivariate and geostatistical analysis
The multivariate analysis provides techniques, such as the Principle Component Analysis (PCA), Factor Analysis (FA) and Cluster Analysis (CA) for classifying the interrelationship of measured variables (Zamani et al., 2012). The Cluster Analysis was performed on the data, by utilizing the Ward Method and Squared Euclidean Distance characteristic. Multivariate geostatistical methods combine the advantages of geostatistical techniques and multivariate analysis, whereas, the geostatistical techniques have been applied to illustrate the incorporating spatial or temporal correlations and multivariate relationships, in order to map the various sources of spatial variation on divergent scales (Faccinelli et al., 2001). Geostatistics is presented as a collection of techniques for solving estimation problems involving spatial variables. It includes a variety of tools such as interpolation, integration and differentiation of hydrogeologic parameters to produce the prediction surface and other derived characteristics from measurements at known locations (Sahoo & Jha, 2014).
The first step in the geostatistical estimation, is a provision of a model that can facilitate the computation of semivariogram value for any possible sampling intervals. The most commonly used models are the Spherical, Exponential, Gaussian and Pure Nugget effect (Isaaks & Srivastava, 1989). The semivariogram plays a fundamental role in the analysis of geostatistical data by employing the Kriging Method. Prior to performing Kriging, a valid semivariogram model has to be selected and the model parameters have to be estimated (Pang et al., 2009). An experimental semivariogram is calculated as follows: (1)
Where, denotes the semivariogram, is the spatial interval, which is designated as lag; is the observed paired data, when the interval and are the measured values, when the Z(x) values are as xi+h, respectively. Valid models which are commonly fitted to the experimental semi variograms include the spherical, Gaussian and exponential functions. These are characterized by a sill, which represents the covariance accounted for by the model and a range that signifies the extent of spatial correlation. The value of the semi variograms is referred to as the nugget effect, where the model approaches the abscissa. These significant geostatistical parameters can indicate the spatial variation and relativity of regionalized variables under a certain scale (Yang et al., 2009).
Fig 1. Location map of Ramiyan and the sampling points.
Interpolation methods
Kriging Method was used as estimating tool in sustainable management of groundwater. It is a geostatistical interpolation technique that considers both the distance and the degree of variation between known data points when estimating values in unknown areas (Sahoo & Jha, 2014). This technique is an exact interpolation estimator, which is used to detect the best linear unbiased estimate. The optimum linear unbiased estimator must have a minimum variance of error of estimation (Einax & Soldt, 1999; Ahmadi & Sedghamiz, 2008).
In order to estimate the values of some locations which are not sampled, it is necessary to solve the following linear equation:
(2)
denotes the estimate of the unknown value and are the weights of known neighboring points .
Kriging is an estimating method that is stable on weighty mobile average coincident. This estimator is known as a best unbiased linear estimator. Spherical, circular, Gaussian and exponential functions are available models when the Kriging method is ordinary (Nas, 2009). Goovaerts describes the detail of the method (Goovaerts, 1997). Because it uses statistical models, it allows a variety of map outputs, including predictions, prediction standard errors, probability, and quantile maps. Among the various forms of Kriging, ordinary Kriging has been used widely as a reliable estimation method (Nas, 2009). In interpolation with the Inverse Distance Weighted (IDW) method, a weight is attributed to the point to be measured. In other words weight is the function of inverse distance and closer points have more influence in estimating unknown points (Eslami et al., 2013). The amount of this weight depends on the distance of the point to another unknown point.
These weights are controlled on the bases of power ten. So, with an increase of power, the effect of the points (that are farther) diminishes, whilst a lesser power distributes the weights more uniformly between neighboring points. In this method the distance between the points counts, so that, the points of equal distance have equal weights (Balakrishnan et al., 2011). The weight factor is determined based on the distance between the data points as follows:
(3)
Where designates the weight of point which is the distance between point i and the unknown point, which is the weight on the bases of power ten and n is the number of data points (Karandish & Shahnazari, 2014). Kriging in geostatistics is similar to inverse distance weighting except that the weights are based not only on the distance between the measured sampling points but also on the overall spatial arrangement among the sampling points. The basic assumption in kringing is that the sampling points that are close to each other are similar than those that are away. Kriging is regarded as an optimal spatial interpolation method, which is a type of weighted moving average (Gorai & Kumar, 2013). The Radial Basis Functions (RBF) Methods are a series of exact interpolation techniques, where the surface must go through for each measured sample value. The basis of each function has a different shape and results in a slightly different interpolation surface (Kazemi Poshtmasari et al., 2012). RBF Methods predict values that can vary above the maximum or below the minimum of the measured values. For all RBF Methods, there is a parameter that controls the smoothness of the resulting surface. The estimated values of the methods are based on a mathematical function that minimizes the overall surface curvature, generating surfaces that are quite smooth. The differences among them are slight, so the generated surfaces are almost similar. A formula f, which minimizes the following factor [eq. (4)], is an example of the RBF technique and more specifically of the exact SP line method (Karydas et al., 2009):
(4) Where signifies the source of random error, is the measured value of an attribute at point and epsilon is the associated random error. The term represents the smoothness of the function f and the second term represents its proximity to the data (Karydas et al., 2009).
Evaluation criteria
The adequacy and validity of the developed semivariogram models was tested satisfactorily by a technique called crossvalidation. The idea of crossvalidation consists of removing a datum at a time from the data set and reestimating this value from remaining data by using different variogram models. The interpolated and actual values are compared, and the model that yields the most accurate predictions is retained (Burrough & McDonnell, 1998; Karimi Nezhad et al., 2012 ;). In this paper, to compare the applied Interpolation methods, a cross validation was performed by utilizing the Mean Bias Error (MBE), Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) of the statistical parameters. When MAE and MBE shift to zero, the applied method simulates the fact well. Finally, we used the RMSE to evaluate the model performances in the crossvalidation mode. Each of these measures is such ‘dimensioned’ that, it expresses an average interpolator error in the units of the variable of interest. The smallest RMSE indicates the most accurate predictions. This method was recently adopted by many researchers (Twomey & Smith, 1996; Willmott & Matsuura, 2006; Kazemi Poshtmasari et al., 2012; Karandish & Shahnazari, 2014). These parameters are calculated according to the following equation Nos. (5 to 7):
(5)
(6)
(7)
Where Z (xi) is the observed value at point xi, Z*(xi) is the predicted value at point x and N denotes the number of samples.
RESULTS AND DISCUSSION
The extent of heavy metal contamination
The results of the analysis of target metal ions i.e., Co, Ni, Zn, Cd and Pb in samples from 24 wells/springs under study are given in Table (2). The results show that Co, Ni, Pb and Cd are evident in 100% of the samples and Zn and Cu are detected in 96% and 88% of the samples, respectively. The concentration of investigated metals (in µg/L) in the samples were found to be below their MCL and in the ranges of 5.69 92.44 for Zn, 1.23 7.06 for Pb, 0.148.40 for Cu, 0.010.99 for Cd, 1.23 21.79 for Ni and 0.49 7.79 for Co. The geographical location of the sampling stations and the average concentrations of metals at each station are shown in Table (1).
Classification survey of heavy metals by the Cluster Analysis Method
Two main groups of elements have been determined using the Cluster Analysis Method, one group includes Ni and Co and the other comprises of Pb, Cd, Zn and Cu (Fig. 2).
Principal component analysis and factor analysis
The major objective of the Factor Analysis (FA) is to reduce the contribution of less significant variables so as to further simplify even more of the data structure given by the PCA. This goal can be achieved by rotating the axis defined by the PCA and the construction of new variables, which are also called Varifactors (Shrestha & Kazama, 2007). Prior to such analysis, the raw data is commonly normalized to avoid misclassifications, due to the varied order of magnitude and range of variation of the analytical parameters (Tabachnick & Fidell, 2007). This process reduces the dimensionality of data by a linear combination of original data, to generate new latent variables which are orthogonal and uncorrelated to each other (Nkansah et al., 2010). According to the results of the Eigen values in Table (3), three factors are extracted from the available data set, which accounts for over 82.07% of all the data variation. The common factors were extracted by means of the maximumlikelihood method with the Varimaxrotation.
Nickel and cobalt, contained in the first factor, are typical emitted elements of electronic plants. The second factor includes cadmium and lead elements which are emitted by the agricultural activities and the metallurgical plant.
The third factor is loaded with zinc and copper, which are emissions of batteries, pigments and fungicides. The heavy metal grouping has been explored in plotting the first three principle components generated from these parameters (Fig. 3).
Table 1. GPS location and concentration of heavy metals in sampling stations.
Station 
X 
Y 
Zn (µg/L) 
Pb (µg/L) 
Cd (µg/L) 
Cu (µg/L) 
Ni (µg/L) 
Co (µg/L) 
W_{1} 
331881 
4103244 
21.29 
2.87 
0.04 
ND 
2.48 
6.20 
W_{2} 
318661 
4093812 
36.48 
3.64 
0.15 
4.68 
5.33 
1.72 
W_{3} 
316518 
4094590 
19.22 
7.06 
0.37 
3.41 
2.69 
1.29 
W_{4} 
318301 
4097255 
64.68 
7.03 
0.99 
ND 
2.99 
1.78 
W_{5} 
331486 
4105353 
34.60 
5.88 
0.24 
5.12 
1.71 
1.51 
W_{6} 
328781 
4105267 
12.85 
2.92 
0.09 
1.86 
2.29 
2.32 
S_{1} 
327797 
4096396 
36.77 
2.54 
0.08 
1.75 
2.28 
2.11 
W_{7} 
316718 
4091641 
6.73 
3.26 
0.08 
3.06 
7.48 
2.18 
S_{2} 
343356 
4086870 
56.52 
3.42 
0.04 
3.72 
2.00 
2.31 
W_{8} 
318492 
4091655 
15.11 
5.56 
0.08 
7.33 
6.98 
1.81 
W_{9} 
315558 
4090055 
32.65 
4.18 
0.09 
6.60 
5.07 
2.03 
W_{10} 
330180 
4107361 
52.07 
6.46 
0.2 
7.43 
1.89 
1.12 
W_{11} 
315050 
4101192 
19.68 
4.59 
0.13 
2.47 
4.64 
2.25 
W_{12} 
334472 
4103554 
18.46 
3.14 
0.03 
0.14 
4.43 
0.81 
W_{13} 
322519 
4104116 
ND 
2.98 
0.15 
1.27 
5.12 
1.77 
W_{14} 
327726 
4108520 
17.88 
5.50 
0.05 
3.66 
1.98 
0.01 
W_{15} 
334669 
4104096 
50.23 
5.49 
0.05 
2.16 
2.84 
1.19 
W_{16} 
324922 
4100196 
12.62 
3.89 
0.07 
5.79 
19.30 
3.65 
W_{17} 
316424 
4099638 
10.61 
3.06 
0.05 
2.54 
3.98 
2.77 
S_{3} 
324628 
4098775 
16.15 
5.04 
0.08 
ND 
3.29 
2.77 
W_{18} 
330192 
4107354 
62.28 
3.40 
0.09 
3.22 
4.12 
1.44 
S_{4} 
335731 
4086350 
18.93 
3.28 
0.04 
1.48 
2.81 
1.32 
S_{5} 
319249 
4092456 
44.84 
3.21 
0.07 
1.83 
3.50 
2.48 
W_{19} 
318221 
4092774 
31.49 
3.14 
0.07 
4.58 
5.50 
1.82 
Table 2. Summary of statistics of heavy metal contents in water samples (µg/L).
Metal 
Ni 
Co 
Pb 
Cd 
Zn 
Cu 
Detected (%) 
100% 
100% 
100% 
100% 
96% 
88% 
Min 
1.23 
0.49 
1.23 
0.01 
5.69 
0.14 
Max 
21.79 
7.79 
7.06 
0.99 
92.44 
8.40 
Mean 
4.38 
1.99 
4.21 
0.12 
30.55 
4.05 
Standard deviation 
3.65 
1.22 
1.94 
0.17 
18.77 
14.26 
WHO Standard 
70 
 
10 
3 
3000 
1000 
ISIRI Standard 
70 
 
10 
3 
3000 
2000 
Table 3. Rotated component matrix of threefactor model^{.}
Variable 
Component 

F_{1} 
F_{2} 
F_{3} 

Ni 
0.90 
0.07 
0.11 
Co 
0.84 
0.21 
0.10 
Pb 
0.15 
0.86 
0.21 
Cd 
0.08 
0.88 
0.04 
Zn 
0.52 
0.16 
0.72 
Cu 
0.29 
0.31 
0.82 
Eigen Value 
2.21 
1.57 
1.12 
Variance (%) 
36.99 
26.27 
18.79 
Cumulative (%) 
36.99 
63.27 
82.07 
^{†}Extraction method: Principle component analysis. Rotation method: Varimax with Kaiser Normalization.
Fig. 2. Dendrogram of heavy metal concentrations in groundwater samples.
Fig. 3. Component plot in rotated space for heavy metals (Factor loading, factor 1 vs. factor 2 vs. factor 3, Rotation: varimax normalized, extraction: principle component).
Spatial structure analysis
The geostatistical analysis is to be assumed that the distribution behavior of the metal ions in the sampling stations is normal. The random and normal distribution assumptions were checked by the (KS) (KolmogorovSmirnov) Methods. Alternatively, the homogeneity and normal distribution in the data, can be achieved by transforming the obtained data to another mathematically presentation, which lowers the difference between the data. This can be achieved by using the logarithmic form of data.
The normality of heavy metal data set was checked by the Kolmogorov–Smirnov Test. It is often observed that environmental variables are lognormal (McGrath et al., 2004), and data transformation is necessary to normalize such data sets. The normality tests of the six heavy metals for the 24 samples were performed as described by KS test. It was detected that only Cu and Zn were in accordance with the normal distribution using KS (p>0.05) before data transformation. To further normalize the data logarithmic transformation was utilized (Table 4).
After the logarithmic transformation of the original data, a normal distribution can be obtained. Thus, the following calculations must be performed on the logarithms of the data. After normalizing the data Semivariogram parameters were generated for each theoretical model.
Then, the confidence level of all variograms was evaluated using the ratio of nugget variance to sill which is regarded as a criterion for classifying the spatial dependence of ground water quality parameters. If this ratio is less than 25%, then the variable has strong spatial dependence; if the ratio is between 25 and 75%, the variable has moderate spatial dependence and the ratio greater than 75%, represents weak spatial dependence (Taghizadeh et al, 2008).
The most appropriate theoretical model was selected, which was based on highest R2 and lowest RSS (Table 5).
Table 4. Normal distribution behaviors of heavy metal concentration.
Metal 
N 
Mean 
Std. Deviation 
Kolmogorov Smirnov Z 
Asymp. Sig. (2tailed) 
Ni 
74 
4.38 
3.65 
1.90 
0.001 
Co 
74 
1.99 
1.22 
1.80 
0.003 
Pb 
74 
4.21 
1.94 
1.72 
0.005 
Cd 
59 
0.12 
0.17 
2.52 
0.00 
Zn 
72 
30.53 
18.77 
1.20 
0.11 
Cu 
48 
4.05 
2.25 
1.06 
0.20 
log Ni 
74 
0.55 
0.25 
0.67 
0.75 
log Co 
74 
0.24 
0.21 
0.90 
0.38 
log Pb 
74 
0.58 
0.17 
1.14 
0.14 
log Cd 
59 
1.70 
0.34 
1.25 
0.08 
log Zn 
72 
1.39 
0.28 
0.86 
0.43 
log Cu 
48 
0.52 
0.31 
0.76 
0.69 
Table 5. Summary of the most appropriate models for different heavy metals of GW.
Heavy metals 
Transformation 
Bestfit model 
Nugget (C_{0}) 
Sill (C_{0}+C) 
Proportion (C_{0}/ C_{0}+C)×100 
R^{2} 
RSS 

F_{1} 
Ni 
Lognormal 
Exponential 
0.024 
0.276 
8.69 
0.196 
0.100 

Co 
Lognormal 
Exponential 
0.029 
0.162 
17.90 
0.194 
0.033 

F_{2} 
Pb 
Lognormal 
Exponential 
0.015 
0.060 
25.00 
0.154 
0.0032 

Cd 
Lognormal 
Exponential 
0.095 
0.412 
23.05 
0.052 
0.256 

F_{2} 
Zn 
Lognormal 
Gaussian 
0.0910 
9.647 
0.943 
0.099 
109.6 

Cu 
Lognormal 
Gaussian 
1.195 
4.805 
24.86 
0.179 
57.23 

The attributes of the semivariograms for each factor are summarized in Table (5). Semivariograms show that the first and second factors are appropriate with the Exponential Model, whereas, the third factor fits well with the Gaussian Model. The values of R2 illustrate that the semivariogram models give good descriptions of the spatial structure of the heavy metals of groundwater. The nugget/sill ratios can be regarded as the criterion to classify the spatial dependence of data sets (Liu et al., 2009). The ratio of nugget to sill (RNS) can be used to express the extent of spatial autocorrelations of environmental factors, for example, groundwater heavy metal concentrations, in this study. A low RNS indicates the strong spatial autocorrelations of heavy metal concentrations in groundwater sources, while a high RNS indicates that random effects play an important role in spatial heterogeneity of heavy metals (Zheng et al., 2008). The RNS of six heavy metals demonstrate weak spatial correlations for all factors. Crossvalidation permits the determination as to which model provides the best predictions (Adhikary et al., 2012).
Table 6. Geostatistical analyses of heavy metals in groundwater (Ramiyan area).
Heavy Metal 
Method 
Model 
Cross validation 

MBE 
MAE 
RMSE 

Ni 
OK 
Exponential 
0.426 
2.587 
3.958 
IDW 
1 
0.240 
2.216 
3.828 

2 
0.388 
2.650 
4.377 

3 
0.268 
2.877 
4.812 

RBF 
SP line with Tension 
0.041 
2.242 
3.687 

Multiquadric 
0.308 
3.636 
4.352 

Inverse Multiquadric 
0.011 
2.218 
3.628 

Co 
OK 
Exponential 
0.018 
0744 
1.190 
IDW 
1 
0.083 
0.661 
1.125 

2 
0.030 
0.632 
1.143 

3 
0.074 
0.728 
1.176 

RBF 
SP line with Tension 
0.002 
0.727 
1.132 

Multiquadric 
0.027 
0.739 
1.173 

Inverse Multiquadric 
0.041 
0.695 
1.107 

Pb 
OK 
Exponential 
0.050 
1.258 
1.436 
IDW 
1 
0.027 
1.453 
1.715 

2 
0.013 
1.617 
1.799 

3 
0.027 
1.703 
1.843 

RBF 
SP line with Tension 
0.168 
1.382 
1.618 

Multiquadric 
0.052 
1.604 
1.808 

Inverse Multiquadric 
0.010 
1.312 
1.440 

Cd 
OK 
Exponential 
0.003 
0.103 
0.197 
IDW 
1 
0.003 
0.109 
0.199 

2 
0.015 
0.098 
0.194 

3 
0.026 
0.092 
0.191 

RBF 
SP line with Tension 
0.000 
0.111 
0.196 

Multiquadric 
0.020 
0.109 
0.199 

Inverse Multiquadric 
0.004 
0.107 
0.202 

Zn 
OK 
Gaussian 
0.008 
15.82 
18.57 
IDW 
1 
2.927 
18.01 
20.56 

2 
3.268 
19.14 
21.64 

3 
2.402 
19.33 
23.34 

RBF 
SP line with Tension 
1.039 
15.86 
18.02 

Multiquadric 
0.801 
17.73 
20.07 

Inverse Multiquadric 
0.024 
17.21 
17.15 

Cu 
OK 
Gaussian 
0.056 
1.961 
1.006 
IDW 
1 
0.225 
2.278 
2.609 

2 
0.247 
2.401 
2.878 

3 
0.426 
2.675 
3.039 

RBF 
SP line with Tension 
0.202 
2.020 
2.417 

Multiquadric 
0.228 
2.846 
3.125 

Inverse Multiquadric 
0.047 
1.944 
2.267 
The applicability of different semivariogram models is tested by crossvalidation and best model is selected (Table 6). In this study, ordinary kriging (OK), IDW and RBF were utilized to estimate six heavy metal concentrations. Comparisons between different methods were carried out by the MAE, MBE, and RMSE statistical parameters. In this research, the Radial Basis Functions Method (Inverse Multiquadric Model) was found to be the most suitable method for the estimation of Ni mapping. Whereas, statistics for the geostatistical method also show that Ordinary Kriging for Pb (Exponential Model), Zn and Cu (Gaussian Model); the Inverse Distance Weighted method for Co (power 2) and Cd (power 3) provides a much better estimation for results of concentrations, than the other methods (Table 6).
After plotting the values of heavy metal concentrations of groundwater for various sample locations, drinking water quality maps for heavy metal concentrations, can be drawn to demonstrate locations, where the water is almost clean or to some extent at risk (Fig 4).
Filled contour map of Co Filled contour map of Ni
Filled contour map of Cd Filled contour map of Pb
Filled contour map of Cu Filled contour map of Zn
Fig. 4. Filled contour maps of heavy metals in sampling groundwater.
CONCLUSIONS
Due to the complexity and a large variation of environmental data sets, the application of geostatistical and multivariate statistical methods is recommended.
The main objective of this study was to determine the best estimators for providing heavy metals maps in ground water resources in Ramyian district. The application of multivariate statistical and geostatistical methods were performed on six heavy metals and three principal components were identified, so as to represent the variability of heavy metals in groundwater sources. From the spatial distributions of 6 heavy metals, it was evident that the parent materials and anthropogenic factors played important roles in heavy metal concentrations of GW in Ramiyan. The effects of these two factors varied with that of the heavy metals. The results of the Cluster Analysis (CA) and Factor Analysis (FA) on the heavy metals, showed that Ni and Co was grouped in factor F1, Pb and Cd in F2 and Zn and Cu in F3. The probability of the presence of elevated levels of the heavy metals studied in the groundwater was predicted by using the bestfit semivariogram model. The performance of methods was evaluated by utilizing the Mean Average Error (MAE), Mean Bias Error (MBE), and Root Mean Square Error (RMSE). Moreover, results showed that Radial Basis Functions (RBF), Inverse distance weighted (IDW) and Ordinary Kriging (OK) methods were the best methods employed to estimate the Ni; Co and Cd; Pb, Zn and Cu mappings, respectively. The Geographic Information System (GIS) can fully display the spatial patterns and relationships among landscape indices and heavy metal concentrations, in the groundwater of this area of study. Application of different multivariate statistical techniques interprets complex data matrices and better understanding of water quality. Although the concentrations of investigated metals in the collected samples were found to be below their maximum contaminant level values reported by WHO and ISIRI but the source of heavy
metals contamination should be investigated specially in hot points within the studied area.
ACKNOWLEDGMENTS
Sincere gratitude to Rural Water and Wastewater Company (Golestan Province, Iran) for partial financial support (Grant Number 4987). The authors gratefully acknowledge Younes Khosravi’s contribution to this work.