In statistics , originally in geostatistics , kriging or Kriging ( / ˈ k r iː ɡ ɪ ŋ / ), also known as Gaussian process regression , is a method of interpolation based on Gaussian process governed by prior covariances . Under suitable assumptions of the prior, kriging gives the best linear unbiased prediction (BLUP) at unsampled locations. Interpolating methods based on other criteria such as smoothness (e.g., smoothing spline ) may not yield the BLUP. The method is widely used in the domain of spatial analysis and computer experiments . The technique is also known as Wiener–Kolmogorov prediction , after Norbert Wiener and Andrey Kolmogorov .
64-423: GPR may refer to: Science and technology [ edit ] Gaussian process regression , an interpolation method in statistics General-purpose register of a microprocessor G-protein coupled receptor Ground-penetrating radar Ground potential rise , in electrical engineering Other [ edit ] General practice residency , in dentistry in
128-444: A best linear unbiased estimator based on assumptions on covariances , make use of Gauss–Markov theorem to prove independence of the estimate and error, and use very similar formulae. Even so, they are useful in different frameworks: kriging is made for estimation of a single realization of a random field, while regression models are based on multiple observations of a multivariate data set. The kriging estimation may also be seen as
192-430: A spline in a reproducing kernel Hilbert space , with the reproducing kernel given by the covariance function. The difference with the classical kriging approach is provided by the interpretation: while the spline is motivated by a minimum-norm interpolation based on a Hilbert-space structure, kriging is motivated by an expected squared prediction error based on a stochastic model. Kriging with polynomial trend surfaces
256-967: A uniform (improper) prior is taken for p ( b ) {\displaystyle p(\mathbf {b} )} , and as p ( ε ) {\displaystyle p({\boldsymbol {\varepsilon }})} is a marginal distribution, it does not depend on b {\displaystyle \mathbf {b} } . Therefore the log-probability is log p ( b | ε ) = log p ( ε | b ) + ⋯ = − 1 2 ε T Ω − 1 ε + ⋯ , {\displaystyle \log p(\mathbf {b} |{\boldsymbol {\varepsilon }})=\log p({\boldsymbol {\varepsilon }}|\mathbf {b} )+\cdots =-{\frac {1}{2}}{\boldsymbol {\varepsilon }}^{\mathrm {T} }{\boldsymbol {\Omega }}^{-1}{\boldsymbol {\varepsilon }}+\cdots ,} where
320-425: A black-box model built over a designed set of computer experiments . In many practical engineering problems, such as the design of a metal forming process, a single FEM simulation might be several hours or even a few days long. It is therefore more efficient to design and run a limited number of computer simulations, and then use a kriging interpolator to rapidly predict the response in any other design point. Kriging
384-708: A detailed explanation. where the literals { Var x i , Var x 0 , Cov x i x 0 } {\displaystyle \left\{\operatorname {Var} _{x_{i}},\operatorname {Var} _{x_{0}},\operatorname {Cov} _{x_{i}x_{0}}\right\}} stand for Once defined the covariance model or variogram , C ( h ) {\displaystyle C(\mathbf {h} )} or γ ( h ) {\displaystyle \gamma (\mathbf {h} )} , valid in all field of analysis of Z ( x ) {\displaystyle Z(x)} , then we can write an expression for
448-559: A linear regression of Z ( x 0 ) {\displaystyle Z(x_{0})} on the other z 1 , … , z n {\displaystyle z_{1},\ldots ,z_{n}} . The interpolation by simple kriging is given by The kriging error is given by which leads to the generalised least-squares version of the Gauss–Markov theorem (Chiles & Delfiner 1999, p. 159): See also Bayesian Polynomial Chaos Although kriging
512-734: A quantity Z : R n → R {\displaystyle Z\colon \mathbb {R} ^{n}\to \mathbb {R} } , at an unobserved location x 0 {\displaystyle x_{0}} , is calculated from a linear combination of the observed values z i = Z ( x i ) {\displaystyle z_{i}=Z(x_{i})} and weights w i ( x 0 ) , i = 1 , … , N {\displaystyle w_{i}(x_{0}),\;i=1,\ldots ,N} : The weights w i {\displaystyle w_{i}} are intended to summarize two extremely important procedures in
576-447: A random function, of which only one realization is known – the set z ( x i ) {\displaystyle z(x_{i})} of observed data. With only one realization of each random variable, it's theoretically impossible to determine any statistical parameter of the individual variables or the function. The proposed solution in the geostatistical formalism consists in assuming various degrees of stationarity in
640-423: A random process, but rather it allows one to build a methodological basis for the spatial inference of quantities in unobserved locations and to quantify the uncertainty associated with the estimator. A stochastic process is, in the context of this model, simply a way to approach the set of data collected from the samples. The first step in geostatistical modulation is to create a random process that best describes
704-495: A spatial inference process: When calculating the weights w i {\displaystyle w_{i}} , there are two objectives in the geostatistical formalism: unbias and minimal variance of estimation . If the cloud of real values Z ( x 0 ) {\displaystyle Z(x_{0})} is plotted against the estimated values Z ^ ( x 0 ) {\displaystyle {\hat {Z}}(x_{0})} ,
SECTION 10
#1732772329370768-1059: A vector, y ≡ ( y 1 ⋮ y n ) , {\displaystyle \mathbf {y} \equiv {\begin{pmatrix}y_{1}\\\vdots \\y_{n}\end{pmatrix}},} and the predictor values are placed in the design matrix , X ≡ ( 1 x 12 x 13 ⋯ x 1 k 1 x 22 x 23 ⋯ x 2 k ⋮ ⋮ ⋮ ⋱ ⋮ 1 x n 2 x n 3 ⋯ x n k ) , {\displaystyle \mathbf {X} \equiv {\begin{pmatrix}1&x_{12}&x_{13}&\cdots &x_{1k}\\1&x_{22}&x_{23}&\cdots &x_{2k}\\\vdots &\vdots &\vdots &\ddots &\vdots \\1&x_{n2}&x_{n3}&\cdots &x_{nk}\end{pmatrix}},} where each row
832-427: Is a Lagrange multiplier used in the minimization of the kriging error σ k 2 ( x ) {\displaystyle \sigma _{k}^{2}(x)} to honor the unbiasedness condition. Simple kriging is mathematically the simplest, but the least general. It assumes the expectation of the random field is known and relies on a covariance function . However, in most applications neither
896-488: Is a quadratic programming problem. The stationary point of the objective function occurs when 2 X T Ω − 1 X b − 2 X T Ω − 1 y = 0 , {\displaystyle 2\mathbf {X} ^{\mathrm {T} }\mathbf {\Omega } ^{-1}\mathbf {X} {\mathbf {b} }-2\mathbf {X} ^{\mathrm {T} }\mathbf {\Omega } ^{-1}\mathbf {y} =0,} so
960-451: Is a vector of the k {\displaystyle k} predictor variables (including a constant) for the i {\displaystyle i} th data point. The model assumes that the conditional mean of y {\displaystyle \mathbf {y} } given X {\displaystyle \mathbf {X} } to be a linear function of X {\displaystyle \mathbf {X} } and that
1024-629: Is a vector of unknown constants, called "regression coefficients", which are estimated from the data. If b {\displaystyle \mathbf {b} } is a candidate estimate for β {\displaystyle {\boldsymbol {\beta }}} , then the residual vector for b {\displaystyle \mathbf {b} } is y − X b {\displaystyle \mathbf {y} -\mathbf {X} \mathbf {b} } . The generalized least squares method estimates β {\displaystyle {\boldsymbol {\beta }}} by minimizing
1088-431: Is also Gaussian, with a mean and covariance that can be simply computed from the observed values, their variance, and the kernel matrix derived from the prior. In geostatistical models, sampled data are interpreted as the result of a random process. The fact that these models incorporate uncertainty in their conceptualization doesn't mean that the phenomenon – the forest, the aquifer, the mineral deposit – has resulted from
1152-441: Is also interpreted as a random variable located in x 0 {\displaystyle x_{0}} , a result of the linear combination of variables. Kriging seeks to minimize the mean square value of the following error in estimating Z ( x 0 ) {\displaystyle Z(x_{0})} , subject to lack of bias: The two quality criteria referred to previously can now be expressed in terms of
1216-798: Is equivalent to β ^ = argmin b y T Ω − 1 y + b T X T Ω − 1 X b − 2 b T X T Ω − 1 y , {\displaystyle {\hat {\boldsymbol {\beta }}}={\underset {\mathbf {b} }{\operatorname {argmin} }}\,\mathbf {y} ^{\mathrm {T} }\,\mathbf {\Omega } ^{-1}\mathbf {y} +\mathbf {b} ^{\mathrm {T} }\mathbf {X} ^{\mathrm {T} }\mathbf {\Omega } ^{-1}\mathbf {X} \mathbf {b} -2\mathbf {b} ^{\mathrm {T} }\mathbf {X} ^{\mathrm {T} }\mathbf {\Omega } ^{-1}\mathbf {y} ,} which
1280-1941: Is equivalent to applying ordinary least squares (OLS) to a linearly transformed version of the data. This can be seen by factoring Ω = C C T {\displaystyle \mathbf {\Omega } =\mathbf {C} \mathbf {C} ^{\mathrm {T} }} using a method such as Cholesky decomposition . Left-multiplying both sides of y = X β + ε {\displaystyle \mathbf {y} =\mathbf {X} {\boldsymbol {\beta }}+{\boldsymbol {\varepsilon }}} by C − 1 {\displaystyle \mathbf {C} ^{-1}} yields an equivalent linear model: y ∗ = X ∗ β + ε ∗ , where y ∗ = C − 1 y , X ∗ = C − 1 X , ε ∗ = C − 1 ε . {\displaystyle \mathbf {y} ^{*}=\mathbf {X} ^{*}{\boldsymbol {\beta }}+{\boldsymbol {\varepsilon }}^{*},\quad {\text{where}}\quad \mathbf {y} ^{*}=\mathbf {C} ^{-1}\mathbf {y} ,\quad \mathbf {X} ^{*}=\mathbf {C} ^{-1}\mathbf {X} ,\quad {\boldsymbol {\varepsilon }}^{*}=\mathbf {C} ^{-1}{\boldsymbol {\varepsilon }}.} In this model, Var [ ε ∗ ∣ X ] = C − 1 Ω ( C − 1 ) T = I {\displaystyle \operatorname {Var} [{\boldsymbol {\varepsilon }}^{*}\mid \mathbf {X} ]=\mathbf {C} ^{-1}\mathbf {\Omega } \left(\mathbf {C} ^{-1}\right)^{\mathrm {T} }=\mathbf {I} } , where I {\displaystyle \mathbf {I} }
1344-560: Is important to notice that the squared residuals cannot be used in the previous expression; an estimator of the errors' variances is needed. To do so, a parametric heteroskedasticity model or nonparametric estimator can be used. Estimate β F G L S 1 {\displaystyle \beta _{FGLS1}} using Ω ^ OLS {\displaystyle {\widehat {\Omega }}_{\text{OLS}}} using weighted least squares : The procedure can be iterated. The first iteration
SECTION 20
#17327723293701408-430: Is interpreted as a random variable located in x 0 {\displaystyle x_{0}} , as well as the values of neighbors samples Z ( x i ) , i = 1 , … , N {\displaystyle Z(x_{i}),\ i=1,\ldots ,N} . The estimator Z ^ ( x 0 ) {\displaystyle {\hat {Z}}(x_{0})}
1472-851: Is known as the precision matrix (or dispersion matrix ), a generalization of the diagonal weight matrix . The GLS estimator is unbiased , consistent , efficient , and asymptotically normal with E [ β ^ ∣ X ] = β , and Cov [ β ^ ∣ X ] = ( X T Ω − 1 X ) − 1 . {\displaystyle \operatorname {E} [{\hat {\boldsymbol {\beta }}}\mid \mathbf {X} ]={\boldsymbol {\beta }},\quad {\text{and}}\quad \operatorname {Cov} [{\hat {\boldsymbol {\beta }}}\mid \mathbf {X} ]=(\mathbf {X} ^{\mathrm {T} }{\boldsymbol {\Omega }}^{-1}\mathbf {X} )^{-1}.} GLS
1536-419: Is mathematically identical to generalized least squares polynomial curve fitting . Kriging can also be understood as a form of Bayesian optimization . Kriging starts with a prior distribution over functions . This prior takes the form of a Gaussian process: N {\displaystyle N} samples from a function will be normally distributed , where the covariance between any two samples
1600-472: Is more efficient than OLS under heteroscedasticity (also spelled heteroskedasticity) or autocorrelation , this is not true for FGLS. The feasible estimator is asymptotically more efficient (provided the errors covariance matrix is consistently estimated), but for a small to medium-sized sample, it can be actually less efficient than OLS. This is why some authors prefer to use OLS and reformulate their inferences by simply considering an alternative estimator for
1664-436: Is sometimes capitalized as Kriging in the literature. Though computationally intensive in its basic formulation, kriging can be scaled to larger problems using various approximation methods . Kriging predicts the value of a function at a given point by computing a weighted average of the known values of the function in the neighborhood of the point. The method is closely related to regression analysis . Both theories derive
1728-962: Is the identity matrix . Then, β {\displaystyle {\boldsymbol {\beta }}} can be efficiently estimated by applying OLS to the transformed data, which requires minimizing the objective, ( y ∗ − X ∗ β ) T ( y ∗ − X ∗ β ) = ( y − X b ) T Ω − 1 ( y − X b ) . {\displaystyle \left(\mathbf {y} ^{*}-\mathbf {X} ^{*}{\boldsymbol {\beta }}\right)^{\mathrm {T} }(\mathbf {y} ^{*}-\mathbf {X} ^{*}{\boldsymbol {\beta }})=(\mathbf {y} -\mathbf {X} \mathbf {b} )^{\mathrm {T} }\,\mathbf {\Omega } ^{-1}(\mathbf {y} -\mathbf {X} \mathbf {b} ).} This transformation effectively standardizes
1792-456: Is the covariance function (or kernel ) of the Gaussian process evaluated at the spatial location of two points. A set of values is then observed, each value associated with a spatial location. Now, a new value can be predicted at any new spatial location by combining the Gaussian prior with a Gaussian likelihood function for each of the observed values. The resulting posterior distribution
1856-399: Is the squared exponential, which heavily favours smooth function estimates. For this reason, it can produce poor estimates in many real-world applications, especially when the true underlying function contains discontinuities and rapid changes. The kriging weights of simple kriging have no unbiasedness condition and are given by the simple kriging equation system : This is analogous to
1920-433: Is therefore used very often as a so-called surrogate model , implemented inside optimization routines. Kriging-based surrogate models may also be used in the case of mixed integer inputs. Generalized least squares In statistics , generalized least squares (GLS) is a method used to estimate the unknown parameters in a linear regression model . It is used when there is a non-zero amount of correlation between
1984-528: Is to apply OLS but discard the classical variance estimator (which is inconsistent in this framework) and instead use a HAC (Heteroskedasticity and Autocorrelation Consistent) estimator. In the context of autocorrelation, the Newey–West estimator can be used, and in heteroscedastic contexts, the Eicker–White estimator can be used instead. This approach is much safer, and it is the appropriate path to take unless
GPR - Misplaced Pages Continue
2048-452: Is to iterate; that is, to take the residuals from FGLS to update the errors' covariance estimator and then update the FGLS estimation, applying the same idea iteratively until the estimators vary less than some tolerance. However, this method does not necessarily improve the efficiency of the estimator very much if the original sample was small. A reasonable option when samples are not too large
2112-411: The residuals in the regression model. GLS is employed to improve statistical efficiency and reduce the risk of drawing erroneous inferences, as compared to conventional least squares and weighted least squares methods. It was first described by Alexander Aitken in 1935. It requires knowledge of the covariance matrix for the residuals. If this is unknown, estimating the covariance matrix gives
2176-408: The variogram and the covariogram : where: In this set, ( i , j ) {\displaystyle (i,\;j)} and ( j , i ) {\displaystyle (j,\;i)} denote the same element. Generally an "approximate distance" h {\displaystyle h} is used, implemented using a certain tolerance. Spatial inference, or estimation, of
2240-791: The United States Georgia Public Radio , in Georgia, United States Glider Pilot Regiment of the British Army GPR index , a stock index of property companies Grupa na rzecz Partii Robotniczej , the Polish section of the Committee for a Workers' International Topics referred to by the same term [REDACTED] This disambiguation page lists articles associated with the title GPR . If an internal link led you here, you may wish to change
2304-406: The actual measurements. To date kriging has been used in a variety of disciplines, including the following: Another very important and rapidly growing field of application, in engineering , is the interpolation of data coming out as response variables of deterministic computer simulations, e.g. finite element method (FEM) simulations. In this case, kriging is used as a metamodeling tool, i.e.
2368-1134: The arithmetic mean of sampled values. The hypothesis of stationarity related to the second moment is defined in the following way: the correlation between two random variables solely depends on the spatial distance between them and is independent of their location. Thus if h = x 2 − x 1 {\displaystyle \mathbf {h} =x_{2}-x_{1}} and | h | = h {\displaystyle |\mathbf {h} |=h} , then: For simplicity, we define C ( x i , x j ) = C ( Z ( x i ) , Z ( x j ) ) {\displaystyle C(x_{i},x_{j})=C{\big (}Z(x_{i}),Z(x_{j}){\big )}} and γ ( x i , x j ) = γ ( Z ( x i ) , Z ( x j ) ) {\displaystyle \gamma (x_{i},x_{j})=\gamma {\big (}Z(x_{i}),Z(x_{j}){\big )}} . This hypothesis allows one to infer those two measures –
2432-438: The cloud of estimated values versus the cloud real values is more disperse, the estimator is more imprecise. Depending on the stochastic properties of the random field and the various degrees of stationarity assumed, different methods for calculating the weights can be deduced, i.e. different types of kriging apply. Classical methods are: The unknown value Z ( x 0 ) {\displaystyle Z(x_{0})}
2496-912: The conditional variance of the error term given X {\displaystyle \mathbf {X} } is a known non-singular covariance matrix , Ω {\displaystyle \mathbf {\Omega } } . That is, y = X β + ε , E [ ε ∣ X ] = 0 , Cov [ ε ∣ X ] = Ω , {\displaystyle \mathbf {y} =\mathbf {X} {\boldsymbol {\beta }}+{\boldsymbol {\varepsilon }},\quad \operatorname {E} [{\boldsymbol {\varepsilon }}\mid \mathbf {X} ]=0,\quad \operatorname {Cov} [{\boldsymbol {\varepsilon }}\mid \mathbf {X} ]={\boldsymbol {\Omega }},} where β ∈ R k {\displaystyle {\boldsymbol {\beta }}\in \mathbb {R} ^{k}}
2560-470: The covariance of the errors Ω {\displaystyle \Omega } is unknown, one can get a consistent estimate of Ω {\displaystyle \Omega } , say Ω ^ {\displaystyle {\widehat {\Omega }}} , using an implementable version of GLS known as the feasible generalized least squares ( FGLS ) estimator. In FGLS, modeling proceeds in two stages: Whereas GLS
2624-468: The criterion for global unbias, intrinsic stationarity or wide sense stationarity of the field, implies that the mean of the estimations must be equal to mean of the real values. The second criterion says that the mean of the squared deviations ( Z ^ ( x ) − Z ( x ) ) {\displaystyle {\big (}{\hat {Z}}(x)-Z(x){\big )}} must be minimal, which means that when
GPR - Misplaced Pages Continue
2688-416: The estimation variance of any estimator in function of the covariance between the samples and the covariances between the samples and the point to estimate: Some conclusions can be asserted from this expression. The variance of estimation: Solving this optimization problem (see Lagrange multipliers ) results in the kriging system : The additional parameter μ {\displaystyle \mu }
2752-586: The estimator is β ^ = ( X T Ω − 1 X ) − 1 X T Ω − 1 y . {\displaystyle {\hat {\boldsymbol {\beta }}}=\left(\mathbf {X} ^{\mathrm {T} }\mathbf {\Omega } ^{-1}\mathbf {X} \right)^{-1}\mathbf {X} ^{\mathrm {T} }\mathbf {\Omega } ^{-1}\mathbf {y} .} The quantity Ω − 1 {\displaystyle \mathbf {\Omega } ^{-1}}
2816-454: The expectation nor the covariance are known beforehand. The practical assumptions for the application of simple kriging are: The covariance function is a crucial design choice, since it stipulates the properties of the Gaussian process and thereby the behaviour of the model. The covariance function encodes information about, for instance, smoothness and periodicity, which is reflected in the estimate produced. A very common covariance function
2880-412: The hidden terms are those that do not depend on b {\displaystyle \mathbf {b} } , and log p ( ε | b ) {\displaystyle \log p({\boldsymbol {\varepsilon }}|\mathbf {b} )} is the log-likelihood . The maximum a posteriori (MAP) estimate is then the maximum likelihood estimate (MLE), which is equivalent to
2944-409: The link to point directly to the intended article. Retrieved from " https://en.wikipedia.org/w/index.php?title=GPR&oldid=1054182221 " Category : Disambiguation pages Hidden categories: Short description is different from Wikidata All article disambiguation pages All disambiguation pages Gaussian process regression The theoretical basis for the method
3008-416: The mean and variance of the new random variable ϵ ( x 0 ) {\displaystyle \epsilon (x_{0})} : Since the random function is stationary, E [ Z ( x i ) ] = E [ Z ( x 0 ) ] = m {\displaystyle E[Z(x_{i})]=E[Z(x_{0})]=m} , the weights must sum to 1 in order to ensure that
3072-544: The method of feasible generalized least squares (FGLS). However, FGLS provides fewer guarantees of improvement. In standard linear regression models, one observes data { y i , x i j } i = 1 , … , n , j = 2 , … , k {\displaystyle \{y_{i},x_{ij}\}_{i=1,\dots ,n,j=2,\dots ,k}} on n statistical units with k − 1 predictor values and one response value each. The response values are placed in
3136-594: The model for heteroscedastic and non-autocorrelated errors. Assume that the variance-covariance matrix Ω {\displaystyle \Omega } of the error vector is diagonal, or equivalently that errors from distinct observations are uncorrelated. Then each diagonal entry may be estimated by the fitted residuals u ^ j {\displaystyle {\widehat {u}}_{j}} so Ω ^ O L S {\displaystyle {\widehat {\Omega }}_{OLS}} may be constructed by: It
3200-531: The model is unbiased. This can be seen as follows: Two estimators can have E [ ϵ ( x 0 ) ] = 0 {\displaystyle E[\epsilon (x_{0})]=0} , but the dispersion around their mean determines the difference between the quality of estimators. To find an estimator with minimum variance, we need to minimize E [ ϵ ( x 0 ) 2 ] {\displaystyle E[\epsilon (x_{0})^{2}]} . See covariance matrix for
3264-752: The optimization problem from above, β ^ = argmax b p ( b | ε ) = argmax b log p ( b | ε ) = argmax b log p ( ε | b ) , {\displaystyle {\hat {\boldsymbol {\beta }}}={\underset {\mathbf {b} }{\operatorname {argmax} }}\;p(\mathbf {b} |{\boldsymbol {\varepsilon }})={\underset {\mathbf {b} }{\operatorname {argmax} }}\;\log p(\mathbf {b} |{\boldsymbol {\varepsilon }})={\underset {\mathbf {b} }{\operatorname {argmax} }}\;\log p({\boldsymbol {\varepsilon }}|\mathbf {b} ),} where
SECTION 50
#17327723293703328-1007: The optimization problem has been re-written using the fact that the logarithm is a strictly increasing function and the property that the argument solving an optimization problem is independent of terms in the objective function which do not involve said terms. Substituting y − X b {\displaystyle \mathbf {y} -\mathbf {X} \mathbf {b} } for ε {\displaystyle {\boldsymbol {\varepsilon }}} , β ^ = argmin b 1 2 ( y − X b ) T Ω − 1 ( y − X b ) . {\displaystyle {\hat {\boldsymbol {\beta }}}={\underset {\mathbf {b} }{\operatorname {argmin} }}\;{\frac {1}{2}}(\mathbf {y} -\mathbf {X} \mathbf {b} )^{\mathrm {T} }{\boldsymbol {\Omega }}^{-1}(\mathbf {y} -\mathbf {X} \mathbf {b} ).} If
3392-1254: The prior is generalized to the case where errors may not be independent and may have differing variances . For given fit parameters b {\displaystyle \mathbf {b} } , the conditional probability density function of the errors are assumed to be: p ( ε | b ) = 1 ( 2 π ) n det Ω exp ( − 1 2 ε T Ω − 1 ε ) . {\displaystyle p({\boldsymbol {\varepsilon }}|\mathbf {b} )={\frac {1}{\sqrt {(2\pi )^{n}\det {\boldsymbol {\Omega }}}}}\exp \left(-{\frac {1}{2}}{\boldsymbol {\varepsilon }}^{\mathrm {T} }{\boldsymbol {\Omega }}^{-1}{\boldsymbol {\varepsilon }}\right).} By Bayes' theorem , p ( b | ε ) = p ( ε | b ) p ( b ) p ( ε ) . {\displaystyle p(\mathbf {b} |{\boldsymbol {\varepsilon }})={\frac {p({\boldsymbol {\varepsilon }}|\mathbf {b} )p(\mathbf {b} )}{p({\boldsymbol {\varepsilon }})}}.} In GLS,
3456-430: The properties of FGLS estimators are unknown: they vary dramatically with each particular model, and as a general rule, their exact distributions cannot be derived analytically. For finite samples, FGLS may be less efficient than OLS in some cases. Thus, while GLS can be made feasible, it is not always wise to apply this method when the sample is small. A method used to improve the accuracy of the estimators in finite samples
3520-399: The random function, in order to make the inference of some statistic values possible. For instance, if one assumes, based on the homogeneity of samples in area A {\displaystyle A} where the variable is distributed, the hypothesis that the first moment is stationary (i.e. all random variables have the same mean), then one is assuming that the mean can be estimated by
3584-542: The sample is large, where "large" is sometimes a slippery issue (e.g., if the error distribution is asymmetric the required sample will be much larger). The ordinary least squares (OLS) estimator is calculated by: and estimates of the residuals u ^ j = ( Y − X β ^ OLS ) j {\displaystyle {\widehat {u}}_{j}=(Y-X{\widehat {\beta }}_{\text{OLS}})_{j}} are constructed. For simplicity, consider
3648-469: The scale of and de-correlates the errors. When OLS is used on data with homoscedastic errors, the Gauss–Markov theorem applies, so the GLS estimate is the best linear unbiased estimator for β {\displaystyle {\boldsymbol {\beta }}} . A special case of GLS, called weighted least squares (WLS), occurs when all the off-diagonal entries of Ω are 0. This situation arises when
3712-458: The set of observed data. A value from location x 1 {\displaystyle x_{1}} (generic denomination of a set of geographic coordinates ) is interpreted as a realization z ( x 1 ) {\displaystyle z(x_{1})} of the random variable Z ( x 1 ) {\displaystyle Z(x_{1})} . In the space A {\displaystyle A} , where
3776-400: The set of samples is dispersed, there are N {\displaystyle N} realizations of the random variables Z ( x 1 ) , Z ( x 2 ) , … , Z ( x N ) {\displaystyle Z(x_{1}),Z(x_{2}),\ldots ,Z(x_{N})} , correlated between themselves. The set of random variables constitutes
3840-1416: The squared Mahalanobis length of this residual vector: β ^ = argmin b ( y − X b ) T Ω − 1 ( y − X b ) = argmin b y T Ω − 1 y + ( X b ) T Ω − 1 X b − y T Ω − 1 X b − ( X b ) T Ω − 1 y , {\displaystyle {\begin{aligned}{\hat {\boldsymbol {\beta }}}&={\underset {\mathbf {b} }{\operatorname {argmin} }}\,(\mathbf {y} -\mathbf {X} \mathbf {b} )^{\mathrm {T} }\mathbf {\Omega } ^{-1}(\mathbf {y} -\mathbf {X} \mathbf {b} )\\&={\underset {\mathbf {b} }{\operatorname {argmin} }}\,\mathbf {y} ^{\mathrm {T} }\,\mathbf {\Omega } ^{-1}\mathbf {y} +(\mathbf {X} \mathbf {b} )^{\mathrm {T} }\mathbf {\Omega } ^{-1}\mathbf {X} \mathbf {b} -\mathbf {y} ^{\mathrm {T} }\mathbf {\Omega } ^{-1}\mathbf {X} \mathbf {b} -(\mathbf {X} \mathbf {b} )^{\mathrm {T} }\mathbf {\Omega } ^{-1}\mathbf {y} \,,\end{aligned}}} which
3904-566: The variance of the estimator robust to heteroscedasticity or serial autocorrelation. However, for large samples, FGLS is preferred over OLS under heteroskedasticity or serial correlation. A cautionary note is that the FGLS estimator is not always consistent. One case in which FGLS might be inconsistent is if there are individual-specific fixed effects. In general, this estimator has different properties than GLS. For large samples (i.e., asymptotically), all properties are (under appropriate conditions) common with respect to GLS, but for finite samples,
SECTION 60
#17327723293703968-449: The variances of the observed values are unequal or when heteroscedasticity is present, but no correlations exist among the observed variances. The weight for unit i is proportional to the reciprocal of the variance of the response for unit i . Ordinary least squares can be interpreted as maximum likelihood estimation with the prior that the errors are independent and normally distributed with zero mean and common variance. In GLS,
4032-595: Was developed by the French mathematician Georges Matheron in 1960, based on the master's thesis of Danie G. Krige , the pioneering plotter of distance-weighted average gold grades at the Witwatersrand reef complex in South Africa . Krige sought to estimate the most likely distribution of gold based on samples from a few boreholes. The English verb is to krige , and the most common noun is kriging . The word
4096-405: Was developed originally for applications in geostatistics, it is a general method of statistical interpolation and can be applied within any discipline to sampled data from random fields that satisfy the appropriate mathematical assumptions. It can be used where spatially related data has been collected (in 2-D or 3-D) and estimates of "fill-in" data are desired in the locations (spatial gaps) between
#369630