In probability theory and statistics , variance is the expected value of the squared deviation from the mean of a random variable . The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion , meaning it is a measure of how far a set of numbers is spread out from their average value. It is the second central moment of a distribution , and the covariance of the random variable with itself, and it is often represented by σ 2 {\displaystyle \sigma ^{2}} , s 2 {\displaystyle s^{2}} , Var ( X ) {\displaystyle \operatorname {Var} (X)} , V ( X ) {\displaystyle V(X)} , or V ( X ) {\displaystyle \mathbb {V} (X)} .
58-410: An advantage of variance as a measure of dispersion is that it is more amenable to algebraic manipulation than other measures of dispersion such as the expected absolute deviation ; for example, the variance of a sum of uncorrelated random variables is equal to the sum of their variances. A disadvantage of the variance for practical applications is that, unlike the standard deviation, its units differ from
116-480: A x 2 + b {\displaystyle \varphi (x)=ax^{2}+b} , where a > 0 . This also holds in the multidimensional case. Unlike the expected absolute deviation , the variance of a variable has units that are the square of the units of the variable itself. For example, a variable measured in meters will have a variance measured in meters squared. For this reason, describing data sets via their standard deviation or root mean square deviation
174-489: A constant, the variance is scaled by the square of that constant: The variance of a sum of two random variables is given by where Cov ( X , Y ) {\displaystyle \operatorname {Cov} (X,Y)} is the covariance . In general, for the sum of N {\displaystyle N} random variables { X 1 , … , X N } {\displaystyle \{X_{1},\dots ,X_{N}\}} ,
232-447: A continuous function φ {\displaystyle \varphi } satisfies a r g m i n m E ( φ ( X − m ) ) = E ( X ) {\displaystyle \mathrm {argmin} _{m}\,\mathrm {E} (\varphi (X-m))=\mathrm {E} (X)} for all random variables X , then it is necessarily of the form φ ( x ) =
290-428: A distribution, then the sample variance calculated from that infinite set will match the value calculated using the distribution's equation for variance. Variance has a central role in statistics, where some ideas that use it include descriptive statistics , statistical inference , hypothesis testing , goodness of fit , and Monte Carlo sampling . The variance of a random variable X {\displaystyle X}
348-464: A line that is out of place. A dotted antisigma ( antisigma periestigmenon , Ͽ ) may indicate a line after which rearrangements should be made, or to variant readings of uncertain priority. In Greek inscriptions from the late first century BC onwards, Ͻ was an abbreviation indicating that a man's father's name is the same as his own name, thus Dionysodoros son of Dionysodoros would be written Διονυσόδωρος Ͻ ( Dionysodoros Dionysodorou ). In Unicode ,
406-447: A median of 1, in this case unaffected by the value of the outlier 14, so the median absolute deviation is 1. For a symmetric distribution, the median absolute deviation is equal to half the interquartile range . The maximum absolute deviation around an arbitrary point is the maximum of the absolute deviations of a sample from that point. While not strictly a measure of central tendency, the maximum absolute deviation can be found using
464-408: A sample is considered an estimate of the full population variance. There are multiple ways to calculate an estimate of the population variance, as discussed in the section below. The two kinds of variance are closely related. To see how, consider that a theoretical probability distribution can be used as a generator of hypothetical observations. If an infinite number of observations are generated using
522-611: A separate letter in the Greek alphabet, represented as Ϻ . Herodotus reports that "san" was the name given by the Dorians to the same letter called "sigma" by the Ionians . According to one hypothesis, the name "sigma" may continue that of Phoenician samekh ( [REDACTED] ), the letter continued through Greek xi , represented as Ξ . Alternatively, the name may have been a Greek innovation that simply meant 'hissing', from
580-413: A set X = { x 1 , x 2 , …, x n } is 1 n ∑ i = 1 n | x i − m ( X ) | . {\displaystyle {\frac {1}{n}}\sum _{i=1}^{n}|x_{i}-m(X)|.} The choice of measure of central tendency, m ( X ) {\displaystyle m(X)} , has a marked effect on
638-472: A set of n {\displaystyle n} equally likely values can be equivalently expressed, without directly referring to the mean, in terms of squared deviations of all pairwise squared distances of points from each other: If the random variable X {\displaystyle X} has a probability density function f ( x ) {\displaystyle f(x)} , and F ( x ) {\displaystyle F(x)}
SECTION 10
#1732791525493696-462: A specified central point (see above). MAD has been proposed to be used in place of standard deviation since it corresponds better to real life. Because the MAD is a simpler measure of variability than the standard deviation , it can be useful in school teaching. This method's forecast accuracy is very closely related to the mean squared error (MSE) method which is just the average squared error of
754-988: Is φ ( E [ Y ] ) ≤ E [ φ ( Y ) ] {\displaystyle \varphi \left(\mathbb {E} [Y]\right)\leq \mathbb {E} \left[\varphi (Y)\right]} , where φ is a convex function, this implies for Y = | X − μ | {\displaystyle Y=\vert X-\mu \vert } that: ( E | X − μ | ) 2 ≤ E ( | X − μ | 2 ) {\displaystyle \left(\mathbb {E} |X-\mu \right|)^{2}\leq \mathbb {E} \left(|X-\mu |^{2}\right)} ( E | X − μ | ) 2 ≤ Var ( X ) {\displaystyle \left(\mathbb {E} |X-\mu \right|)^{2}\leq \operatorname {Var} (X)} Since both sides are positive, and
812-425: Is a characteristic of a set of observations. When variance is calculated from observations, those observations are typically measured from a real-world system. If all possible observations of the system are present, then the calculated variance is called the population variance. Normally, however, only a subset is available, and the variance calculated from this is called the sample variance. The variance calculated from
870-416: Is a discrete random variable assuming possible values y 1 , y 2 , y 3 … {\displaystyle y_{1},y_{2},y_{3}\ldots } with corresponding probabilities p 1 , p 2 , p 3 … , {\displaystyle p_{1},p_{2},p_{3}\ldots ,} , then in the formula for total variance,
928-401: Is a normally distributed random variable with expected value 0 then, see Geary (1935): w = E | X | E ( X 2 ) = 2 π . {\displaystyle w={\frac {E|X|}{\sqrt {E(X^{2})}}}={\sqrt {\frac {2}{\pi }}}.} In other words, for a normal distribution, mean absolute deviation is about 0.8 times
986-406: Is based on the notion of mean-unbiasedness. Each measure of location has its own form of unbiasedness (see entry on biased estimator ). The relevant form of unbiasedness here is median unbiasedness. Sigma Sigma ( / ˈ s ɪ ɡ m ə / SIG -mə ; uppercase Σ , lowercase σ , lowercase in word-final position ς ; ‹See Tfd› Greek : σίγμα ) is the eighteenth letter of
1044-552: Is given by on the interval [0, ∞) . Its mean can be shown to be Using integration by parts and making use of the expected value already calculated, we have: Thus, the variance of X is given by A fair six-sided die can be modeled as a discrete random variable, X , with outcomes 1 through 6, each with equal probability 1/6. The expected value of X is ( 1 + 2 + 3 + 4 + 5 + 6 ) / 6 = 7 / 2. {\displaystyle (1+2+3+4+5+6)/6=7/2.} Therefore,
1102-478: Is often preferred over using the variance. In the dice example the standard deviation is √ 2.9 ≈ 1.7 , slightly larger than the expected absolute deviation of 1.5. The standard deviation and the expected absolute deviation can both be used as an indicator of the "spread" of a distribution. The standard deviation is more amenable to algebraic manipulation than the expected absolute deviation, and, together with variance and its generalization covariance ,
1160-476: Is still widely used in decorative typefaces in Greece, especially in religious and church contexts, as well as in some modern print editions of classical Greek texts. A dotted lunate sigma ( sigma periestigmenon , Ͼ ) was used by Aristarchus of Samothrace (220–143 BC) as an editorial sign indicating that the line marked as such is at an incorrect position. Similarly, a reversed sigma ( antisigma , Ͻ ), may mark
1218-406: Is the expected value of the squared deviation from the mean of X {\displaystyle X} , μ = E [ X ] {\displaystyle \mu =\operatorname {E} [X]} : This definition encompasses random variables that are generated by processes that are discrete , continuous , neither , or mixed. The variance can also be thought of as
SECTION 20
#17327915254931276-472: Is the corresponding cumulative distribution function , then or equivalently, where μ {\displaystyle \mu } is the expected value of X {\displaystyle X} given by In these formulas, the integrals with respect to d x {\displaystyle dx} and d F ( x ) {\displaystyle dF(x)} are Lebesgue and Lebesgue–Stieltjes integrals, respectively. If
1334-403: Is the expected value. That is, (When such a discrete weighted variance is specified by weights whose sum is not 1, then one divides by the sum of the weights.) The variance of a collection of n {\displaystyle n} equally likely values can be written as where μ {\displaystyle \mu } is the average value. That is, The variance of
1392-452: Is the measure of central tendency most associated with the absolute deviation. Some location parameters can be compared as follows: The mean absolute deviation of a sample is a biased estimator of the mean absolute deviation of the population. In order for the absolute deviation to be an unbiased estimator, the expected value (average) of all the sample absolute deviations must equal the population absolute deviation. However, it does not. For
1450-419: Is the point about which the mean deviation is minimized. The MAD median offers a direct measure of the scale of a random variable around its median D med = E | X − median | {\displaystyle D_{\text{med}}=E|X-{\text{median}}|} This is the maximum likelihood estimator of the scale parameter b {\displaystyle b} of
1508-426: Is used frequently in theoretical statistics; however the expected absolute deviation tends to be more robust as it is less sensitive to outliers arising from measurement anomalies or an unduly heavy-tailed distribution . Variance is invariant with respect to changes in a location parameter . That is, if a constant is added to all values of the variable, the variance is unchanged: If all values are scaled by
1566-464: The Greek alphabet . In the system of Greek numerals , it has a value of 200. In general mathematics, uppercase Σ is used as an operator for summation . When used at the end of a letter-case word (one that does not use all caps ), the final form (ς) is used. In Ὀδυσσεύς (Odysseus), for example, the two lowercase sigmas (σ) in the center of the name are distinct from the word-final sigma (ς) at
1624-476: The Laplace distribution . Since the median minimizes the average absolute distance, we have D med ≤ D mean {\displaystyle D_{\text{med}}\leq D_{\text{mean}}} . The mean absolute deviation from the median is less than or equal to the mean absolute deviation from the mean. In fact, the mean absolute deviation from the median is always less than or equal to
1682-419: The conditional variance Var ( X ∣ Y ) {\displaystyle \operatorname {Var} (X\mid Y)} may be understood as follows. Given any particular value y of the random variable Y , there is a conditional expectation E ( X ∣ Y = y ) {\displaystyle \operatorname {E} (X\mid Y=y)} given
1740-750: The covariance of a random variable with itself: The variance is also equivalent to the second cumulant of a probability distribution that generates X {\displaystyle X} . The variance is typically designated as Var ( X ) {\displaystyle \operatorname {Var} (X)} , or sometimes as V ( X ) {\displaystyle V(X)} or V ( X ) {\displaystyle \mathbb {V} (X)} , or symbolically as σ X 2 {\displaystyle \sigma _{X}^{2}} or simply σ 2 {\displaystyle \sigma ^{2}} (pronounced " sigma squared"). The expression for
1798-491: The law of total variance is: If X {\displaystyle X} and Y {\displaystyle Y} are two random variables, and the variance of X {\displaystyle X} exists, then The conditional expectation E ( X ∣ Y ) {\displaystyle \operatorname {E} (X\mid Y)} of X {\displaystyle X} given Y {\displaystyle Y} , and
Variance - Misplaced Pages Continue
1856-401: The square root is a monotonically increasing function in the positive domain: E ( | X − μ | ) ≤ Var ( X ) {\displaystyle \mathbb {E} \left(|X-\mu \right|)\leq {\sqrt {\operatorname {Var} (X)}}} For a general case of this statement, see Hölder's inequality . The median
1914-524: The 8th century BC. At that time a simplified three-stroke version, omitting the lowermost stroke, was already found in Western Greek alphabets, and was incorporated into classical Etruscan and Oscan , as well as in the earliest Latin epigraphy (early Latin S ), such as the Duenos inscription . The alternation between three and four (and occasionally more than four) strokes was also adopted into
1972-459: The above variations of lunate sigma are encoded as U+03F9 Ϲ GREEK CAPITAL LUNATE SIGMA SYMBOL ; U+03FD Ͻ GREEK CAPITAL REVERSED LUNATE SIGMA SYMBOL , U+03FE Ͼ GREEK CAPITAL DOTTED LUNATE SIGMA SYMBOL , and U+03FF Ͽ GREEK CAPITAL REVERSED DOTTED LUNATE SIGMA SYMBOL . Sigma was adopted in the Old Italic alphabets beginning in
2030-484: The absolute deviation it is necessary to specify both the measure of deviation and the measure of central tendency. The statistical literature has not yet adopted a standard notation, as both the mean absolute deviation around the mean and the median absolute deviation around the median have been denoted by their initials "MAD" in the literature, which may lead to confusion, since they generally have values considerably different from each other. The mean absolute deviation of
2088-514: The end. The Latin letter S derives from sigma while the Cyrillic letter Es derives from a lunate form of this letter. The shape (Σς) and alphabetic position of sigma is derived from the Phoenician letter [REDACTED] ( shin ). Sigma's original name may have been san , but due to the complicated early history of the Greek epichoric alphabets , san came to be identified as
2146-568: The event Y = y . This quantity depends on the particular value y ; it is a function g ( y ) = E ( X ∣ Y = y ) {\displaystyle g(y)=\operatorname {E} (X\mid Y=y)} . That same function evaluated at the random variable Y is the conditional expectation E ( X ∣ Y ) = g ( Y ) . {\displaystyle \operatorname {E} (X\mid Y)=g(Y).} In particular, if Y {\displaystyle Y}
2204-679: The first term on the right-hand side becomes where σ i 2 = Var [ X ∣ Y = y i ] {\displaystyle \sigma _{i}^{2}=\operatorname {Var} [X\mid Y=y_{i}]} . Similarly, the second term on the right-hand side becomes where μ i = E [ X ∣ Y = y i ] {\displaystyle \mu _{i}=\operatorname {E} [X\mid Y=y_{i}]} and μ = ∑ i p i μ i {\displaystyle \mu =\sum _{i}p_{i}\mu _{i}} . Thus
2262-446: The forecasts. Although these methods are very closely related, MAD is more commonly used because it is both easier to compute (avoiding the need for squaring) and easier to understand. For the normal distribution , the ratio of mean absolute deviation from the mean to standard deviation is 2 / π = 0.79788456 … {\textstyle {\sqrt {2/\pi }}=0.79788456\ldots } . Thus if X
2320-422: The formula for the average absolute deviation as above with m ( X ) = max ( X ) {\displaystyle m(X)=\max(X)} , where max ( X ) {\displaystyle \max(X)} is the sample maximum . The measures of statistical dispersion derived from absolute deviation characterize various measures of central tendency as minimizing dispersion: The median
2378-449: The function x 2 f ( x ) {\displaystyle x^{2}f(x)} is Riemann-integrable on every finite interval [ a , b ] ⊂ R , {\displaystyle [a,b]\subset \mathbb {R} ,} then where the integral is an improper Riemann integral . The exponential distribution with parameter λ is a continuous distribution whose probability density function
Variance - Misplaced Pages Continue
2436-467: The generator of random variable X {\displaystyle X} is discrete with probability mass function x 1 ↦ p 1 , x 2 ↦ p 2 , … , x n ↦ p n {\displaystyle x_{1}\mapsto p_{1},x_{2}\mapsto p_{2},\ldots ,x_{n}\mapsto p_{n}} , then where μ {\displaystyle \mu }
2494-514: The given data set. AAD includes the mean absolute deviation and the median absolute deviation (both abbreviated as MAD ). Several measures of statistical dispersion are defined in terms of the absolute deviation. The term "average absolute deviation" does not uniquely identify a measure of statistical dispersion , as there are several measures that can be used to measure absolute deviations, and there are several measures of central tendency that can be used as well. Thus, to uniquely identify
2552-429: The indicator function is I O := { 1 if x > median , 0 otherwise . {\displaystyle \mathbf {I} _{O}:={\begin{cases}1&{\text{if }}x>{\text{median}},\\0&{\text{otherwise}}.\end{cases}}} This representation allows for obtaining MAD median correlation coefficients. While in principle
2610-400: The mean absolute deviation from any other fixed number. By using the general dispersion function, Habib (2011) defined MAD about median as D med = E | X − median | = 2 Cov ( X , I O ) {\displaystyle D_{\text{med}}=E|X-{\text{median}}|=2\operatorname {Cov} (X,I_{O})} where
2668-462: The mean or any other central point could be taken as the central point for the median absolute deviation, most often the median value is taken instead. The median absolute deviation (also MAD) is the median of the absolute deviation from the median . It is a robust estimator of dispersion . For the example {2, 2, 3, 4, 14}: 3 is the median, so the absolute deviations from the median are {1, 1, 0, 1, 11} (reordered as {0, 1, 1, 1, 11}) with
2726-424: The population 1,2,3 both the population absolute deviation about the median and the population absolute deviation about the mean are 2/3. The average of all the sample absolute deviations about the mean of size 3 that can be drawn from the population is 44/81, while the average of all the sample absolute deviations about the median is 4/9. Therefore, the absolute deviation is a biased estimator. However, this argument
2784-424: The predicted score and the error score, where the latter two are uncorrelated. Similar decompositions are possible for the sum of squared deviations (sum of squares, S S {\displaystyle {\mathit {SS}}} ): The population variance for a non-negative random variable can be expressed in terms of the cumulative distribution function F using This expression can be used to calculate
2842-417: The random variable, which is why the standard deviation is more commonly reported as a measure of dispersion once the calculation is finished. Another disadvantage is that the variance is not finite for many distributions. There are two distinct concepts that are both called "variance". One, as discussed above, is part of a theoretical probability distribution and is defined by an equation. The other variance
2900-605: The root of σίζω ( sízō , from Proto-Greek *sig-jō 'I hiss'). In handwritten Greek during the Hellenistic period (4th–3rd century BC), the epigraphic form of Σ was simplified into a C-like shape, which has also been found on coins from the 4th century BC onward. This became the universal standard form of sigma during late antiquity and the Middle Ages. Today, it is known as lunate sigma (uppercase Ϲ , lowercase ϲ ), because of its crescent -like shape, and
2958-568: The same value: If a distribution does not have a finite expected value, as is the case for the Cauchy distribution , then the variance cannot be finite either. However, some distributions may not have a finite variance, despite their expected value being finite. An example is a Pareto distribution whose index k {\displaystyle k} satisfies 1 < k ≤ 2. {\displaystyle 1<k\leq 2.} The general formula for variance decomposition or
SECTION 50
#17327915254933016-502: The standard deviation. However, in-sample measurements deliver values of the ratio of mean average deviation / standard deviation for a given Gaussian sample n with the following bounds: w n ∈ [ 0 , 1 ] {\displaystyle w_{n}\in [0,1]} , with a bias for small n . The mean absolute deviation from the mean is less than or equal to the standard deviation ; one way of proving this relies on Jensen's inequality . Jensen's inequality
3074-512: The total variance is given by A similar formula is applied in analysis of variance , where the corresponding formula is here M S {\displaystyle {\mathit {MS}}} refers to the Mean of the Squares. In linear regression analysis the corresponding formula is This can also be derived from the additivity of variances, since the total (observed) score is the sum of
3132-420: The value of the mean deviation. For example, for the data set {2, 2, 3, 4, 14}: The mean absolute deviation (MAD), also referred to as the "mean deviation" or sometimes "average absolute deviation", is the mean of the data's absolute deviations around the data's mean: the average (absolute) distance from the mean. "Average absolute deviation" can refer to either this usage, or to the general form with respect to
3190-422: The variance becomes: Average absolute deviation The average absolute deviation ( AAD ) of a data set is the average of the absolute deviations from a central point . It is a summary statistic of statistical dispersion or variability. In the general form, the central point can be a mean , median , mode , or the result of any other measure of central tendency or any reference value related to
3248-450: The variance can be expanded as follows: In other words, the variance of X is equal to the mean of the square of X minus the square of the mean of X . This equation should not be used for computations using floating point arithmetic , because it suffers from catastrophic cancellation if the two components of the equation are similar in magnitude. For other numerically stable alternatives, see algorithms for calculating variance . If
3306-585: The variance in situations where the CDF, but not the density , can be conveniently expressed. The second moment of a random variable attains the minimum value when taken around the first moment (i.e., mean) of the random variable, i.e. a r g m i n m E ( ( X − m ) 2 ) = E ( X ) {\displaystyle \mathrm {argmin} _{m}\,\mathrm {E} \left(\left(X-m\right)^{2}\right)=\mathrm {E} (X)} . Conversely, if
3364-423: The variance of X is The general formula for the variance of the outcome, X , of an n -sided die is The following table lists the variance for some commonly used probability distributions. Variance is non-negative because the squares are positive or zero: The variance of a constant is zero. Conversely, if the variance of a random variable is 0, then it is almost surely a constant. That is, it always has
#492507