Misplaced Pages

Variance

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

In probability theory and statistics , variance is the expected value of the squared deviation from the mean of a random variable . The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion , meaning it is a measure of how far a set of numbers is spread out from their average value. It is the second central moment of a distribution , and the covariance of the random variable with itself, and it is often represented by σ 2 {\displaystyle \sigma ^{2}} , s 2 {\displaystyle s^{2}} , Var ⁡ ( X ) {\displaystyle \operatorname {Var} (X)} , V ( X ) {\displaystyle V(X)} , or V ( X ) {\displaystyle \mathbb {V} (X)} .

#492507

58-410: An advantage of variance as a measure of dispersion is that it is more amenable to algebraic manipulation than other measures of dispersion such as the expected absolute deviation ; for example, the variance of a sum of uncorrelated random variables is equal to the sum of their variances. A disadvantage of the variance for practical applications is that, unlike the standard deviation, its units differ from

116-480: A x 2 + b {\displaystyle \varphi (x)=ax^{2}+b} , where a > 0 . This also holds in the multidimensional case. Unlike the expected absolute deviation , the variance of a variable has units that are the square of the units of the variable itself. For example, a variable measured in meters will have a variance measured in meters squared. For this reason, describing data sets via their standard deviation or root mean square deviation

174-489: A constant, the variance is scaled by the square of that constant: The variance of a sum of two random variables is given by where Cov ⁡ ( X , Y ) {\displaystyle \operatorname {Cov} (X,Y)} is the covariance . In general, for the sum of N {\displaystyle N} random variables { X 1 , … , X N } {\displaystyle \{X_{1},\dots ,X_{N}\}} ,

232-447: A continuous function φ {\displaystyle \varphi } satisfies a r g m i n m E ( φ ( X − m ) ) = E ( X ) {\displaystyle \mathrm {argmin} _{m}\,\mathrm {E} (\varphi (X-m))=\mathrm {E} (X)} for all random variables X , then it is necessarily of the form φ ( x ) =

290-428: A distribution, then the sample variance calculated from that infinite set will match the value calculated using the distribution's equation for variance. Variance has a central role in statistics, where some ideas that use it include descriptive statistics , statistical inference , hypothesis testing , goodness of fit , and Monte Carlo sampling . The variance of a random variable X {\displaystyle X}

348-464: A line that is out of place. A dotted antisigma ( antisigma periestigmenon , Ͽ ) may indicate a line after which rearrangements should be made, or to variant readings of uncertain priority. In Greek inscriptions from the late first century BC onwards, Ͻ was an abbreviation indicating that a man's father's name is the same as his own name, thus Dionysodoros son of Dionysodoros would be written Διονυσόδωρος Ͻ ( Dionysodoros Dionysodorou ). In Unicode ,

406-447: A median of 1, in this case unaffected by the value of the outlier 14, so the median absolute deviation is 1. For a symmetric distribution, the median absolute deviation is equal to half the interquartile range . The maximum absolute deviation around an arbitrary point is the maximum of the absolute deviations of a sample from that point. While not strictly a measure of central tendency, the maximum absolute deviation can be found using

464-408: A sample is considered an estimate of the full population variance. There are multiple ways to calculate an estimate of the population variance, as discussed in the section below. The two kinds of variance are closely related. To see how, consider that a theoretical probability distribution can be used as a generator of hypothetical observations. If an infinite number of observations are generated using

522-611: A separate letter in the Greek alphabet, represented as Ϻ . Herodotus reports that "san" was the name given by the Dorians to the same letter called "sigma" by the Ionians . According to one hypothesis, the name "sigma" may continue that of Phoenician samekh ( [REDACTED] ), the letter continued through Greek xi , represented as Ξ . Alternatively, the name may have been a Greek innovation that simply meant 'hissing', from

580-413: A set X  = { x 1 , x 2 , …, x n } is 1 n ∑ i = 1 n | x i − m ( X ) | . {\displaystyle {\frac {1}{n}}\sum _{i=1}^{n}|x_{i}-m(X)|.} The choice of measure of central tendency, m ( X ) {\displaystyle m(X)} , has a marked effect on

638-472: A set of n {\displaystyle n} equally likely values can be equivalently expressed, without directly referring to the mean, in terms of squared deviations of all pairwise squared distances of points from each other: If the random variable X {\displaystyle X} has a probability density function f ( x ) {\displaystyle f(x)} , and F ( x ) {\displaystyle F(x)}

SECTION 10

#1732791525493

696-462: A specified central point (see above). MAD has been proposed to be used in place of standard deviation since it corresponds better to real life. Because the MAD is a simpler measure of variability than the standard deviation , it can be useful in school teaching. This method's forecast accuracy is very closely related to the mean squared error (MSE) method which is just the average squared error of

754-988: Is φ ( E [ Y ] ) ≤ E [ φ ( Y ) ] {\displaystyle \varphi \left(\mathbb {E} [Y]\right)\leq \mathbb {E} \left[\varphi (Y)\right]} , where φ is a convex function, this implies for Y = | X − μ | {\displaystyle Y=\vert X-\mu \vert } that: ( E | X − μ | ) 2 ≤ E ( | X − μ | 2 ) {\displaystyle \left(\mathbb {E} |X-\mu \right|)^{2}\leq \mathbb {E} \left(|X-\mu |^{2}\right)} ( E | X − μ | ) 2 ≤ Var ⁡ ( X ) {\displaystyle \left(\mathbb {E} |X-\mu \right|)^{2}\leq \operatorname {Var} (X)} Since both sides are positive, and

812-425: Is a characteristic of a set of observations. When variance is calculated from observations, those observations are typically measured from a real-world system. If all possible observations of the system are present, then the calculated variance is called the population variance. Normally, however, only a subset is available, and the variance calculated from this is called the sample variance. The variance calculated from

870-416: Is a discrete random variable assuming possible values y 1 , y 2 , y 3 … {\displaystyle y_{1},y_{2},y_{3}\ldots } with corresponding probabilities p 1 , p 2 , p 3 … , {\displaystyle p_{1},p_{2},p_{3}\ldots ,} , then in the formula for total variance,

928-401: Is a normally distributed random variable with expected value 0 then, see Geary (1935): w = E | X | E ( X 2 ) = 2 π . {\displaystyle w={\frac {E|X|}{\sqrt {E(X^{2})}}}={\sqrt {\frac {2}{\pi }}}.} In other words, for a normal distribution, mean absolute deviation is about 0.8 times

986-406: Is based on the notion of mean-unbiasedness. Each measure of location has its own form of unbiasedness (see entry on biased estimator ). The relevant form of unbiasedness here is median unbiasedness. Sigma Sigma ( / ˈ s ɪ ɡ m ə / SIG -mə ; uppercase Σ , lowercase σ , lowercase in word-final position ς ; ‹See Tfd› Greek : σίγμα ) is the eighteenth letter of

1044-552: Is given by on the interval [0, ∞) . Its mean can be shown to be Using integration by parts and making use of the expected value already calculated, we have: Thus, the variance of X is given by A fair six-sided die can be modeled as a discrete random variable, X , with outcomes 1 through 6, each with equal probability 1/6. The expected value of X is ( 1 + 2 + 3 + 4 + 5 + 6 ) / 6 = 7 / 2. {\displaystyle (1+2+3+4+5+6)/6=7/2.} Therefore,

1102-478: Is often preferred over using the variance. In the dice example the standard deviation is √ 2.9 ≈ 1.7 , slightly larger than the expected absolute deviation of 1.5. The standard deviation and the expected absolute deviation can both be used as an indicator of the "spread" of a distribution. The standard deviation is more amenable to algebraic manipulation than the expected absolute deviation, and, together with variance and its generalization covariance ,

1160-476: Is still widely used in decorative typefaces in Greece, especially in religious and church contexts, as well as in some modern print editions of classical Greek texts. A dotted lunate sigma ( sigma periestigmenon , Ͼ ) was used by Aristarchus of Samothrace (220–143 BC) as an editorial sign indicating that the line marked as such is at an incorrect position. Similarly, a reversed sigma ( antisigma , Ͻ ), may mark

1218-406: Is the expected value of the squared deviation from the mean of X {\displaystyle X} , μ = E ⁡ [ X ] {\displaystyle \mu =\operatorname {E} [X]} : This definition encompasses random variables that are generated by processes that are discrete , continuous , neither , or mixed. The variance can also be thought of as

SECTION 20

#1732791525493

1276-472: Is the corresponding cumulative distribution function , then or equivalently, where μ {\displaystyle \mu } is the expected value of X {\displaystyle X} given by In these formulas, the integrals with respect to d x {\displaystyle dx} and d F ( x ) {\displaystyle dF(x)} are Lebesgue and Lebesgue–Stieltjes integrals, respectively. If

1334-403: Is the expected value. That is, (When such a discrete weighted variance is specified by weights whose sum is not 1, then one divides by the sum of the weights.) The variance of a collection of n {\displaystyle n} equally likely values can be written as where μ {\displaystyle \mu } is the average value. That is, The variance of

1392-452: Is the measure of central tendency most associated with the absolute deviation. Some location parameters can be compared as follows: The mean absolute deviation of a sample is a biased estimator of the mean absolute deviation of the population. In order for the absolute deviation to be an unbiased estimator, the expected value (average) of all the sample absolute deviations must equal the population absolute deviation. However, it does not. For

1450-419: Is the point about which the mean deviation is minimized. The MAD median offers a direct measure of the scale of a random variable around its median D med = E | X − median | {\displaystyle D_{\text{med}}=E|X-{\text{median}}|} This is the maximum likelihood estimator of the scale parameter b {\displaystyle b} of

1508-426: Is used frequently in theoretical statistics; however the expected absolute deviation tends to be more robust as it is less sensitive to outliers arising from measurement anomalies or an unduly heavy-tailed distribution . Variance is invariant with respect to changes in a location parameter . That is, if a constant is added to all values of the variable, the variance is unchanged: If all values are scaled by

1566-464: The Greek alphabet . In the system of Greek numerals , it has a value of 200. In general mathematics, uppercase Σ is used as an operator for summation . When used at the end of a letter-case word (one that does not use all caps ), the final form (ς) is used. In Ὀδυσσεύς (Odysseus), for example, the two lowercase sigmas (σ) in the center of the name are distinct from the word-final sigma (ς) at

1624-476: The Laplace distribution . Since the median minimizes the average absolute distance, we have D med ≤ D mean {\displaystyle D_{\text{med}}\leq D_{\text{mean}}} . The mean absolute deviation from the median is less than or equal to the mean absolute deviation from the mean. In fact, the mean absolute deviation from the median is always less than or equal to

1682-419: The conditional variance Var ⁡ ( X ∣ Y ) {\displaystyle \operatorname {Var} (X\mid Y)} may be understood as follows. Given any particular value y of the random variable  Y , there is a conditional expectation E ⁡ ( X ∣ Y = y ) {\displaystyle \operatorname {E} (X\mid Y=y)} given

1740-750: The covariance of a random variable with itself: The variance is also equivalent to the second cumulant of a probability distribution that generates X {\displaystyle X} . The variance is typically designated as Var ⁡ ( X ) {\displaystyle \operatorname {Var} (X)} , or sometimes as V ( X ) {\displaystyle V(X)} or V ( X ) {\displaystyle \mathbb {V} (X)} , or symbolically as σ X 2 {\displaystyle \sigma _{X}^{2}} or simply σ 2 {\displaystyle \sigma ^{2}} (pronounced " sigma squared"). The expression for

1798-491: The law of total variance is: If X {\displaystyle X} and Y {\displaystyle Y} are two random variables, and the variance of X {\displaystyle X} exists, then The conditional expectation E ⁡ ( X ∣ Y ) {\displaystyle \operatorname {E} (X\mid Y)} of X {\displaystyle X} given Y {\displaystyle Y} , and

Variance - Misplaced Pages Continue

1856-401: The square root is a monotonically increasing function in the positive domain: E ( | X − μ | ) ≤ Var ⁡ ( X ) {\displaystyle \mathbb {E} \left(|X-\mu \right|)\leq {\sqrt {\operatorname {Var} (X)}}} For a general case of this statement, see Hölder's inequality . The median

1914-524: The 8th century BC. At that time a simplified three-stroke version, omitting the lowermost stroke, was already found in Western Greek alphabets, and was incorporated into classical Etruscan and Oscan , as well as in the earliest Latin epigraphy (early Latin S ), such as the Duenos inscription . The alternation between three and four (and occasionally more than four) strokes was also adopted into

1972-459: The above variations of lunate sigma are encoded as U+03F9 Ϲ GREEK CAPITAL LUNATE SIGMA SYMBOL ; U+03FD Ͻ GREEK CAPITAL REVERSED LUNATE SIGMA SYMBOL , U+03FE Ͼ GREEK CAPITAL DOTTED LUNATE SIGMA SYMBOL , and U+03FF Ͽ GREEK CAPITAL REVERSED DOTTED LUNATE SIGMA SYMBOL . Sigma was adopted in the Old Italic alphabets beginning in

2030-484: The absolute deviation it is necessary to specify both the measure of deviation and the measure of central tendency. The statistical literature has not yet adopted a standard notation, as both the mean absolute deviation around the mean and the median absolute deviation around the median have been denoted by their initials "MAD" in the literature, which may lead to confusion, since they generally have values considerably different from each other. The mean absolute deviation of

2088-514: The end. The Latin letter S derives from sigma while the Cyrillic letter Es derives from a lunate form of this letter. The shape (Σς) and alphabetic position of sigma is derived from the Phoenician letter [REDACTED] ( shin ). Sigma's original name may have been san , but due to the complicated early history of the Greek epichoric alphabets , san came to be identified as

2146-568: The event  Y  =  y . This quantity depends on the particular value  y ; it is a function g ( y ) = E ⁡ ( X ∣ Y = y ) {\displaystyle g(y)=\operatorname {E} (X\mid Y=y)} . That same function evaluated at the random variable Y is the conditional expectation E ⁡ ( X ∣ Y ) = g ( Y ) . {\displaystyle \operatorname {E} (X\mid Y)=g(Y).} In particular, if Y {\displaystyle Y}

2204-679: The first term on the right-hand side becomes where σ i 2 = Var ⁡ [ X ∣ Y = y i ] {\displaystyle \sigma _{i}^{2}=\operatorname {Var} [X\mid Y=y_{i}]} . Similarly, the second term on the right-hand side becomes where μ i = E ⁡ [ X ∣ Y = y i ] {\displaystyle \mu _{i}=\operatorname {E} [X\mid Y=y_{i}]} and μ = ∑ i p i μ i {\displaystyle \mu =\sum _{i}p_{i}\mu _{i}} . Thus

2262-446: The forecasts. Although these methods are very closely related, MAD is more commonly used because it is both easier to compute (avoiding the need for squaring) and easier to understand. For the normal distribution , the ratio of mean absolute deviation from the mean to standard deviation is 2 / π = 0.79788456 … {\textstyle {\sqrt {2/\pi }}=0.79788456\ldots } . Thus if X

2320-422: The formula for the average absolute deviation as above with m ( X ) = max ( X ) {\displaystyle m(X)=\max(X)} , where max ( X ) {\displaystyle \max(X)} is the sample maximum . The measures of statistical dispersion derived from absolute deviation characterize various measures of central tendency as minimizing dispersion: The median

2378-449: The function x 2 f ( x ) {\displaystyle x^{2}f(x)} is Riemann-integrable on every finite interval [ a , b ] ⊂ R , {\displaystyle [a,b]\subset \mathbb {R} ,} then where the integral is an improper Riemann integral . The exponential distribution with parameter λ is a continuous distribution whose probability density function

Variance - Misplaced Pages Continue

2436-467: The generator of random variable X {\displaystyle X} is discrete with probability mass function x 1 ↦ p 1 , x 2 ↦ p 2 , … , x n ↦ p n {\displaystyle x_{1}\mapsto p_{1},x_{2}\mapsto p_{2},\ldots ,x_{n}\mapsto p_{n}} , then where μ {\displaystyle \mu }

2494-514: The given data set. AAD includes the mean absolute deviation and the median absolute deviation (both abbreviated as MAD ). Several measures of statistical dispersion are defined in terms of the absolute deviation. The term "average absolute deviation" does not uniquely identify a measure of statistical dispersion , as there are several measures that can be used to measure absolute deviations, and there are several measures of central tendency that can be used as well. Thus, to uniquely identify

2552-429: The indicator function is I O := { 1 if  x > median , 0 otherwise . {\displaystyle \mathbf {I} _{O}:={\begin{cases}1&{\text{if }}x>{\text{median}},\\0&{\text{otherwise}}.\end{cases}}} This representation allows for obtaining MAD median correlation coefficients. While in principle

2610-400: The mean absolute deviation from any other fixed number. By using the general dispersion function, Habib (2011) defined MAD about median as D med = E | X − median | = 2 Cov ⁡ ( X , I O ) {\displaystyle D_{\text{med}}=E|X-{\text{median}}|=2\operatorname {Cov} (X,I_{O})} where

2668-462: The mean or any other central point could be taken as the central point for the median absolute deviation, most often the median value is taken instead. The median absolute deviation (also MAD) is the median of the absolute deviation from the median . It is a robust estimator of dispersion . For the example {2, 2, 3, 4, 14}: 3 is the median, so the absolute deviations from the median are {1, 1, 0, 1, 11} (reordered as {0, 1, 1, 1, 11}) with

2726-424: The population 1,2,3 both the population absolute deviation about the median and the population absolute deviation about the mean are 2/3. The average of all the sample absolute deviations about the mean of size 3 that can be drawn from the population is 44/81, while the average of all the sample absolute deviations about the median is 4/9. Therefore, the absolute deviation is a biased estimator. However, this argument

2784-424: The predicted score and the error score, where the latter two are uncorrelated. Similar decompositions are possible for the sum of squared deviations (sum of squares, S S {\displaystyle {\mathit {SS}}} ): The population variance for a non-negative random variable can be expressed in terms of the cumulative distribution function F using This expression can be used to calculate

2842-417: The random variable, which is why the standard deviation is more commonly reported as a measure of dispersion once the calculation is finished. Another disadvantage is that the variance is not finite for many distributions. There are two distinct concepts that are both called "variance". One, as discussed above, is part of a theoretical probability distribution and is defined by an equation. The other variance

2900-605: The root of σίζω ( sízō , from Proto-Greek *sig-jō 'I hiss'). In handwritten Greek during the Hellenistic period (4th–3rd century BC), the epigraphic form of Σ was simplified into a C-like shape, which has also been found on coins from the 4th century BC onward. This became the universal standard form of sigma during late antiquity and the Middle Ages. Today, it is known as lunate sigma (uppercase Ϲ , lowercase ϲ ), because of its crescent -like shape, and

2958-568: The same value: If a distribution does not have a finite expected value, as is the case for the Cauchy distribution , then the variance cannot be finite either. However, some distributions may not have a finite variance, despite their expected value being finite. An example is a Pareto distribution whose index k {\displaystyle k} satisfies 1 < k ≤ 2. {\displaystyle 1<k\leq 2.} The general formula for variance decomposition or

SECTION 50

#1732791525493

3016-502: The standard deviation. However, in-sample measurements deliver values of the ratio of mean average deviation / standard deviation for a given Gaussian sample n with the following bounds: w n ∈ [ 0 , 1 ] {\displaystyle w_{n}\in [0,1]} , with a bias for small n . The mean absolute deviation from the mean is less than or equal to the standard deviation ; one way of proving this relies on Jensen's inequality . Jensen's inequality

3074-512: The total variance is given by A similar formula is applied in analysis of variance , where the corresponding formula is here M S {\displaystyle {\mathit {MS}}} refers to the Mean of the Squares. In linear regression analysis the corresponding formula is This can also be derived from the additivity of variances, since the total (observed) score is the sum of

3132-420: The value of the mean deviation. For example, for the data set {2, 2, 3, 4, 14}: The mean absolute deviation (MAD), also referred to as the "mean deviation" or sometimes "average absolute deviation", is the mean of the data's absolute deviations around the data's mean: the average (absolute) distance from the mean. "Average absolute deviation" can refer to either this usage, or to the general form with respect to

3190-422: The variance becomes: Average absolute deviation The average absolute deviation ( AAD ) of a data set is the average of the absolute deviations from a central point . It is a summary statistic of statistical dispersion or variability. In the general form, the central point can be a mean , median , mode , or the result of any other measure of central tendency or any reference value related to

3248-450: The variance can be expanded as follows: In other words, the variance of X is equal to the mean of the square of X minus the square of the mean of X . This equation should not be used for computations using floating point arithmetic , because it suffers from catastrophic cancellation if the two components of the equation are similar in magnitude. For other numerically stable alternatives, see algorithms for calculating variance . If

3306-585: The variance in situations where the CDF, but not the density , can be conveniently expressed. The second moment of a random variable attains the minimum value when taken around the first moment (i.e., mean) of the random variable, i.e. a r g m i n m E ( ( X − m ) 2 ) = E ( X ) {\displaystyle \mathrm {argmin} _{m}\,\mathrm {E} \left(\left(X-m\right)^{2}\right)=\mathrm {E} (X)} . Conversely, if

3364-423: The variance of X is The general formula for the variance of the outcome, X , of an n -sided die is The following table lists the variance for some commonly used probability distributions. Variance is non-negative because the squares are positive or zero: The variance of a constant is zero. Conversely, if the variance of a random variable is 0, then it is almost surely a constant. That is, it always has

#492507