In probability theory, the first-order second-moment (FOSM) method , also referenced as mean value first-order second-moment (MVFOSM) method , is a probabilistic method to determine the stochastic moments of a function with random input variables. The name is based on the derivation, which uses a first-order Taylor series and the first and second moments of the input variables.
99-413: Consider the objective function g ( x ) {\displaystyle g(x)} , where the input vector x {\displaystyle x} is a realization of the random vector X {\displaystyle X} with probability density function f X ( x ) {\displaystyle f_{X}(x)} . Because X {\displaystyle X}
198-422: A ) n . {\displaystyle f(a)+{\frac {f'(a)}{1!}}(x-a)+{\frac {f''(a)}{2!}}(x-a)^{2}+{\frac {f'''(a)}{3!}}(x-a)^{3}+\cdots =\sum _{n=0}^{\infty }{\frac {f^{(n)}(a)}{n!}}(x-a)^{n}.} Here, n ! denotes the factorial of n . The function f ( a ) denotes the n th derivative of f evaluated at the point a . The derivative of order zero of f is defined to be f itself and ( x −
297-541: A i = e − u ∑ j = 0 ∞ u j j ! a i + j . {\displaystyle \sum _{n=0}^{\infty }{\frac {u^{n}}{n!}}\Delta ^{n}a_{i}=e^{-u}\sum _{j=0}^{\infty }{\frac {u^{j}}{j!}}a_{i+j}.} So in particular, f ( a + t ) = lim h → 0 + e − t / h ∑ j = 0 ∞ f (
396-434: A n ( x − b ) n . {\displaystyle f(x)=\sum _{n=0}^{\infty }a_{n}(x-b)^{n}.} Differentiating by x the above formula n times, then setting x = b gives: f ( n ) ( b ) n ! = a n {\displaystyle {\frac {f^{(n)}(b)}{n!}}=a_{n}} and so the power series expansion agrees with
495-670: A ≤ X ≤ b ] = ∫ a b f X ( x ) d x . {\displaystyle \Pr[a\leq X\leq b]=\int _{a}^{b}f_{X}(x)\,dx.} Hence, if F X {\displaystyle F_{X}} is the cumulative distribution function of X {\displaystyle X} , then: F X ( x ) = ∫ − ∞ x f X ( u ) d u , {\displaystyle F_{X}(x)=\int _{-\infty }^{x}f_{X}(u)\,du,} and (if f X {\displaystyle f_{X}}
594-461: A ) h n = f ( a + t ) . {\displaystyle \lim _{h\to 0^{+}}\sum _{n=0}^{\infty }{\frac {t^{n}}{n!}}{\frac {\Delta _{h}^{n}f(a)}{h^{n}}}=f(a+t).} Here Δ h is the n th finite difference operator with step size h . The series is precisely the Taylor series, except that divided differences appear in place of differentiation: the series
693-399: A ) + f ″ ( a ) 2 ! ( x − a ) 2 + f ‴ ( a ) 3 ! ( x − a ) 3 + ⋯ = ∑ n = 0 ∞ f ( n ) ( a ) n ! ( x −
792-454: A = 1 is ( x − 1 ) − 1 2 ( x − 1 ) 2 + 1 3 ( x − 1 ) 3 − 1 4 ( x − 1 ) 4 + ⋯ , {\displaystyle (x-1)-{\tfrac {1}{2}}(x-1)^{2}+{\tfrac {1}{3}}(x-1)^{3}-{\tfrac {1}{4}}(x-1)^{4}+\cdots ,} and more generally,
891-467: A monotonic function , then the resulting density function is f Y ( y ) = f X ( g − 1 ( y ) ) | d d y ( g − 1 ( y ) ) | . {\displaystyle f_{Y}(y)=f_{X}{\big (}g^{-1}(y){\big )}\left|{\frac {d}{dy}}{\big (}g^{-1}(y){\big )}\right|.} Here g denotes
990-409: A probability density function ( PDF ), density function , or density of an absolutely continuous random variable , is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can be interpreted as providing a relative likelihood that the value of the random variable would be equal to that sample. Probability density
1089-889: A ) and 0! are both defined to be 1 . This series can be written by using sigma notation , as in the right side formula. With a = 0 , the Maclaurin series takes the form: f ( 0 ) + f ′ ( 0 ) 1 ! x + f ″ ( 0 ) 2 ! x 2 + f ‴ ( 0 ) 3 ! x 3 + ⋯ = ∑ n = 0 ∞ f ( n ) ( 0 ) n ! x n . {\displaystyle f(0)+{\frac {f'(0)}{1!}}x+{\frac {f''(0)}{2!}}x^{2}+{\frac {f'''(0)}{3!}}x^{3}+\cdots =\sum _{n=0}^{\infty }{\frac {f^{(n)}(0)}{n!}}x^{n}.} The Taylor series of any polynomial
SECTION 10
#17327833197421188-730: A collapsed random variable with probability density function p Z ( z ) = δ ( z ) {\displaystyle p_{Z}(z)=\delta (z)} (i.e., a constant equal to zero). Let the random vector X ~ {\displaystyle {\tilde {X}}} and the transform H {\displaystyle H} be defined as H ( Z , X ) = [ Z + V ( X ) X ] = [ Y X ~ ] . {\displaystyle H(Z,X)={\begin{bmatrix}Z+V(X)\\X\end{bmatrix}}={\begin{bmatrix}Y\\{\tilde {X}}\end{bmatrix}}.} It
1287-492: A countable set), while the PDF is used in the context of continuous random variables. Suppose bacteria of a certain species typically live 20 to 30 hours. The probability that a bacterium lives exactly 5 hours is equal to zero. A lot of bacteria live for approximately 5 hours, but there is no chance that any given bacterium dies at exactly 5.00... hours. However, the probability that the bacterium dies between 5 hours and 5.01 hours
1386-669: A density function: the distributions of discrete random variables do not; nor does the Cantor distribution , even though it has no discrete component, i.e., does not assign positive probability to any individual point. A distribution has a density function if and only if its cumulative distribution function F ( x ) is absolutely continuous . In this case: F is almost everywhere differentiable , and its derivative can be used as probability density: d d x F ( x ) = f ( x ) . {\displaystyle {\frac {d}{dx}}F(x)=f(x).} If
1485-437: A differentiable function and X {\displaystyle X} be a random vector taking values in R n {\displaystyle \mathbb {R} ^{n}} , f X {\displaystyle f_{X}} be the probability density function of X {\displaystyle X} and δ ( ⋅ ) {\displaystyle \delta (\cdot )} be
1584-487: A discrete variable can take n different values among real numbers, then the associated probability density function is: f ( t ) = ∑ i = 1 n p i δ ( t − x i ) , {\displaystyle f(t)=\sum _{i=1}^{n}p_{i}\,\delta (t-x_{i}),} where x 1 , … , x n {\displaystyle x_{1},\ldots ,x_{n}} are
1683-498: A few centuries later. In the 14th century, the earliest examples of specific Taylor series (but not the general method) were given by Indian mathematician Madhava of Sangamagrama . Though no record of his work survives, writings of his followers in the Kerala school of astronomy and mathematics suggest that he found the Taylor series for the trigonometric functions of sine , cosine , and arctangent (see Madhava series ). During
1782-1005: A general method for expanding functions in series. Newton had in fact used a cumbersome method involving long division of series and term-by-term integration, but Gregory did not know it and set out to discover a general method for himself. In early 1671 Gregory discovered something like the general Maclaurin series and sent a letter to Collins including series for arctan x , {\textstyle \arctan x,} tan x , {\textstyle \tan x,} sec x , {\textstyle \sec x,} ln sec x {\textstyle \ln \,\sec x} (the integral of tan {\displaystyle \tan } ), ln tan 1 2 ( 1 2 π + x ) {\textstyle \ln \,\tan {\tfrac {1}{2}}{{\bigl (}{\tfrac {1}{2}}\pi +x{\bigr )}}} (the integral of sec ,
1881-407: A given distribution, the parameters are constants, and terms in a density function that contain only parameters, but not variables, are part of the normalization factor of a distribution (the multiplicative factor that ensures that the area under the density—the probability of something in the domain occurring— equals 1). This normalization factor is outside the kernel of the distribution. Since
1980-479: A joint density are all independent from each other if and only if f X 1 , … , X n ( x 1 , … , x n ) = f X 1 ( x 1 ) ⋯ f X n ( x n ) . {\displaystyle f_{X_{1},\ldots ,X_{n}}(x_{1},\ldots ,x_{n})=f_{X_{1}}(x_{1})\cdots f_{X_{n}}(x_{n}).} If
2079-404: A method by Newton, Gregory never described how he obtained these series, and it can only be inferred that he understood the general method by examining scratch work he had scribbled on the back of another letter from 1671. In 1691–1692, Isaac Newton wrote down an explicit statement of the Taylor and Maclaurin series in an unpublished version of his work De Quadratura Curvarum . However, this work
SECTION 20
#17327833197422178-490: A philosophical resolution of the paradox, but the mathematical content was apparently unresolved until taken up by Archimedes , as it had been prior to Aristotle by the Presocratic Atomist Democritus . It was through Archimedes's method of exhaustion that an infinite number of progressive subdivisions could be performed to achieve a finite result. Liu Hui independently employed a similar method
2277-408: A probability distribution admits a density, then the probability of every one-point set { a } is zero; the same holds for finite and countable sets. Two probability densities f and g represent the same probability distribution precisely if they differ only on a set of Lebesgue measure zero . In the field of statistical physics , a non-formal reformulation of the relation above between
2376-404: A radius of convergence 0 everywhere. A function cannot be written as a Taylor series centred at a singularity ; in these cases, one can often still achieve a series expansion if one allows also negative powers of the variable x ; see Laurent series . For example, f ( x ) = e can be written as a Laurent series. The generalization of the Taylor series does converge to the value of
2475-471: A random variable X is given and its distribution admits a probability density function f , then the expected value of X (if the expected value exists) can be calculated as E [ X ] = ∫ − ∞ ∞ x f ( x ) d x . {\displaystyle \operatorname {E} [X]=\int _{-\infty }^{\infty }x\,f(x)\,dx.} Not every probability distribution has
2574-410: A random variable (or vector) X is given as f X ( x ) , it is possible (but often not necessary; see below) to calculate the probability density function of some variable Y = g ( X ) . This is also called a "change of variable" and is in practice used to generate a random variable of arbitrary shape f g ( X ) = f Y using a known (for instance, uniform) random number generator. It
2673-718: A reference for a continuous random variable). Furthermore, when it does exist, the density is almost unique, meaning that any two such densities coincide almost everywhere . Unlike a probability, a probability density function can take on values greater than one; for example, the continuous uniform distribution on the interval [0, 1/2] has probability density f ( x ) = 2 for 0 ≤ x ≤ 1/2 and f ( x ) = 0 elsewhere. The standard normal distribution has probability density f ( x ) = 1 2 π e − x 2 / 2 . {\displaystyle f(x)={\frac {1}{\sqrt {2\pi }}}\,e^{-x^{2}/2}.} If
2772-524: A significantly smaller number of evaluations than performing a Monte Carlo simulation. However, when using the FOSM method as a design procedure, a lower bound shall be estimated, which is actually not given by the FOSM approach. Therefore, a type of distribution needs to be assumed for the distribution of the objective function, taking into account the approximated mean value and standard deviation. Probability density function In probability theory ,
2871-983: A whole, often called joint probability density function . This density function is defined as a function of the n variables, such that, for any domain D in the n -dimensional space of the values of the variables X 1 , ..., X n , the probability that a realisation of the set variables falls inside the domain D is Pr ( X 1 , … , X n ∈ D ) = ∫ D f X 1 , … , X n ( x 1 , … , x n ) d x 1 ⋯ d x n . {\displaystyle \Pr \left(X_{1},\ldots ,X_{n}\in D\right)=\int _{D}f_{X_{1},\ldots ,X_{n}}(x_{1},\ldots ,x_{n})\,dx_{1}\cdots dx_{n}.} If F ( x 1 , ..., x n ) = Pr( X 1 ≤ x 1 , ..., X n ≤ x n )
2970-410: Is Pr ( X > 0 , Y > 0 ) = ∫ 0 ∞ ∫ 0 ∞ f X , Y ( x , y ) d x d y . {\displaystyle \Pr \left(X>0,Y>0\right)=\int _{0}^{\infty }\int _{0}^{\infty }f_{X,Y}(x,y)\,dx\,dy.} If the probability density function of
3069-559: Is infinitely differentiable at x = 0 , and has all derivatives zero there. Consequently, the Taylor series of f ( x ) about x = 0 is identically zero. However, f ( x ) is not the zero function, so does not equal its Taylor series around the origin. Thus, f ( x ) is an example of a non-analytic smooth function . In real analysis , this example shows that there are infinitely differentiable functions f ( x ) whose Taylor series are not equal to f ( x ) even if they converge. By contrast,
First-order second-moment method - Misplaced Pages Continue
3168-549: Is an upper triangular matrix with ones on the main diagonal, therefore its determinant is 1. Applying the change of variable theorem from the previous section we obtain that f Y , X ( y , x ) = f X ( x ) δ ( y − V ( x ) ) , {\displaystyle f_{Y,X}(y,x)=f_{X}(\mathbf {x} )\delta {\big (}y-V(\mathbf {x} ){\big )},} which if marginalized over x {\displaystyle x} leads to
3267-945: Is clear that H {\displaystyle H} is a bijective mapping, and the Jacobian of H − 1 {\displaystyle H^{-1}} is given by: d H − 1 ( y , x ~ ) d y d x ~ = [ 1 − d V ( x ~ ) d x ~ 0 n × 1 I n × n ] , {\displaystyle {\frac {dH^{-1}(y,{\tilde {\mathbf {x} }})}{dy\,d{\tilde {\mathbf {x} }}}}={\begin{bmatrix}1&-{\frac {dV({\tilde {\mathbf {x} }})}{d{\tilde {\mathbf {x} }}}}\\\mathbf {0} _{n\times 1}&\mathbf {I} _{n\times n}\end{bmatrix}},} which
3366-456: Is continuous at x {\displaystyle x} ) f X ( x ) = d d x F X ( x ) . {\displaystyle f_{X}(x)={\frac {d}{dx}}F_{X}(x).} Intuitively, one can think of f X ( x ) d x {\displaystyle f_{X}(x)\,dx} as being the probability of X {\displaystyle X} falling within
3465-482: Is formally similar to the Newton series . When the function f is analytic at a , the terms in the series converge to the terms of the Taylor series, and in this sense generalizes the usual Taylor series. In general, for any infinite sequence a i , the following power series identity holds: ∑ n = 0 ∞ u n n ! Δ n
3564-477: Is given by the integral Inserting the first-order Taylor series yields The variance of g {\displaystyle g} is given by the integral According to the computational formula for the variance, this can be written as Inserting the Taylor series yields The following abbreviations are introduced. In the following, the entries of the random vector X {\displaystyle X} are assumed to be independent . Considering also
3663-401: Is no more than | x | / 9! . For a full cycle centered at the origin ( −π < x < π ) the error is less than 0.08215. In particular, for −1 < x < 1 , the error is less than 0.000003. In contrast, also shown is a picture of the natural logarithm function ln(1 + x ) and some of its Taylor polynomials around a = 0 . These approximations converge to
3762-501: Is not necessarily a density) then the n variables in the set are all independent from each other, and the marginal probability density function of each of them is given by f X i ( x i ) = f i ( x i ) ∫ f i ( x ) d x . {\displaystyle f_{X_{i}}(x_{i})={\frac {f_{i}(x_{i})}{\int f_{i}(x)\,dx}}.} This elementary example illustrates
3861-459: Is possible to represent certain discrete random variables as well as random variables involving both a continuous and a discrete part with a generalized probability density function using the Dirac delta function . (This is not possible with a probability density function in the sense defined above, it may be done with a distribution .) For example, consider a binary discrete random variable having
3960-414: Is quantifiable. Suppose the answer is 0.02 (i.e., 2%). Then, the probability that the bacterium dies between 5 hours and 5.001 hours should be about 0.002, since this time interval is one-tenth as long as the previous. The probability that the bacterium dies between 5 hours and 5.0001 hours should be about 0.0002, and so on. In this example, the ratio (probability of living during an interval) / (duration of
4059-588: Is randomly distributed, g {\displaystyle g} is also randomly distributed. Following the FOSM method, the mean value of g {\displaystyle g} is approximated by The variance of g {\displaystyle g} is approximated by where n {\displaystyle n} is the length/dimension of x {\displaystyle x} and ∂ g ( μ ) ∂ x i {\textstyle {\frac {\partial g(\mu )}{\partial x_{i}}}}
First-order second-moment method - Misplaced Pages Continue
4158-900: Is tempting to think that in order to find the expected value E( g ( X )) , one must first find the probability density f g ( X ) of the new random variable Y = g ( X ) . However, rather than computing E ( g ( X ) ) = ∫ − ∞ ∞ y f g ( X ) ( y ) d y , {\displaystyle \operatorname {E} {\big (}g(X){\big )}=\int _{-\infty }^{\infty }yf_{g(X)}(y)\,dy,} one may find instead E ( g ( X ) ) = ∫ − ∞ ∞ g ( x ) f X ( x ) d x . {\displaystyle \operatorname {E} {\big (}g(X){\big )}=\int _{-\infty }^{\infty }g(x)f_{X}(x)\,dx.} The values of
4257-742: Is the Radon–Nikodym derivative : f = d X ∗ P d μ . {\displaystyle f={\frac {dX_{*}P}{d\mu }}.} That is, f is any measurable function with the property that: Pr [ X ∈ A ] = ∫ X − 1 A d P = ∫ A f d μ {\displaystyle \Pr[X\in A]=\int _{X^{-1}A}\,dP=\int _{A}f\,d\mu } for any measurable set A ∈ A . {\displaystyle A\in {\mathcal {A}}.} In
4356-534: Is the cumulative distribution function of the vector ( X 1 , ..., X n ) , then the joint probability density function can be computed as a partial derivative f ( x ) = ∂ n F ∂ x 1 ⋯ ∂ x n | x {\displaystyle f(x)=\left.{\frac {\partial ^{n}F}{\partial x_{1}\cdots \partial x_{n}}}\right|_{x}} For i = 1, 2, ..., n , let f X i ( x i ) be
4455-430: Is the probability per unit length, in other words, while the absolute likelihood for a continuous random variable to take on any particular value is 0 (since there is an infinite set of possible values to begin with), the value of the PDF at two different samples can be used to infer, in any particular draw of the random variable, how much more likely it is that the random variable would be close to one sample compared to
4554-976: Is the number of solutions in x for the equation g ( x ) = y {\displaystyle g(x)=y} , and g k − 1 ( y ) {\displaystyle g_{k}^{-1}(y)} are these solutions. Suppose x is an n -dimensional random variable with joint density f . If y = G ( x ) , where G is a bijective , differentiable function , then y has density p Y : p Y ( y ) = f ( G − 1 ( y ) ) | det [ d G − 1 ( z ) d z | z = y ] | {\displaystyle p_{Y}(\mathbf {y} )=f{\Bigl (}G^{-1}(\mathbf {y} ){\Bigr )}\left|\det \left[\left.{\frac {dG^{-1}(\mathbf {z} )}{d\mathbf {z} }}\right|_{\mathbf {z} =\mathbf {y} }\right]\right|} with
4653-507: Is the partial derivative of g {\displaystyle g} at the mean vector μ {\displaystyle \mu } with respect to the i -th entry of x {\displaystyle x} . More accurate, second-order second-moment approximations are also available The objective function is approximated by a Taylor series at the mean vector μ {\displaystyle \mu } . The mean value of g {\displaystyle g}
4752-485: Is the point where the derivatives are considered, after Colin Maclaurin , who made extensive use of this special case of Taylor series in the 18th century. The partial sum formed by the first n + 1 terms of a Taylor series is a polynomial of degree n that is called the n th Taylor polynomial of the function. Taylor polynomials are approximations of a function, which become generally more accurate as n increases. Taylor's theorem gives quantitative estimates on
4851-650: Is the polynomial itself. The Maclaurin series of 1 / 1 − x is the geometric series 1 + x + x 2 + x 3 + ⋯ . {\displaystyle 1+x+x^{2}+x^{3}+\cdots .} So, by substituting x for 1 − x , the Taylor series of 1 / x at a = 1 is 1 − ( x − 1 ) + ( x − 1 ) 2 − ( x − 1 ) 3 + ⋯ . {\displaystyle 1-(x-1)+(x-1)^{2}-(x-1)^{3}+\cdots .} By integrating
4950-459: Is the probability that the bacterium dies in that window. A probability density function is most commonly associated with absolutely continuous univariate distributions . A random variable X {\displaystyle X} has density f X {\displaystyle f_{X}} , where f X {\displaystyle f_{X}} is a non-negative Lebesgue-integrable function, if: Pr [
5049-528: Is the probability that the bacterium dies within an infinitesimal window of time around 5 hours, where dt is the duration of this window. For example, the probability that it lives longer than 5 hours, but shorter than (5 hours + 1 nanosecond), is (2 hour )×(1 nanosecond) ≈ 6 × 10 (using the unit conversion 3.6 × 10 nanoseconds = 1 hour). There is a probability density function f with f (5 hours) = 2 hour . The integral of f over any window of time (not only infinitesimal windows but also large windows)
SECTION 50
#17327833197425148-404: Is undefined at 0. More generally, every sequence of real or complex numbers can appear as coefficients in the Taylor series of an infinitely differentiable function defined on the real line, a consequence of Borel's lemma . As a result, the radius of convergence of a Taylor series can be zero. There are even infinitely differentiable functions defined on the real line whose Taylor series have
5247-541: The Borel sets as measurable subsets) has as probability distribution the pushforward measure X ∗ P on ( X , A ) {\displaystyle ({\mathcal {X}},{\mathcal {A}})} : the density of X {\displaystyle X} with respect to a reference measure μ {\displaystyle \mu } on ( X , A ) {\displaystyle ({\mathcal {X}},{\mathcal {A}})}
5346-671: The Dirac delta function. It is possible to use the formulas above to determine f Y {\displaystyle f_{Y}} , the probability density function of Y = V ( X ) {\displaystyle Y=V(X)} , which will be given by f Y ( y ) = ∫ R n f X ( x ) δ ( y − V ( x ) ) d x . {\displaystyle f_{Y}(y)=\int _{\mathbb {R} ^{n}}f_{X}(\mathbf {x} )\delta {\big (}y-V(\mathbf {x} ){\big )}\,d\mathbf {x} .} This result leads to
5445-477: The Rademacher distribution —that is, taking −1 or 1 for values, with probability 1 ⁄ 2 each. The density of probability associated with this variable is: f ( t ) = 1 2 ( δ ( t + 1 ) + δ ( t − 1 ) ) . {\displaystyle f(t)={\frac {1}{2}}(\delta (t+1)+\delta (t-1)).} More generally, if
5544-452: The Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor series are equal near this point. Taylor series are named after Brook Taylor , who introduced them in 1715. A Taylor series is also called a Maclaurin series when 0
5643-414: The complex plane ) containing x . This implies that the function is analytic at every point of the interval (or disk). The Taylor series of a real or complex-valued function f ( x ) , that is infinitely differentiable at a real or complex number a , is the power series f ( a ) + f ′ ( a ) 1 ! ( x −
5742-469: The continuous univariate case above , the reference measure is the Lebesgue measure . The probability mass function of a discrete random variable is the density with respect to the counting measure over the sample space (usually the set of integers , or some subset thereof). It is not possible to define a density with reference to an arbitrary measure (e.g. one can not choose the counting measure as
5841-1025: The exponential function e is ∑ n = 0 ∞ x n n ! = x 0 0 ! + x 1 1 ! + x 2 2 ! + x 3 3 ! + x 4 4 ! + x 5 5 ! + ⋯ = 1 + x + x 2 2 + x 3 6 + x 4 24 + x 5 120 + ⋯ . {\displaystyle {\begin{aligned}\sum _{n=0}^{\infty }{\frac {x^{n}}{n!}}&={\frac {x^{0}}{0!}}+{\frac {x^{1}}{1!}}+{\frac {x^{2}}{2!}}+{\frac {x^{3}}{3!}}+{\frac {x^{4}}{4!}}+{\frac {x^{5}}{5!}}+\cdots \\&=1+x+{\frac {x^{2}}{2}}+{\frac {x^{3}}{6}}+{\frac {x^{4}}{24}}+{\frac {x^{5}}{120}}+\cdots .\end{aligned}}} The above expansion holds because
5940-425: The holomorphic functions studied in complex analysis always possess a convergent Taylor series, and even the Taylor series of meromorphic functions , which might have singularities, never converge to a value different from the function itself. The complex function e , however, does not approach 0 when z approaches 0 along the imaginary axis, so it is not continuous in the complex plane and its Taylor series
6039-1357: The inverse function . This follows from the fact that the probability contained in a differential area must be invariant under change of variables. That is, | f Y ( y ) d y | = | f X ( x ) d x | , {\displaystyle \left|f_{Y}(y)\,dy\right|=\left|f_{X}(x)\,dx\right|,} or f Y ( y ) = | d x d y | f X ( x ) = | d d y ( x ) | f X ( x ) = | d d y ( g − 1 ( y ) ) | f X ( g − 1 ( y ) ) = | ( g − 1 ) ′ ( y ) | ⋅ f X ( g − 1 ( y ) ) . {\displaystyle f_{Y}(y)=\left|{\frac {dx}{dy}}\right|f_{X}(x)=\left|{\frac {d}{dy}}(x)\right|f_{X}(x)=\left|{\frac {d}{dy}}{\big (}g^{-1}(y){\big )}\right|f_{X}{\big (}g^{-1}(y){\big )}={\left|\left(g^{-1}\right)'(y)\right|}\cdot f_{X}{\big (}g^{-1}(y){\big )}.} For functions that are not monotonic,
SECTION 60
#17327833197426138-1349: The law of the unconscious statistician : E Y [ Y ] = ∫ R y f Y ( y ) d y = ∫ R y ∫ R n f X ( x ) δ ( y − V ( x ) ) d x d y = ∫ R n ∫ R y f X ( x ) δ ( y − V ( x ) ) d y d x = ∫ R n V ( x ) f X ( x ) d x = E X [ V ( X ) ] . {\displaystyle \operatorname {E} _{Y}[Y]=\int _{\mathbb {R} }yf_{Y}(y)\,dy=\int _{\mathbb {R} }y\int _{\mathbb {R} ^{n}}f_{X}(\mathbf {x} )\delta {\big (}y-V(\mathbf {x} ){\big )}\,d\mathbf {x} \,dy=\int _{{\mathbb {R} }^{n}}\int _{\mathbb {R} }yf_{X}(\mathbf {x} )\delta {\big (}y-V(\mathbf {x} ){\big )}\,dy\,d\mathbf {x} =\int _{\mathbb {R} ^{n}}V(\mathbf {x} )f_{X}(\mathbf {x} )\,d\mathbf {x} =\operatorname {E} _{X}[V(X)].} Proof: Let Z {\displaystyle Z} be
6237-552: The mean , variance , and kurtosis ), starting from the formulas given for a continuous distribution of the probability. It is common for probability density functions (and probability mass functions ) to be parametrized—that is, to be characterized by unspecified parameters . For example, the normal distribution is parametrized in terms of the mean and the variance , denoted by μ {\displaystyle \mu } and σ 2 {\displaystyle \sigma ^{2}} respectively, giving
6336-489: The probability distribution is defined as a function over general sets of values or it may refer to the cumulative distribution function , or it may be a probability mass function (PMF) rather than the density. "Density function" itself is also used for the probability mass function, leading to further confusion. In general though, the PMF is used in the context of discrete random variables (random variables that take values on
6435-399: The square root , the logarithm , the trigonometric function tangent, and its inverse, arctan . For these functions the Taylor series do not converge if x is far from b . That is, the Taylor series diverges at x if the distance between x and b is larger than the radius of convergence . The Taylor series can be used to calculate the value of an entire function at every point, if
6534-412: The Taylor series, but higher-order moments, the third central moment is approximated by For the second-order approximations of the third central moment as well as for the derivation of all higher-order approximations see Appendix D of Ref. Taking into account the quadratic terms of the Taylor series and the third moments of the input variables is referred to as second-order third-moment method. However,
6633-482: The Taylor series. Thus a function is analytic in an open disk centered at b if and only if its Taylor series converges to the value of the function at each point of the disk. If f ( x ) is equal to the sum of its Taylor series for all x in the complex plane, it is called entire . The polynomials, exponential function e , and the trigonometric functions sine and cosine, are examples of entire functions. Examples of functions that are not entire include
6732-492: The above Maclaurin series, we find the Maclaurin series of ln(1 − x ) , where ln denotes the natural logarithm : − x − 1 2 x 2 − 1 3 x 3 − 1 4 x 4 − ⋯ . {\displaystyle -x-{\tfrac {1}{2}}x^{2}-{\tfrac {1}{3}}x^{3}-{\tfrac {1}{4}}x^{4}-\cdots .} The corresponding Taylor series of ln x at
6831-434: The above definition of multidimensional probability density functions in the simple case of a function of a set of two variables. Let us call R → {\displaystyle {\vec {R}}} a 2-dimensional random vector of coordinates ( X , Y ) : the probability to obtain R → {\displaystyle {\vec {R}}} in the quarter plane of positive x and y
6930-449: The corresponding Taylor series of ln x at an arbitrary nonzero point a is: ln a + 1 a ( x − a ) − 1 a 2 ( x − a ) 2 2 + ⋯ . {\displaystyle \ln a+{\frac {1}{a}}(x-a)-{\frac {1}{a^{2}}}{\frac {\left(x-a\right)^{2}}{2}}+\cdots .} The Maclaurin series of
7029-431: The derivative of e with respect to x is also e , and e equals 1. This leaves the terms ( x − 0) in the numerator and n ! in the denominator of each term in the infinite sum. The ancient Greek philosopher Zeno of Elea considered the problem of summing an infinite series to achieve a finite result, but rejected it as an impossibility; the result was Zeno's paradox . Later, Aristotle proposed
7128-542: The derivative of the cumulative distribution function and the probability density function is generally used as the definition of the probability density function. This alternate definition is the following: If dt is an infinitely small number, the probability that X is included within the interval ( t , t + dt ) is equal to f ( t ) dt , or: Pr ( t < X < t + d t ) = f ( t ) d t . {\displaystyle \Pr(t<X<t+dt)=f(t)\,dt.} It
7227-717: The desired probability density function. The probability density function of the sum of two independent random variables U and V , each of which has a probability density function, is the convolution of their separate density functions: f U + V ( x ) = ∫ − ∞ ∞ f U ( y ) f V ( x − y ) d y = ( f U ∗ f V ) ( x ) {\displaystyle f_{U+V}(x)=\int _{-\infty }^{\infty }f_{U}(y)f_{V}(x-y)\,dy=\left(f_{U}*f_{V}\right)(x)} Taylor series In mathematics ,
7326-1824: The differential regarded as the Jacobian of the inverse of G (⋅) , evaluated at y . For example, in the 2-dimensional case x = ( x 1 , x 2 ) , suppose the transform G is given as y 1 = G 1 ( x 1 , x 2 ) , y 2 = G 2 ( x 1 , x 2 ) with inverses x 1 = G 1 ( y 1 , y 2 ) , x 2 = G 2 ( y 1 , y 2 ) . The joint distribution for y = ( y 1 , y 2 ) has density p Y 1 , Y 2 ( y 1 , y 2 ) = f X 1 , X 2 ( G 1 − 1 ( y 1 , y 2 ) , G 2 − 1 ( y 1 , y 2 ) ) | ∂ G 1 − 1 ∂ y 1 ∂ G 2 − 1 ∂ y 2 − ∂ G 1 − 1 ∂ y 2 ∂ G 2 − 1 ∂ y 1 | . {\displaystyle p_{Y_{1},Y_{2}}(y_{1},y_{2})=f_{X_{1},X_{2}}{\big (}G_{1}^{-1}(y_{1},y_{2}),G_{2}^{-1}(y_{1},y_{2}){\big )}\left\vert {\frac {\partial G_{1}^{-1}}{\partial y_{1}}}{\frac {\partial G_{2}^{-1}}{\partial y_{2}}}-{\frac {\partial G_{1}^{-1}}{\partial y_{2}}}{\frac {\partial G_{2}^{-1}}{\partial y_{1}}}\right\vert .} Let V : R n → R {\displaystyle V:\mathbb {R} ^{n}\to \mathbb {R} } be
7425-426: The discrete values accessible to the variable and p 1 , … , p n {\displaystyle p_{1},\ldots ,p_{n}} are the probabilities associated with these values. This substantially unifies the treatment of discrete and continuous probability distributions. The above expression allows for determining statistical characteristics of such a discrete variable (such as
7524-412: The error introduced by the use of such approximations. If the Taylor series of a function is convergent , its sum is the limit of the infinite sequence of the Taylor polynomials. A function may differ from the sum of its Taylor series, even if its Taylor series is convergent. A function is analytic at a point x if it is equal to the sum of its Taylor series in some open interval (or open disk in
7623-463: The family of densities f ( x ; μ , σ 2 ) = 1 σ 2 π e − 1 2 ( x − μ σ ) 2 . {\displaystyle f(x;\mu ,\sigma ^{2})={\frac {1}{\sigma {\sqrt {2\pi }}}}e^{-{\frac {1}{2}}\left({\frac {x-\mu }{\sigma }}\right)^{2}}.} Different values of
7722-578: The following two centuries his followers developed further series expansions and rational approximations. In late 1670, James Gregory was shown in a letter from John Collins several Maclaurin series ( sin x , {\textstyle \sin x,} cos x , {\textstyle \cos x,} arcsin x , {\textstyle \arcsin x,} and x cot x {\textstyle x\cot x} ) derived by Isaac Newton , and told that Newton had developed
7821-509: The full second-order approach of the variance (given above) also includes fourth-order moments of input parameters, the full second-order approach of the skewness 6th-order moments, and the full second-order approach of the kurtosis up to 8th-order moments. There are several examples in the literature where the FOSM method is employed to estimate the stochastic distribution of the buckling load of axially compressed structures (see e.g. Ref.). For structures which are very sensitive to deviations from
7920-431: The function itself for any bounded continuous function on (0,∞) , and this can be done by using the calculus of finite differences . Specifically, the following theorem, due to Einar Hille , that for any t > 0 , lim h → 0 + ∑ n = 0 ∞ t n n ! Δ h n f (
8019-399: The function only in the region −1 < x ≤ 1 ; outside of this region the higher-degree Taylor polynomials are worse approximations for the function. The error incurred in approximating a function by its n th-degree Taylor polynomial is called the remainder or residual and is denoted by the function R n ( x ) . Taylor's theorem can be used to obtain a bound on the size of
8118-497: The ideal structure (like cylindrical shells) it has been proposed to use the FOSM method as a design approach. Often the applicability is checked by comparison with a Monte Carlo simulation . Two comprehensive application examples of the full second-order method specifically oriented towards the fatigue crack growth in a metal railway axle are discussed and checked by comparison with a Monte Carlo simulation in Ref. In engineering practice,
8217-554: The infinitesimal interval [ x , x + d x ] {\displaystyle [x,x+dx]} . ( This definition may be extended to any probability distribution using the measure-theoretic definition of probability . ) A random variable X {\displaystyle X} with values in a measurable space ( X , A ) {\displaystyle ({\mathcal {X}},{\mathcal {A}})} (usually R n {\displaystyle \mathbb {R} ^{n}} with
8316-411: The interval) is approximately constant, and equal to 2 per hour (or 2 hour ). For example, there is 0.02 probability of dying in the 0.01-hour interval between 5 and 5.01 hours, and (0.02 probability / 0.01 hours) = 2 hour . This quantity 2 hour is called the probability density for dying at around 5 hours. Therefore, the probability that the bacterium dies at 5 hours can be written as (2 hour ) dt . This
8415-525: The inverse Gudermannian function ), arcsec ( 2 e x ) , {\textstyle \operatorname {arcsec} {\bigl (}{\sqrt {2}}e^{x}{\bigr )},} and 2 arctan e x − 1 2 π {\textstyle 2\arctan e^{x}-{\tfrac {1}{2}}\pi } (the Gudermannian function). However, thinking that he had merely redeveloped
8514-530: The joint probability density function of a vector of n random variables can be factored into a product of n functions of one variable f X 1 , … , X n ( x 1 , … , x n ) = f 1 ( x 1 ) ⋯ f n ( x n ) , {\displaystyle f_{X_{1},\ldots ,X_{n}}(x_{1},\ldots ,x_{n})=f_{1}(x_{1})\cdots f_{n}(x_{n}),} (where each f i
8613-422: The objective function often is not given as analytic expression, but for instance as a result of a finite-element simulation. Then the derivatives of the objective function need to be estimated by the central differences method. The number of evaluations of the objective function equals 2 n + 1 {\displaystyle 2n+1} . Depending on the number of random variables this still can mean
8712-408: The other sample. More precisely, the PDF is used to specify the probability of the random variable falling within a particular range of values , as opposed to taking on any one value. This probability is given by the integral of this variable's PDF over that range—that is, it is given by the area under the density function but above the horizontal axis and between the lowest and greatest values of
8811-404: The parameters are constants, reparametrizing a density in terms of different parameters to give a characterization of a different random variable in the family, means simply substituting the new parameter values into the formula in place of the old ones. For continuous random variables X 1 , ..., X n , it is also possible to define a probability density function associated to the set as
8910-417: The parameters describe different distributions of different random variables on the same sample space (the same set of all possible values of the variable); this sample space is the domain of the family of random variables that this family of distributions describes. A given set of parameters describes a single distribution within the family sharing the functional form of the density. From the perspective of
9009-800: The probability density function associated with variable X i alone. This is called the marginal density function, and can be deduced from the probability density associated with the random variables X 1 , ..., X n by integrating over all values of the other n − 1 variables: f X i ( x i ) = ∫ f ( x 1 , … , x n ) d x 1 ⋯ d x i − 1 d x i + 1 ⋯ d x n . {\displaystyle f_{X_{i}}(x_{i})=\int f(x_{1},\ldots ,x_{n})\,dx_{1}\cdots dx_{i-1}\,dx_{i+1}\cdots dx_{n}.} Continuous random variables X 1 , ..., X n admitting
9108-475: The probability density function for y is ∑ k = 1 n ( y ) | d d y g k − 1 ( y ) | ⋅ f X ( g k − 1 ( y ) ) , {\displaystyle \sum _{k=1}^{n(y)}\left|{\frac {d}{dy}}g_{k}^{-1}(y)\right|\cdot f_{X}{\big (}g_{k}^{-1}(y){\big )},} where n ( y )
9207-413: The range. The probability density function is nonnegative everywhere, and the area under the entire curve is equal to 1. The terms probability distribution function and probability function have also sometimes been used to denote the probability density function. However, this use is not standard among probabilists and statisticians. In other sources, "probability distribution function" may be used when
9306-749: The remainder . In general, Taylor series need not be convergent at all. In fact, the set of functions with a convergent Taylor series is a meager set in the Fréchet space of smooth functions . Even if the Taylor series of a function f does converge, its limit need not be equal to the value of the function f ( x ) . For example, the function f ( x ) = { e − 1 / x 2 if x ≠ 0 0 if x = 0 {\displaystyle f(x)={\begin{cases}e^{-1/x^{2}}&{\text{if }}x\neq 0\\[3mu]0&{\text{if }}x=0\end{cases}}}
9405-415: The second-order terms of the Taylor expansion, the approximation of the mean value is given by The incomplete second-order approximation (ISOA) of the variance is given by The skewness of g {\displaystyle g} can be determined from the third central moment μ g , 3 {\displaystyle \mu _{g,3}} . When considering only linear terms of
9504-409: The special case of the Taylor result in the mid-18th century. If f ( x ) is given by a convergent power series in an open disk centred at b in the complex plane (or an interval in the real line), it is said to be analytic in this region. Thus for x in this region, f is given by a convergent power series f ( x ) = ∑ n = 0 ∞
9603-427: The two integrals are the same in all cases in which both X and g ( X ) actually have probability density functions. It is not necessary that g be a one-to-one function . In some cases the latter integral is computed much more easily than the former. See Law of the unconscious statistician . Let g : R → R {\displaystyle g:\mathbb {R} \to \mathbb {R} } be
9702-636: The value of the function, and of all of its derivatives, are known at a single point. Uses of the Taylor series for analytic functions include: Pictured is an accurate approximation of sin x around the point x = 0 . The pink curve is a polynomial of degree seven: sin x ≈ x − x 3 3 ! + x 5 5 ! − x 7 7 ! . {\displaystyle \sin {x}\approx x-{\frac {x^{3}}{3!}}+{\frac {x^{5}}{5!}}-{\frac {x^{7}}{7!}}.\!} The error in this approximation
9801-438: Was never completed and the relevant sections were omitted from the portions published in 1704 under the title Tractatus de Quadratura Curvarum . It was not until 1715 that a general method for constructing these series for all functions for which they exist was finally published by Brook Taylor , after whom the series are now named. The Maclaurin series was named after Colin Maclaurin , a Scottish mathematician, who published
#741258