fbpx
Wikipedia

Conditioning (probability)

Beliefs depend on the available information. This idea is formalized in probability theory by conditioning. Conditional probabilities, conditional expectations, and conditional probability distributions are treated on three levels: discrete probabilities, probability density functions, and measure theory. Conditioning leads to a non-random result if the condition is completely specified; otherwise, if the condition is left random, the result of conditioning is also random.

Conditioning on the discrete level edit

Example: A fair coin is tossed 10 times; the random variable X is the number of heads in these 10 tosses, and Y is the number of heads in the first 3 tosses. In spite of the fact that Y emerges before X it may happen that someone knows X but not Y.

Conditional probability edit

Given that X = 1, the conditional probability of the event Y = 0 is

 

More generally,

 

One may also treat the conditional probability as a random variable, — a function of the random variable X, namely,

 

The expectation of this random variable is equal to the (unconditional) probability,

 

namely,

 

which is an instance of the law of total probability  

Thus,   may be treated as the value of the random variable   corresponding to X = 1. On the other hand,   is well-defined irrespective of other possible values of X.

Conditional expectation edit

Given that X = 1, the conditional expectation of the random variable Y is   More generally,

 

(In this example it appears to be a linear function, but in general it is nonlinear.) One may also treat the conditional expectation as a random variable, — a function of the random variable X, namely,

 

The expectation of this random variable is equal to the (unconditional) expectation of Y,

 

namely,

 

or simply

 

which is an instance of the law of total expectation  

The random variable   is the best predictor of Y given X. That is, it minimizes the mean square error   on the class of all random variables of the form f(X). This class of random variables remains intact if X is replaced, say, with 2X. Thus,   It does not mean that   rather,   In particular,   More generally,   for every function g that is one-to-one on the set of all possible values of X. The values of X are irrelevant; what matters is the partition (denote it αX)

 

of the sample space Ω into disjoint sets {X = xn}. (Here   are all possible values of X.) Given an arbitrary partition α of Ω, one may define the random variable E ( Y | α ). Still, E ( E ( Y | α)) = E ( Y ).

Conditional probability may be treated as a special case of conditional expectation. Namely, P ( A | X ) = E ( Y | X ) if Y is the indicator of A. Therefore the conditional probability also depends on the partition αX generated by X rather than on X itself; P ( A | g(X) ) = P (A | X) = P (A | α), α = αX = αg(X).

On the other hand, conditioning on an event B is well-defined, provided that   irrespective of any partition that may contain B as one of several parts.

Conditional distribution edit

Given X = x, the conditional distribution of Y is

 

for 0 ≤ y ≤ min ( 3, x ). It is the hypergeometric distribution H ( x; 3, 7 ), or equivalently, H ( 3; x, 10-x ). The corresponding expectation 0.3 x, obtained from the general formula

 

for H ( n; R, W ), is nothing but the conditional expectation E (Y | X = x) = 0.3 x.

Treating H ( X; 3, 7 ) as a random distribution (a random vector in the four-dimensional space of all measures on {0,1,2,3}), one may take its expectation, getting the unconditional distribution of Y, — the binomial distribution Bin ( 3, 0.5 ). This fact amounts to the equality

 

for y = 0,1,2,3; which is an instance of the law of total probability.

Conditioning on the level of densities edit

Example. A point of the sphere x2 + y2 + z2 = 1 is chosen at random according to the uniform distribution on the sphere.[1] The random variables X, Y, Z are the coordinates of the random point. The joint density of X, Y, Z does not exist (since the sphere is of zero volume), but the joint density fX,Y of X, Y exists,

 

(The density is non-constant because of a non-constant angle between the sphere and the plane.) The density of X may be calculated by integration,

 

surprisingly, the result does not depend on x in (−1,1),

 

which means that X is distributed uniformly on (−1,1). The same holds for Y and Z (and in fact, for aX + bY + cZ whenever a2 + b2 + c2 = 1).

Example. A different measure of calculating the marginal distribution function is provided below [2][3]

 

 

Conditional probability edit

Calculation edit

Given that X = 0.5, the conditional probability of the event Y ≤ 0.75 is the integral of the conditional density,

 
 

More generally,

 

for all x and y such that −1 < x < 1 (otherwise the denominator fX(x) vanishes) and   (otherwise the conditional probability degenerates to 0 or 1). One may also treat the conditional probability as a random variable, — a function of the random variable X, namely,

 

The expectation of this random variable is equal to the (unconditional) probability,

 

which is an instance of the law of total probability E ( P ( A | X ) ) = P ( A ).

Interpretation edit

The conditional probability P ( Y ≤ 0.75 | X = 0.5 ) cannot be interpreted as P ( Y ≤ 0.75, X = 0.5 ) / P ( X = 0.5 ), since the latter gives 0/0. Accordingly, P ( Y ≤ 0.75 | X = 0.5 ) cannot be interpreted via empirical frequencies, since the exact value X = 0.5 has no chance to appear at random, not even once during an infinite sequence of independent trials.

The conditional probability can be interpreted as a limit,

 

Conditional expectation edit

The conditional expectation E ( Y | X = 0.5 ) is of little interest; it vanishes just by symmetry. It is more interesting to calculate E ( |Z| | X = 0.5 ) treating |Z| as a function of X, Y:

 

More generally,

 

for −1 < x < 1. One may also treat the conditional expectation as a random variable, — a function of the random variable X, namely,

 

The expectation of this random variable is equal to the (unconditional) expectation of |Z|,

 

namely,

 

which is an instance of the law of total expectation E ( E ( Y | X ) ) = E ( Y ).

The random variable E(|Z| | X) is the best predictor of |Z| given X. That is, it minimizes the mean square error E ( |Z| - f(X) )2 on the class of all random variables of the form f(X). Similarly to the discrete case, E ( |Z| | g(X) ) = E ( |Z| | X ) for every measurable function g that is one-to-one on (-1,1).

Conditional distribution edit

Given X = x, the conditional distribution of Y, given by the density fY|X=x(y), is the (rescaled) arcsin distribution; its cumulative distribution function is

 

for all x and y such that x2 + y2 < 1.The corresponding expectation of h(x,Y) is nothing but the conditional expectation E ( h(X,Y) | X=x ). The mixture of these conditional distributions, taken for all x (according to the distribution of X) is the unconditional distribution of Y. This fact amounts to the equalities

 

the latter being the instance of the law of total probability mentioned above.

What conditioning is not edit

On the discrete level, conditioning is possible only if the condition is of nonzero probability (one cannot divide by zero). On the level of densities, conditioning on X = x is possible even though P ( X = x ) = 0. This success may create the illusion that conditioning is always possible. Regretfully, it is not, for several reasons presented below.

Geometric intuition: caution edit

The result P ( Y ≤ 0.75 | X = 0.5 ) = 5/6, mentioned above, is geometrically evident in the following sense. The points (x,y,z) of the sphere x2 + y2 + z2 = 1, satisfying the condition x = 0.5, are a circle y2 + z2 = 0.75 of radius   on the plane x = 0.5. The inequality y ≤ 0.75 holds on an arc. The length of the arc is 5/6 of the length of the circle, which is why the conditional probability is equal to 5/6.

This successful geometric explanation may create the illusion that the following question is trivial.

A point of a given sphere is chosen at random (uniformly). Given that the point lies on a given plane, what is its conditional distribution?

It may seem evident that the conditional distribution must be uniform on the given circle (the intersection of the given sphere and the given plane). Sometimes it really is, but in general it is not. Especially, Z is distributed uniformly on (-1,+1) and independent of the ratio Y/X, thus, P ( Z ≤ 0.5 | Y/X ) = 0.75. On the other hand, the inequality z ≤ 0.5 holds on an arc of the circle x2 + y2 + z2 = 1, y = cx (for any given c). The length of the arc is 2/3 of the length of the circle. However, the conditional probability is 3/4, not 2/3. This is a manifestation of the classical Borel paradox.[4][5]

Appeals to symmetry can be misleading if not formalized as invariance arguments.

— Pollard[6]

Another example. A random rotation of the three-dimensional space is a rotation by a random angle around a random axis. Geometric intuition suggests that the angle is independent of the axis and distributed uniformly. However, the latter is wrong; small values of the angle are less probable.

The limiting procedure edit

Given an event B of zero probability, the formula   is useless, however, one can try   for an appropriate sequence of events Bn of nonzero probability such that BnB (that is,   and  ). One example is given above. Two more examples are Brownian bridge and Brownian excursion.

In the latter two examples the law of total probability is irrelevant, since only a single event (the condition) is given. By contrast, in the example above the law of total probability applies, since the event X = 0.5 is included into a family of events X = x where x runs over (−1,1), and these events are a partition of the probability space.

In order to avoid paradoxes (such as the Borel's paradox), the following important distinction should be taken into account. If a given event is of nonzero probability then conditioning on it is well-defined (irrespective of any other events), as was noted above. By contrast, if the given event is of zero probability then conditioning on it is ill-defined unless some additional input is provided. Wrong choice of this additional input leads to wrong conditional probabilities (expectations, distributions). In this sense, "the concept of a conditional probability with regard to an isolated hypothesis whose probability equals 0 is inadmissible." (Kolmogorov[6])

The additional input may be (a) a symmetry (invariance group); (b) a sequence of events Bn such that BnB, P ( Bn ) > 0; (c) a partition containing the given event. Measure-theoretic conditioning (below) investigates Case (c), discloses its relation to (b) in general and to (a) when applicable.

Some events of zero probability are beyond the reach of conditioning. An example: let Xn be independent random variables distributed uniformly on (0,1), and B the event "Xn → 0 as n → ∞"; what about P ( Xn < 0.5 | B ) ? Does it tend to 1, or not? Another example: let X be a random variable distributed uniformly on (0,1), and B the event "X is a rational number"; what about P ( X = 1/n | B ) ? The only answer is that, once again,

the concept of a conditional probability with regard to an isolated hypothesis whose probability equals 0 is inadmissible.

— Kolmogorov[6]

Conditioning on the level of measure theory edit

Example. Let Y be a random variable distributed uniformly on (0,1), and X = f(Y) where f is a given function. Two cases are treated below: f = f1 and f = f2, where f1 is the continuous piecewise-linear function

 

and f2 is the Weierstrass function.

Geometric intuition: caution edit

Given X = 0.75, two values of Y are possible, 0.25 and 0.5. It may seem evident that both values are of conditional probability 0.5 just because one point is congruent to another point. However, this is an illusion; see below.

Conditional probability edit

The conditional probability P ( Y ≤ 1/3 | X ) may be defined as the best predictor of the indicator

 

given X. That is, it minimizes the mean square error E ( I - g(X) )2 on the class of all random variables of the form g (X).

In the case f = f1 the corresponding function g = g1 may be calculated explicitly,[details 1]

 

Alternatively, the limiting procedure may be used,

 

giving the same result.

Thus, P ( Y ≤ 1/3 | X ) = g1 (X). The expectation of this random variable is equal to the (unconditional) probability, E ( P ( Y ≤ 1/3 | X ) ) = P ( Y ≤ 1/3 ), namely,

 

which is an instance of the law of total probability E ( P ( A | X ) ) = P ( A ).

In the case f = f2 the corresponding function g = g2 probably cannot be calculated explicitly. Nevertheless it exists, and can be computed numerically. Indeed, the space L2 (Ω) of all square integrable random variables is a Hilbert space; the indicator I is a vector of this space; and random variables of the form g (X) are a (closed, linear) subspace. The orthogonal projection of this vector to this subspace is well-defined. It can be computed numerically, using finite-dimensional approximations to the infinite-dimensional Hilbert space.

Once again, the expectation of the random variable P ( Y ≤ 1/3 | X ) = g2 (X) is equal to the (unconditional) probability, E ( P ( Y ≤ 1/3 | X ) ) = P ( Y ≤ 1/3 ), namely,

 

However, the Hilbert space approach treats g2 as an equivalence class of functions rather than an individual function. Measurability of g2 is ensured, but continuity (or even Riemann integrability) is not. The value g2 (0.5) is determined uniquely, since the point 0.5 is an atom of the distribution of X. Other values x are not atoms, thus, corresponding values g2 (x) are not determined uniquely. Once again, "the concept of a conditional probability with regard to an isolated hypothesis whose probability equals 0 is inadmissible." (Kolmogorov.[6]

Alternatively, the same function g (be it g1 or g2) may be defined as the Radon–Nikodym derivative

 

where measures μ, ν are defined by

 

for all Borel sets   That is, μ is the (unconditional) distribution of X, while ν is one third of its conditional distribution,

 

Both approaches (via the Hilbert space, and via the Radon–Nikodym derivative) treat g as an equivalence class of functions; two functions g and g′ are treated as equivalent, if g (X) = g′ (X) almost surely. Accordingly, the conditional probability P ( Y ≤ 1/3 | X ) is treated as an equivalence class of random variables; as usual, two random variables are treated as equivalent if they are equal almost surely.

Conditional expectation edit

The conditional expectation   may be defined as the best predictor of Y given X. That is, it minimizes the mean square error   on the class of all random variables of the form h(X).

In the case f = f1 the corresponding function h = h1 may be calculated explicitly,[details 2]

 

Alternatively, the limiting procedure may be used,

 

giving the same result.

Thus,   The expectation of this random variable is equal to the (unconditional) expectation,   namely,

 

which is an instance of the law of total expectation  

In the case f = f2 the corresponding function h = h2 probably cannot be calculated explicitly. Nevertheless it exists, and can be computed numerically in the same way as g2 above, — as the orthogonal projection in the Hilbert space. The law of total expectation holds, since the projection cannot change the scalar product by the constant 1 belonging to the subspace.

Alternatively, the same function h (be it h1 or h2) may be defined as the Radon–Nikodym derivative

 

where measures μ, ν are defined by

 

for all Borel sets   Here   is the restricted expectation, not to be confused with the conditional expectation  

Conditional distribution edit

In the case f = f1 the conditional cumulative distribution function may be calculated explicitly, similarly to g1. The limiting procedure gives:

 

which cannot be correct, since a cumulative distribution function must be right-continuous!

This paradoxical result is explained by measure theory as follows. For a given y the corresponding   is well-defined (via the Hilbert space or the Radon–Nikodym derivative) as an equivalence class of functions (of x). Treated as a function of y for a given x it is ill-defined unless some additional input is provided. Namely, a function (of x) must be chosen within every (or at least almost every) equivalence class. Wrong choice leads to wrong conditional cumulative distribution functions.

A right choice can be made as follows. First,   is considered for rational numbers y only. (Any other dense countable set may be used equally well.) Thus, only a countable set of equivalence classes is used; all choices of functions within these classes are mutually equivalent, and the corresponding function of rational y is well-defined (for almost every x). Second, the function is extended from rational numbers to real numbers by right continuity.

In general the conditional distribution is defined for almost all x (according to the distribution of X), but sometimes the result is continuous in x, in which case individual values are acceptable. In the considered example this is the case; the correct result for x = 0.75,

 

shows that the conditional distribution of Y given X = 0.75 consists of two atoms, at 0.25 and 0.5, of probabilities 1/3 and 2/3 respectively.

Similarly, the conditional distribution may be calculated for all x in (0, 0.5) or (0.5, 1).

The value x = 0.5 is an atom of the distribution of X, thus, the corresponding conditional distribution is well-defined and may be calculated by elementary means (the denominator does not vanish); the conditional distribution of Y given X = 0.5 is uniform on (2/3, 1). Measure theory leads to the same result.

The mixture of all conditional distributions is the (unconditional) distribution of Y.

The conditional expectation   is nothing but the expectation with respect to the conditional distribution.

In the case f = f2 the corresponding   probably cannot be calculated explicitly. For a given y it is well-defined (via the Hilbert space or the Radon–Nikodym derivative) as an equivalence class of functions (of x). The right choice of functions within these equivalence classes may be made as above; it leads to correct conditional cumulative distribution functions, thus, conditional distributions. In general, conditional distributions need not be atomic or absolutely continuous (nor mixtures of both types). Probably, in the considered example they are singular (like the Cantor distribution).

Once again, the mixture of all conditional distributions is the (unconditional) distribution, and the conditional expectation is the expectation with respect to the conditional distribution.

Technical details edit

  1. ^ Proof:
     
    it remains to note that (1−a )2 + 2a2 is minimal at a = 1/3.
  2. ^ Proof:
     
    it remains to note that
     
    is minimal at   and   is minimal at  

See also edit

Notes edit

  1. ^ "Mathematica/Uniform Spherical Distribution - Wikibooks, open books for an open world". en.wikibooks.org. Retrieved 2018-10-27.
  2. ^ Buchanan, K.; Huff, G. H. (July 2011). "A comparison of geometrically bound random arrays in euclidean space". 2011 IEEE International Symposium on Antennas and Propagation (APSURSI). pp. 2008–2011. doi:10.1109/APS.2011.5996900. ISBN 978-1-4244-9563-4. S2CID 10446533.
  3. ^ Buchanan, K.; Flores, C.; Wheeland, S.; Jensen, J.; Grayson, D.; Huff, G. (May 2017). "Transmit beamforming for radar applications using circularly tapered random arrays". 2017 IEEE Radar Conference (RadarConf). pp. 0112–0117. doi:10.1109/RADAR.2017.7944181. ISBN 978-1-4673-8823-8. S2CID 38429370.
  4. ^ Pollard 2002, Sect. 5.5, Example 17 on page 122.
  5. ^ Durrett 1996, Sect. 4.1(a), Example 1.6 on page 224.
  6. ^ a b c d Pollard 2002, Sect. 5.5, page 122.

References edit

  • Durrett, Richard (1996), Probability: theory and examples (Second ed.)
  • Pollard, David (2002), A user's guide to measure theoretic probability, Cambridge University Press
  • Draheim, Dirk (2017) Generalized Jeffrey Conditionalization (A Frequentist Semantics of Partial Conditionalization), Springer

conditioning, probability, this, article, needs, additional, citations, verification, please, help, improve, this, article, adding, citations, reliable, sources, unsourced, material, challenged, removed, find, sources, conditioning, probability, news, newspape. This article needs additional citations for verification Please help improve this article by adding citations to reliable sources Unsourced material may be challenged and removed Find sources Conditioning probability news newspapers books scholar JSTOR May 2009 Learn how and when to remove this template message Beliefs depend on the available information This idea is formalized in probability theory by conditioning Conditional probabilities conditional expectations and conditional probability distributions are treated on three levels discrete probabilities probability density functions and measure theory Conditioning leads to a non random result if the condition is completely specified otherwise if the condition is left random the result of conditioning is also random Contents 1 Conditioning on the discrete level 1 1 Conditional probability 1 2 Conditional expectation 1 3 Conditional distribution 2 Conditioning on the level of densities 2 1 Conditional probability 2 1 1 Calculation 2 1 2 Interpretation 2 2 Conditional expectation 2 3 Conditional distribution 3 What conditioning is not 3 1 Geometric intuition caution 3 2 The limiting procedure 4 Conditioning on the level of measure theory 4 1 Geometric intuition caution 4 2 Conditional probability 4 3 Conditional expectation 4 4 Conditional distribution 5 Technical details 6 See also 7 Notes 8 ReferencesConditioning on the discrete level editExample A fair coin is tossed 10 times the random variable X is the number of heads in these 10 tosses and Y is the number of heads in the first 3 tosses In spite of the fact that Y emerges before X it may happen that someone knows X but not Y Conditional probability edit Main article Conditional probability Given that X 1 the conditional probability of the event Y 0 is P Y 0 X 1 P Y 0 X 1 P X 1 0 7 displaystyle mathbb P Y 0 X 1 frac mathbb P Y 0 X 1 mathbb P X 1 0 7 nbsp More generally P Y 0 X x 7x 10x 7 10 x 7 x 10 x 0 1 2 3 4 5 6 7 P Y 0 X x 0x 8 9 10 displaystyle begin aligned mathbb P Y 0 X x amp frac binom 7 x binom 10 x frac 7 10 x 7 x 10 amp amp x 0 1 2 3 4 5 6 7 4pt mathbb P Y 0 X x amp 0 amp amp x 8 9 10 end aligned nbsp One may also treat the conditional probability as a random variable a function of the random variable X namely P Y 0 X 7X 10X X 7 0X gt 7 displaystyle mathbb P Y 0 X begin cases binom 7 X binom 10 X amp X leqslant 7 0 amp X gt 7 end cases nbsp The expectation of this random variable is equal to the unconditional probability E P Y 0 X xP Y 0 X x P X x P Y 0 displaystyle mathbb E mathbb P Y 0 X sum x mathbb P Y 0 X x mathbb P X x mathbb P Y 0 nbsp namely x 07 7x 10x 1210 10x 18 displaystyle sum x 0 7 frac binom 7 x binom 10 x cdot frac 1 2 10 binom 10 x frac 1 8 nbsp which is an instance of the law of total probability E P A X P A displaystyle mathbb E mathbb P A X mathbb P A nbsp Thus P Y 0 X 1 displaystyle mathbb P Y 0 X 1 nbsp may be treated as the value of the random variable P Y 0 X displaystyle mathbb P Y 0 X nbsp corresponding to X 1 On the other hand P Y 0 X 1 displaystyle mathbb P Y 0 X 1 nbsp is well defined irrespective of other possible values of X Conditional expectation edit Main article Conditional expectation Given that X 1 the conditional expectation of the random variable Y is E Y X 1 310 displaystyle mathbb E Y X 1 tfrac 3 10 nbsp More generally E Y X x 310x x 0 10 displaystyle mathbb E Y X x frac 3 10 x qquad x 0 ldots 10 nbsp In this example it appears to be a linear function but in general it is nonlinear One may also treat the conditional expectation as a random variable a function of the random variable X namely E Y X 310X displaystyle mathbb E Y X frac 3 10 X nbsp The expectation of this random variable is equal to the unconditional expectation of Y E E Y X xE Y X x P X x E Y displaystyle mathbb E mathbb E Y X sum x mathbb E Y X x mathbb P X x mathbb E Y nbsp namely x 010310x 1210 10x 32 displaystyle sum x 0 10 frac 3 10 x cdot frac 1 2 10 binom 10 x frac 3 2 nbsp or simply E 310X 310E X 310 5 32 displaystyle mathbb E left frac 3 10 X right frac 3 10 mathbb E X frac 3 10 cdot 5 frac 3 2 nbsp which is an instance of the law of total expectation E E Y X E Y displaystyle mathbb E mathbb E Y X mathbb E Y nbsp The random variable E Y X displaystyle mathbb E Y X nbsp is the best predictor of Y given X That is it minimizes the mean square error E Y f X 2 displaystyle mathbb E Y f X 2 nbsp on the class of all random variables of the form f X This class of random variables remains intact if X is replaced say with 2X Thus E Y 2X E Y X displaystyle mathbb E Y 2X mathbb E Y X nbsp It does not mean that E Y 2X 310 2X displaystyle mathbb E Y 2X tfrac 3 10 times 2X nbsp rather E Y 2X 320 2X 310X displaystyle mathbb E Y 2X tfrac 3 20 times 2X tfrac 3 10 X nbsp In particular E Y 2X 2 310 displaystyle mathbb E Y 2X 2 tfrac 3 10 nbsp More generally E Y g X E Y X displaystyle mathbb E Y g X mathbb E Y X nbsp for every function g that is one to one on the set of all possible values of X The values of X are irrelevant what matters is the partition denote it aX W X x1 X x2 displaystyle Omega X x 1 uplus X x 2 uplus dots nbsp of the sample space W into disjoint sets X xn Here x1 x2 displaystyle x 1 x 2 ldots nbsp are all possible values of X Given an arbitrary partition a of W one may define the random variable E Y a Still E E Y a E Y Conditional probability may be treated as a special case of conditional expectation Namely P A X E Y X if Y is the indicator of A Therefore the conditional probability also depends on the partition aX generated by X rather than on X itself P A g X P A X P A a a aX ag X On the other hand conditioning on an event B is well defined provided that P B 0 displaystyle mathbb P B neq 0 nbsp irrespective of any partition that may contain B as one of several parts Conditional distribution edit Main article Conditional probability distribution Given X x the conditional distribution of Y is P Y y X x 3y 7x y 10x xy 10 x3 y 103 displaystyle mathbb P Y y X x frac binom 3 y binom 7 x y binom 10 x frac binom x y binom 10 x 3 y binom 10 3 nbsp for 0 y min 3 x It is the hypergeometric distribution H x 3 7 or equivalently H 3 x 10 x The corresponding expectation 0 3 x obtained from the general formula nRR W displaystyle n frac R R W nbsp for H n R W is nothing but the conditional expectation E Y X x 0 3 x Treating H X 3 7 as a random distribution a random vector in the four dimensional space of all measures on 0 1 2 3 one may take its expectation getting the unconditional distribution of Y the binomial distribution Bin 3 0 5 This fact amounts to the equality x 010P Y y X x P X x P Y y 123 3y displaystyle sum x 0 10 mathbb P Y y X x mathbb P X x mathbb P Y y frac 1 2 3 binom 3 y nbsp for y 0 1 2 3 which is an instance of the law of total probability Conditioning on the level of densities editMain articles Probability density function and Conditional probability distribution Example A point of the sphere x2 y2 z2 1 is chosen at random according to the uniform distribution on the sphere 1 The random variables X Y Z are the coordinates of the random point The joint density of X Y Z does not exist since the sphere is of zero volume but the joint density fX Y of X Y exists fX Y x y 12p1 x2 y2if x2 y2 lt 1 0otherwise displaystyle f X Y x y begin cases frac 1 2 pi sqrt 1 x 2 y 2 amp text if x 2 y 2 lt 1 0 amp text otherwise end cases nbsp The density is non constant because of a non constant angle between the sphere and the plane The density of X may be calculated by integration fX x fX Y x y dy 1 x2 1 x2dy2p1 x2 y2 displaystyle f X x int infty infty f X Y x y mathrm d y int sqrt 1 x 2 sqrt 1 x 2 frac mathrm d y 2 pi sqrt 1 x 2 y 2 nbsp surprisingly the result does not depend on x in 1 1 fX x 0 5for 1 lt x lt 1 0otherwise displaystyle f X x begin cases 0 5 amp text for 1 lt x lt 1 0 amp text otherwise end cases nbsp which means that X is distributed uniformly on 1 1 The same holds for Y and Z and in fact for aX bY cZ whenever a2 b2 c2 1 Example A different measure of calculating the marginal distribution function is provided below 2 3 fX Y Z x y z 34p displaystyle f X Y Z x y z frac 3 4 pi nbsp fX x 1 y2 x2 1 y2 x2 1 x2 1 x23dydz4p 31 x2 4 displaystyle f X x int sqrt 1 y 2 x 2 sqrt 1 y 2 x 2 int sqrt 1 x 2 sqrt 1 x 2 frac 3 mathrm d y mathrm d z 4 pi 3 sqrt 1 x 2 4 nbsp Conditional probability edit Calculation edit Given that X 0 5 the conditional probability of the event Y 0 75 is the integral of the conditional density fY X 0 5 y fX Y 0 5 y fX 0 5 1p0 75 y2for 0 75 lt y lt 0 75 0otherwise displaystyle f Y X 0 5 y frac f X Y 0 5 y f X 0 5 begin cases frac 1 pi sqrt 0 75 y 2 amp text for sqrt 0 75 lt y lt sqrt 0 75 0 amp text otherwise end cases nbsp P Y 0 75 X 0 5 0 75fY X 0 5 y dy 0 750 75dyp0 75 y2 12 1parcsin 0 75 56 displaystyle mathbb P Y leq 0 75 X 0 5 int infty 0 75 f Y X 0 5 y mathrm d y int sqrt 0 75 0 75 frac mathrm d y pi sqrt 0 75 y 2 tfrac 1 2 tfrac 1 pi arcsin sqrt 0 75 tfrac 5 6 nbsp More generally P Y y X x 12 1parcsin y1 x2 displaystyle mathbb P Y leq y X x tfrac 1 2 tfrac 1 pi arcsin frac y sqrt 1 x 2 nbsp for all x and y such that 1 lt x lt 1 otherwise the denominator fX x vanishes and 1 x2 lt y lt 1 x2 displaystyle textstyle sqrt 1 x 2 lt y lt sqrt 1 x 2 nbsp otherwise the conditional probability degenerates to 0 or 1 One may also treat the conditional probability as a random variable a function of the random variable X namely P Y y X 0for X2 1 y2 and y lt 0 12 1parcsin y1 X2for X2 lt 1 y2 1for X2 1 y2 and y gt 0 displaystyle mathbb P Y leq y X begin cases 0 amp text for X 2 geq 1 y 2 text and y lt 0 frac 1 2 frac 1 pi arcsin frac y sqrt 1 X 2 amp text for X 2 lt 1 y 2 1 amp text for X 2 geq 1 y 2 text and y gt 0 end cases nbsp The expectation of this random variable is equal to the unconditional probability E P Y y X P Y y X x fX x dx P Y y displaystyle mathbb E mathbb P Y leq y X int infty infty mathbb P Y leq y X x f X x mathrm d x mathbb P Y leq y nbsp which is an instance of the law of total probability E P A X P A Interpretation edit The conditional probability P Y 0 75 X 0 5 cannot be interpreted as P Y 0 75 X 0 5 P X 0 5 since the latter gives 0 0 Accordingly P Y 0 75 X 0 5 cannot be interpreted via empirical frequencies since the exact value X 0 5 has no chance to appear at random not even once during an infinite sequence of independent trials The conditional probability can be interpreted as a limit P Y 0 75 X 0 5 lime 0 P Y 0 75 0 5 e lt X lt 0 5 e lime 0 P Y 0 75 0 5 e lt X lt 0 5 e P 0 5 e lt X lt 0 5 e lime 0 0 5 e0 5 edx 0 75dyfX Y x y 0 5 e0 5 edxfX x displaystyle begin aligned mathbb P Y leq 0 75 X 0 5 amp lim varepsilon to 0 mathbb P Y leq 0 75 0 5 varepsilon lt X lt 0 5 varepsilon amp lim varepsilon to 0 frac mathbb P Y leq 0 75 0 5 varepsilon lt X lt 0 5 varepsilon mathbb P 0 5 varepsilon lt X lt 0 5 varepsilon amp lim varepsilon to 0 frac int 0 5 varepsilon 0 5 varepsilon mathrm d x int infty 0 75 mathrm d y f X Y x y int 0 5 varepsilon 0 5 varepsilon mathrm d x f X x end aligned nbsp Conditional expectation edit The conditional expectation E Y X 0 5 is of little interest it vanishes just by symmetry It is more interesting to calculate E Z X 0 5 treating Z as a function of X Y Z h X Y 1 X2 Y2 E Z X 0 5 h 0 5 y fY X 0 5 y dy 0 75 0 750 75 y2 dyp0 75 y2 2p0 75 displaystyle begin aligned Z amp h X Y sqrt 1 X 2 Y 2 mathrm E Z X 0 5 amp int infty infty h 0 5 y f Y X 0 5 y mathrm d y amp int sqrt 0 75 sqrt 0 75 sqrt 0 75 y 2 cdot frac mathrm d y pi sqrt 0 75 y 2 amp frac 2 pi sqrt 0 75 end aligned nbsp More generally E Z X x 2p1 x2 displaystyle mathbb E Z X x frac 2 pi sqrt 1 x 2 nbsp for 1 lt x lt 1 One may also treat the conditional expectation as a random variable a function of the random variable X namely E Z X 2p1 X2 displaystyle mathbb E Z X frac 2 pi sqrt 1 X 2 nbsp The expectation of this random variable is equal to the unconditional expectation of Z E E Z X E Z X x fX x dx E Z displaystyle mathbb E mathbb E Z X int infty infty mathbb E Z X x f X x mathrm d x mathbb E Z nbsp namely 1 12p1 x2 dx2 12 displaystyle int 1 1 frac 2 pi sqrt 1 x 2 cdot frac mathrm d x 2 tfrac 1 2 nbsp which is an instance of the law of total expectation E E Y X E Y The random variable E Z X is the best predictor of Z given X That is it minimizes the mean square error E Z f X 2 on the class of all random variables of the form f X Similarly to the discrete case E Z g X E Z X for every measurable function g that is one to one on 1 1 Conditional distribution edit Given X x the conditional distribution of Y given by the density fY X x y is the rescaled arcsin distribution its cumulative distribution function is FY X x y P Y y X x 12 1parcsin y1 x2 displaystyle F Y X x y mathbb P Y leq y X x frac 1 2 frac 1 pi arcsin frac y sqrt 1 x 2 nbsp for all x and y such that x2 y2 lt 1 The corresponding expectation of h x Y is nothing but the conditional expectation E h X Y X x The mixture of these conditional distributions taken for all x according to the distribution of X is the unconditional distribution of Y This fact amounts to the equalities fY X x y fX x dx fY y FY X x y fX x dx FY y displaystyle begin aligned amp int infty infty f Y X x y f X x mathrm d x f Y y amp int infty infty F Y X x y f X x mathrm d x F Y y end aligned nbsp the latter being the instance of the law of total probability mentioned above What conditioning is not editMain article Borel Kolmogorov paradox On the discrete level conditioning is possible only if the condition is of nonzero probability one cannot divide by zero On the level of densities conditioning on X x is possible even though P X x 0 This success may create the illusion that conditioning is always possible Regretfully it is not for several reasons presented below Geometric intuition caution edit The result P Y 0 75 X 0 5 5 6 mentioned above is geometrically evident in the following sense The points x y z of the sphere x2 y2 z2 1 satisfying the condition x 0 5 are a circle y2 z2 0 75 of radius 0 75 displaystyle sqrt 0 75 nbsp on the plane x 0 5 The inequality y 0 75 holds on an arc The length of the arc is 5 6 of the length of the circle which is why the conditional probability is equal to 5 6 This successful geometric explanation may create the illusion that the following question is trivial A point of a given sphere is chosen at random uniformly Given that the point lies on a given plane what is its conditional distribution It may seem evident that the conditional distribution must be uniform on the given circle the intersection of the given sphere and the given plane Sometimes it really is but in general it is not Especially Z is distributed uniformly on 1 1 and independent of the ratio Y X thus P Z 0 5 Y X 0 75 On the other hand the inequality z 0 5 holds on an arc of the circle x2 y2 z2 1 y cx for any given c The length of the arc is 2 3 of the length of the circle However the conditional probability is 3 4 not 2 3 This is a manifestation of the classical Borel paradox 4 5 Appeals to symmetry can be misleading if not formalized as invariance arguments Pollard 6 Another example A random rotation of the three dimensional space is a rotation by a random angle around a random axis Geometric intuition suggests that the angle is independent of the axis and distributed uniformly However the latter is wrong small values of the angle are less probable The limiting procedure edit Given an event B of zero probability the formula P A B P A B P B displaystyle textstyle mathbb P A B mathbb P A cap B mathbb P B nbsp is useless however one can try P A B limn P A Bn P Bn displaystyle textstyle mathbb P A B lim n to infty mathbb P A cap B n mathbb P B n nbsp for an appropriate sequence of events Bn of nonzero probability such that Bn B that is B1 B2 displaystyle textstyle B 1 supset B 2 supset dots nbsp and B1 B2 B displaystyle textstyle B 1 cap B 2 cap dots B nbsp One example is given above Two more examples are Brownian bridge and Brownian excursion In the latter two examples the law of total probability is irrelevant since only a single event the condition is given By contrast in the example above the law of total probability applies since the event X 0 5 is included into a family of events X x where x runs over 1 1 and these events are a partition of the probability space In order to avoid paradoxes such as the Borel s paradox the following important distinction should be taken into account If a given event is of nonzero probability then conditioning on it is well defined irrespective of any other events as was noted above By contrast if the given event is of zero probability then conditioning on it is ill defined unless some additional input is provided Wrong choice of this additional input leads to wrong conditional probabilities expectations distributions In this sense the concept of a conditional probability with regard to an isolated hypothesis whose probability equals 0 is inadmissible Kolmogorov 6 The additional input may be a a symmetry invariance group b a sequence of events Bn such that Bn B P Bn gt 0 c a partition containing the given event Measure theoretic conditioning below investigates Case c discloses its relation to b in general and to a when applicable Some events of zero probability are beyond the reach of conditioning An example let Xn be independent random variables distributed uniformly on 0 1 and B the event Xn 0 as n what about P Xn lt 0 5 B Does it tend to 1 or not Another example let X be a random variable distributed uniformly on 0 1 and B the event X is a rational number what about P X 1 n B The only answer is that once again the concept of a conditional probability with regard to an isolated hypothesis whose probability equals 0 is inadmissible Kolmogorov 6 Conditioning on the level of measure theory editMain article Conditional expectation Example Let Y be a random variable distributed uniformly on 0 1 and X f Y where f is a given function Two cases are treated below f f1 and f f2 where f1 is the continuous piecewise linear function f1 y 3yfor 0 y 1 3 1 5 1 y for 1 3 y 2 3 0 5for 2 3 y 1 displaystyle f 1 y begin cases 3y amp text for 0 leq y leq 1 3 1 5 1 y amp text for 1 3 leq y leq 2 3 0 5 amp text for 2 3 leq y leq 1 end cases nbsp and f2 is the Weierstrass function Geometric intuition caution edit Given X 0 75 two values of Y are possible 0 25 and 0 5 It may seem evident that both values are of conditional probability 0 5 just because one point is congruent to another point However this is an illusion see below Conditional probability edit The conditional probability P Y 1 3 X may be defined as the best predictor of the indicator I 1if Y 1 3 0otherwise displaystyle I begin cases 1 amp text if Y leq 1 3 0 amp text otherwise end cases nbsp given X That is it minimizes the mean square error E I g X 2 on the class of all random variables of the form g X In the case f f1 the corresponding function g g1 may be calculated explicitly details 1 g1 x 1for 0 lt x lt 0 5 0for x 0 5 1 3for 0 5 lt x lt 1 displaystyle g 1 x begin cases 1 amp text for 0 lt x lt 0 5 0 amp text for x 0 5 1 3 amp text for 0 5 lt x lt 1 end cases nbsp Alternatively the limiting procedure may be used g1 x lime 0 P Y 1 3 x e X x e displaystyle g 1 x lim varepsilon to 0 mathbb P Y leq 1 3 x varepsilon leq X leq x varepsilon nbsp giving the same result Thus P Y 1 3 X g1 X The expectation of this random variable is equal to the unconditional probability E P Y 1 3 X P Y 1 3 namely 1 P X lt 0 5 0 P X 0 5 13 P X gt 0 5 1 16 0 13 13 16 13 13 displaystyle 1 cdot mathbb P X lt 0 5 0 cdot mathbb P X 0 5 frac 1 3 cdot mathbb P X gt 0 5 1 cdot frac 1 6 0 cdot frac 1 3 frac 1 3 cdot left frac 1 6 frac 1 3 right frac 1 3 nbsp which is an instance of the law of total probability E P A X P A In the case f f2 the corresponding function g g2 probably cannot be calculated explicitly Nevertheless it exists and can be computed numerically Indeed the space L2 W of all square integrable random variables is a Hilbert space the indicator I is a vector of this space and random variables of the form g X are a closed linear subspace The orthogonal projection of this vector to this subspace is well defined It can be computed numerically using finite dimensional approximations to the infinite dimensional Hilbert space Once again the expectation of the random variable P Y 1 3 X g2 X is equal to the unconditional probability E P Y 1 3 X P Y 1 3 namely 01g2 f2 y dy 13 displaystyle int 0 1 g 2 f 2 y mathrm d y tfrac 1 3 nbsp However the Hilbert space approach treats g2 as an equivalence class of functions rather than an individual function Measurability of g2 is ensured but continuity or even Riemann integrability is not The value g2 0 5 is determined uniquely since the point 0 5 is an atom of the distribution of X Other values x are not atoms thus corresponding values g2 x are not determined uniquely Once again the concept of a conditional probability with regard to an isolated hypothesis whose probability equals 0 is inadmissible Kolmogorov 6 Alternatively the same function g be it g1 or g2 may be defined as the Radon Nikodym derivative g dndm displaystyle g frac mathrm d nu mathrm d mu nbsp where measures m n are defined by m B P X B n B P X B Y 13 displaystyle begin aligned mu B amp mathbb P X in B nu B amp mathbb P X in B Y leq tfrac 1 3 end aligned nbsp for all Borel sets B R displaystyle B subset mathbb R nbsp That is m is the unconditional distribution of X while n is one third of its conditional distribution n B P X B Y 13 P Y 13 13P X B Y 13 displaystyle nu B mathbb P X in B Y leq tfrac 1 3 mathbb P Y leq tfrac 1 3 tfrac 1 3 mathbb P X in B Y leq tfrac 1 3 nbsp Both approaches via the Hilbert space and via the Radon Nikodym derivative treat g as an equivalence class of functions two functions g and g are treated as equivalent if g X g X almost surely Accordingly the conditional probability P Y 1 3 X is treated as an equivalence class of random variables as usual two random variables are treated as equivalent if they are equal almost surely Conditional expectation edit The conditional expectation E Y X displaystyle mathbb E Y X nbsp may be defined as the best predictor of Y given X That is it minimizes the mean square error E Y h X 2 displaystyle mathbb E Y h X 2 nbsp on the class of all random variables of the form h X In the case f f1 the corresponding function h h1 may be calculated explicitly details 2 h1 x x30 lt x lt 1256x 1213 2 x 12 lt x lt 1 displaystyle h 1 x begin cases frac x 3 amp 0 lt x lt frac 1 2 4pt frac 5 6 amp x frac 1 2 4pt frac 1 3 2 x amp frac 1 2 lt x lt 1 end cases nbsp Alternatively the limiting procedure may be used h1 x lime 0 E Y x e X x e displaystyle h 1 x lim varepsilon to 0 mathbb E Y x varepsilon leqslant X leqslant x varepsilon nbsp giving the same result Thus E Y X h1 X displaystyle mathbb E Y X h 1 X nbsp The expectation of this random variable is equal to the unconditional expectation E E Y X E Y displaystyle mathbb E mathbb E Y X mathbb E Y nbsp namely 01h1 f1 y dy 0163y3dy 16132 3y3dy 13232 32 1 y 3dy 23156dy 12 displaystyle int 0 1 h 1 f 1 y mathrm d y int 0 frac 1 6 frac 3y 3 mathrm d y int frac 1 6 frac 1 3 frac 2 3y 3 mathrm d y int frac 1 3 frac 2 3 frac 2 frac 3 2 1 y 3 mathrm d y int frac 2 3 1 frac 5 6 mathrm d y frac 1 2 nbsp which is an instance of the law of total expectation E E Y X E Y displaystyle mathbb E mathbb E Y X mathbb E Y nbsp In the case f f2 the corresponding function h h2 probably cannot be calculated explicitly Nevertheless it exists and can be computed numerically in the same way as g2 above as the orthogonal projection in the Hilbert space The law of total expectation holds since the projection cannot change the scalar product by the constant 1 belonging to the subspace Alternatively the same function h be it h1 or h2 may be defined as the Radon Nikodym derivative h dndm displaystyle h frac mathrm d nu mathrm d mu nbsp where measures m n are defined by m B P X B n B E Y X B displaystyle begin aligned mu B amp mathbb P X in B nu B amp mathbb E Y X in B end aligned nbsp for all Borel sets B R displaystyle B subset mathbb R nbsp Here E Y A displaystyle mathbb E Y A nbsp is the restricted expectation not to be confused with the conditional expectation E Y A E Y A P A displaystyle mathbb E Y A mathbb E Y A mathbb P A nbsp Conditional distribution edit Main articles Disintegration theorem and Regular conditional probability In the case f f1 the conditional cumulative distribution function may be calculated explicitly similarly to g1 The limiting procedure gives FY X 34 y P Y y X 34 lime 0 P Y y 34 e X 34 e 0 lt y lt 1416y 141314 lt y lt 1223y 12112 lt y lt displaystyle F Y X frac 3 4 y mathbb P left Y leqslant y left X tfrac 3 4 right right lim varepsilon to 0 mathbb P left Y leqslant y left tfrac 3 4 varepsilon leqslant X leqslant tfrac 3 4 varepsilon right right begin cases 0 amp infty lt y lt tfrac 1 4 4pt tfrac 1 6 amp y tfrac 1 4 4pt tfrac 1 3 amp tfrac 1 4 lt y lt tfrac 1 2 4pt tfrac 2 3 amp y tfrac 1 2 4pt 1 amp tfrac 1 2 lt y lt infty end cases nbsp which cannot be correct since a cumulative distribution function must be right continuous This paradoxical result is explained by measure theory as follows For a given y the corresponding FY X x y P Y y X x displaystyle F Y X x y mathbb P Y leqslant y X x nbsp is well defined via the Hilbert space or the Radon Nikodym derivative as an equivalence class of functions of x Treated as a function of y for a given x it is ill defined unless some additional input is provided Namely a function of x must be chosen within every or at least almost every equivalence class Wrong choice leads to wrong conditional cumulative distribution functions A right choice can be made as follows First FY X x y P Y y X x displaystyle F Y X x y mathbb P Y leqslant y X x nbsp is considered for rational numbers y only Any other dense countable set may be used equally well Thus only a countable set of equivalence classes is used all choices of functions within these classes are mutually equivalent and the corresponding function of rational y is well defined for almost every x Second the function is extended from rational numbers to real numbers by right continuity In general the conditional distribution is defined for almost all x according to the distribution of X but sometimes the result is continuous in x in which case individual values are acceptable In the considered example this is the case the correct result for x 0 75 FY X 34 y P Y y X 34 0 lt y lt 141314 y lt 12112 y lt displaystyle F Y X frac 3 4 y mathbb P left Y leqslant y left X tfrac 3 4 right right begin cases 0 amp infty lt y lt tfrac 1 4 4pt tfrac 1 3 amp tfrac 1 4 leqslant y lt tfrac 1 2 4pt 1 amp tfrac 1 2 leqslant y lt infty end cases nbsp shows that the conditional distribution of Y given X 0 75 consists of two atoms at 0 25 and 0 5 of probabilities 1 3 and 2 3 respectively Similarly the conditional distribution may be calculated for all x in 0 0 5 or 0 5 1 The value x 0 5 is an atom of the distribution of X thus the corresponding conditional distribution is well defined and may be calculated by elementary means the denominator does not vanish the conditional distribution of Y given X 0 5 is uniform on 2 3 1 Measure theory leads to the same result The mixture of all conditional distributions is the unconditional distribution of Y The conditional expectation E Y X x displaystyle mathbb E Y X x nbsp is nothing but the expectation with respect to the conditional distribution In the case f f2 the corresponding FY X x y P Y y X x displaystyle F Y X x y mathbb P Y leqslant y X x nbsp probably cannot be calculated explicitly For a given y it is well defined via the Hilbert space or the Radon Nikodym derivative as an equivalence class of functions of x The right choice of functions within these equivalence classes may be made as above it leads to correct conditional cumulative distribution functions thus conditional distributions In general conditional distributions need not be atomic or absolutely continuous nor mixtures of both types Probably in the considered example they are singular like the Cantor distribution Once again the mixture of all conditional distributions is the unconditional distribution and the conditional expectation is the expectation with respect to the conditional distribution Technical details edit Proof E I g X 2 01 3 1 g 3y 2dy 1 32 3g2 1 5 1 y dy 2 31g2 0 5 dy 01 1 g x 2dx3 0 51g2 x dx1 5 13g2 0 5 13 00 5 1 g x 2dx 13g2 0 5 13 0 51 1 g x 2 2g2 x dx displaystyle begin aligned mathbb E I g X 2 amp int 0 1 3 1 g 3y 2 mathrm d y int 1 3 2 3 g 2 1 5 1 y mathrm d y int 2 3 1 g 2 0 5 mathrm d y amp int 0 1 1 g x 2 frac mathrm d x 3 int 0 5 1 g 2 x frac mathrm d x 1 5 frac 1 3 g 2 0 5 amp frac 1 3 int 0 0 5 1 g x 2 mathrm d x frac 1 3 g 2 0 5 frac 1 3 int 0 5 1 1 g x 2 2g 2 x mathrm d x end aligned nbsp it remains to note that 1 a 2 2a2 is minimal at a 1 3 Proof E Y h1 X 2 01 y h1 f1 x 2dy 013 y h1 3y 2dy 1323 y h1 1 5 1 y 2dy 231 y h1 12 2dy 01 x3 h1 x 2dx3 121 1 x1 5 h1 x 2dx1 5 13h12 12 59h1 12 1981 13 012 h1 x x3 2dx 13h12 12 59h1 12 1981 13 121 h1 x x3 2 2 h1 x 1 2x3 2 dx displaystyle begin aligned mathbb E Y h 1 X 2 amp int 0 1 left y h 1 f 1 x right 2 mathrm d y amp int 0 frac 1 3 y h 1 3y 2 mathrm d y int frac 1 3 frac 2 3 left y h 1 1 5 1 y right 2 mathrm d y int frac 2 3 1 left y h 1 tfrac 1 2 right 2 mathrm d y amp int 0 1 left frac x 3 h 1 x right 2 frac mathrm d x 3 int frac 1 2 1 left 1 frac x 1 5 h 1 x right 2 frac mathrm d x 1 5 frac 1 3 h 1 2 tfrac 1 2 frac 5 9 h 1 tfrac 1 2 frac 19 81 amp frac 1 3 int 0 frac 1 2 left h 1 x frac x 3 right 2 mathrm d x tfrac 1 3 h 1 2 tfrac 1 2 tfrac 5 9 h 1 tfrac 1 2 tfrac 19 81 tfrac 1 3 int frac 1 2 1 left left h 1 x frac x 3 right 2 2 left h 1 x 1 frac 2x 3 right 2 right mathrm d x end aligned nbsp it remains to note that a x3 2 2 a 1 2x3 2 displaystyle left a frac x 3 right 2 2 left a 1 frac 2x 3 right 2 nbsp is minimal at a 2 x3 displaystyle a tfrac 2 x 3 nbsp and 13a2 59a displaystyle tfrac 1 3 a 2 tfrac 5 9 a nbsp is minimal at a 56 displaystyle a tfrac 5 6 nbsp See also editConditional probability Conditional expectation Conditional probability distribution Joint probability distribution Borel s paradox Regular conditional probability Disintegration theorem Law of total variance Law of total cumulanceNotes edit Mathematica Uniform Spherical Distribution Wikibooks open books for an open world en wikibooks org Retrieved 2018 10 27 Buchanan K Huff G H July 2011 A comparison of geometrically bound random arrays in euclidean space 2011 IEEE International Symposium on Antennas and Propagation APSURSI pp 2008 2011 doi 10 1109 APS 2011 5996900 ISBN 978 1 4244 9563 4 S2CID 10446533 Buchanan K Flores C Wheeland S Jensen J Grayson D Huff G May 2017 Transmit beamforming for radar applications using circularly tapered random arrays 2017 IEEE Radar Conference RadarConf pp 0112 0117 doi 10 1109 RADAR 2017 7944181 ISBN 978 1 4673 8823 8 S2CID 38429370 Pollard 2002 Sect 5 5 Example 17 on page 122 Durrett 1996 Sect 4 1 a Example 1 6 on page 224 a b c d Pollard 2002 Sect 5 5 page 122 References editDurrett Richard 1996 Probability theory and examples Second ed Pollard David 2002 A user s guide to measure theoretic probability Cambridge University Press Draheim Dirk 2017 Generalized Jeffrey Conditionalization A Frequentist Semantics of Partial Conditionalization Springer Retrieved from https en wikipedia org w index php title Conditioning probability amp oldid 1166493300, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.