0% found this document useful (0 votes)
332 views

Lecture23 Conditional Expectation

The document discusses conditional expectation, which is defined as the expected value of one random variable given the value of another random variable. It proves the law of iterated expectation, which states that the expected value of Y is equal to the expected value of the expected value of Y given X. It also proves a generalized form of this law. The conditional expectation E[Y|X] is shown to minimize the mean squared error when estimating Y given X, making it the minimum mean square error (MMSE) estimator. Examples are provided to illustrate these concepts.

Uploaded by

sourav kumar ray
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
332 views

Lecture23 Conditional Expectation

The document discusses conditional expectation, which is defined as the expected value of one random variable given the value of another random variable. It proves the law of iterated expectation, which states that the expected value of Y is equal to the expected value of the expected value of Y given X. It also proves a generalized form of this law. The conditional expectation E[Y|X] is shown to minimize the mean squared error when estimating Y given X, making it the minimum mean square error (MMSE) estimator. Examples are provided to illustrate these concepts.

Uploaded by

sourav kumar ray
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

EE5110: Probability Foundations for Electrical Engineers July-November 2015

Lecture 23: Conditional Expectation


Lecturer: Dr. Krishna Jagannathan Scribe: Sudharsan Parthasarathy

Let X and Y be discrete random variables with joint probability mass function pX,Y (x, y), then the condi-
tional probability mass function was defined in previous lectures as

pX,Y (x, y)
pX|Y (x|y) = ,
pY (y)

assuming pY (y) > 0. Let us define


X
E[X|Y = y] = xpX|Y (x|y).
x

(y) = E[X|Y = y] changes with y. The random variable (Y ) is the conditional expectation of X given Y
and denoted as E[X|Y ].
Let X and Y be continuous random variables with joint probability density function fX,Y (x, y). Recall the
conditional probability density function

fX,Y (x, y)
fX|Y (x|y) = ,
fY (y)

when fY (y) > 0. Define


Z
E[X|Y = y] = xfX|Y (x|y)dx.
x

The random variable (Y ) is the conditional expectation of X given Y and denoted as E[X|Y ].

Example 1: Find E[Y |X] if the joint probability density function is fX,Y (x, y)= x1 ; 0 < y x 1.
Rx
Solution: fX (x)= x1 dy =1, 0 x 1
0

fX,Y (x,y) 1
fY |X (y|x)= fX (x) = x, 0<yx
Rx Rx y x
E[Y |X = x] = yfY |X (y|x)dy = x dy = 2
0 0

X
The conditional expectation E[Y |X] = 2.

Theorem 23.1 Law of Iterated Expectation:

E[Y ] = EX [E[Y |X]].

23-1
23-2 Lecture 23: Conditional Expectation

Proof: We prove the result for discrete random variables. We have

X
EX [E[Y |X]] = pX (x)E[Y |X = x]
x
X X
= pX (x) ypY |X (y|x)
x y
X X pX,Y (x, y)
= pX (x) y
x y
pX (x)
X
= ypX,Y (x, y)
x,y
X X
= y pX,Y (x, y)
y x
X
= ypY (y)
y

= E[Y ].

Similarly law of iterated expectation for jointly continuous random variables can also be proved.
Application of the law of iterated expectation:
N
P
SN = Xi , where {X1 , ...XN } are independent and identically distributed random variables. N is a non-
i=1
negative random variable independent of Xi i {1, ..N }. From the law of iterative expectation, E[SN ] =
EN [E[SN |N ]]. Consider

"N #
X
E[SN |N = n] = E Xi |N = n (23.1)
i=1
" n #
X
=E Xi |N = n . (23.2)
i=1

n
n
P P
As N is independent of Xi , E Xi |N = n =E Xi =nE[X].
i=1 i=1

Thus E[SN |N ] = N E[X], E[SN ] = E[N ]E[X].

Theorem 23.2 Generalized form of Law of Iterated Expectation:


For any measurable function g with E[|g(X)|] < ,

E[Y g(X)] = E[E[Y |X]g(X)].


Lecture 23: Conditional Expectation 23-3

Proof: We prove the result for discrete random variables. We have


X
E[E[Y |X]g(X)] = pX (x)E[Y |X = x]g(x)
x
X X
= pX (x)g(x) ypY |X (y|x)
x y
X X pX,Y (x, y)
= pX (x)g(x) y
x y
pX (x)
X
= yg(x)pX,Y (x, y)
x,y

= E[Y g(X)].

Exercise: Prove E[Y g(X)] = E[E[Y |X]g(X)] if X and Y are jointly continuous random variables.
This theorem implies that
E[(Y E[Y |X])g(X)] = 0. (23.3)
The conditional expectation E[Y |X] can be viewed as an estimator of Y given X. Y E(Y |X) is then the
estimation error for this estimator. The above theorem implies that the estimation error is uncorrelated
with every function of X.
Observe that in this lecture, we have not dealt with conditional expectations in a general framework. Instead,
we have separately defined it for discrete and jointly continuous random variables. In a more general
development of the topic, (23.3) is in fact taken as the defining property of the conditional expectation.
Specifically, for any g(X), one can prove the existence and uniqueness (up to measure zero) of a (X)-
measurable random variable (X), that satisfies E[((X) Y )g(X)] = 0. Such a (X) is then defined as
the conditional expectation E[Y |X]. For a more detailed discussion, refer Chapter 9 in [1].
Minimum Mean Square Error Estimator:
We have seen that E[Y |X] is an estimator of Y given X. In the next theorem we will prove that this is indeed
an optimal estimate of Y given X, in the sense that the conditional expectation minimizes the mean-squared
error.

Theorem 23.3 If E(Y 2 ) < , then for any measurable function g,


E[(Y E[Y |X])2 ] E[(Y g(X))2 ].

Proof:
E[(Y g(X))2 ] = E[(Y E[Y |X])2 ] + E[(E[Y |X] g(X))2 ] + 2E[(Y E[Y |X])(E[Y |X] g(X))]
E[(Y E[Y |X])2 ].

This is because E[(Y E[Y |X])(E[Y |X] g(X))] =0 (by (23.3)), and E[(E[Y |X] g(X))2 ] 0.
E[(Y E[Y |X])(E[Y |X] g(X))] =0 as from (23.3) we know that E[(E[Y |X] Y )(X)] = 0. Here (X) =
(E[Y |X] g(X)).
From (23.3) we observe that the estimation error Y (E[Y |X)] is orthogonal to any measurable function
of X. In the Hilbert Space of square integrable random variables, E[Y |X] can be viewed as the projection
of Y onto the subspace L2 ((X)) of (X) measurable random variables. As depicted in Figure 23.1, it is
quite intuitive that the conditional expectation (which is the projection of Y onto the subspace) minimizes
the mean-squared error among all random variables from the subspace L2 ((X)).
23-4 Lecture 23: Conditional Expectation

L2 ((X))

E[Y |X]

Figure 23.1: Geometric interpretation of MMSE

23.1 Exercises
1. Prove the law of iterated expectation for jointly continuous random variables.
2. (i) Given is the table for Joint PMF of random variables X and Y .
X=0 X =1
1 2
Y=0 5 5
2
Y=1 5 0
Let Z = E[X|Y ] and V = V ar(X|Y ). Find the PMF of Z and V , and compute E[Z] and E[V ].
1
(ii) Consider a sequence of i.i.d. random variables {Zi } where P(Zi = 0) = P(Zi = 1) = 2. Using
this sequence, define a new sequence of random variables {Xn } as follows:
X0 = 0,
X1 = 2Z1 1, and
Xn = Xn1 + (1 + Z1 + ... + Zn1 )(2Zn - 1) for n 2.
Show that E[Xn+1 |X0 , X1 , ..., Xn ] = Xn a.s. for all n.
3. (a) [MIT OCW problem set] The number of people that enter a pizzeria in a period of 15 minutes
is a (nonnegative integer) random variable K with known moment generating function MK (s).
Each person who comes in buys a pizza. There are n types of pizzas, and each person is equally
likely to choose any type of pizza, independently of what anyone else chooses. Give a formula, in
terms of MK (.), for the expected number of different types of pizzas ordered.
(b) John takes a taxi to home everyday after work. Every evening, he waits by the road to get a taxi
but every taxi that comes by is occupied with a probability 0.8 independent of each other . He
counts the number of taxis he missed till he gets an unoccupied taxi. Once he gets inside the taxi,
he throws a fair six faced die for a number of times equal to the number of taxis he missed. He
counts the output of the die throws and gives a tip to the driver equal to that. Find the expected
amount of tip that John gives everyday.

References
[1] D. Williams, Probability with Martingales,Cambridge University Press, Fourteenth Printing, 2011.

You might also like