Table of Content
見 Conditioning。
注意是 under \(Y=y\)!
但是,在 continous r.v. 中,\(P(Y=y)\) 不是應該要等於 \(0\) 嗎?的確,\(P(Y=y) = 0\),不過這裡我們計算的是 r.v. \(X\),也就是說,我們關注的是在 \(Y=y\) 的情況下,\(\forall X \le x\) 的機率!也因為這個考量,\(F_{X\vert Y}(x\vert y)\) 才不寫成 \(P(X\le x, Y=y)/P(Y=y)\)。
見 Bayes’ Rule。
Given a discrete r.v. \(K\) and a continous r.v. \(Y\), the joint probability is
\[\begin{align*} &P(K=k, y \le Y \le+\delta) \\ =\ &P(K=k)P(y \le Y \le y+\delta|K=k) \approx p_K(k)f_{Y|K}(y|k)\delta \\ =\ &P(y\le Y \le y+\delta)P(K=k|y\le Y \le y+\delta) \approx f_Y(y)\delta p_{K|Y}(k|y). \end{align*}\]Comparing the right hand side, we have
\[p_K(k)f_{Y|K}(y|k) = f_Y(y)p_{K|Y}(k|y).\]From this equation, we can derive two formulae
\[\begin{align*} p_{K|Y}(k|y) = {p_K(k)f_{Y|K}(y|k) \over f_Y(y)}, \tag{1} \\ f_{Y|K}(y|k) = {f_Y(y)p_{K|Y}(k|y) \over p_K(k)}, \tag{2} \end{align*}\]where
\[f_Y(y) = \sum_{k'} p_K(k')f_{Y|K}(y|k'), \\ p_K(k) = \int f_Y(y')p_{K|Y}(k|y')dy'.\]可以發現,這 sum 和 integral 分別對應到 \((1)\) 和 \((2)\) 的分子!
解 \((1)、(2)\) 的時候,先算出分子再解決分母;一項一項慢慢來。
這裡,\(p_K(k)\) 稱為 prior probability(事前機率),指在沒有任何額外資訊下,the belief on \(K\);而 \(p_{K\vert Y}(k\vert y)\) 稱為 posterior probability(事後機率),則是多了 \(Y=y\) 的資訊。
後來發生的事件 \(Y=y\) 改變了我們對 \(K\) 的看法。
見 Conditioning。
Define:
\[E[X\vert Y] = g(Y).\]是 \(Y\) 的 function。
\[\begin{align*} E\big[E[X\vert Y]\big] &= E[g(Y)] \\ &= \sum_y g(y)p_Y(y) \\ &= \sum_y E[X\vert Y=y]p_Y(y) \\ &= E[X], \text{ by total expectation thm.} \end{align*}\]\(E\big[E[X\vert Y]\big] = E[X]\).
在使用時,先將 \(Y\) 代入象徵性的 \(y\),即 \(Y=y\),算好後再把 \(y\) 替換成 \(Y\)。
Proof
\[\begin{align*} var(X\vert Y=y) &= E[X^2\vert Y=y] - \Big(E[X\vert Y=y] \Big)^2,\text{ for all } y. \\ var(X\vert Y) &= E[X^2\vert Y] - \Big(E[X\vert Y] \Big)^2 \\ E\Big[var(X\vert Y)\Big] &= E\Big[E[X^2\vert Y]\Big] - E\bigg[\Big(E[X\vert Y] \Big)^2\bigg] \\ &= E[X^2] - E\bigg[\Big(E[X\vert Y] \Big)^2\bigg] \\ var\Big(E[X\vert Y]\Big) &= E\bigg[\Big(E[X\vert Y] \Big)^2\bigg] - \bigg(E\Big[E[X\vert Y] \Big] \bigg)^2 \\ &= E\bigg[\Big(E[X\vert Y] \Big)^2\bigg] - (E[X])^2. \end{align*}\]Hence
\[E\Big[var(X\vert Y)\Big] + var\Big(E[X\vert Y]\Big) = E[X^2] - (E[X])^2 = var(X). \tag*{$\blacksquare$}\]\(var(X) =\) (average variability within sections) \(+\) (variability between sections)
\(var(X) =\)(每個區間 var. 的平均)\(+\)(區間平均的 var.)
區間指的是每個 \(Y=y\)。
A rat in a maze has only two paths to go. If the rat goes right, it will return to the starting point after \(3\) minutes. If the rat goes left, there is a \(2\over 3\) chance of returning to the starting point after \(5\) minutes, and a \(1\over 3\) chance of exiting the maze after \(2\) minutes. Find the average time that the rat is in the maze if the probability of the rat going left and right is equal.
Solution
Let \(T\) be the time that the rat is in the maze and
\[X = \begin{cases} 0, &\text{goes right}, \\ 1, &\text{goes left}. \end{cases}\]Then
\[P(X=0) = P(X=1) = {1\over 2}, \\ E(T|X=0) = 3 + E(T), \\ E(T|X=1) = {2\over 3}(5 + E(T)) + {1\over 3}\cdot 2.\]Thus we have
\[E[E(T|X)] = {1\over 2}(3+E(T)) + {1\over 2}({2\over 3}(5 + E(T)) + {1\over 3}\cdot 2) = E(T), \\ \therefore E(T) = 21.\]Introduction to Probability, 2nd, by Dimitri P. Bertsekas and John N. Tsitsiklis. Problem 4.24 (p.251, 252).
Solution
Let \(T\) be the amount of time the professor will spend with the student, and \(F\) be the event that the student finds the professor. Then
\[E[T] = P(F)E[T\vert F] + P(F^c)E[T\vert F^c]. \tag{1}\]We then need to find out \(P(F)\). Let
\[\begin{align*} W &= \text{ length of time between 9 a.m. and arrival of the Ph.D. student (uniformly distributed);} \\ X &= \text{ amount of time the professor devotes to his task (exponentially distributed);} \\ Y &= \text{ length of time between 9 a.m. and arrival of the professor(uniformly distributed).} \end{align*}\]We have
\[P(F) = P(Y\le W\le X+Y).\]We konw that \(W\) can be between \(0\) and \(8\), but \(X+Y\) can be arbitrarily large. That is to say, we may overestimate \(P(W\le X+Y)\). Hence we should write
\[P(Y\le W\le X+Y) = 1 - \Big(P(W<Y) + P(W > X+Y) \Big).\]We have
\[\begin{align*} P(W < Y=y) &= F_W(y) = \int_0^{y}f_W(w)dw. \\ P(W < Y) &= \int^4_0 f_Y(y) F_W(y) dy. \tag{2} \end{align*}\]\(Y\) is uniformly distributed between \(0\) and \(4\).
記錄這題的重點有二: