HW5 (Due 12/1)

(5 points) With the same setup as in Q6 in class, recompute the a) ML, b) MAP, and c) Bayesian estimate of the probability of the third toss being head when the first two tosses are tails.
(5 points) With the same setup as in class where the Beta prior is assume but prior distribution is \(Beta(a=0.5,b=0.5)\) instead, compute the a) ML, b) MAP, and c) Bayesian estimate of the probability of the third toss being head when the first two tosses are tails (please see slides 85-86). Note that when \(a\) or \(b\) is less than 1, the beta distribution is skewed. The mode is at one of the two extremes (0 or 1) instead of \(\frac{a-1}{a+b-2}\).
(10 points) For a discrete variable with more than two outcomes, the Beta prior can be generalized to the Dirchlet prior (please also see slides 87-91). Assuming a Dirchlet prior of \(Dir(\alpha_1=3,\alpha_2=3,\alpha_3=3,\alpha_4=3,\alpha_5=3,\alpha_6=3)\), compute the a) ML, b) MAP, and c) Bayesian estimate of the probability of the third toss of a dice being one when the first two tosses are also ones.
(5 points) Extra credits. Write a program to estimate the probability of the second toss to be head given that the first toss is tail with Monte Carlo simulation with the setup as in Q6 in class. You can use any programming language of your choice but it has to be a Monte Carlo simulation. For example, your program can randomly draw a coin with the given distribution and toss it twice. Repeat this experiment, say 10,000 times. Only keep experiments that the first toss is tail and discard the rest. Finally, estimate the conditional probability as the fraction out of the remaining experiments that the second toss is head.
1. To which estimate (ML, MAP, Bayesian) your result obtained from Monte Carlo simulation is closest?
(25 points) Extra credits. For the setup used in class (the one with \(P(X=A)=2/3\)). Compute the following values (note that I will only give you points for correct solutions for this question).
1. \(H(Y_1,Y_2)\)
2. \(H(Y_1,Y_2|X)\)
3. \(H(X|Y_1)\)
4. \(H(X|Y_1=H)\)
5. \(H(Y_1|X=A)\)
6. \(I(Y_1;Y_2)\)
7. \(I(X;Y_2|Y_1)\)
8. \(I(X;Y_2|Y_1=H)\). N.B. \(I(X;Y|Z=a)=H(X|Z=a)-H(X|Y,Z=a)\)