HW4 (Due 10/24)

For convenience, we will use natural log (with unit nat) rather than \(log_2\) (with unit bit) for this question. Consider the Beta distribution, the Beta distribution is very useful to model the prior distribution of a binary variable such as the outcome of a coin flip. The Beta distribution has the support of \([0,1]\). That is, the pdf \(p(x)\) is \(0\) for \(x<0\) and \(x>1\). Within \([0,1]\), the pdf is given by \(\frac{x^{\alpha-1}(1-x)^{\beta-1}}{B(\alpha,\beta)}\), where \(B(\alpha,\beta)=\frac{ \Gamma(\alpha) \Gamma(\beta)}{\Gamma(\alpha+\beta)}\) and \(\Gamma\) is the Gamma function. Define the digamma function \(\psi(a) \triangleq \frac{\partial \ln \Gamma(a)}{\partial a}\), we have \(\qquad\qquad\) \({\displaystyle {\begin{aligned}\operatorname {E} [\ln X]&=\int _{0}^{1}\ln x\,p(x;\alpha ,\beta )\,dx\\[4pt]&=\int _{0}^{1}\ln x\,{\frac {x^{\alpha -1}(1-x)^{\beta -1}}{\mathrm {B} (\alpha ,\beta )}}\,dx\\[4pt]&={\frac {1}{\mathrm {B} (\alpha ,\beta )}}\,\int _{0}^{1}{\frac {\partial x^{\alpha -1}(1-x)^{\beta -1}}{\partial \alpha }}\,dx\\[4pt]&={\frac {1}{\mathrm {B} (\alpha ,\beta )}}{\frac {\partial }{\partial \alpha }}\int _{0}^{1}x^{\alpha -1}(1-x)^{\beta -1}\,dx\\[4pt]&={\frac {1}{\mathrm {B} (\alpha ,\beta )}}{\frac {\partial \mathrm {B} (\alpha ,\beta )}{\partial \alpha }}\\[4pt]&={\frac {\partial \ln \mathrm {B} (\alpha ,\beta )}{\partial \alpha }}\\[4pt]&={\frac {\partial \ln \Gamma (\alpha )}{\partial \alpha }}-{\frac {\partial \ln \Gamma (\alpha +\beta )}{\partial \alpha }}\\[4pt]&=\psi (\alpha )-\psi (\alpha +\beta )\end{aligned}}},\)
1. (5 points) Study and understand the derivation above. By immitating the above proof, show that \(E[\ln (1-X)] = \psi(\beta) - \psi(\alpha+\beta)\)
2. (5 points) Using the result given above and shown in Part a, show that the differential entropy of a Beta distributed variable is \(\ln B(\alpha,\beta) - (\alpha-1) \psi(\alpha) - (\beta -1) \psi(\beta) +(\alpha+\beta-2) \psi(\alpha+\beta)\)
(5 points) For a signal with maximum variance of \(5^2\), if we want to store the signal with the precision of 1 decimal place, how many bits will be needed (without any other information)?
(5 points) Consider a binary channel with a cross over probability of 0.1 and an erasure probability of 0.1, what is the capacity of the channel? In other words, when a bit is sent through the channel, there is a probability of 0.1 that the bit is flipped and then there is an additional probability of 0.1 that the bit got erased (the decoder can't even tell if the received bit is 0 or 1).