Home     Archive     Tags     About        RSS


John Baez
John D Cook
Turn a Fair Coin Into a Biased Coin

It's fairly easy to simulate a fair coin with a biased coin. See Coin Tossing: the Hydrogen Atom of Probability for a simple way to do it. But how do you turn a fair coin into a biased coin?

If you have a perfectly fair coin, \(P(H)=P(T)=1/2\), you can use it to simulate a biased coin with \(P(H)=\alpha\), \(P(T)=1-\alpha\). To see how let's first look at how to do it using a uniform random variable (URV). Let the URV have the following probability density function:

\[\begin{align} p(x) & = 1\;\;\; 0 \le x \lt 1\nonumber\\ & = 0\;\;\; \mbox{otherwise}\nonumber \end{align}\]

The cumulative distribution then says that \(P(x<\alpha)=\alpha\) and \(P(x\ge \alpha)=1-\alpha\). So the process is simple. Generate a random variate with this distribution. If \(x<\alpha\) take the result to be \(H\) otherwise take it to be \(T\).

You can mimic this process with a fair coin if you let \(H=0\), \(T=1\) and take the result of a toss sequence to be a fractional binary number. For example, the toss sequence \(HTHTT\) converts to the fractional binary number \(.01011\) which is equal to the decimal number \(1/4+1/16+1/32=11/32=0.34375\). If we were using this to simulate a biased coin with \(P(H)=1/3\), \(P(T)=2/3\) then at this point we could stop and output a \(T\) since the binary number will always stay above \(1/3\) no matter what the subsequent tosses are.

Suppose on the other hand that the toss sequence is \(HTHH\) which converts to \(.0100\) with decimal value \(1/4=0.25\). In this case we can stop and output a \(H\) since the binary number will always stay below \(1/3\) no matter what the subsequent tosses are. The same would be true if the toss sequence started as \(HH\). There is no way this can lead to a binary number greater than \(1/3\) so we can stop and output a \(H\).

This procedure can be used to simulate a biased coin with probability \(P(H)=\alpha\), \(P(T)=1-\alpha\). The basic algorithm is as follows. Let \(x\) be the probability we want to simulate, and \(y\) be the value of the toss sequence converted into a binary number. Then

1  input(x)
2  y = 0
3  i = 1
4  gettoss(t)
5  y = y + t/2^i
6  if y >= x
     goto 2
7  if x > y + 1/2^i
     goto 2
8  i = i + 1
9  goto 4

A C code implementation of this can be found here. It reads a string of \(1\)'s and \(0\)'s representing a fair coin toss sequence, and converts it into a string of \(1\)'s and \(0\)'s representing a biased coin with a probability of \(0\) given on the command line.

Using this program, we took data from the NIST randomness beacon that is supposed to be totally random, i.e. \(P(0)=P(1)=1/2\), and converted it using the C program linked to above to data with bias probabilities from \(0.0\) through \(1.0\) in increments of \(0.01\). For each bias, we took the first \(90000\) bits and created a \(300\) by \(300\) pixel image. All the images were then combined into the animated gif shown below.

Inverting a Tridiagonal Matrix

This post will show how to invert a tridiagonal matrix in spectral decomposition form. This is a convenient form if you're only interested in one particular element of the inverse and want it as a sum of terms involving the eigenvalues of the matrix.

For any matrix \(\mathbf{A}\) with eigenvalues \(x_k\) and eigenvectors \(\lvert v_k \rangle\) such that:

\[\mathbf{A}\lvert v_k \rangle=x_k\lvert v_k \rangle\]

We can write \(\mathbf{A}\) in spectral decomposition form as:

\[\mathbf{A}=\sum_k x_k \lvert v_k \rangle \langle v_k \rvert\]

Since the inverse of \(\mathbf{A}\) has the same eigenvectors but reciprocal eigenvalues we can likewise write the inverse as:

\[\mathbf{A^{-1}}=\sum_k \frac{\lvert v_k \rangle \langle v_k \rvert}{x_k}\]

Expressing the inverse in this form is often quite useful especially when the eigenvalues and vectors of \(\mathbf{A}\) are known or can easily be calculated. One type of matrix where the eigenvalues and vectors are easily calculated is a Tridiagonal matrix with constant diagonals. A general matrix of this type is \(\mathbf{A}=\mathbf{tridiag}(c,a,b)\) which in the \(n=4\) dimensional case would look like:

\[\begin{pmatrix}a & b & 0 & 0\\c & a & b & 0\\0 & c & a & b\\0 & 0 & c & a\end{pmatrix}\]

The matrix can be made symmetric with a similarity transform

\[\mathbf{B} = \mathbf{S}^{-1}\mathbf{A}\mathbf{S} = \begin{pmatrix}a & \sqrt{bc} & 0 & 0\\\sqrt{bc} & a &\sqrt{bc} & 0\\0 & \sqrt{bc} & a & \sqrt{bc}\\0 & 0 & \sqrt{bc} & a\end{pmatrix}\]

where \(\mathbf{S}\) is the diagonal matrix

\[\mathbf{S} = \begin{pmatrix}1 & 0 & 0 & 0\\0 & (c/b)^{1/2} & 0 & 0\\0 & 0 & c/b & 0\\0 & 0 & 0 & (c/b)^{3/2}\end{pmatrix}\]

How does this affect the eigenvalues? We can write the eigenvalue equation as \(\mathbf{S}^{-1}\mathbf{A}\mathbf{S}\mathbf{S}^{-1}\lvert v_k \rangle=x_k\mathbf{S}^{-1}\lvert v_k \rangle\) or \(\mathbf{B}\mathbf{S}^{-1}\lvert v_k \rangle=x_k\mathbf{S}^{-1}\lvert v_k \rangle\). Therefor \(\mathbf{B}\) has the same eigenvalues as \(\mathbf{A}\) but the eigenvectors become \(\mathbf{S}^{-1}\lvert v_k \rangle\). Let \(\lvert u_k \rangle=\mathbf{S}^{-1}\lvert v_k \rangle\) then the eigenvalue equation for \(\mathbf{B}\) is \(\mathbf{B}\lvert u_k \rangle=x_k\lvert u_k \rangle\). By dividing this equation by \(\sqrt{bc}\) we can simplify the matrix to

\[\mathbf{C} = \mathbf{B}/\sqrt{bc} = \begin{pmatrix}\alpha & 1 & 0 & 0\\1 & \alpha & 1 & 0\\0 & 1 & \alpha & 1\\0 & 0 & 1 & \alpha\end{pmatrix}\]

where \(\alpha=a/\sqrt{bc}\). The eigenvalues similarly become \(y_k=x_k/\sqrt{bc}\) and now we have the eigenvalue equation \(\mathbf{C}\lvert u_k \rangle=y_k\lvert u_k \rangle\). The eigenvalues \(y_k\) are roots of the determinant \(p(y)=\mathrm{det}(y\mathbf{I}-\mathbf{C})\). Let \(p_n(y)\) be this determinant when \(\mathbf{C}\) has dimension \(n\) then the determinant can be calculated recursively using the equation


with initial values \(p_0(y)=1\), \(p_1(y)=y-\alpha\). This equation can easily be found using the expansion theorem for determinants.

With the change in variables \(2z=y-\alpha\) the equation becomes the recurrence for the Type II Chebyshev polynomials:


whose roots can easily be found. Letting \(z=\cos{\theta}\) we have


The roots of \(U_n(\theta)\) occur at \(\theta_k=k\pi/(n+1)\) therefor the corresponding roots for \(p_n(y)\) are at:


These are the eigenvalues for the matrix \(\mathbf{C}\). The eigenvalues for the matrices \(\mathbf{A}\) and \(\mathbf{B}\) are


The only thing still needed to construct the inverse is the eigenvectors. The nice thing about symmetric tridiagonal matrices is that they all have the same eigenvectors. The \(i^{\mathrm{th}}\) component of the \(k^{\mathrm{th}}\) eigenvector is given by:

\[\lvert u_k \rangle_i=u_{i,k}=\sqrt{\frac{2}{n+1}}\sin{\frac{ik\pi}{n+1}}\]

We will not derive this result because it is quite easy to verify given the formula for the eigenvalues we derived above. The eigenvectors for the nonsymmetric matrix \(\mathbf{A}\) are then equal to \(\lvert v_k \rangle=\mathbf{S}\lvert u_k \rangle\) where \(\mathbf{S}\) is the diagonal matrix defined above.

In the next post I will show an interesting application of this.

Not Falling Behind in a Coin Toss

A coin is tossed \(n\) times. What is the probability that the number of tails in the sequence never exceeds the number of heads?

The problem can be represented as a walk on the infinite state automaton shown in the figure below.

The start state is \(0\) and all states are end states. When the automaton is in state \(k\) there have been \(k\) more heads than tales.

We will solve the problem by finding the probability generating function. For that we need the generating function for the number of walks that start at state \(0\) and end at any of the states. Since this is an infinite automaton the usual approach of using the transition matrix or a regular expression will not work. We can however use a recursive regular expression for this problem. The following pair of expressions describe walks that start at state \(0\) and end at any state including \(0\).

\[\begin{align} S & = A+AhA+AhAhA+\cdots\nonumber\\ & = A(\epsilon+hA+hAhA+\cdots\nonumber\\ & = A(hA)^{*}\nonumber\\ A & = (hAt)^{*}\nonumber \end{align}\]

The regex \(A\) is for starting at a state, visiting any number of states to the right, and ending back at the starting state. The first term in the regex \(S\) represents starting and ending at state \(0\). The second term represents starting at \(0\) and ending at \(1\), and so on.

We get the generating function by converting these regular expressions into algebraic form. The generating function is

\[g(h,t) = \frac{A(h,t)}{1-hA(h,t)}\]

where \(A(h,t)\) is a solution of the following equation.

\[A(h,t) = \frac{1}{1-htA(h,t)}\]

The solution we want is

\[A(h,t) = \frac{1-\sqrt{1-4ht}}{2ht}\]

Substituting this into the expression for \(g(h,t)\) and simplifying, we get the following generating function

\[g(h,t) = \frac{2}{1-2h+\sqrt{1-4ht}}\]

The coefficient of \(h^{k}t^{l}\) in the power series expansion of this function is the number of coin toss sequences of length \(n=k+l\) that have \(k\) heads and \(l\) tails where the number of tails never exceeds the number of heads. To get the probability generating function, we make the substitutions \(h\rightarrow pz\), and \(t\rightarrow qz\) where \(p\) is the probability of heads, and \(q=1-p\) is the probability of tails. The probability generating function is then

\[\begin{align} f(p,z) & = g(pz,qz)\nonumber\\ & = \frac{2}{1-2pz+\sqrt{1-4pqz^{2}}}\nonumber \end{align}\]

The coefficient of \(z^n\) in the power series expansion of \(f(p,z)\) is the probability that in a sequence of \(n\) coin tosses, the number of tails never exceeds the number of heads. If you do the power series expansion you will notice that the probability for a sequence of length \(2m\), i.e. an even length sequence, is always equal to the probability for the sequence of length \(2m-1\). This is because for an odd length sequence, the number of heads must always be greater than the number of tails, so it can be extended by getting either a head or a tail. The probabilities for \(n=10,20,30\), respectively, are shown below and are plotted in the following figure.


\[\begin{align} &-4862p^{19}+48620p^{18}-217360p^{17}+570570p^{16}-969969p^{15}+\nonumber\\ &1108536p^{14}-852720p^{13}+426360p^{12}-125970p^{11}+16796p^{10}\nonumber \end{align}\]

\[\begin{align} &2674440p^{29}-40116600p^{28}+280073300p^{27}-1206469600p^{26}+\nonumber\\ &3583214712p^{25}-7763631876p^{24}+12658095450p^{23}-15781521600p^{22}+\nonumber\\ &15123958200p^{21}-11090902680p^{20}+6129183060p^{19}-2476437600p^{18}+\nonumber\\ &691945800p^{17}-119759850p^{16}+9694845p^{15}\nonumber \end{align}\]

Probability that the number of tails never exceeds the number of heads in n=10, 20, 30 tosses of a coin as a function of p = probability of heads.

Combinatorics Problem 1

I occasionaly get emails from people asking about combinatorics problems and I run into a lot of problems in my own research so I thought I would start posting some of them here. I've blogged about these problems before but only sporadically. The plan is to start doing it on a more regular basis and to include other types of problems from subjects such as probability and information theory. I may even throw in some physics and circuit design problems. They should all be fairly simple but interesting and understandable to people who have read some of our books. So let's start with a combinatorics problem.

The problem involves constructing strings from an alphabet of the \(m\) symbols: \(a_1,a_2,\ldots,a_m\). Each string starts with at most \(n_1\) consecutive \(a_1\)'s followed by at most \(n_2\) consecutive \(a_2\)'s and so on. All \(n_i\) values are greater than zero and each symbol appears at least once. The question is how many unique strings of length \(n\) can one construct?

Combinatorially we are asking for the number of compositions of \(n\) into exactly \(m\) parts with part \(i\) in the range from \(1\) to \(n_i\). Suppose for example \(n=5\), \(m=3\), and \(n_1=n_2=n_3=3\) then the compositions and corresponding strings using the symbols \(0,1,2\) are

  • \(5 = 3+1+1\hspace{3em} 00012\)
  • \(5 = 1+3+1\hspace{3em} 01112\)
  • \(5 = 1+1+3\hspace{3em} 01222\)
  • \(5 = 2+2+1\hspace{3em} 00112\)
  • \(5 = 2+1+2\hspace{3em} 00122\)
  • \(5 = 1+2+2\hspace{3em} 01122\)

The simplest way to solve the general problem is to use a generating function. If you look at each string as a concatenation of \(m\) substrings each composed of a single symbol with possible lengths \(1,2,\ldots,n_i\) then it's very easy to construct the generating function. The substrings have generating functions given by


and the generating function for the complete string is the product of these substring generating functions. For the example above, each substring has the g.f. \(z+z^2+z^3\) so the total g.f. is \((z+z^2+z^3)^3\) which expanded out is:


This tells us there's only one string of length 3 and 9. The strings are \(012\) and \(000111222\) respectively. There are three strings of length 4 and 8, they are \(0012\), \(0112\), \(0122\) and \(00011122\), \(00011222\), \(00111222\). There are six strings of length 5 and 7 and seven strings of length 6.

In the general case where we have \(m\) symbols and each can repeat at most 3 times the generating function is \((z+z^2+z^3)^m\) and the numbers of possible strings will be given by the trinomial coefficients.

Another interesting case is when the symbol \(a_i\) can repeat at most \(i\) times. With 4 symbols for example, the generating function is


which expanded out is


The coefficients of these generating functions are the Mahonian numbers which have many interesting properties and interpretations.

© 2010-2019 Stefan Hollos and Richard Hollos