This is a solution to the Riddler Classic from here.

*You play a game with four balls: One ball is red, one is blue, one is green and one is yellow. They are placed in a box. You draw a ball out of the box at random and note its color. Without replacing the first ball, you draw a second ball and then paint it to match the color of the first. Replace both balls, and repeat the process. The game ends when all four balls have become the same color. What is the expected number of turns to finish the game? *

*Extra credit: What if there are more balls and more colors? *

It was straightforward enough to get an answer by computation. Here we define a function sim that simulates the process and returns how many turns it takes. Then we take the mean of 10,000 results.

```
library(tidyverse)
library(ggthemes)
```

```
sim <- function(m){
colours <- 1:m
n = 1
while (n <= 10000){
pick1 <- sample(colours, 1)
i1 <- which(colours == pick1)[1]
pick2 <- sample(colours[-i1], 1)
i2 <- which(colours == pick2)[1]
colours[i2] <- pick1
if (length(unique(colours)) == 1){
return(n)
}
n = n + 1
}
}
vec <- c()
for (j in 1:10000){
vec <- c(vec, sim(4))
}
print(mean(vec))
```

`## [1] 8.9968`

Looks like the answer is 9.

We can approach this more analytically. Without loss of generality, let \(P_{n,m}\) be the probability that there are \(n\) red balls after \(m\) turns. So the probabilty the game ends after \(m\) turns is \(4 P_{4,m}\) and the expected number of turns is \[E = \sum_{m=0}^{\infty} m \cdot 4 P_{4,m}. \]

So we need to get \(P_{4,m}\). For there to be \(4\) red balls on turn \(m\), three things must occur:

- There must have been \(3\) red balls on turn \(m-1\).
- One of red balls was chosen in the first draw on turn \(m-1\).
- The last non red ball was chosen in the second draw on turn \(m-1\).

In probability terms we can say, \[P_{4,m} = P(\text{Red in first draw}) P(\text{Non red in second draw}) P_{3,m-1}.\] Some thought leads to \(P(\text{Red in first draw}) = \frac{3}{4}\) and \(P(\text{Non red in second draw}) = \frac{1}{3}.\) Therefore, \(P_{4,m} = \frac{1}{4} P_{3,m-1}.\)

So the problem reduces to getting \(P_{3,m}.\) By a similar rationale to above, we see \[P_{3,m} = P(\text{Red in first draw}) P(\text{Non red in second draw}) P_{2,m-1} + \] \[P(\text{Red in first draw}) P(\text{Red in second draw}) P_{3,m-1}.\] And working out these probabilities yields \[P_{3,m} = \frac{1}{3}P_{2,m-1} + \frac{1}{2}P_{3,m-1}.\] Similar arguments result in \[P_{2,m} = \frac{1}{4}P_{1,m-1} + \frac{1}{3}P_{2,m-1} \frac{1}{4}P_{3,m-1}\] and \[P_{1,m-1} = \frac{1}{2}P_{1,m-1} + \frac{1}{3}P_{2,m-1}.\]

This is a two dimensional recurrence relation, with the initial conditions \(P_{1,0} = 1\) and \(P_{n,m} = 0\) for \(n > m+1\). The best strategy I know is turn this into a one dimensional recurrence relation. To this end, I wrote up the relation as a function in R.

```
f <- function(n, m) {
if (n == 1 & m == 0){
return (1)
}
if (n > m + 1){
return (0)
}
else{
if (n == 4){
return (1/4*f(3, m-1))
}
if (n == 3){
return (1/3*f(2, m-1) + 1/2*f(3, m-1))
}
if (n == 2){
return (1/4*f(1,m-1) + 1/3*f(2,m-1) + 1/4*f(3,m-1))
}
if (n == 1){
return (1/2*f(1,m-1) + 1/3*f(2,m-1))
}
}
}
```

As a sanity check, the above simulation function should provide approximations to this function. Letâ€™s see if it does. We plot the simulations as coloumns and the function is the red line.

Looks good!

After a lot of fiddling around with fractions and getting some inspiration using the above function, I arrived at \(P_{2,m} = \frac{5^{m-1}}{4\cdot 6^{m-1}}\). This leads to the one dimensional reccurance relation \[P_{3,m} = \frac{5^{m-1}}{2^m 3^{m-1}} + \frac{1}{2} P_{3,m-1}. \]

Skipping a bunch of maths, we arrive at the explicit solution \[P_{3,m} =\frac{1}{2^{m+1}} \left( \left(\frac{5}{3}\right)^{m-1} - 1 \right). \]

So the expected number of turns is \[E = \sum_{m=2}^{\infty} \frac{m+1}{2^{m+1}} \left( \left(\frac{5}{3}\right)^{m-1} - 1 \right).\]

Further calculations (or cheating) gets us that this series converges to 9.