This is a simulation in R for the solution to Fivethirtyeight’s Riddler Express from March 24th 2017, found here. Some solutions were published a week later, along with new puzzles.

To save a click, here’s the puzzle:

Your baby is learning to walk. She begins standing, holding onto the couch. There’s a 25 percent chance she’ll take a step forward, and a 75 percent chance she’ll stay put, clutching the couch. If she’s ever a step or more away from the couch, there’s a 25 percent chance she’ll take another step forward, a 25 percent chance she’ll stay where she is, and a 50 percent chance she’ll take a step back, toward the couch. In the long run, what percentage of the time does the baby clutch the couch?

Required libraries

library(ggplot2)
library(ggthemes)
library(dplyr)

Simulation function

The following function simulates the baby’s behaviour after a given number of steps and outputs a dataframe containing the distance from the couch and the proporion of time the baby has touched the couch after taking a number of steps.

babyWalk <- function(steps) {
    dist <- 0
    count <- 0
    prop_vec <- c()
    dist_vec <- c()
    for (i in 1:steps) {
        if (dist == 0) {
            x = sample(1:4, 1)
            
            if (x == 1) {
                dist <- dist + 1
            } else {
                count <- count + 1
            }
        } else {
            x = sample(1:4, 1)
            
            if (x == 1) {
                dist <- dist + 1
            }
            
            if (x %in% 2:3) {
                dist <- dist - 1
            }
            
            if (dist == 0) {
                count <- count + 1
            }
        }
        prop_vec = c(prop_vec, count/i)
        dist_vec = c(dist_vec, dist)
    }
    df <- data.frame(Distance = dist_vec, Proportion = prop_vec)
    df$Steps <- as.numeric(rownames(df))
    df
}

Now let’s use this function to simulate one baby taking 10,000 steps and graph it.

steps <- 10000
baby <- babyWalk(steps)

ggplot(baby, aes(x = Steps, y = Proportion)) + 
  geom_point(alpha = 0.2) + 
  theme_minimal() +
  labs(x = 'Steps Taken', y = 'Proportion of Time Touching the Couch')

Looks like the answer is 0.5. This is true by the maths!

Now let’s consider a population of babies. We simulate 500 babies taking 50 steps and use dplyr to calculate the percentage which are touching the couch at a particular step.

babyWalksDf <- data.frame(Distance = numeric(0), Proportion = numeric(0), Steps = numeric(0))

steps <- 50
num_babies <- 500

for (i in 1:num_babies){
  df <- babyWalk(steps)
  df$Baby_id <- i
  babyWalksDf <- bind_rows(babyWalksDf,df)
}

babyWalksDf <- babyWalksDf %>%
  group_by(Steps) %>%
  mutate(Percentage = (sum(Distance == 0)/num_babies)*100) %>%
  select(-Baby_id) %>%
  distinct(.keep_all = T)

And now we graph again. As expected, the percentage approaches 50%.

ggplot(babyWalksDf, aes(x = Steps , y = Percentage)) + 
  geom_point(alpha = 0.6) + 
  geom_smooth(method = 'auto') +
  theme_minimal() +
  labs(x = 'Steps Taken', y = 'Percentage of Babies Touching the Couch')