Home
/index.xml
Recent content on HomeHugo -- gohugo.ioen-usVictor BibleSun, 21 Jan 2018 00:00:00 +0000Studies in Two Factor Authentication
/2018/01/21/2018-01-21-studies-in-two-factor-authentication/
Sun, 21 Jan 2018 00:00:00 +0000/2018/01/21/2018-01-21-studies-in-two-factor-authentication/<pre><code class="language-python">from IPython.display import Image
import warnings
warnings.filterwarnings('ignore')
Image("puzzle1.png")
</code></pre>
<p><img src="studies_in_two_factor_auth_files/studies_in_two_factor_auth_0_0.png" alt="png" /></p>
<p>The above is the image accompanying this week’s (very fun!) Riddler puzzle 1.
<a href="https://fivethirtyeight.com/features/can-you-stay-awake-for-50-hours-and-solve-150-puzzles/">https://fivethirtyeight.com/features/can-you-stay-awake-for-50-hours-and-solve-150-puzzles/</a></p>
<p>It came with this clue:</p>
<p><em>Puzzle 1: Studies in two-factor authentication</em></p>
<p><em>Ugh! Dad says the computer will hurt my eyes, but I doubt that’s his prime concern. Time to see what requires such complex security.</em></p>
<p>At first glance the picture is just thousands of black dots on a white background. On closer inspectin there’s a few coloured dots. A bit of googling stuff around complex primes (from the words “prime” and “complex” in the hint) lead me to reading about the Gaussian integers: <a href="https://www.wikiwand.com/en/Gaussian_integer">https://www.wikiwand.com/en/Gaussian_integer</a>.</p>
<p>By counting pixels, I figured out that this is an image of the positive Gaussian primes in black along with some Gaussian integers (non prime) coloured in. The “two-factor” part of the title led to the idea of examining the factors of the coloured numbers.</p>
<pre><code class="language-python">import numpy as np
import matplotlib.pyplot as plt
from skimage.io import imread
from utils import *
img = imread("puzzle1.png")
</code></pre>
<p>First we find out what are the pixel values in the image.</p>
<pre><code class="language-python">unique_points = []
for _ in list(list(_) for _ in img.reshape(250*250,3)):
if _ not in unique_points:
unique_points.append(_)
unique_points
</code></pre>
<pre><code>[[255, 255, 255],
[0, 0, 0],
[0, 0, 255],
[204, 204, 0],
[255, 0, 0],
[0, 255, 0],
[127, 0, 204],
[255, 153, 25]]
</code></pre>
<p>The first two are white and black. Using <a href="https://www.rapidtables.com/web/color/RGB_Color.html">https://www.rapidtables.com/web/color/RGB_Color.html</a> we get that the others are the six colours in the picture - red, blue, green, orange, yellow and purple.</p>
<pre><code class="language-python">blue = [0,0,255]
green = [0,255,0]
red = [255,0,0]
purple = [127,0,204]
yellow = [204, 204, 0]
orange = [255, 153, 25]
</code></pre>
<pre><code class="language-python">blue_pixels = col_pixels(img, blue)
green_pixels = col_pixels(img, green)
red_pixels = col_pixels(img, red)
purple_pixels = col_pixels(img, purple)
yellow_pixels = col_pixels(img, yellow)
orange_pixels = col_pixels(img, orange)
</code></pre>
<pre><code class="language-python">print("Red pixels:", red_pixels)
print("Green pixels:", green_pixels)
print("Blue pixels:", blue_pixels)
print("Yellow pixels:", yellow_pixels)
print("Purple pixels:", purple_pixels)
print("Orange pixels:", orange_pixels)
</code></pre>
<pre><code>Red pixels: [[124, 46], [146, 58], [162, 68], [169, 43], [184, 76]]
Green pixels: [[213, 31], [215, 19], [225, 11]]
Blue pixels: [[71, 57], [140, 60], [145, 15], [191, 33]]
Yellow pixels: [[109, 47], [149, 79]]
Purple pixels: [[237, 1], [243, 9]]
Orange pixels: [[238, 10], [242, 4], [246, 6], [246, 12]]
</code></pre>
<p>By closely examining the picture we see that these values are correct.</p>
<p>If these are complex numbers we’ll need to convert the pixels into Cartesian coordinates.</p>
<pre><code class="language-python">orange_complex = [pixel_to_complex(_) for _ in orange_pixels]
purple_complex = [pixel_to_complex(_) for _ in purple_pixels]
red_complex = [pixel_to_complex(_) for _ in red_pixels]
yellow_complex = [pixel_to_complex(_) for _ in yellow_pixels]
green_complex = [pixel_to_complex(_) for _ in green_pixels]
blue_complex = [pixel_to_complex(_) for _ in blue_pixels]
</code></pre>
<pre><code class="language-python">print("Orange numbers:", orange_complex)
print("Purple numbers:", purple_complex)
print("Red numbers:", red_complex)
print("Blue numbers:", blue_complex)
print("Yellow numbers:", yellow_complex)
print("Green numbers:", green_complex)
</code></pre>
<pre><code>Orange numbers: [array([ 10.+11.j]), array([ 4.+7.j]), array([ 6.+3.j]), array([ 12.+3.j])]
Purple numbers: [array([ 1.+12.j]), array([ 9.+6.j])]
Red numbers: [array([ 46.+125.j]), array([ 58.+103.j]), array([ 68.+87.j]), array([ 43.+80.j]), array([ 76.+65.j])]
Blue numbers: [array([ 57.+178.j]), array([ 60.+109.j]), array([ 15.+104.j]), array([ 33.+58.j])]
Yellow numbers: [array([ 47.+140.j]), array([ 79.+100.j])]
Green numbers: [array([ 31.+36.j]), array([ 19.+34.j]), array([ 11.+24.j])]
</code></pre>
<p>Next I factorised all the above numbers using <a href="https://www.alpertron.com.ar/GAUSSIAN.HTM">https://www.alpertron.com.ar/GAUSSIAN.HTM</a>. All of the numbers have two or three prime factors.</p>
<p>With a bit of trial and error on pen and paper I noticed by drawing lines between prime factors for some colours I could see letters.</p>
<pre><code class="language-python">red_factors = np.array([[8 + 7j, 11 + 6j],
[8 + 5j, 11 + 6j],
[8 + 5j, 11 + 4j],
[8 + 3j, 8 + 7j],
[8 + 3j, 11 + 4j]])
see_factors(red_factors, red)
</code></pre>
<p><img src="studies_in_two_factor_auth_files/studies_in_two_factor_auth_14_0.png" alt="png" /></p>
<p>The colours which had numbers with only two prime factors represented letters. The others didn’t look so good. After messing around a bit by multiplying together two of the prime factors for the numbers which had three, I arrived at the following list.</p>
<pre><code class="language-python">orange_factors = np.array([[3 + 2j, 4 + 1j],
[2 + 1j, 3 + 2j],
[3, 2 + 1j],
[3, 4 + 1j]])
yellow_factors = np.array([[8 + 7j, 12 + 7j],
[10 + 3j, 10 + 7j]])
green_factors = np.array([[6 + 1j, 6 + 5j],
[4 + 1j, 4 + 5j],
[6 + 1j, 4 + 5j]])
blue_factors = np.array([[10 + 9j, 12 + 7j],
[8 + 7j, 11 + 4j],
[6 + 5j, 10 + 9j],
[6 + 5j, 8 + 3j]])
purple_factors = np.array([[3, 3+2j],
[1 + 2j, 5 + 2j]])
</code></pre>
<p>And here’s what they look like:</p>
<pre><code class="language-python">see_factors(orange_factors, orange)
</code></pre>
<p><img src="studies_in_two_factor_auth_files/studies_in_two_factor_auth_18_0.png" alt="png" /></p>
<pre><code class="language-python">see_factors(yellow_factors, yellow)
</code></pre>
<p><img src="studies_in_two_factor_auth_files/studies_in_two_factor_auth_19_0.png" alt="png" /></p>
<pre><code class="language-python">see_factors(green_factors, green)
</code></pre>
<p><img src="studies_in_two_factor_auth_files/studies_in_two_factor_auth_20_0.png" alt="png" /></p>
<pre><code class="language-python">see_factors(blue_factors, blue)
</code></pre>
<p><img src="studies_in_two_factor_auth_files/studies_in_two_factor_auth_21_0.png" alt="png" /></p>
<pre><code class="language-python">see_factors(purple_factors, purple)
</code></pre>
<p><img src="studies_in_two_factor_auth_files/studies_in_two_factor_auth_22_0.png" alt="png" /></p>
<p>So by going in rainbow order (ROYGBIV), or using an anagram solver, we get the answer of BOTNET.</p>
<p>Neat!</p>
The Riddler - Squishyball
/puzzles/the-riddler-squishyball/
Fri, 14 Jul 2017 00:00:00 +0000/puzzles/the-riddler-squishyball/<p>This is the solution to this week’s <a href="https://fivethirtyeight.com/features/can-you-eat-more-pizza-than-your-siblings/">Riddler Classic.</a></p>
<p>Suppose we have an series that lasts <span class="math inline">\(n\)</span> games, where <span class="math inline">\(n\)</span> is some odd number between 1 and 99. Let <span class="math inline">\(k = (n+1)/2\)</span>, so <span class="math inline">\(k\)</span> is the number of games we need to win to win the series. Then the likelihood of winning the series <span class="math inline">\(k : 0\)</span> is the probability of one victory repeated <span class="math inline">\(k\)</span> times, that is <span class="math inline">\(\left( \frac{6}{10} \right) ^ k\)</span>.</p>
<p>The likelihood of winning the series <span class="math inline">\(k : 1\)</span> is the probability of <span class="math inline">\(k\)</span> victories and one loss times the number of such games, k. <span class="math inline">\(\left( \frac{6}{10} \right) ^ k \times \frac{4}{10} \times k\)</span>.</p>
<p>The likelihood of winning the series <span class="math inline">\(k : 2\)</span> is the probability of <span class="math inline">\(k\)</span> victories and 2 losses times the number of such games, $k 2 $. <span class="math inline">\(\left( \frac{6}{10} \right) ^ k \times \frac{4}{10} \times {{k + 1}\choose{2}}\)</span>.</p>
<p>The likelihood of winning the series <span class="math inline">\(k : m\)</span> is the probability of <span class="math inline">\(k\)</span> victories and <span class="math inline">\(m\)</span> losses times the number of such games, $k m $. That is, $( ) ^ k ( )^{m} $.</p>
<p>The expected winnings in a series of length <span class="math inline">\(n\)</span> is then $(1000000 - 10000k) _{m = 0}^{k} $</p>
The Riddler - Who Steals Most?
/puzzles/the-riddler-who-steals-most/
Sun, 02 Jul 2017 00:00:00 +0000/puzzles/the-riddler-who-steals-most/<p>This is a solution to this week’s <a href="https://fivethirtyeight.com/features/who-steals-the-most-in-a-town-full-of-thieves/">Riddler Classic.</a></p>
<p><em>A town of 1,000 households has a strange law intended to prevent wealth-hoarding. On January 1 of every year, each household robs one other household, selected at random, moving all of that house’s money into their own house. The order in which the robberies take place is also random and is determined by a lottery. (Note that if House A robs House B first, and then C robs A, the houses of A and B would each be empty and C would have acquired the resources of both A and B.)</em></p>
<p><em>Two questions about this fateful day:</em></p>
<p><em>1. What is the probability that a house is not robbed over the course of the day?</em></p>
<p><em>2. Suppose that every house has the same amount of cash to begin with — say $100. Which position in the lottery has the most expected cash at the end of the day, and what is that amount?</em></p>
<p>For the first question, the probability that house <span class="math inline">\(m\)</span> draws house <span class="math inline">\(n\)</span> is <span class="math inline">\(\frac{1}{999}\)</span>. Write this as <span class="math inline">\(P_{m,n} = \frac{1}{999}\)</span>. The probability house <span class="math inline">\(n\)</span> is not robbed by any house is then <span class="math display">\[\prod_{m \in \{1, \dots 1000 \} \backslash \{ n \}} (1 - P_{m,n}) = \prod_{m \in \{1, \dots 1000 \} \backslash \{ n \}} \frac{998}{999} = \left( \frac{998}{999} \right)^{999} = 0.3678.\]</span></p>
<p>Onto the second part. Consider the house in position <span class="math inline">\(n\)</span>. One of three things can happen.</p>
<ol style="list-style-type: decimal">
<li>The house never gets robbed.</li>
<li>The house gets robbed after their turn.</li>
<li>The house gets robbed before their turn and not after.</li>
</ol>
<p>Let these events have probabilites <span class="math inline">\(P_1, P_2, P_3\)</span> and expected values <span class="math inline">\(E_1, E_2, E_3\)</span> respectively.</p>
<p>We’ve seen that <span class="math inline">\(P_1 = 0.368\)</span>. Since the mean of all houses that house <span class="math inline">\(n\)</span> can steal from is <span class="math inline">\(\$100\)</span>, <span class="math inline">\(E_1 = \$200\)</span>.</p>
<p>It’s not hard to see that <span class="math inline">\(P_2 = 1 - (998/999)^{1000-n}\)</span>. In this event the house ends with no money so <span class="math inline">\(E_2 = 0\)</span>.</p>
<p>And similarly <span class="math inline">\(P_3 = (1 - (998/999))^{n-1}(998/999)^{1000-n}\)</span>. Here the expected value of all other houses is <span class="math inline">\(\frac{100000}{999}\)</span>, so $E_3 = 100.1001001. $</p>
<p>Putting all this together, <span class="math inline">\(E_n = P_1 E_1 + P_2 E_2 + P_3 E_3 = 0.368 \times 200 + 0 + (1 - (998/999))^{n-1}(998/999)^{1000-n} \times 100.1001001.\)</span> This increases strictly as <span class="math inline">\(n\)</span> increases, so the final house in the lottery has the most expected cash and the amount is <span class="math inline">\(\$136.83282\)</span>.</p>
<p>I also ran a <a href="https://github.com/vbible/the-riddler/blob/master/house-lottery/houseLottery.R">simulation</a> and graphed it with <span class="math inline">\(E_n\)</span> as the red line.</p>
<p><img src="/puzzles/2017-07-02-the-riddler-who-steals-most_files/figure-html/unnamed-chunk-1-1.png" width="672" /></p>
The Riddler- Who Steals Most?
/2017/07/02/2017-07-02-the-riddler-who-steals-most/
Sun, 02 Jul 2017 00:00:00 +0000/2017/07/02/2017-07-02-the-riddler-who-steals-most/<p>This is a solution to this week’s <a href="https://fivethirtyeight.com/features/who-steals-the-most-in-a-town-full-of-thieves/">Riddler Classic.</a></p>
<p><em>A town of 1,000 households has a strange law intended to prevent wealth-hoarding. On January 1 of every year, each household robs one other household, selected at random, moving all of that house’s money into their own house. The order in which the robberies take place is also random and is determined by a lottery. (Note that if House A robs House B first, and then C robs A, the houses of A and B would each be empty and C would have acquired the resources of both A and B.)</em></p>
<p><em>Two questions about this fateful day:</em></p>
<p><em>1. What is the probability that a house is not robbed over the course of the day?</em></p>
<p><em>2. Suppose that every house has the same amount of cash to begin with — say $100. Which position in the lottery has the most expected cash at the end of the day, and what is that amount?</em></p>
<p>For the first question, the probability that house <span class="math inline">\(m\)</span> draws house <span class="math inline">\(n\)</span> is <span class="math inline">\(\frac{1}{999}\)</span>. Write this as <span class="math inline">\(P_{m,n} = \frac{1}{999}\)</span>. The probability house <span class="math inline">\(n\)</span> is not robbed by any house is then <span class="math display">\[\prod_{m \in \{1, \dots 1000 \} \backslash \{ n \}} (1 - P_{m,n}) = \prod_{m \in \{1, \dots 1000 \} \backslash \{ n \}} \frac{998}{999} = \left( \frac{998}{999} \right)^{999} = 0.3678.\]</span></p>
<p>Onto the second part. Consider the house in position <span class="math inline">\(n\)</span>. One of three things can happen.</p>
<ol style="list-style-type: decimal">
<li>The house never gets robbed.</li>
<li>The house gets robbed after their turn.</li>
<li>The house gets robbed before their turn and not after.</li>
</ol>
<p>Let these events have probabilites <span class="math inline">\(P_1, P_2, P_3\)</span> and expected values <span class="math inline">\(E_1, E_2, E_3\)</span> respectively.</p>
<p>We’ve seen that <span class="math inline">\(P_1 = 0.368\)</span>. Since the mean of all houses that house <span class="math inline">\(n\)</span> can steal from is <span class="math inline">\(\$100\)</span>, <span class="math inline">\(E_1 = \$200\)</span>.</p>
<p>It’s not hard to see that <span class="math inline">\(P_2 = 1 - (998/999)^{1000-n}\)</span>. In this event the house ends with no money so <span class="math inline">\(E_2 = 0\)</span>.</p>
<p>And similarly <span class="math inline">\(P_3 = (1 - (998/999))^{n-1}(998/999)^{1000-n}\)</span>. Here the expected value of all other houses is <span class="math inline">\(\frac{100000}{999}\)</span>, so $E_3 = 100.1001001. $</p>
<p>Putting all this together, <span class="math inline">\(E_n = P_1 E_1 + P_2 E_2 + P_3 E_3 = 0.368 \times 200 + 0 + (1 - (998/999))^{n-1}(998/999)^{1000-n} \times 100.1001001.\)</span> This increases strictly as <span class="math inline">\(n\)</span> increases, so the final house in the lottery has the most expected cash and the amount is <span class="math inline">\(\$136.83282\)</span>.</p>
<p>I also ran a <a href="https://github.com/vbible/the-riddler/blob/master/house-lottery/houseLottery.R">simulation</a> and graphed it with <span class="math inline">\(E_n\)</span> as the red line.</p>
<p><img src="/post/2017-07-02-the-riddler-who-steals-most_files/figure-html/unnamed-chunk-1-1.png" width="672" /></p>
The Riddler - Careening Commute
/2017/05/27/2017-05-27-the-riddler-careening-commute/
Sat, 27 May 2017 00:00:00 +0000/2017/05/27/2017-05-27-the-riddler-careening-commute/<p><em>Four co-workers carpool to work each day. A driver is selected randomly for the drive to work and again randomly for the drive home. Each of the drivers has a lead foot, and each has a chance of being ticketed for speeding. Driver A has a 10 percent chance of getting a ticket each time he drives, Driver B a 15 percent chance, Driver C a 20 percent chance, and Driver D a 25 percent chance. The state will immediately revoke the license of a driver after his or her third ticket, and a driver will stop driving in the carpool once his license is revoked. Since there is only one police officer on the carpool route, a maximum of one ticket will be issued per morning and a max of one per evening.</em></p>
<p><em>Assuming that all four drivers start with no tickets, how many days can we expect the carpool to last until all the drivers have lost their licenses?</em></p>
<p>Just a simulation from me this week. I’m sure there’s a nice analytical approach.</p>
<pre class="r"><code>sim <- function(){
drivers <- 1:4
tickets <- replicate(4, 0)
n <- 0
while(length(drivers) > 0){
if (length(drivers) == 1){
driver = drivers
}
else{
driver <- sample(drivers, 1)
}
x <- runif(1)
if (x < (0.1 + (driver - 1)*0.05) ) {
tickets[driver] = tickets[driver] + 1
}
if (tickets[driver] == 3) {
drivers <- drivers[drivers != driver]
}
n = n + 1
}
return(n)
}
v <- c()
for (i in 1:10000){
v <- c(v, sim())
}
mean(v)</code></pre>
<pre><code>## [1] 76.9298</code></pre>
<p>So the carpool should end on the 38th day.</p>
<p>And here’s a histogram.</p>
<pre class="r"><code>library(ggplot2)
library(ggthemes)
df <- as.data.frame(v)
ggplot(df, aes(v)) +
geom_histogram(bins=40) +
theme_fivethirtyeight()</code></pre>
<p><img src="/post/2017-05-27-the-riddler-careening-commute_files/figure-html/unnamed-chunk-2-1.png" width="672" /></p>
The Riddler - Careening Commute
/puzzles/the-riddler-careening-commute/
Sat, 27 May 2017 00:00:00 +0000/puzzles/the-riddler-careening-commute/<p><em>Four co-workers carpool to work each day. A driver is selected randomly for the drive to work and again randomly for the drive home. Each of the drivers has a lead foot, and each has a chance of being ticketed for speeding. Driver A has a 10 percent chance of getting a ticket each time he drives, Driver B a 15 percent chance, Driver C a 20 percent chance, and Driver D a 25 percent chance. The state will immediately revoke the license of a driver after his or her third ticket, and a driver will stop driving in the carpool once his license is revoked. Since there is only one police officer on the carpool route, a maximum of one ticket will be issued per morning and a max of one per evening.</em></p>
<p><em>Assuming that all four drivers start with no tickets, how many days can we expect the carpool to last until all the drivers have lost their licenses?</em></p>
<p>Just a simulation from me this week. I’m sure there’s a nice analytical approach.</p>
<pre class="r"><code>sim <- function(){
drivers <- 1:4
tickets <- replicate(4, 0)
n <- 0
while(length(drivers) > 0){
if (length(drivers) == 1){
driver = drivers
}
else{
driver <- sample(drivers, 1)
}
x <- runif(1)
if (x < (0.1 + (driver - 1)*0.05) ) {
tickets[driver] = tickets[driver] + 1
}
if (tickets[driver] == 3) {
drivers <- drivers[drivers != driver]
}
n = n + 1
}
return(n)
}
v <- c()
for (i in 1:10000){
v <- c(v, sim())
}
mean(v)</code></pre>
<pre><code>## [1] 76.9298</code></pre>
<p>So the carpool should end on the 38th day.</p>
<p>And here’s a histogram.</p>
<pre class="r"><code>library(ggplot2)
library(ggthemes)
df <- as.data.frame(v)
ggplot(df, aes(v)) +
geom_histogram(bins=40) +
theme_fivethirtyeight()</code></pre>
<p><img src="/post/2017-05-27-the-riddler-careening-commute_files/figure-html/unnamed-chunk-2-1.png" width="672" /></p>
The Riddler - Who Will Win The Lucky Derby?
/2017/05/08/2017-05-06-the-riddle-who-will-win-the-lucky-derby/
Mon, 08 May 2017 00:00:00 +0000/2017/05/08/2017-05-06-the-riddle-who-will-win-the-lucky-derby/<script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"></script>
<p><em>The Kentucky Derby is on Saturday, and a field of 20 horses is slated to run “the fastest two minutes in sports” in pursuit of the right to be draped with a blanket of roses. But let’s consider, instead, the Lucky Derby, where things are a little more bizarre:</em></p>
<p><em>The bugle sounds, and 20 horses make their way to the starting gate for the first annual Lucky Derby. These horses, all trained at the mysterious Riddler Stables, are special. Each second, every Riddler-trained horse takes one step. Each step is exactly one meter long. But what these horses exhibit in precision, they lack in sense of direction. Most of the time, their steps are forward (toward the finish line) but the rest of the time they are backward (away from the finish line). As an avid fan of the Lucky Derby, you’ve done exhaustive research on these 20 competitors. You know that Horse One goes forward 52 percent of the time, Horse Two 54 percent of the time, Horse Three 56 percent, and so on, up to the favorite filly, Horse Twenty, who steps forward 90 percent of the time. The horses’ steps are taken independently of one another, and the finish line is 200 meters from the starting gate.</em></p>
<p><em>Handicap this race and place your bets! In other words, what are the odds (a percentage is fine) that each horse wins?</em></p>
<p><em>Extra credit: Animate this derby. I’ll broadcast the highlights next week.</em></p>
<p>This week I only had time to do the extra credit.</p>
<pre class="r"><code>library(tidyverse)
library(gganimate)
horses <- 1:20
f <- function(horse){
weights = c(0.5 + 0.02*horse, 0.5 - 0.02*horse)
step = sample(c(1,-1), 1, prob = weights)
return(step)
}
raceDf <- function(){
pos <- replicate(20,0)
race <- list(pos)
while (max(unlist(race)) < 200){
step <- c()
for (horse in horses){
step <- c(step,f(horse))
}
pos <- pos + step
race <- c(race,list(pos))
}
names(race) <- 0:(length(race)-1)
return(data.frame(race))
}</code></pre>
<p>Here I simulate a race and plot it.</p>
<pre class="r"><code>race <- raceDf()
race <- (data.frame(t(race)))
colnames(race) <- 1:20
rownames(race) <- 1:length(rownames(race))
race$Step <- as.numeric(rownames(race))
raceGathered <- gather(race, key = 'horse', value ='Distance', -Step)
ggplot(raceGathered, aes(x = Step, y = Distance, group = factor(horse, levels = as.numeric(horse)),
color = factor(horse, levels = as.numeric(horse) ) )) +
geom_line() +
theme(legend.position = 'bottom') +
guides(colour = guide_legend(nrow = 2, title = 'Horse')) </code></pre>
<p><img src="/post/2017-05-06-the-riddle-who-will-win-the-lucky-derby_files/figure-html/unnamed-chunk-2-1.png" width="672" /></p>
<p>And we can also animate the race. <a href="https://www.youtube.com/watch?v=7m1h0Hf5uMs">Go banana!</a></p>
<pre class="r"><code>p <- ggplot(raceGathered, aes(x = factor(horse, levels = as.numeric(horse)),
y = Distance,
frame = Step,
color = factor(horse, levels = as.numeric(horse)))) +
geom_point(size = 5) +
theme(legend.position = 'none') +
labs(x = 'Horse')#, y = 'Position')
gganimate(p, interval = 0.1)</code></pre>
<div class="figure">
<img src="race.gif" />
</div>
The Riddler - Who Will Win The Lucky Derby?
/puzzles/the-riddler-who-will-win-the-lucky-derby/
Mon, 08 May 2017 00:00:00 +0000/puzzles/the-riddler-who-will-win-the-lucky-derby/<script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"></script>
<p><em>The Kentucky Derby is on Saturday, and a field of 20 horses is slated to run “the fastest two minutes in sports” in pursuit of the right to be draped with a blanket of roses. But let’s consider, instead, the Lucky Derby, where things are a little more bizarre:</em></p>
<p><em>The bugle sounds, and 20 horses make their way to the starting gate for the first annual Lucky Derby. These horses, all trained at the mysterious Riddler Stables, are special. Each second, every Riddler-trained horse takes one step. Each step is exactly one meter long. But what these horses exhibit in precision, they lack in sense of direction. Most of the time, their steps are forward (toward the finish line) but the rest of the time they are backward (away from the finish line). As an avid fan of the Lucky Derby, you’ve done exhaustive research on these 20 competitors. You know that Horse One goes forward 52 percent of the time, Horse Two 54 percent of the time, Horse Three 56 percent, and so on, up to the favorite filly, Horse Twenty, who steps forward 90 percent of the time. The horses’ steps are taken independently of one another, and the finish line is 200 meters from the starting gate.</em></p>
<p><em>Handicap this race and place your bets! In other words, what are the odds (a percentage is fine) that each horse wins?</em></p>
<p><em>Extra credit: Animate this derby. I’ll broadcast the highlights next week.</em></p>
<p>This week I only had time to do the extra credit.</p>
<pre class="r"><code>library(tidyverse)
library(gganimate)
horses <- 1:20
f <- function(horse){
weights = c(0.5 + 0.02*horse, 0.5 - 0.02*horse)
step = sample(c(1,-1), 1, prob = weights)
return(step)
}
raceDf <- function(){
pos <- replicate(20,0)
race <- list(pos)
while (max(unlist(race)) < 200){
step <- c()
for (horse in horses){
step <- c(step,f(horse))
}
pos <- pos + step
race <- c(race,list(pos))
}
names(race) <- 0:(length(race)-1)
return(data.frame(race))
}</code></pre>
<p>Here I simulate a race and plot it.</p>
<pre class="r"><code>race <- raceDf()
race <- (data.frame(t(race)))
colnames(race) <- 1:20
rownames(race) <- 1:length(rownames(race))
race$Step <- as.numeric(rownames(race))
raceGathered <- gather(race, key = 'horse', value ='Distance', -Step)
ggplot(raceGathered, aes(x = Step, y = Distance, group = factor(horse, levels = as.numeric(horse)),
color = factor(horse, levels = as.numeric(horse) ) )) +
geom_line() +
theme(legend.position = 'bottom') +
guides(colour = guide_legend(nrow = 2, title = 'Horse')) </code></pre>
<p><img src="/post/2017-05-06-the-riddle-who-will-win-the-lucky-derby_files/figure-html/unnamed-chunk-2-1.png" width="672" /></p>
<p>And we can also animate the race. <a href="https://www.youtube.com/watch?v=7m1h0Hf5uMs">Go banana!</a></p>
<pre class="r"><code>p <- ggplot(raceGathered, aes(x = factor(horse, levels = as.numeric(horse)),
y = Distance,
frame = Step,
color = factor(horse, levels = as.numeric(horse)))) +
geom_point(size = 5) +
theme(legend.position = 'none') +
labs(x = 'Horse')#, y = 'Position')
gganimate(p, interval = 0.1)</code></pre>
<div class="figure">
<img src="race.gif" />
</div>
The Riddler - Painting Balls
/2017/04/30/2017-04-30-the-riddler-painting-balls/
Sun, 30 Apr 2017 00:00:00 +0000/2017/04/30/2017-04-30-the-riddler-painting-balls/<script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"></script>
<p>This is a solution to the Riddler Classic from <a href="https://fivethirtyeight.com/features/can-you-solve-these-colorful-puzzles/">here.</a></p>
<p><em>You play a game with four balls: One ball is red, one is blue, one is green and one is yellow. They are placed in a box. You draw a ball out of the box at random and note its color. Without replacing the first ball, you draw a second ball and then paint it to match the color of the first. Replace both balls, and repeat the process. The game ends when all four balls have become the same color. What is the expected number of turns to finish the game? </em></p>
<p><em>Extra credit: What if there are more balls and more colors? </em></p>
<p>It was straightforward enough to get an answer by computation. Here we define a function sim that simulates the process and returns how many turns it takes. Then we take the mean of 10,000 results.</p>
<pre class="r"><code>library(tidyverse)
library(ggthemes)</code></pre>
<pre class="r"><code>sim <- function(m){
colours <- 1:m
n = 1
while (n <= 10000){
pick1 <- sample(colours, 1)
i1 <- which(colours == pick1)[1]
pick2 <- sample(colours[-i1], 1)
i2 <- which(colours == pick2)[1]
colours[i2] <- pick1
if (length(unique(colours)) == 1){
return(n)
}
n = n + 1
}
}
vec <- c()
for (j in 1:10000){
vec <- c(vec, sim(4))
}
print(mean(vec))</code></pre>
<pre><code>## [1] 8.9968</code></pre>
<p>Looks like the answer is 9.</p>
<p>We can approach this more analytically. Without loss of generality, let <span class="math inline">\(P_{n,m}\)</span> be the probability that there are <span class="math inline">\(n\)</span> red balls after <span class="math inline">\(m\)</span> turns. So the probabilty the game ends after <span class="math inline">\(m\)</span> turns is <span class="math inline">\(4 P_{4,m}\)</span> and the expected number of turns is <span class="math display">\[E = \sum_{m=0}^{\infty} m \cdot 4 P_{4,m}. \]</span></p>
<p>So we need to get <span class="math inline">\(P_{4,m}\)</span>. For there to be <span class="math inline">\(4\)</span> red balls on turn <span class="math inline">\(m\)</span>, three things must occur:</p>
<ul>
<li>There must have been <span class="math inline">\(3\)</span> red balls on turn <span class="math inline">\(m-1\)</span>.</li>
<li>One of red balls was chosen in the first draw on turn <span class="math inline">\(m-1\)</span>.</li>
<li>The last non red ball was chosen in the second draw on turn <span class="math inline">\(m-1\)</span>.</li>
</ul>
<p>In probability terms we can say, <span class="math display">\[P_{4,m} = P(\text{Red in first draw}) P(\text{Non red in second draw}) P_{3,m-1}.\]</span> Some thought leads to <span class="math inline">\(P(\text{Red in first draw}) = \frac{3}{4}\)</span> and <span class="math inline">\(P(\text{Non red in second draw}) = \frac{1}{3}.\)</span> Therefore, <span class="math inline">\(P_{4,m} = \frac{1}{4} P_{3,m-1}.\)</span></p>
<p>So the problem reduces to getting <span class="math inline">\(P_{3,m}.\)</span> By a similar rationale to above, we see <span class="math display">\[P_{3,m} = P(\text{Red in first draw}) P(\text{Non red in second draw}) P_{2,m-1} + \]</span> <span class="math display">\[P(\text{Red in first draw}) P(\text{Red in second draw}) P_{3,m-1}.\]</span> And working out these probabilities yields <span class="math display">\[P_{3,m} = \frac{1}{3}P_{2,m-1} + \frac{1}{2}P_{3,m-1}.\]</span> Similar arguments result in <span class="math display">\[P_{2,m} = \frac{1}{4}P_{1,m-1} + \frac{1}{3}P_{2,m-1} \frac{1}{4}P_{3,m-1}\]</span> and <span class="math display">\[P_{1,m-1} = \frac{1}{2}P_{1,m-1} + \frac{1}{3}P_{2,m-1}.\]</span></p>
<p>This is a two dimensional recurrence relation, with the initial conditions <span class="math inline">\(P_{1,0} = 1\)</span> and <span class="math inline">\(P_{n,m} = 0\)</span> for <span class="math inline">\(n > m+1\)</span>. The best strategy I know is turn this into a one dimensional recurrence relation. To this end, I wrote up the relation as a function in R.</p>
<pre class="r"><code>f <- function(n, m) {
if (n == 1 & m == 0){
return (1)
}
if (n > m + 1){
return (0)
}
else{
if (n == 4){
return (1/4*f(3, m-1))
}
if (n == 3){
return (1/3*f(2, m-1) + 1/2*f(3, m-1))
}
if (n == 2){
return (1/4*f(1,m-1) + 1/3*f(2,m-1) + 1/4*f(3,m-1))
}
if (n == 1){
return (1/2*f(1,m-1) + 1/3*f(2,m-1))
}
}
}</code></pre>
<p>As a sanity check, the above simulation function should provide approximations to this function. Let’s see if it does. We plot the simulations as coloumns and the function is the red line.</p>
<p><img src="/puzzles/2017-04-30-the-riddler-painting-balls_files/figure-html/unnamed-chunk-5-1.png" width="672" /></p>
<p>Looks good!</p>
<p>After a lot of fiddling around with fractions and getting some inspiration using the above function, I arrived at <span class="math inline">\(P_{2,m} = \frac{5^{m-1}}{4\cdot 6^{m-1}}\)</span>. This leads to the one dimensional reccurance relation <span class="math display">\[P_{3,m} = \frac{5^{m-1}}{2^m 3^{m-1}} + \frac{1}{2} P_{3,m-1}. \]</span></p>
<p>Skipping a bunch of maths, we arrive at the explicit solution <span class="math display">\[P_{3,m} =\frac{1}{2^{m+1}} \left( \left(\frac{5}{3}\right)^{m-1} - 1 \right). \]</span></p>
<p>So the expected number of turns is <span class="math display">\[E = \sum_{m=2}^{\infty} \frac{m+1}{2^{m+1}} \left( \left(\frac{5}{3}\right)^{m-1} - 1 \right).\]</span></p>
<p>Further calculations (or <a href="https://www.wolframalpha.com/input/?i=sum+(m%2B1)+1%2F5+2%5E(-1+-+m)+(-5+%2B+3%5E(1+-+m)+5%5Em),+m+%3D2+to+infinity">cheating</a>) gets us that this series converges to 9.</p>
The Riddler - Painting Balls
/puzzles/the-riddler-painting-balls/
Sun, 30 Apr 2017 00:00:00 +0000/puzzles/the-riddler-painting-balls/<script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"></script>
<p>This is a solution to the Riddler Classic from <a href="https://fivethirtyeight.com/features/can-you-solve-these-colorful-puzzles/">here.</a></p>
<p><em>You play a game with four balls: One ball is red, one is blue, one is green and one is yellow. They are placed in a box. You draw a ball out of the box at random and note its color. Without replacing the first ball, you draw a second ball and then paint it to match the color of the first. Replace both balls, and repeat the process. The game ends when all four balls have become the same color. What is the expected number of turns to finish the game? </em></p>
<p><em>Extra credit: What if there are more balls and more colors? </em></p>
<p>It was straightforward enough to get an answer by computation. Here we define a function sim that simulates the process and returns how many turns it takes. Then we take the mean of 10,000 results.</p>
<pre class="r"><code>library(tidyverse)
library(ggthemes)</code></pre>
<pre class="r"><code>sim <- function(m){
colours <- 1:m
n = 1
while (n <= 10000){
pick1 <- sample(colours, 1)
i1 <- which(colours == pick1)[1]
pick2 <- sample(colours[-i1], 1)
i2 <- which(colours == pick2)[1]
colours[i2] <- pick1
if (length(unique(colours)) == 1){
return(n)
}
n = n + 1
}
}
vec <- c()
for (j in 1:10000){
vec <- c(vec, sim(4))
}
print(mean(vec))</code></pre>
<pre><code>## [1] 8.9968</code></pre>
<p>Looks like the answer is 9.</p>
<p>We can approach this more analytically. Without loss of generality, let <span class="math inline">\(P_{n,m}\)</span> be the probability that there are <span class="math inline">\(n\)</span> red balls after <span class="math inline">\(m\)</span> turns. So the probabilty the game ends after <span class="math inline">\(m\)</span> turns is <span class="math inline">\(4 P_{4,m}\)</span> and the expected number of turns is <span class="math display">\[E = \sum_{m=0}^{\infty} m \cdot 4 P_{4,m}. \]</span></p>
<p>So we need to get <span class="math inline">\(P_{4,m}\)</span>. For there to be <span class="math inline">\(4\)</span> red balls on turn <span class="math inline">\(m\)</span>, three things must occur:</p>
<ul>
<li>There must have been <span class="math inline">\(3\)</span> red balls on turn <span class="math inline">\(m-1\)</span>.</li>
<li>One of red balls was chosen in the first draw on turn <span class="math inline">\(m-1\)</span>.</li>
<li>The last non red ball was chosen in the second draw on turn <span class="math inline">\(m-1\)</span>.</li>
</ul>
<p>In probability terms we can say, <span class="math display">\[P_{4,m} = P(\text{Red in first draw}) P(\text{Non red in second draw}) P_{3,m-1}.\]</span> Some thought leads to <span class="math inline">\(P(\text{Red in first draw}) = \frac{3}{4}\)</span> and <span class="math inline">\(P(\text{Non red in second draw}) = \frac{1}{3}.\)</span> Therefore, <span class="math inline">\(P_{4,m} = \frac{1}{4} P_{3,m-1}.\)</span></p>
<p>So the problem reduces to getting <span class="math inline">\(P_{3,m}.\)</span> By a similar rationale to above, we see <span class="math display">\[P_{3,m} = P(\text{Red in first draw}) P(\text{Non red in second draw}) P_{2,m-1} + \]</span> <span class="math display">\[P(\text{Red in first draw}) P(\text{Red in second draw}) P_{3,m-1}.\]</span> And working out these probabilities yields <span class="math display">\[P_{3,m} = \frac{1}{3}P_{2,m-1} + \frac{1}{2}P_{3,m-1}.\]</span> Similar arguments result in <span class="math display">\[P_{2,m} = \frac{1}{4}P_{1,m-1} + \frac{1}{3}P_{2,m-1} \frac{1}{4}P_{3,m-1}\]</span> and <span class="math display">\[P_{1,m-1} = \frac{1}{2}P_{1,m-1} + \frac{1}{3}P_{2,m-1}.\]</span></p>
<p>This is a two dimensional recurrence relation, with the initial conditions <span class="math inline">\(P_{1,0} = 1\)</span> and <span class="math inline">\(P_{n,m} = 0\)</span> for <span class="math inline">\(n > m+1\)</span>. The best strategy I know is turn this into a one dimensional recurrence relation. To this end, I wrote up the relation as a function in R.</p>
<pre class="r"><code>f <- function(n, m) {
if (n == 1 & m == 0){
return (1)
}
if (n > m + 1){
return (0)
}
else{
if (n == 4){
return (1/4*f(3, m-1))
}
if (n == 3){
return (1/3*f(2, m-1) + 1/2*f(3, m-1))
}
if (n == 2){
return (1/4*f(1,m-1) + 1/3*f(2,m-1) + 1/4*f(3,m-1))
}
if (n == 1){
return (1/2*f(1,m-1) + 1/3*f(2,m-1))
}
}
}</code></pre>
<p>As a sanity check, the above simulation function should provide approximations to this function. Let’s see if it does. We plot the simulations as coloumns and the function is the red line.</p>
<p><img src="/puzzles/2017-04-30-the-riddler-painting-balls_files/figure-html/unnamed-chunk-5-1.png" width="672" /></p>
<p>Looks good!</p>
<p>After a lot of fiddling around with fractions and getting some inspiration using the above function, I arrived at <span class="math inline">\(P_{2,m} = \frac{5^{m-1}}{4\cdot 6^{m-1}}\)</span>. This leads to the one dimensional reccurance relation <span class="math display">\[P_{3,m} = \frac{5^{m-1}}{2^m 3^{m-1}} + \frac{1}{2} P_{3,m-1}. \]</span></p>
<p>Skipping a bunch of maths, we arrive at the explicit solution <span class="math display">\[P_{3,m} =\frac{1}{2^{m+1}} \left( \left(\frac{5}{3}\right)^{m-1} - 1 \right). \]</span></p>
<p>So the expected number of turns is <span class="math display">\[E = \sum_{m=2}^{\infty} \frac{m+1}{2^{m+1}} \left( \left(\frac{5}{3}\right)^{m-1} - 1 \right).\]</span></p>
<p>Further calculations (or <a href="https://www.wolframalpha.com/input/?i=sum+(m%2B1)+1%2F5+2%5E(-1+-+m)+(-5+%2B+3%5E(1+-+m)+5%5Em),+m+%3D2+to+infinity">cheating</a>) gets us that this series converges to 9.</p>
The Riddler - Supreme Court
/2017/04/16/2017-04-16-supreme-court/
Sun, 16 Apr 2017 00:00:00 +0000/2017/04/16/2017-04-16-supreme-court/<div id="supreme-court-puzzle" class="section level2">
<h2>Supreme Court Puzzle</h2>
<p>This is a simulation in R for the solution to Fivethirtyeight’s Riddler Classic from April 14th 2017, found <a href="https://fivethirtyeight.com/features/how-many-bingo-cards-are-there-in-the-world/">here.</a></p>
<p><em>Imagine that U.S. Supreme Court nominees are only confirmed if the same party holds the presidency and the Senate. What is the expected number of vacancies on the bench in the long run?</em></p>
<p><em>You can assume the following:</em></p>
<ul>
<li><em>You start with an empty, nine-person bench.</em></li>
<li><em>There are two parties, and each has a 50 percent chance of winning the presidency and a 50 percent chance of winning the Senate in each election.</em></li>
<li><em>The outcomes of Senate elections and presidential elections are independent.</em></li>
<li><em>The length of time for which a justice serves is uniformly distributed between zero and 40 years.</em></li>
</ul>
<p>The following code usually gives an answer between 0.68 and 0.74. Running it for longer gives something around 0.72.</p>
<pre class="r"><code>library(tidyverse)</code></pre>
<pre class="r"><code>justices <- seq(from = 0, to = 0, length.out = 9)
parties <- c('Democrat', 'Republican')
pres <- sample(parties, 1)
senate <- sample(parties, 1)
fillSeats <- function(justices, year, senate, pres){
if (year == 0){
return(justices)
}
else{
if (pres == senate){
for (num in 1:length(justices)){
if (justices[num] == 0){
justices[num] <- sample(0:40, 1)
}
}
}
justices
}
}
sim <- function(time){
justices <- seq(from = 0, to = 0, length.out = 9)
histVacancies <- c()
histMeanVacancies <- c()
period <- 0:time
for (year in period){
justices <- justices - 1
justices <- replace(justices, which(justices == -1), 0)
justices <- fillSeats(justices, year, senate, pres)
vacancies <- sum(justices == 0)
if (year %% 2 == 0){
senate <- sample(parties, 1)
}
if (year %% 4 == 0){
pres <- sample(parties, 1)
}
# print(c(senate, pres))
histVacancies <- c(histVacancies, vacancies)
meanVacancies <- mean(histVacancies)
histMeanVacancies <- c(histMeanVacancies, meanVacancies)
}
histMeanVacancies
}
vec <- sim(10000)
print(tail(vec, 1))</code></pre>
<pre><code>## [1] 0.7422258</code></pre>
<p>And here’s a graph. I cut off the limits on the y-axis for ease of view.</p>
<pre class="r"><code>df <- data.frame(year = 0:(length(vec)-1), meanVacancies = vec)
ggplot(df, aes(x = year, y = meanVacancies)) +
geom_line() +
scale_y_continuous(limits = c(0.5,1))</code></pre>
<p><img src="/puzzles/2017-04-16-supreme-court_files/figure-html/unnamed-chunk-3-1.png" width="672" /></p>
</div>
The Riddler - Supreme Court
/puzzles/supreme-court/
Sun, 16 Apr 2017 00:00:00 +0000/puzzles/supreme-court/<div id="supreme-court-puzzle" class="section level2">
<h2>Supreme Court Puzzle</h2>
<p>This is a simulation in R for the solution to Fivethirtyeight’s Riddler Classic from April 14th 2017, found <a href="https://fivethirtyeight.com/features/how-many-bingo-cards-are-there-in-the-world/">here.</a></p>
<p><em>Imagine that U.S. Supreme Court nominees are only confirmed if the same party holds the presidency and the Senate. What is the expected number of vacancies on the bench in the long run?</em></p>
<p><em>You can assume the following:</em></p>
<ul>
<li><em>You start with an empty, nine-person bench.</em></li>
<li><em>There are two parties, and each has a 50 percent chance of winning the presidency and a 50 percent chance of winning the Senate in each election.</em></li>
<li><em>The outcomes of Senate elections and presidential elections are independent.</em></li>
<li><em>The length of time for which a justice serves is uniformly distributed between zero and 40 years.</em></li>
</ul>
<p>The following code usually gives an answer between 0.68 and 0.74. Running it for longer gives something around 0.72.</p>
<pre class="r"><code>library(tidyverse)</code></pre>
<pre class="r"><code>justices <- seq(from = 0, to = 0, length.out = 9)
parties <- c('Democrat', 'Republican')
pres <- sample(parties, 1)
senate <- sample(parties, 1)
fillSeats <- function(justices, year, senate, pres){
if (year == 0){
return(justices)
}
else{
if (pres == senate){
for (num in 1:length(justices)){
if (justices[num] == 0){
justices[num] <- sample(0:40, 1)
}
}
}
justices
}
}
sim <- function(time){
justices <- seq(from = 0, to = 0, length.out = 9)
histVacancies <- c()
histMeanVacancies <- c()
period <- 0:time
for (year in period){
justices <- justices - 1
justices <- replace(justices, which(justices == -1), 0)
justices <- fillSeats(justices, year, senate, pres)
vacancies <- sum(justices == 0)
if (year %% 2 == 0){
senate <- sample(parties, 1)
}
if (year %% 4 == 0){
pres <- sample(parties, 1)
}
# print(c(senate, pres))
histVacancies <- c(histVacancies, vacancies)
meanVacancies <- mean(histVacancies)
histMeanVacancies <- c(histMeanVacancies, meanVacancies)
}
histMeanVacancies
}
vec <- sim(10000)
print(tail(vec, 1))</code></pre>
<pre><code>## [1] 0.7422258</code></pre>
<p>And here’s a graph. I cut off the limits on the y-axis for ease of view.</p>
<pre class="r"><code>df <- data.frame(year = 0:(length(vec)-1), meanVacancies = vec)
ggplot(df, aes(x = year, y = meanVacancies)) +
geom_line() +
scale_y_continuous(limits = c(0.5,1))</code></pre>
<p><img src="/puzzles/2017-04-16-supreme-court_files/figure-html/unnamed-chunk-3-1.png" width="672" /></p>
</div>
The Ridder - Baby Walks
/2017/04/01/2017-04-01-sample/
Sat, 01 Apr 2017 00:00:00 +0000/2017/04/01/2017-04-01-sample/<p>This is a simulation in R for the solution to Fivethirtyeight’s Riddler Express from March 24th 2017, found <a href="https://fivethirtyeight.com/features/will-the-baby-walk-away-will-the-troll-kill-the-dwarves/">here.</a> Some solutions were <a href="https://fivethirtyeight.com/features/what-are-the-chances-well-meet-for-lunch/">published a week later</a>, along with new puzzles.</p>
<p>To save a click, here’s the puzzle:</p>
<p><em>Your baby is learning to walk. She begins standing, holding onto the couch. There’s a 25 percent chance she’ll take a step forward, and a 75 percent chance she’ll stay put, clutching the couch. If she’s ever a step or more away from the couch, there’s a 25 percent chance she’ll take another step forward, a 25 percent chance she’ll stay where she is, and a 50 percent chance she’ll take a step back, toward the couch. In the long run, what percentage of the time does the baby clutch the couch?</em></p>
<div id="required-libraries" class="section level2">
<h2>Required libraries</h2>
<pre class="r"><code>library(ggplot2)
library(ggthemes)
library(dplyr)</code></pre>
</div>
<div id="simulation-function" class="section level2">
<h2>Simulation function</h2>
<p>The following function simulates the baby’s behaviour after a given number of steps and outputs a dataframe containing the distance from the couch and the proporion of time the baby has touched the couch after taking a number of steps.</p>
<pre class="r"><code>babyWalk <- function(steps) {
dist <- 0
count <- 0
prop_vec <- c()
dist_vec <- c()
for (i in 1:steps) {
if (dist == 0) {
x = sample(1:4, 1)
if (x == 1) {
dist <- dist + 1
} else {
count <- count + 1
}
} else {
x = sample(1:4, 1)
if (x == 1) {
dist <- dist + 1
}
if (x %in% 2:3) {
dist <- dist - 1
}
if (dist == 0) {
count <- count + 1
}
}
prop_vec = c(prop_vec, count/i)
dist_vec = c(dist_vec, dist)
}
df <- data.frame(Distance = dist_vec, Proportion = prop_vec)
df$Steps <- as.numeric(rownames(df))
df
}</code></pre>
<p>Now let’s use this function to simulate one baby taking 10,000 steps and graph it.</p>
<pre class="r"><code>steps <- 10000
baby <- babyWalk(steps)
ggplot(baby, aes(x = Steps, y = Proportion)) +
geom_point(alpha = 0.2) +
theme_minimal() +
labs(x = 'Steps Taken', y = 'Proportion of Time Touching the Couch')</code></pre>
<p><img src="/post/2017-04-01-sample_files/figure-html/unnamed-chunk-3-1.png" width="672" /></p>
<p>Looks like the answer is 0.5. This is true by the <a href="https://fivethirtyeight.com/features/what-are-the-chances-well-meet-for-lunch/">maths!</a></p>
<p>Now let’s consider a population of babies. We simulate 500 babies taking 50 steps and use dplyr to calculate the percentage which are touching the couch at a particular step.</p>
<pre class="r"><code>babyWalksDf <- data.frame(Distance = numeric(0), Proportion = numeric(0), Steps = numeric(0))
steps <- 50
num_babies <- 500
for (i in 1:num_babies){
df <- babyWalk(steps)
df$Baby_id <- i
babyWalksDf <- bind_rows(babyWalksDf,df)
}
babyWalksDf <- babyWalksDf %>%
group_by(Steps) %>%
mutate(Percentage = (sum(Distance == 0)/num_babies)*100) %>%
select(-Baby_id) %>%
distinct(.keep_all = T)</code></pre>
<p>And now we graph again. As expected, the percentage approaches 50%.</p>
<pre class="r"><code>ggplot(babyWalksDf, aes(x = Steps , y = Percentage)) +
geom_point(alpha = 0.6) +
geom_smooth(method = 'auto') +
theme_minimal() +
labs(x = 'Steps Taken', y = 'Percentage of Babies Touching the Couch')</code></pre>
<p><img src="/post/2017-04-01-sample_files/figure-html/unnamed-chunk-5-1.png" width="672" /></p>
</div>
The Ridder - Baby Walks
/puzzles/babywalks/
Sat, 01 Apr 2017 00:00:00 +0000/puzzles/babywalks/<p>This is a simulation in R for the solution to Fivethirtyeight’s Riddler Express from March 24th 2017, found <a href="https://fivethirtyeight.com/features/will-the-baby-walk-away-will-the-troll-kill-the-dwarves/">here.</a> Some solutions were <a href="https://fivethirtyeight.com/features/what-are-the-chances-well-meet-for-lunch/">published a week later</a>, along with new puzzles.</p>
<p>To save a click, here’s the puzzle:</p>
<p><em>Your baby is learning to walk. She begins standing, holding onto the couch. There’s a 25 percent chance she’ll take a step forward, and a 75 percent chance she’ll stay put, clutching the couch. If she’s ever a step or more away from the couch, there’s a 25 percent chance she’ll take another step forward, a 25 percent chance she’ll stay where she is, and a 50 percent chance she’ll take a step back, toward the couch. In the long run, what percentage of the time does the baby clutch the couch?</em></p>
<div id="required-libraries" class="section level2">
<h2>Required libraries</h2>
<pre class="r"><code>library(ggplot2)
library(ggthemes)
library(dplyr)</code></pre>
</div>
<div id="simulation-function" class="section level2">
<h2>Simulation function</h2>
<p>The following function simulates the baby’s behaviour after a given number of steps and outputs a dataframe containing the distance from the couch and the proporion of time the baby has touched the couch after taking a number of steps.</p>
<pre class="r"><code>babyWalk <- function(steps) {
dist <- 0
count <- 0
prop_vec <- c()
dist_vec <- c()
for (i in 1:steps) {
if (dist == 0) {
x = sample(1:4, 1)
if (x == 1) {
dist <- dist + 1
} else {
count <- count + 1
}
} else {
x = sample(1:4, 1)
if (x == 1) {
dist <- dist + 1
}
if (x %in% 2:3) {
dist <- dist - 1
}
if (dist == 0) {
count <- count + 1
}
}
prop_vec = c(prop_vec, count/i)
dist_vec = c(dist_vec, dist)
}
df <- data.frame(Distance = dist_vec, Proportion = prop_vec)
df$Steps <- as.numeric(rownames(df))
df
}</code></pre>
<p>Now let’s use this function to simulate one baby taking 10,000 steps and graph it.</p>
<pre class="r"><code>steps <- 10000
baby <- babyWalk(steps)
ggplot(baby, aes(x = Steps, y = Proportion)) +
geom_point(alpha = 0.2) +
theme_minimal() +
labs(x = 'Steps Taken', y = 'Proportion of Time Touching the Couch')</code></pre>
<p><img src="/post/2017-04-01-sample_files/figure-html/unnamed-chunk-3-1.png" width="672" /></p>
<p>Looks like the answer is 0.5. This is true by the <a href="https://fivethirtyeight.com/features/what-are-the-chances-well-meet-for-lunch/">maths!</a></p>
<p>Now let’s consider a population of babies. We simulate 500 babies taking 50 steps and use dplyr to calculate the percentage which are touching the couch at a particular step.</p>
<pre class="r"><code>babyWalksDf <- data.frame(Distance = numeric(0), Proportion = numeric(0), Steps = numeric(0))
steps <- 50
num_babies <- 500
for (i in 1:num_babies){
df <- babyWalk(steps)
df$Baby_id <- i
babyWalksDf <- bind_rows(babyWalksDf,df)
}
babyWalksDf <- babyWalksDf %>%
group_by(Steps) %>%
mutate(Percentage = (sum(Distance == 0)/num_babies)*100) %>%
select(-Baby_id) %>%
distinct(.keep_all = T)</code></pre>
<p>And now we graph again. As expected, the percentage approaches 50%.</p>
<pre class="r"><code>ggplot(babyWalksDf, aes(x = Steps , y = Percentage)) +
geom_point(alpha = 0.6) +
geom_smooth(method = 'auto') +
theme_minimal() +
labs(x = 'Steps Taken', y = 'Percentage of Babies Touching the Couch')</code></pre>
<p><img src="/post/2017-04-01-sample_files/figure-html/unnamed-chunk-5-1.png" width="672" /></p>
</div>
The Riddler - What are the chances?
/2017/04/01/2017-04-01-what-are-the-chances/
Sat, 01 Apr 2017 00:00:00 +0000/2017/04/01/2017-04-01-what-are-the-chances/<script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"></script>
<p>Solution of the riddler express from <a href="https://fivethirtyeight.com/features/what-are-the-chances-well-meet-for-lunch/">here.</a></p>
<p><em>On a lovely spring day, you and I agree to meet for a lunch picnic at the fountain in the center of our favorite park. We agree that we’ll each arrive sometime from noon and 1 p.m., and that whoever arrives first will wait up to 15 minutes for the other. If the other person doesn’t show by then, the first person will abandon the plans and spend the day with a more punctual friend. If we both arrive at the fountain at an independently random time between noon and 1, what are the chances our picnic actually happens?</em></p>
<p>Let the <span class="math inline">\(x\)</span> be the number of minutes past 12 that I arrive and <span class="math inline">\(y\)</span> be the number of minutes past 12 that you arrive. Then the picnic will take place once <span class="math inline">\(|x-y| <= 15\)</span>.</p>
<p>Half the time <span class="math inline">\(x \in [15,45]\)</span>. In this scenario <span class="math inline">\(y \in [x-15, x+15]\)</span> with probability <span class="math inline">\(\frac{1}{2}\)</span>.</p>
<p>One quarter of the time <span class="math inline">\(x \in [0,15)\)</span>. If this happens <span class="math inline">\(y \in [0, x+15]\)</span> with probability <span class="math inline">\(\frac{1}{15} \times \frac{1}{60} \int_{0}^{15} \frac{x + 15}{60} = \frac{3}{8}\)</span>.</p>
<p>The final case is that <span class="math inline">\(x \in (45,60]\)</span>. This also occurs with probability <span class="math inline">\(\frac{1}{4}\)</span> and we will have <span class="math inline">\(y \in [x-15, 60]\)</span> with probability <span class="math inline">\(\frac{1}{15} \times \frac{1}{60} \int_{45}^{60} \frac{75 - x}{60} = \frac{3}{8}\)</span></p>
<p>Putting all this together, we get that <span class="math inline">\(P(|x-y| <= 15) = \frac{1}{2}\times \frac{1}{2} + \frac{1}{2} \times \frac{3}{8} = \frac{7}{16}\)</span>.</p>
<p>This can also be simulated with some code from R.</p>
<pre class="r"><code>library(ggplot2)
library(ggthemes)</code></pre>
<pre class="r"><code>prop <- function(tries){
a <- runif(tries, min = 0, max = 60)
b <- runif(tries, min = 0, max = 60)
data.frame(x=abs(a - b))
}</code></pre>
<pre class="r"><code>tries <- 10000
gaps <- prop(tries)
ggplot(gaps, aes(x = x) ) +
stat_ecdf() +
geom_vline(xintercept = 15, col = 'red') +
geom_hline(yintercept = sum(gaps <= 15)/tries, col = 'blue') +
labs(y = 'Probability that gap between arrivals <= x') +
theme_minimal()</code></pre>
<p><img src="/post/2017-04-01-what-are-the-chances_files/figure-html/unnamed-chunk-3-1.png" width="672" /></p>