# Puzzle: In A Village Where People Only Want Boys

Feb 29, 2020 #monte-carlo #dynamic

The origin of this question is an interview question from Google.

In a village in which people only want boys every family continues to have children until they have a boy. If they have a girl, they have another child. If they have a boy, they stop. What is the expected boy to girl ratio in the village?

Though this can be solved mathematically (of course it can), the question is (IMO) easier to solve by thinking in a tree/branches structure.

If we assume the village has N families, and the probability of having a boy or girl is 50/50, then

Here we see the obvious, as the depth of tree extends, the boy-to-girl disparity will ultimately become insignificant. Hence, the ratio will be very close to 50/50.

However, the question itself is not the main interest of this post. It turns out to be quite a nice practice for simulation coding. The following attaches the source code to the simulation as it validates the answer given above. I utilize both **functional (higher order function)** and **dynamic programming** here.

```
library(magrittr)
simulate_once <- function(N, verbose=FALSE) {
# nobody has anything yet...
boys = 0; girls = 0;
print_vars <- function() {
message("BGR:", boys, "/", girls)
message("==================")
}
tryonemore <- function(x, msgON) {
# browser()
if(msgON) message(x)
# terminal condition: all 1 (boys)
if (all(x)) {
boys <<- boys + sum(x)
if (msgON) print_vars()
return(1)
}
# assign to parent env so that we can keep track
boys <<- boys + sum(x)
girls <<- girls + sum(!x)
if (msgON) print_vars()
return(tryonemore(rbinom(sum(!x), 1, 0.5), msgON = msgON))
}
# init call
tryonemore(rbinom(N, 1, 0.5), msgON=verbose)
return(boys / girls)
}
set.seed(1111)
simulate_once(10, verbose = TRUE)
## 0010111010
## BGR:5/5
## ==================
## 01111
## BGR:9/6
## ==================
## 0
## BGR:9/7
## ==================
## 1
## BGR:10/7
## ==================
## [1] 1.428571
```

```
library(purrr)
library(ggplot2)
N.range = 11:1000
res <- map_dbl(N.range, ~ {
mean(replicate(1000, simulate_once(.x)))
})
data.frame(ind = N.range, ratio = res) %>%
ggplot(aes(ind, ratio)) +
geom_line() +
geom_hline(yintercept = 1, lty = 4, col = "blue") +
scale_y_continuous(limits = c(0.9, 1.2)) +
theme_minimal(base_family = "Menlo") +
labs(x = "N (# families)",
y = "Expected Boy-to-Girl Ratio",
caption = "*Simulation is ran 1000 times for each N",
title = "Simulation for Various N")
```