The birthday problem or probability paradox concerns the probability that a group of people all have different birthdays, see for instance p. 15 of the book “Statistical modelling and computation” by Dirk Kroese and Joshua Chan. The purpose of this blog post is to exemplify how tail recursion can be used to avoid mutability in the context of a simple statistical computation, thus switching from the imperative to the functional programming paradigm.

**Short recap on the solution to the birthday problem**

As a succinct recap of the solution to the birthday problem, let denote the event that the first people have different birthdays for . Notice that , whence . The probability that all people have different birthdays is then . By application of the product rule of probability,

.

Given that the first people have different birthdays, the first people have different birthdays if and only if the birthday of the -th person is chosen from the remaining birthdays. So,

.

Obviously, , which leads to

.

**Computing the probability of different birthdays via an iterative function**

To compute the probability that all people have distinct birthdays as a function of , consider the following iterative function in R:

UniqueBirthdayProb <- function(n) { p <- 1 for (k in 1:n) { p <- p * (365 - k + 1) / 365 } return(p) }

The function call

UniqueBirthdayProb(23)

gives a probability of , which means that the probability of people not exhibiting a duplicate birthday is less than .

The following snippet computes and plots for :

nseq <- 1:70 pseq <- sapply(nseq, UniqueBirthdayProb) plot( nseq, pseq, type="o", pch="*", ylim=c(0, 1), xlab="n", ylab=expression('P(A'[n]*')'), main="Probability of distinct birthdays" )

The resulting plot is shown below:

**Computing the probability of different birthdays via a recursive function**

The following R function computes recursively:

RecUniqueBirthdayProb <- function(n) { if (n == 1) { return(1) } else { return(RecUniqueBirthdayProb(n - 1) * (365 - n + 1) / 365) } }

Mutability has been avoided in that the state of the local variable *p* in the body of *UniqueBirthdayProb(n)* is not altered. The function *RecUniqueBirthdayProb(n)* calls itself recursively, and therefore avoids the need for a local variable. A downside of the recursive function calls in *RecUniqueBirthdayProb(n)* is that the stack frame is filled for increasing values of the input argument *n*.

**Computing the probability of different birthdays via a tail-recursive function**

Tail recursion reduces the number of recursive calls. This is how to define a tail-recursive function in R to compute :

TailRecUniqueBirthdayProb <- function(n) { loop <- function(acc, n) { if (n == 1) { return(acc) } else { return(loop(acc * (365 - n + 1) / 365, n - 1)) } } loop(1, n) }

A closure *loop(acc, n)* is defined inside the tail-recursive function *TailRecUniqueBirthdayProb(n)*. The number of recursive calls is reduced via the accumulator *acc*, which is the first input argument to *loop(acc, n)*.

**A comparison of stack frames between the three functions**

To compare the iterative, recursive and tail-recursive functions in terms of their stack demands, the *trace()* R function is called upon each of the three functions, yielding the following output:

> trace(UniqueBirthdayProb) > UniqueBirthdayProb(10) trace: UniqueBirthdayProb(10) [1] 0.8830518 > untrace(UniqueBirthdayProb) > > trace(RecUniqueBirthdayProb) > RecUniqueBirthdayProb(10) trace: RecUniqueBirthdayProb(10) trace: RecUniqueBirthdayProb trace: RecUniqueBirthdayProb trace: RecUniqueBirthdayProb trace: RecUniqueBirthdayProb trace: RecUniqueBirthdayProb trace: RecUniqueBirthdayProb trace: RecUniqueBirthdayProb trace: RecUniqueBirthdayProb trace: RecUniqueBirthdayProb [1] 0.8830518 > untrace(RecUniqueBirthdayProb) > > trace(TailRecUniqueBirthdayProb) > TailRecUniqueBirthdayProb(10) trace: TailRecUniqueBirthdayProb(10) [1] 0.8830518 > untrace(TailRecUniqueBirthdayProb)

It is clear from the output that *RecUniqueBirthdayProb(n)* requires more recursive function calls than *TailRecUniqueBirthdayProb(n)*. Strictly speaking, and despite the reduction in the number of recursive function calls, R does not support tail elimination. The focus of this post is on the concept of tail recursion and how it can be applied in a simple statistical problem, not on the specifics of tail elimination in R. The choice of programming language in this post has been made on pedagogical and not on performance grounds.