Sum of Uniform random variables

This is an interesting problem that has long been a favorite of mine for its surprising solution.

Problem: Let x be a positive real. Let U_i, i = 1,2,\dots be i.i.d random variables uniformly distributed between 0 and 1. Define the random variable N as N(x) \triangleq \{ \min n \colon \sum_{i=1}^{n} U_i > x \}. In plain English, N is the minimum number of uniform random variables one needs to add for the sum to exceed x. Find the expected value of N(x).

The more famous version of this problem merely asks you to find the expectation of N(1). However, a slight extension of the approach can give the answer for a general x. We shall tackle this problem using recursive expectation which is a fairly natural line of attack for these kinds of problems. Before proceeding further, let me note that it is also possible to find this expectation by actually computing the pmf of N(x) (at least when 0 < x < 1. I don’t know it if the pmf is easy to compute for x > 1).

Lets start by defining f(x) to be the expected value of N(x). For reasons that will become obvious shortly, lets for now restrict x to lie between 0 and 1. A recursion can be derived for f(x) by conditioning on the value of the first random variable U_1.

f(x) = \int_{0}^{1} \mathbb{E}(N(x) | U_1 = a) da

This integral naturally splits into two as

f(x) = \int_{0}^{x} \mathbb{E}(N(x) | U_1 = a) da + \int_{x}^{1} \mathbb{E}(N(x) | U_1 = a) da

If the value of U_1 is greater than x (as is the case in the second integral), the sum has already exceeded x which implies that \mathbb{E}(N(x) | U_1 = a) = 1 in this case. If U_1 < x, we get \mathbb{E}(N(x) | U_1 = a) = 1 + f(x-a). This is the crucial step where recursion steps in. Substituting these back into the expression for f(x) above, we get

f(x) = 1 + \int_{o}^{x} f(x-a) da

which is the same as

f(x) = 1 + \int_{o}^{x} f(a) da

This integral equation can be solved by converting it into a differential equation. Differentiating yields,

\frac{d}{dx} f(x) = f(x) \quad for \, 0 \leq x < 1

The solution in the familiar exponential function f(x) = C e^x. It is easy to see that as x \rightarrow 0, f(x) \rightarrow 1 which implies that the constant of integration must be 1. Therefore, we get that f(x) = e^x for 0 \leq x < 1. Therefore, the expected number of uniform random variables one needs to add to cross a threshold of unity is simply e!!!

Take a moment to reflect how cool this result is. Uniform random variables are everywhere around us and by simply adding them up till they cross a threshold, we can estimate one of the fundamental constants of all mathematics. This is similar to the Buffon’s needle problem which one can use to estimate \pi by simply throwing a needle onto a ruled sheet. It comes with a caveat though: The variance of N(x) is quite high (\approx 0.8) which means that, just as in the Buffon’s needle problem, this is a highly inefficient way of estimating the underlying constant.

Let us forge on and try to compute f(x) for x > 1. A similar recursive equation can be derived by conditioning on U_1 as

f(x) = \int_{0}^{1} \mathbb{E}(N(x) | U_1 = a) da

However, the difference is that when x > 1, the integral doesn’t split into two parts like it did in the case when x \leq 1. Instead the only possible simplification is by replacing \mathbb{E}(N(x) | U_1 = a) by 1+f(x-a) as before and changing the variable of integration. This gives

f(x) = 1 + \int_{x-1}^{x} f(a) da

Differentiating with respect to x gives the differential equation

\frac{d}{dx} f(x) = f(x) - f(x-1)

I don’t know if there is a easy way to solve this equation but I tackled it in the following manner. We already know the form of f(x) when 0 \leq x < 1. So, when 1 \leq x < 2, the above equation can be converted to a familiar linear first order differential equation by substituting f(x-1) = e^{(x-1)}. This results in the differential equation

\frac{d}{dx} f(x) = f(x) - e^{x-1} when 1 \leq x < 2

whose solution is (with the boundary condition that f(1) = e)

f(x) = e^x - (x-1)e^{x-1} for 1 \leq x < 2.

Armed with this, we can solve the differential equation for 2 \leq x < 3 and so on. For example, we have

\frac{d}{dx} f(x) = f(x) - e^{x-1}+(x-2)e^{x-2} when 2 \leq x < 3

whose solution is

f(x) = e^x - (x-1)e^{x-1} + \frac{(x-2)^2}{2} e^{x-2} for 2 \leq x < 3.

The form of the solution for a general x can be easily guessed by working out the first couple of terms in the above manner. Once the form is guessed, it is easy to show that it indeed satisfies the above differential equation and that the solution is unique. Cutting to the chase, the final expression is

f(x) = \sum_{k = 0}^{[x]} (-1)^k \frac{(x-k)^k}{k!} e^{x-k} where [x] is the integer part of x.

A couple of questions that I am currently trying to figure out:

  1. What is the variance of N(x)? Computer simulations suggest that it isn’t quite monotone increasing in [0,1]. Is there a deeper reason for this?
  2. As x \rightarrow \infty, we expect f(x) to be approximately 2x. Computer simulations suggest that f(x) does indeed grow linearly. However, as x \rightarrow \infty, f(x) \rightarrow 2x + \frac{2}{3}. Is this fact apriori obvious? In other words, is it possible to guess the above form of f(x) for very large x without going through this derivation? Also, how can this asymptotic form be derived from the exact expression for f(x) given above?
  3. What is the distribution of E(x) \triangleq \sum_{i=1}^{N(x)} U_i - x. As x \rightarrow 0, clearly, E(x) is uniformly distributed in [x,x+1). When x = 1, E(x) is definitely not uniformly distributed in [1,2). It seems to me that for very large x, the distribution should again be uniform in [x,x+1). Is this true? How does the procession from uniform to non-uniform to uniform again take place?
  4. What kind of results can we expect if we replace the uniform distribution of U_i with some other distribution (which only takes positive values) like the exponential distribution?
  5. One (potentially useless) side benefit of the above asymptotic is that we can use it to derive polynomial identities involving e that are almost integers. For example, f(5) \approx \frac{32}{3} which gives the approximation

24e^5 -96 e^4 + 108 e^3 -32e^2 + e \approx 256.

    The difference is only about 10^{-4}. Higher values of x yield better approximations. When x = 1, the approximation is just the first convergent (\frac{8}{3})  of the partial fraction expansion of e but higher values don’t agree with the higher convergents. Is there something to be said for such polynomial approximations (as against the linear approximations one gets through continued fraction expansions)?
Advertisement

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.