Sum of Uniform random variables

December 5, 2009

This is an interesting problem that has long been a favorite of mine for its surprising solution.

Problem: Let x be a positive real. Let U_i, i = 1,2,\dots be i.i.d random variables uniformly distributed between 0 and 1. Define the random variable N as N(x) \triangleq \{ \min n \colon \sum_{i=1}^{n} U_i > x \}. In plain English, N is the minimum number of uniform random variables one needs to add for the sum to exceed x. Find the expected value of N(x).

The more famous version of this problem merely asks you to find the expectation of N(1). However, a slight extension of the approach can give the answer for a general x. We shall tackle this problem using recursive expectation which is a fairly natural line of attack for these kinds of problems. Before proceeding further, let me note that it is also possible to find this expectation by actually computing the pmf of N(x) (at least when 0 < x < 1. I don’t know it if the pmf is easy to compute for x > 1).

Lets start by defining f(x) to be the expected value of N(x). For reasons that will become obvious shortly, lets for now restrict x to lie between 0 and 1. A recursion can be derived for f(x) by conditioning on the value of the first random variable U_1.

f(x) = \int_{0}^{1} \mathbb{E}(N(x) | U_1 = a) da

This integral naturally splits into two as

f(x) = \int_{0}^{x} \mathbb{E}(N(x) | U_1 = a) da + \int_{x}^{1} \mathbb{E}(N(x) | U_1 = a) da

If the value of U_1 is greater than x (as is the case in the second integral), the sum has already exceeded x which implies that \mathbb{E}(N(x) | U_1 = a) = 1 in this case. If U_1 < x, we get \mathbb{E}(N(x) | U_1 = a) = 1 + f(x-a). This is the crucial step where recursion steps in. Substituting these back into the expression for f(x) above, we get

f(x) = 1 + \int_{o}^{x} f(x-a) da

which is the same as

f(x) = 1 + \int_{o}^{x} f(a) da

This integral equation can be solved by converting it into a differential equation. Differentiating yields,

\frac{d}{dx} f(x) = f(x) \quad for \, 0 \leq x < 1

The solution in the familiar exponential function f(x) = C e^x. It is easy to see that as x \rightarrow 0, f(x) \rightarrow 1 which implies that the constant of integration must be 1. Therefore, we get that f(x) = e^x for 0 \leq x < 1. Therefore, the expected number of uniform random variables one needs to add to cross a threshold of unity is simply e!!!

Take a moment to reflect how cool this result is. Uniform random variables are everywhere around us and by simply adding them up till they cross a threshold, we can estimate one of the fundamental constants of all mathematics. This is similar to the Buffon’s needle problem which one can use to estimate \pi by simply throwing a needle onto a ruled sheet. It comes with a caveat though: The variance of N(x) is quite high (\approx 0.8) which means that, just as in the Buffon’s needle problem, this is a highly inefficient way of estimating the underlying constant.

Let us forge on and try to compute f(x) for x > 1. A similar recursive equation can be derived by conditioning on U_1 as

f(x) = \int_{0}^{1} \mathbb{E}(N(x) | U_1 = a) da

However, the difference is that when x > 1, the integral doesn’t split into two parts like it did in the case when x \leq 1. Instead the only possible simplification is by replacing \mathbb{E}(N(x) | U_1 = a) by 1+f(x-a) as before and changing the variable of integration. This gives

f(x) = 1 + \int_{x-1}^{x} f(a) da

Differentiating with respect to x gives the differential equation

\frac{d}{dx} f(x) = f(x) - f(x-1)

I don’t know if there is a easy way to solve this equation but I tackled it in the following manner. We already know the form of f(x) when 0 \leq x < 1. So, when 1 \leq x < 2, the above equation can be converted to a familiar linear first order differential equation by substituting f(x-1) = e^{(x-1)}. This results in the differential equation

\frac{d}{dx} f(x) = f(x) - e^{x-1} when 1 \leq x < 2

whose solution is (with the boundary condition that f(1) = e)

f(x) = e^x - (x-1)e^{x-1} for 1 \leq x < 2.

Armed with this, we can solve the differential equation for 2 \leq x < 3 and so on. For example, we have

\frac{d}{dx} f(x) = f(x) - e^{x-1}+(x-2)e^{x-2} when 2 \leq x < 3

whose solution is

f(x) = e^x - (x-1)e^{x-1} + \frac{(x-2)^2}{2} e^{x-2} for 2 \leq x < 3.

The form of the solution for a general x can be easily guessed by working out the first couple of terms in the above manner. Once the form is guessed, it is easy to show that it indeed satisfies the above differential equation and that the solution is unique. Cutting to the chase, the final expression is

f(x) = \sum_{k = 0}^{[x]} (-1)^k \frac{(x-k)^k}{k!} e^{x-k} where [x] is the integer part of x.

A couple of questions that I am currently trying to figure out:

  1. What is the variance of N(x)? Computer simulations suggest that it isn’t quite monotone increasing in [0,1]. Is there a deeper reason for this?
  2. As x \rightarrow \infty, we expect f(x) to be approximately 2x. Computer simulations suggest that f(x) does indeed grow linearly. However, as x \rightarrow \infty, f(x) \rightarrow 2x + \frac{2}{3}. Is this fact apriori obvious? In other words, is it possible to guess the above form of f(x) for very large x without going through this derivation? Also, how can this asymptotic form be derived from the exact expression for f(x) given above?
  3. What is the distribution of E(x) \triangleq \sum_{i=1}^{N(x)} U_i - x. As x \rightarrow 0, clearly, E(x) is uniformly distributed in [x,x+1). When x = 1, E(x) is definitely not uniformly distributed in [1,2). It seems to me that for very large x, the distribution should again be uniform in [x,x+1). Is this true? How does the procession from uniform to non-uniform to uniform again take place?
  4. What kind of results can we expect if we replace the uniform distribution of U_i with some other distribution (which only takes positive values) like the exponential distribution?
  5. One (potentially useless) side benefit of the above asymptotic is that we can use it to derive polynomial identities involving e that are almost integers. For example, f(5) \approx \frac{32}{3} which gives the approximation

24e^5 -96 e^4 + 108 e^3 -32e^2 + e \approx 256.

    The difference is only about 10^{-4}. Higher values of x yield better approximations. When x = 1, the approximation is just the first convergent (\frac{8}{3})  of the partial fraction expansion of e but higher values don’t agree with the higher convergents. Is there something to be said for such polynomial approximations (as against the linear approximations one gets through continued fraction expansions)?

Prime Spirals

May 18, 2008

You know what mathematicians do when they have some spare time? They doodle with spirals and find curious and interesting patterns of prime numbers. That is what Stanislaw Ulam did back in the 60s, I imagine while sitting through a particularly boring talk at some conference. He wrote down the integers along a rectangular grid and noticed the location of the primes along this spiral. The primes seem to concentrate along certain patterns such as diagonals.

Finding patterns among the primes is among the most important problems of mathematics. Primes occupy a central position in mathematics because they can be viewed as the building blocks of all integers through the unique factorization theorem. There has been some success in characterizing them on a large scale (such as counting the number of primes less than n for some large n) but at a smaller scale things still are murky.

The prime spirals are one among many such tantalizing patterns involving prime numbers that are still not conclusively explained. The same patterns appear with other spirals such as Archimedean and the square root spiral. One such pattern in the Archimedean spiral corresponds to Euler’s prime generating polynomial n^2 + n + 41, a famous prime rich polynomial that generates primes for n=1 to 39. If you want to generate a square spiral, try this in matlab: imshow(double(~isprime(spiral(n)))) generates a n \times n prime spiral. Look here for the Archimedean spiral.

Such patterns show how little we really understand primes and how far we still have to go. As an aside, the prime generating polynomial of Euler is very interesting. However, trying to create a polynomial that generates only primes is a doomed effort. Can you prove that any polynomial f(n) will surely produce a composite number for some integer n?

Almost Integers

April 28, 2008
e^{\pi \sqrt{163}} – looks scary, doesn’t it? This is the so called Ramanujan’s constant and is a transcendental number. But a surprise awaits any one who tries to numerically evaluate it. Normal calculators and Matlab won’t work though. One needs more precision and if you want to give it a shot, try this arbitrary precision calculator. The value is within 10^{-12} of an integer!!! Who would have thought, looking at the number, that this would be the case? Ramanujan did a lot of work on these so called “almost integers” and hence this number is named for him. There is a beautiful mathematical theory behind this proximity to an integer involving imaginary quadratic fields. Some day, I hope to understand it. There are also other examples of these almost integers, for example, the so-called Pisot-Vijayaraghavan numbers (Vijayaraghavan is another Indian mathematician who worked with G.H.Hardy in the 1920s). For example, the golden ratio \phi raised to a large power is very close to an integer. This phenomenon is also well understood. However, there are other examples which hint at mysterious and as yet undiscovered properties of e and \pi. For instance e^{\pi} - \pi is within a thousandth of an integer and no one knows whether there is a reason for it or if it is just a coincidence. Truly fascinating stuff.

Follow

Get every new post delivered to your Inbox.