Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 493: Under the Rainbow

View on Project Euler

Project Euler Problem 493 Solution

EulerSolve provides an optimized solution for Project Euler Problem 493, Under the Rainbow, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary An urn contains \(C\) colors, each represented by \(B\) balls, so the total number of balls is \(CB\). We draw \(D\) balls uniformly without replacement and ask for the expected number of distinct colors that appear in the sample. For Problem 493, the concrete parameters are \((C,B,D)=(7,10,20)\). Mathematical Approach Let \(X\) be the number of distinct colors seen in the draw. The key observation is that the expectation can be reduced to a single probability about one fixed color, and that probability is a simple hypergeometric complement. Step 1: Model the Random Variable For each color \(j\in\{1,\dots,C\}\), define an indicator variable \(I_j\) that equals \(1\) if color \(j\) appears at least once in the sample and equals \(0\) otherwise. Then the number of distinct colors is $X=\sum_{j=1}^{C} I_j.$ This reformulation turns a seemingly global counting question into a sum of very small yes-or-no events, one event for each color. Step 2: Apply Linearity of Expectation Taking expectations gives $\mathbb{E}[X]=\sum_{j=1}^{C}\mathbb{E}[I_j]=\sum_{j=1}^{C}\Pr(I_j=1).$ No independence is needed here; linearity of expectation always holds. By symmetry, every color has the same chance to appear. If we write $p=\Pr(I_1=1),$ then $\mathbb{E}[X]=Cp.$ So the whole problem is reduced to finding the appearance probability of one fixed color....

Detailed mathematical approach

Problem Summary

An urn contains \(C\) colors, each represented by \(B\) balls, so the total number of balls is \(CB\). We draw \(D\) balls uniformly without replacement and ask for the expected number of distinct colors that appear in the sample. For Problem 493, the concrete parameters are \((C,B,D)=(7,10,20)\).

Mathematical Approach

Let \(X\) be the number of distinct colors seen in the draw. The key observation is that the expectation can be reduced to a single probability about one fixed color, and that probability is a simple hypergeometric complement.

Step 1: Model the Random Variable

For each color \(j\in\{1,\dots,C\}\), define an indicator variable \(I_j\) that equals \(1\) if color \(j\) appears at least once in the sample and equals \(0\) otherwise. Then the number of distinct colors is

$X=\sum_{j=1}^{C} I_j.$

This reformulation turns a seemingly global counting question into a sum of very small yes-or-no events, one event for each color.

Step 2: Apply Linearity of Expectation

Taking expectations gives

$\mathbb{E}[X]=\sum_{j=1}^{C}\mathbb{E}[I_j]=\sum_{j=1}^{C}\Pr(I_j=1).$

No independence is needed here; linearity of expectation always holds. By symmetry, every color has the same chance to appear. If we write

$p=\Pr(I_1=1),$

then

$\mathbb{E}[X]=Cp.$

So the whole problem is reduced to finding the appearance probability of one fixed color.

Step 3: Count the Complementary Event

It is easier to compute the probability that the chosen color does not appear. The total number of equally likely samples of size \(D\) is

$\binom{CB}{D}.$

If one particular color is missing, then all \(D\) drawn balls must come from the other \(C-1\) colors, which contribute \((C-1)B\) balls altogether. Therefore

$q=\Pr(I_1=0)=\frac{\binom{(C-1)B}{D}}{\binom{CB}{D}}.$

Since \(p=1-q\), we obtain

$p=1-\frac{\binom{(C-1)B}{D}}{\binom{CB}{D}}.$

Step 4: Derive the Closed Formula

Substituting \(p\) into \(\mathbb{E}[X]=Cp\) yields the exact expectation

$\boxed{\mathbb{E}[X]=C\left(1-\frac{\binom{(C-1)B}{D}}{\binom{CB}{D}}\right).}$

For the Project Euler parameters, this becomes

$\mathbb{E}[X]=7\left(1-\frac{\binom{60}{20}}{\binom{70}{20}}\right).$

This formula is already the mathematical solution. The remaining task is simply to evaluate it numerically in a stable way.

Step 5: Rewrite the Binomial Ratio as a Product

Directly forming the two large binomial coefficients is unnecessary. Using falling products,

$\binom{n}{D}=\frac{n(n-1)\cdots(n-D+1)}{D!},$

so for any \(A\) and \(N\) with \(A\le N\) we have

$\frac{\binom{A}{D}}{\binom{N}{D}}=\frac{A(A-1)\cdots(A-D+1)}{N(N-1)\cdots(N-D+1)}=\prod_{i=0}^{D-1}\frac{A-i}{N-i}.$

With \(A=(C-1)B\) and \(N=CB\), the missing-color probability is therefore

$q=\prod_{i=0}^{D-1}\frac{(C-1)B-i}{CB-i}.$

This product form is exactly what the implementations use, because it avoids huge intermediate values and matches floating-point arithmetic naturally.

Worked Example: \((C,B,D)=(3,2,2)\)

This small case is easy to verify by hand. There are \(3\) colors with \(2\) balls each, so \(6\) balls in total. For one fixed color, the probability that it is absent from the two draws is

$q=\frac{\binom{4}{2}}{\binom{6}{2}}=\frac{6}{15}=\frac{2}{5}.$

Hence

$p=1-q=\frac{3}{5},\qquad \mathbb{E}[X]=3\cdot\frac{3}{5}=\frac{9}{5}=1.8.$

A direct enumeration confirms this. Among the \(\binom{6}{2}=15\) equally likely pairs, exactly \(3\) pairs have both balls of the same color and the remaining \(12\) pairs have two different colors, so

$\frac{3\cdot 1+12\cdot 2}{15}=\frac{27}{15}=\frac{9}{5}.$

How the Code Works

The C++, Python, and Java implementations all use the same formula. They first compute the total number of balls \(CB\) and the number of balls that remain after excluding one fixed color, namely \((C-1)B\).

They then evaluate the product

$\prod_{i=0}^{D-1}\frac{(C-1)B-i}{CB-i}$

term by term. Each iteration contributes one factor to the probability that the chosen color is missing from the sample.

After that, the implementation takes the complement of this probability and multiplies by \(C\):

$C\left(1-\prod_{i=0}^{D-1}\frac{(C-1)B-i}{CB-i}\right).$

The result is printed to nine digits after the decimal point. The C++ implementation also performs small sanity checks, including the toy example above and the edge case \(D=0\), before printing the final value.

Complexity Analysis

For general parameters, the method performs exactly \(D\) multiplicative updates, so the running time is \(O(D)\) and the extra memory usage is \(O(1)\). For the actual problem instance \(D=20\), the computation is tiny. The important point is that the algorithm never enumerates samples and never builds large combinatorial tables.

Footnotes and References

  1. Problem page: https://projecteuler.net/problem=493
  2. Hypergeometric distribution: Wikipedia — Hypergeometric distribution
  3. Indicator function and indicator variables: Wikipedia — Indicator function
  4. Expected value and linearity: Wikipedia — Expected value
  5. Binomial coefficient: Wikipedia — Binomial coefficient

Mathematical approach · C++ solution · Python solution · Java solution

Previous: Problem 492 · All Project Euler solutions · Next: Problem 494

Need help with a problem? Ask me! 💡
e
✦ Euler GLM 5.2
Hi! I'm Euler. Ask me anything about Project Euler problems, math concepts, or solution approaches! 🧮