Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 791: Average and Variance

Project Euler Problem 791 Solution

EulerSolve provides an optimized solution for Project Euler Problem 791, Average and Variance, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary Problem 791 studies nondecreasing positive integer quadruples $(a,b,c,d)$ whose average equals their variance. Writing $\mu=\frac{a+b+c+d}{4},\qquad V=\frac{(a-\mu)^2+(b-\mu)^2+(c-\mu)^2+(d-\mu)^2}{4},$ we seek all quadruples with $\mu=V$ and $\mu\le n$, and $S(n)$ is the sum of the corresponding totals $a+b+c+d=4\mu$. The direct four-variable search is far too large for $n=10^8$, so the implementations reduce the problem to a three-dimensional lattice search with closed-form summation over the innermost coordinate. Mathematical Approach The important fact encoded by the implementations is that the original average-equals-variance constraint can be rewritten in terms of three odd integers. Once that reduction is done, the remaining work is purely arithmetic. Step 1: Pass to the Reduced Odd Coordinates After ordering the quadruple, eliminating the redundant degree of freedom, and applying the linear change of variables used by the implementations, every admissible configuration is represented by odd integers $(u,v,w)$ satisfying $u\equiv v\equiv w\equiv 1\pmod 2.$ Its contribution to $S(n)$ becomes $T(u,v,w)=\frac{u^2+v^2+w^2+2u+2v+2w+3}{2}.$ This is the quantity accumulated by the implementations. The original quadruple is no longer enumerated explicitly; the whole computation is carried out in the reduced $(u,v,w)$-space....

Detailed mathematical approach

Problem Summary

Problem 791 studies nondecreasing positive integer quadruples $(a,b,c,d)$ whose average equals their variance. Writing

$\mu=\frac{a+b+c+d}{4},\qquad V=\frac{(a-\mu)^2+(b-\mu)^2+(c-\mu)^2+(d-\mu)^2}{4},$

we seek all quadruples with $\mu=V$ and $\mu\le n$, and $S(n)$ is the sum of the corresponding totals $a+b+c+d=4\mu$. The direct four-variable search is far too large for $n=10^8$, so the implementations reduce the problem to a three-dimensional lattice search with closed-form summation over the innermost coordinate.

Mathematical Approach

The important fact encoded by the implementations is that the original average-equals-variance constraint can be rewritten in terms of three odd integers. Once that reduction is done, the remaining work is purely arithmetic.

Step 1: Pass to the Reduced Odd Coordinates

After ordering the quadruple, eliminating the redundant degree of freedom, and applying the linear change of variables used by the implementations, every admissible configuration is represented by odd integers $(u,v,w)$ satisfying

$u\equiv v\equiv w\equiv 1\pmod 2.$

Its contribution to $S(n)$ becomes

$T(u,v,w)=\frac{u^2+v^2+w^2+2u+2v+2w+3}{2}.$

This is the quantity accumulated by the implementations. The original quadruple is no longer enumerated explicitly; the whole computation is carried out in the reduced $(u,v,w)$-space.

Step 2: Determine the Outer Feasible Region

Set

$R=8n.$

The outer coordinate is restricted to odd values in

$1\le u\le \left\lfloor\sqrt{R+3}\right\rfloor-2,\qquad u\equiv 1\pmod 2.$

For each fixed $u$, the second coordinate is restricted to

$-1\le v\le \min\!\left(u,\left\lfloor\sqrt{R-u^2-4u-1}\right\rfloor-2\right),\qquad v\equiv 1\pmod 2.$

These bounds already remove most of the search space. Instead of a four-dimensional brute-force enumeration, the solver now scans only the feasible odd lattice points in a curved two-dimensional outer region.

Step 3: Compute the Inner Interval and the Central Hole

For each admissible pair $(u,v)$, define

$w_{\max}=\left\lfloor\sqrt{R-u^2-v^2-4u-4v-5}\right\rfloor.$

Then $w$ must lie in the odd interval

$\max(-w_{\max},-v-2)\le w\le \min(w_{\max},v),\qquad w\equiv 1\pmod 2.$

There is one more condition. If

$11-u^2-v^2>0,$

then the middle of the interval is forbidden and we must keep only

$|w|\ge \left\lceil\sqrt{11-u^2-v^2}\right\rceil.$

So for a fixed $(u,v)$, the innermost search is never an arbitrary set: it is either one odd interval or two odd intervals separated by a central gap. That structural fact is exactly what makes the constant-time interval summation possible.

Step 4: Sum an Entire Odd Interval in Closed Form

Suppose one admissible $w$-interval is

$w=L,L+2,\dots,H,$

with $L$ and $H$ odd. Let

$m=\frac{H-L}{2}+1,\qquad w_j=L+2j\quad (0\le j\le m-1).$

Then

$\sum_{j=0}^{m-1} j=\frac{m(m-1)}{2},\qquad \sum_{j=0}^{m-1} j^2=\frac{(m-1)m(2m-1)}{6},$

and therefore

$\sum_{j=0}^{m-1} w_j=mL+2\sum_{j=0}^{m-1}j,$

$\sum_{j=0}^{m-1} w_j^2=mL^2+4L\sum_{j=0}^{m-1}j+4\sum_{j=0}^{m-1}j^2.$

If we abbreviate the part independent of $w$ by

$K(u,v)=u^2+v^2+2u+2v+3,$

then the whole interval contributes

$\sum_{j=0}^{m-1} T(u,v,w_j)=\frac{mK(u,v)+\sum w_j^2+2\sum w_j}{2}.$

This replaces a potentially long inner loop by a handful of arithmetic operations.

Step 5: Split Only When the Hole Is Active

If the central exclusion is absent, one closed-form evaluation is enough. If the exclusion is present, the allowed odd values split into a negative tail and a positive tail:

$[L,H]\cap\{w:w\equiv 1\pmod 2\}=[L,-c]\cup[c,H]$

for the smallest admissible odd threshold $c$. Each tail is summed by the same interval formula, and the two results are added modulo $433494437$.

So the algorithm never iterates over admissible $w$-values one by one. That is the decisive optimization shared by the C++, Python, and Java implementations.

Worked Example: $S(5)=48$

When $n=5$, we have $R=40$. The reduced search produces exactly the following admissible odd triples:

$\begin{aligned} (1,1,-3)&\mapsto T=6,\\ (3,-1,-1)&\mapsto T=8,\\ (3,1,-3)&\mapsto T=12,\\ (3,1,-1)&\mapsto T=10,\\ (3,1,1)&\mapsto T=12. \end{aligned}$

Therefore

$S(5)=6+8+12+10+12=48,$

which matches the checkpoint built into the overall solution strategy.

How the Code Works

The C++, Python, and Java implementations all follow the same mathematical pipeline. First they compute $R=8n$, determine the largest admissible odd outer coordinate, and loop over the feasible odd pairs $(u,v)$ using integer square roots to obtain sharp bounds instead of trial-and-error scanning.

For each outer pair, the implementation constructs the admissible odd range for the innermost coordinate. If the central hole is inactive, it evaluates one interval. If the hole is active, it evaluates a negative interval and a positive interval. In both cases it uses the closed forms for $\sum w$ and $\sum w^2$, so the innermost work is constant time.

The arithmetic is always reduced modulo $433494437$. The C++ implementation parallelizes the outer loop because different odd $u$-slices are independent; the Python and Java implementations use the same bounds and formulas in sequential form. The C++ version also checks the intermediate values $S(5)=48$ and $S(10^3)=37048340$ before computing the final target.

Complexity Analysis

The outer coordinate ranges over $O(\sqrt{n})$ odd values, and for each such value the second coordinate also spans at most $O(\sqrt{n})$ odd values. Hence the number of feasible outer pairs is $O(n)$. Since each pair is handled by at most two constant-time interval evaluations, the total running time is $O(n)$. The sequential versions use $O(1)$ auxiliary space, while the parallel C++ version uses $O(P)$ partial storage for $P$ worker threads.

Footnotes and References

Problem page: https://projecteuler.net/problem=791
Arithmetic mean: Wikipedia — Arithmetic mean
Variance: Wikipedia — Variance
Arithmetic progression: Wikipedia — Arithmetic progression
Sum of squares: Wikipedia — Square pyramidal number