Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 464: Möbius Function and Intervals

Project Euler Problem 464 Solution

EulerSolve provides an optimized solution for Project Euler Problem 464, Möbius Function and Intervals, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary Let $C(n)$ be the number of intervals $I=[l,r]\subseteq [1,n]$ for which the values $\mu(k)=1$ and $\mu(k)=-1$ stay in almost perfect balance. If we write $P_I=\#\{k\in I:\mu(k)=1\},\qquad N_I=\#\{k\in I:\mu(k)=-1\},$ then the counted intervals are exactly those satisfying $99P_I\le 100N_I,\qquad 99N_I\le 100P_I.$ Values with $\mu(k)=0$ are neutral: they lengthen the interval but do not change either count. The implementations verify the derivation with $C(10)=13,\qquad C(500)=16676,\qquad C(10000)=20155319,$ and then evaluate the same method at $n=20000000$. Mathematical Approach The key reduction is to replace interval counting by a pair-counting problem on weighted prefix sums. Step 1: Generate the Möbius Values Efficiently We need $\mu(1),\mu(2),\dots,\mu(n)$. The Möbius function takes the values $\mu(m)=\begin{cases} 1, & m=1,\\ (-1)^t, & m\text{ is a product of }t\text{ distinct primes},\\ 0, & m\text{ is divisible by }p^2\text{ for some prime }p. \end{cases}$ The implementations build these values with a linear sieve, so the preprocessing is essentially one forward pass through the integers....

Detailed mathematical approach

Problem Summary

Let $C(n)$ be the number of intervals $I=[l,r]\subseteq [1,n]$ for which the values $\mu(k)=1$ and $\mu(k)=-1$ stay in almost perfect balance. If we write

$P_I=\#\{k\in I:\mu(k)=1\},\qquad N_I=\#\{k\in I:\mu(k)=-1\},$

then the counted intervals are exactly those satisfying

$99P_I\le 100N_I,\qquad 99N_I\le 100P_I.$

Values with $\mu(k)=0$ are neutral: they lengthen the interval but do not change either count. The implementations verify the derivation with

$C(10)=13,\qquad C(500)=16676,\qquad C(10000)=20155319,$

and then evaluate the same method at $n=20000000$.

Mathematical Approach

The key reduction is to replace interval counting by a pair-counting problem on weighted prefix sums.

Step 1: Generate the Möbius Values Efficiently

We need $\mu(1),\mu(2),\dots,\mu(n)$. The Möbius function takes the values

$\mu(m)=\begin{cases} 1, & m=1,\\ (-1)^t, & m\text{ is a product of }t\text{ distinct primes},\\ 0, & m\text{ is divisible by }p^2\text{ for some prime }p. \end{cases}$

The implementations build these values with a linear sieve, so the preprocessing is essentially one forward pass through the integers.

Step 2: Express the Interval Condition with Two Counts

For an interval $I=[l,r]$, define

$P_I=\#\{k\in I:\mu(k)=1\},\qquad N_I=\#\{k\in I:\mu(k)=-1\}.$

The interval is counted precisely when neither sign dominates the other by more than the factor $100/99$:

$99P_I\le 100N_I,\qquad 99N_I\le 100P_I.$

This formulation makes clear that entries with $\mu(k)=0$ do not affect the balance at all. In particular, an interval containing only zeros is valid because then $P_I=N_I=0$.

Step 3: Pass to the Complementary Bad Families

Instead of counting good intervals directly, count the two ways an interval can fail:

$\mathcal{A}=\{I:99P_I-100N_I\gt 0\},\qquad \mathcal{B}=\{I:99N_I-100P_I\gt 0\}.$

These two families are disjoint. Indeed, if both inequalities held, then adding them would give

$-(P_I+N_I)\gt 0,$

which is impossible. Therefore

$C(n)=\frac{n(n+1)}{2}-\#\mathcal{A}-\#\mathcal{B}.$

The total number of intervals appears because there are $n(n+1)/2$ choices of $[l,r]\subseteq [1,n]$.

Step 4: Turn Each Bad Family into a Weighted Prefix-Sum Problem

Define two weight systems:

$a(k)=\begin{cases} 99, & \mu(k)=1,\\ -100, & \mu(k)=-1,\\ 0, & \mu(k)=0, \end{cases}\qquad b(k)=\begin{cases} -100, & \mu(k)=1,\\ 99, & \mu(k)=-1,\\ 0, & \mu(k)=0. \end{cases}$

Now define prefix sums

$A_j=\sum_{k=1}^{j}a(k),\qquad B_j=\sum_{k=1}^{j}b(k),\qquad A_0=B_0=0.$

For an interval $I=[l,r]$,

$A_r-A_{l-1}=99P_I-100N_I,\qquad B_r-B_{l-1}=99N_I-100P_I.$

So

$I\in\mathcal{A}\iff A_r\gt A_{l-1},\qquad I\in\mathcal{B}\iff B_r\gt B_{l-1}.$

This converts interval counting into the problem of counting ordered pairs $(i,j)$ with $0\le i\lt j\le n$ and a strict prefix inequality.

Step 5: Count Increasing Prefix Pairs

For the first bad family we must count

$\#\mathcal{A}=\#\{(i,j):0\le i\lt j\le n,\ A_j\gt A_i\},$

and similarly

$\#\mathcal{B}=\#\{(i,j):0\le i\lt j\le n,\ B_j\gt B_i\}.$

Scanning the prefixes from left to right, we only need to know how many earlier prefix values are strictly smaller than the current one. A Fenwick tree supports exactly this operation after the prefix values are shifted into a dense range or compressed into sorted coordinates.

Hence the full problem becomes a pair of standard order-statistics passes over two prefix arrays.

Worked Example: $n=10$

The Möbius values are

$\bigl(\mu(1),\dots,\mu(10)\bigr)=(1,-1,-1,0,-1,1,-1,0,0,1).$

Some representative intervals are easy to classify:

$[1,1]:\ P_I=1,\ N_I=0\ \Rightarrow\ 99P_I-100N_I=99\gt 0,$

so $[1,1]\in\mathcal{A}$.

$[2,3]:\ P_I=0,\ N_I=2\ \Rightarrow\ 99N_I-100P_I=198\gt 0,$

so $[2,3]\in\mathcal{B}$.

$[1,2]:\ P_I=1,\ N_I=1,$

hence both balance inequalities hold and $[1,2]$ is counted. Also $[8,9]$ contains only zeros, so $P_I=N_I=0$ and it is counted as well.

For the first weight system, the prefix sums are

$A=(0,99,-1,-101,-101,-201,-102,-202,-202,-202,-103),$

and for the second they are

$B=(0,-100,-1,98,98,197,97,196,196,196,96).$

The number of increasing pairs in the first sequence is $6$, and in the second sequence it is $36$. Since the total number of intervals is

$\frac{10\cdot 11}{2}=55,$

we obtain

$C(10)=55-6-36=13,$

which matches the checkpoint used by the implementations.

How the Code Works

The C++, Python, and Java implementations all follow the same pipeline. First they compute the Möbius values up to $n$ with a linear sieve. Then they run the interval-counting routine twice: once with weights $(99,-100)$ and once with weights $(-100,99)$. In each pass they build the weighted prefix sums, map those prefix values to Fenwick-tree indices, and accumulate how many earlier prefixes are strictly smaller than the current prefix. After obtaining the two bad counts, they subtract them from $n(n+1)/2$.

The C++ and Java implementations also include a coordinate-compression fallback when the raw prefix-value span would make a dense Fenwick tree unnecessarily large. The mathematical quantity being counted is the same in every language.

Complexity Analysis

Computing all Möbius values up to $n$ by linear sieve costs $O(n)$ time and $O(n)$ memory. Each weighted pass constructs one prefix array in $O(n)$ time and performs Fenwick queries and updates in $O(n\log n)$ time in the worst case; when the prefix span is already dense, the logarithm is taken over that span instead of over the number of distinct compressed values. Therefore the overall method runs in $O(n\log n)$ time with $O(n)$ memory, a dramatic improvement over naive $O(n^2)$ interval enumeration.

Footnotes and References

Problem page: https://projecteuler.net/problem=464
Möbius function: Wikipedia - Möbius function
Fenwick tree: Wikipedia - Fenwick tree
Prefix sum: Wikipedia - Prefix sum