Problem 415: Titanic Sets
View on Project EulerProject Euler Problem 415 Solution
EulerSolve provides an optimized solution for Project Euler Problem 415, Titanic Sets, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Let \(G_N=\{(x,y)\in\mathbb{Z}^2:0\le x,y\le N\}\). A finite set \(S\subseteq G_N\) is called titanic if some line passes through exactly two points of \(S\). We must compute \(T(N)\), the number of titanic subsets of \(G_N\), modulo \(10^8\). The implementation never enumerates subsets directly. Instead, it counts the complement: empty sets, singletons, and larger subsets whose selected points are all collinear. Mathematical Approach Step 1: Reduce to the Complement The key theorem is Sylvester-Gallai: every finite non-collinear set of real points has an ordinary line, meaning a line containing exactly two of the points. Therefore a lattice subset is not titanic if and only if it is empty, a singleton, or all of its points lie on one common line. If \(\operatorname{NT}(N)\) denotes the number of non-titanic subsets, then $T(N)=2^{(N+1)^2}-\operatorname{NT}(N)\pmod{10^8}.$ Step 2: Replace Subset Counting by Line-Length Counting Suppose a grid line contains exactly \(k\) lattice points. The non-titanic subsets supported on that line are exactly the subsets of size at least \(3\), so that line contributes $\sum_{r=3}^{k}\binom{k}{r}=2^k-1-k-\binom{k}{2}.$ Now define \(F_\ell(N)\) as the number of grid lines containing at least \(\ell+1\) lattice points....
Detailed mathematical approach
Problem Summary
Let \(G_N=\{(x,y)\in\mathbb{Z}^2:0\le x,y\le N\}\). A finite set \(S\subseteq G_N\) is called titanic if some line passes through exactly two points of \(S\). We must compute \(T(N)\), the number of titanic subsets of \(G_N\), modulo \(10^8\).
The implementation never enumerates subsets directly. Instead, it counts the complement: empty sets, singletons, and larger subsets whose selected points are all collinear.
Mathematical Approach
Step 1: Reduce to the Complement
The key theorem is Sylvester-Gallai: every finite non-collinear set of real points has an ordinary line, meaning a line containing exactly two of the points. Therefore a lattice subset is not titanic if and only if it is empty, a singleton, or all of its points lie on one common line.
If \(\operatorname{NT}(N)\) denotes the number of non-titanic subsets, then
$T(N)=2^{(N+1)^2}-\operatorname{NT}(N)\pmod{10^8}.$
Step 2: Replace Subset Counting by Line-Length Counting
Suppose a grid line contains exactly \(k\) lattice points. The non-titanic subsets supported on that line are exactly the subsets of size at least \(3\), so that line contributes
$\sum_{r=3}^{k}\binom{k}{r}=2^k-1-k-\binom{k}{2}.$
Now define \(F_\ell(N)\) as the number of grid lines containing at least \(\ell+1\) lattice points. For a fixed line with \(k\) points, one has the identity
$\sum_{\ell=2}^{k-1}\bigl(2^\ell-(\ell+1)\bigr)=2^k-1-k-\binom{k}{2}.$
Therefore
$\operatorname{NT}(N)=1+(N+1)^2+\sum_{\ell=2}^{N}F_\ell(N)\bigl(2^\ell-(\ell+1)\bigr).$
The whole task is now reduced to computing \(F_\ell(N)\) efficiently for all \(\ell\ge2\).
Step 3: Primitive Directions and Segment Placements
Write \(a=N+1\). Every non-axis grid line has a primitive step vector \((u,v)\) with \(1\le v\le u\) and \(\gcd(u,v)=1\). The pair \((u,v)\) represents the slope classes \(\pm v/u\), and when \(u\ne v\), also the swapped classes \(\pm u/v\). Horizontal and vertical lines are handled separately.
For such a primitive direction, the number of placements of an \(\ell\)-step segment inside the square is
$C_\ell(u,v)=\max(a-\ell u,0)\max(a-\ell v,0).$
If a line contains \(k\) lattice points, then it contributes exactly \(k-\ell\) segments of length \(\ell\) and \(k-\ell-1\) segments of length \(\ell+1\). Their difference is \(1\) exactly when \(k\ge\ell+1\). Hence the number of distinct lines in this direction class that contain at least \(\ell+1\) points is
$C_\ell(u,v)-C_{\ell+1}(u,v).$
After doubling for the two signs and adding the \(a\) horizontal plus \(a\) vertical lines, we recover \(F_\ell(N)\).
Step 4: Totient-Based Direction Statistics
Let
$\Phi(M)=\sum_{d\le M}\varphi(d),\qquad \Psi_1(M)=\sum_{d\le M} d\,\varphi(d),\qquad \Psi_2(M)=\sum_{d\le M} d^2\varphi(d).$
The implementation stores the primitive-direction information in three cumulative quantities:
$c(M)=2\Phi(M)-1,\qquad s_{xy}(M)=3\Psi_1(M)-1,\qquad s_{\prod}(M)=\Psi_2(M).$
Why do these formulas appear? Group primitive directions by their larger coordinate \(d\). For fixed \(d>1\), there are \(\varphi(d)\) reduced numerators, their sum is \(d\varphi(d)/2\), and the swapped directions contribute the same amount. The exceptional diagonal case \(d=1\) explains the subtractive constants.
Step 5: Fast Evaluation of \(\Phi\), \(\Psi_1\), and \(\Psi_2\)
The code does not recompute these sums from scratch for every query. Using \(\sum_{d\mid n}\varphi(d)=n\), one gets the divisor-summatory identities
$\sum_{q=1}^{n}\Phi\!\left(\left\lfloor\frac{n}{q}\right\rfloor\right)=\sum_{m=1}^{n}m=\frac{n(n+1)}{2},$
$\sum_{q=1}^{n}q\,\Psi_1\!\left(\left\lfloor\frac{n}{q}\right\rfloor\right)=\sum_{m=1}^{n}m^2=\frac{n(n+1)(2n+1)}{6},$
$\sum_{q=1}^{n}q^2\,\Psi_2\!\left(\left\lfloor\frac{n}{q}\right\rfloor\right)=\sum_{m=1}^{n}m^3=\left(\frac{n(n+1)}{2}\right)^2.$
Those identities lead to memoized recurrences over quotient blocks \(\left\lfloor n/q\right\rfloor\), which is why the summatory functions can be queried many times without a linear scan up to \(n\).
Step 6: Closed Form for \(F_\ell(N)\)
Set
$M=\left\lfloor\frac{N}{\ell}\right\rfloor,\qquad M'=\left\lfloor\frac{N}{\ell+1}\right\rfloor,\qquad a=N+1.$
On any interval of \(\ell\) where \(M\) and \(M'\) stay constant, the line count becomes a quadratic polynomial in \(\ell\):
$F_\ell(N)=2a+2\left(q_2\ell^2+q_1\ell+q_0\right),$
where
$q_2=s_{\prod}(M)-s_{\prod}(M'),$
$q_1=a\bigl(s_{xy}(M')-s_{xy}(M)\bigr)-2s_{\prod}(M'),$
$q_0=a^2\bigl(c(M)-c(M')\bigr)+a\,s_{xy}(M')-s_{\prod}(M').$
This is exactly the quadratic block formula that appears in the implementation.
Step 7: Block Summation
The values of \(\left\lfloor N/\ell\right\rfloor\) change only \(O(\sqrt N)\) times, so the final summation is performed block by block. For \(\ell\in[L_1,L_2]\), the contribution is
$\sum_{\ell=L_1}^{L_2}F_\ell(N)\bigl(2^\ell-(\ell+1)\bigr).$
After substituting the quadratic form, the implementation only needs closed forms for
$\sum 2^\ell,\qquad \sum \ell 2^\ell,\qquad \sum \ell^2 2^\ell,\qquad \sum \ell,\qquad \sum \ell^2,\qquad \sum \ell^3.$
The exponential identities used are
$\sum_{j=0}^{n}2^j=2^{n+1}-1,\qquad \sum_{j=0}^{n}j2^j=(n-1)2^{n+1}+2,$
$\sum_{j=0}^{n}j^2 2^j=(n^2-2n+3)2^{n+1}-6.$
Worked Checkpoint: \(N=4\)
For the \(5\times5\) grid, the formula gives
$F_2(4)=32,\qquad F_3(4)=16,\qquad F_4(4)=12.$
Hence
$\operatorname{NT}(4)=1+25+32(1)+16(4)+12(11)=254,$
and therefore
$T(4)=2^{25}-254=33554178,$
which matches the checkpoint verified by the implementation.
How the Code Works
The C++, Python, and Java implementations follow the same mathematical pipeline. They precompute Euler totients up to a fixed cutoff with a linear sieve, build prefix tables for \(\Phi\), \(\Psi_1\), and \(\Psi_2\), and memoize larger arguments so that each summatory query is solved only once.
After that, the implementation enumerates the distinct quotient blocks of \(\lfloor N/\ell\rfloor\). Inside each block it evaluates the quadratic line-count formula, applies the closed forms for the exponential and polynomial sums, accumulates the complement count, and finally subtracts that value from \(2^{(N+1)^2}\) modulo \(10^8\).
Complexity Analysis
Let \(B\) be the totient precomputation cutoff and let \(Q\) be the number of distinct values taken by \(\lfloor N/\ell\rfloor\); one has \(Q=O(\sqrt N)\). The sieve and prefix tables cost near-linear time and \(O(B)\) memory. The outer summation uses only \(O(Q)\) quotient blocks, and the cached summatory totient queries are shared across those blocks instead of being recomputed. In practice, the heavy work is the prefix-table construction plus the \(O(\sqrt N)\)-scale block arithmetic.
References
- Problem page: https://projecteuler.net/problem=415
- Sylvester-Gallai theorem: Wikipedia — Sylvester-Gallai theorem
- Euler's totient function: Wikipedia — Euler's totient function
- Dirichlet hyperbola method: Wikipedia — Dirichlet hyperbola method
- Graham, Knuth, Patashnik, Concrete Mathematics, 2nd ed., Addison-Wesley, Chapter 2.