Problem 438: Integer Part of Polynomial Equation's Solutions
View on Project EulerProject Euler Problem 438 Solution
EulerSolve provides an optimized solution for Project Euler Problem 438, Integer Part of Polynomial Equation's Solutions, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary We consider monic degree-\(7\) polynomials $P(t)=t^7+a_1t^6+a_2t^5+a_3t^4+a_4t^3+a_5t^2+a_6t+a_7$ with integer coefficients, all roots real, and sorted roots \(r_1\le \cdots \le r_7\) satisfying $\lfloor r_i\rfloor=i \qquad (1\le i\le 7).$ For every admissible coefficient vector \(a=(a_1,\dots,a_7)\), define $S(a)=\sum_{i=1}^7 |a_i|.$ The goal is to sum \(S(a)\) over all such polynomials. Mathematical Approach Step 1: Rewrite the polynomial through its roots Because the roots are real and lie in the positive intervals \([1,2),[2,3),\dots,[7,8)\), we may write $P(t)=\prod_{i=1}^7 (t-r_i)=t^7-e_1t^6+e_2t^5-e_3t^4+e_4t^3-e_5t^2+e_6t-e_7,$ where \(e_m\) is the \(m\)-th elementary symmetric sum of the roots. Therefore $a_m=(-1)^m e_m \qquad (1\le m\le 7).$ Since every root is positive, all \(e_m\) are positive as well, and the objective becomes simply $S(a)=e_1+e_2+\cdots+e_7.$ So the search is really over the positive symmetric sums \(e_1,\dots,e_7\), not over arbitrary signed coefficients....
Detailed mathematical approach
Problem Summary
We consider monic degree-\(7\) polynomials
$P(t)=t^7+a_1t^6+a_2t^5+a_3t^4+a_4t^3+a_5t^2+a_6t+a_7$
with integer coefficients, all roots real, and sorted roots \(r_1\le \cdots \le r_7\) satisfying
$\lfloor r_i\rfloor=i \qquad (1\le i\le 7).$
For every admissible coefficient vector \(a=(a_1,\dots,a_7)\), define
$S(a)=\sum_{i=1}^7 |a_i|.$
The goal is to sum \(S(a)\) over all such polynomials.
Mathematical Approach
Step 1: Rewrite the polynomial through its roots
Because the roots are real and lie in the positive intervals \([1,2),[2,3),\dots,[7,8)\), we may write
$P(t)=\prod_{i=1}^7 (t-r_i)=t^7-e_1t^6+e_2t^5-e_3t^4+e_4t^3-e_5t^2+e_6t-e_7,$
where \(e_m\) is the \(m\)-th elementary symmetric sum of the roots. Therefore
$a_m=(-1)^m e_m \qquad (1\le m\le 7).$
Since every root is positive, all \(e_m\) are positive as well, and the objective becomes simply
$S(a)=e_1+e_2+\cdots+e_7.$
So the search is really over the positive symmetric sums \(e_1,\dots,e_7\), not over arbitrary signed coefficients.
Step 2: Immediate bounds from the root intervals
For the elementary symmetric polynomial
$E_m(b_1,\dots,b_7)=\sum_{1\le i_1<\cdots<i_m\le 7} b_{i_1}\cdots b_{i_m},$
monotonicity on positive inputs gives
$E_m(1,2,\dots,7)\le e_m < E_m(2,3,\dots,8).$
Evaluating these endpoint sums yields the integer coefficient box
$\begin{aligned} 28\le e_1\le 34,\qquad &322\le e_2\le 510,\qquad 1960\le e_3\le 4024,\\ 6769\le e_4\le 18423,\qquad &13132\le e_5\le 48859,\\ 13068\le e_6\le 69263,\qquad &5040\le e_7\le 40319. \end{aligned}$
These bounds already remove the overwhelming majority of impossible tuples.
Step 3: Convert the floor condition into sign conditions
For an integer \(k\in\{1,\dots,8\}\), the value
$P(k)=\prod_{i=1}^7 (k-r_i)$
has a prescribed sign. Indeed, when \(P(k)\neq 0\), exactly \(8-k\) factors are negative, so
$(-1)^k P(k)\ge 0 \qquad (1\le k\le 8).$
Written in the symmetric sums, this becomes the affine inequality
$(-1)^k\left(k^7-e_1k^6+e_2k^5-e_3k^4+e_4k^3-e_5k^2+e_6k-e_7\right)\ge 0.$
If all eight inequalities are strict, then the sign alternates between consecutive integers, so the intermediate value theorem forces one root in each open interval \((k,k+1)\). Since the degree is \(7\), that already accounts for all roots.
Step 4: Boundary roots require a derivative check
Equality \(P(k)=0\) means a root lies exactly on an integer boundary. The sign test alone is not enough there: the polynomial must cross the axis in the direction compatible with the interval \([k,k+1)\).
Near a simple root at \(k\),
$P(t)\approx P'(k)(t-k),$
so the admissible orientation is
$(-1)^{k+1}P'(k)>0 \qquad (1\le k\le 7).$
At \(k=8\), equality is impossible for a valid polynomial, because no root may have integer part \(8\). This is exactly why the implementation performs a final endpoint validation after the linear range computation.
Step 5: The admissible region is linear
We now have two kinds of constraints:
$\text{box bounds for } e_1,\dots,e_7,$
and
$8 \text{ affine sign constraints coming from } P(1),P(2),\dots,P(8).$
Every one of these conditions is linear in the variables \(e_1,\dots,e_7\). Therefore the feasible set is a polyhedral region inside the coefficient box.
Step 6: Eliminate later variables before the search
The implementation applies Fourier-Motzkin elimination to project away the later coefficients one by one. After these projections have been precomputed, fixing the first few coefficients immediately produces a much tighter lower and upper bound for the next one.
So the search does not walk through the entire raw box. It chooses coefficients in order, repeatedly intersects the static interval from Step 2 with the projected linear interval coming from the remaining sign conditions, and abandons the branch as soon as the interval becomes empty.
Step 7: Why the last coefficient is summed as an interval
After \(e_1,\dots,e_6\) have been chosen, the remaining valid values of \(e_7\) form a single integer interval \([L,R]\). Because all constraints are linear in \(e_7\), any equality case \(P(k)=0\) can only occur at the ends of that interval, so only the endpoints need the derivative test from Step 4.
Once the valid interval is known, its total contribution to the objective is
$\sum_{u=L}^{R}(e_1+\cdots+e_6+u)=(R-L+1)(e_1+\cdots+e_6)+\frac{(L+R)(R-L+1)}{2}.$
This arithmetic-series formula is what lets the implementation aggregate an entire final interval in constant time.
How the Code Works
The C++, Python, and Java implementations use the same mathematical pipeline. They enumerate the positive symmetric sums instead of the signed coefficients, start from the interval bounds implied by the root locations, precompute projected linear constraints, and then run a depth-first search whose branches are cut as soon as a coefficient interval is empty. For the final coefficient, they validate only boundary values that make some \(P(k)\) vanish, and then add the whole interval contribution with the closed formula above rather than iterating one integer at a time.
Complexity Analysis
Brute force over the coefficient box would be astronomically large. The implemented method is still exponential in the number of free coefficients in the worst case, because it is fundamentally a search problem, but the projected linear bounds reduce the branching factor dramatically. Memory usage stays small: the algorithm stores only a few projected linear systems, the current recursion stack, and the running total.
Footnotes and References
- Problem page: https://projecteuler.net/problem=438
- Vieta's formulas: Wikipedia — Vieta's formulas
- Elementary symmetric polynomial: Wikipedia — Elementary symmetric polynomial
- Fourier-Motzkin elimination: Wikipedia — Fourier-Motzkin elimination
- Intermediate value theorem: Wikipedia — Intermediate value theorem