Problem 630: Crossed Lines
View on Project EulerProject Euler Problem 630 Solution
EulerSolve provides an optimized solution for Project Euler Problem 630, Crossed Lines, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary The problem defines a deterministic pseudo-random sequence, uses it to generate \(n\) lattice points in the square \([-1000,999]^2\), and then forms a geometric line from every unordered pair of distinct points. Let \(M\) be the number of distinct lines that appear. For each distinct line \(L\), count how many other distinct lines cross it, and sum those counts over all \(L\). That total is \(S\), which is equivalently the number of ordered pairs of distinct non-parallel lines. The task is to compute \(S\) for \(n=2500\). Mathematical Approach The solution has two parts: represent every geometric line in a unique integer form, then count how many of those distinct lines share the same slope. Step 1: Generate the deterministic point set The points come from the recurrence $S_0=290797,\qquad S_{k+1}=S_k^2 \bmod 50515093,$ followed by $T_k=(S_k \bmod 2000)-1000,\qquad P_n=(T_{2n-1},T_{2n}).$ So every point has integer coordinates in \([-1000,999]\). The sequence is completely deterministic, which means every implementation works with exactly the same input set. Step 2: Represent every geometric line canonically Take two distinct points \((x_1,y_1)\) and \((x_2,y_2)\)....
Detailed mathematical approach
Problem Summary
The problem defines a deterministic pseudo-random sequence, uses it to generate \(n\) lattice points in the square \([-1000,999]^2\), and then forms a geometric line from every unordered pair of distinct points. Let \(M\) be the number of distinct lines that appear. For each distinct line \(L\), count how many other distinct lines cross it, and sum those counts over all \(L\). That total is \(S\), which is equivalently the number of ordered pairs of distinct non-parallel lines. The task is to compute \(S\) for \(n=2500\).
Mathematical Approach
The solution has two parts: represent every geometric line in a unique integer form, then count how many of those distinct lines share the same slope.
Step 1: Generate the deterministic point set
The points come from the recurrence
$S_0=290797,\qquad S_{k+1}=S_k^2 \bmod 50515093,$
followed by
$T_k=(S_k \bmod 2000)-1000,\qquad P_n=(T_{2n-1},T_{2n}).$
So every point has integer coordinates in \([-1000,999]\). The sequence is completely deterministic, which means every implementation works with exactly the same input set.
Step 2: Represent every geometric line canonically
Take two distinct points \((x_1,y_1)\) and \((x_2,y_2)\). Write
$\Delta x=x_2-x_1,\qquad \Delta y=y_2-y_1,\qquad g=\gcd(|\Delta x|,|\Delta y|).$
After dividing by \(g\), the primitive direction vector is
$u=\frac{\Delta x}{g},\qquad v=\frac{\Delta y}{g}.$
A primitive normal vector is then
$A=v,\qquad B=-u,$
and the line can be written as
$Ax+By=C,\qquad C=Ax_1+By_1.$
To make this representation unique, multiply \((A,B,C)\) by \(-1\) whenever \(A \lt 0\), or when \(A=0\) and \(B \lt 0\). After that sign rule, one geometric line corresponds to exactly one normalized triple \((A,B,C)\). Vertical and horizontal lines are handled automatically by the same formula.
Step 3: Count distinct lines
There are
$P=\binom{n}{2}$
unordered point pairs. Each pair produces one candidate line key, except coincident points, which are ignored because they do not determine a unique line. If three or more points are collinear, several different pairs produce the same normalized triple. Let \(\mathcal{L}\) be the set of distinct normalized triples after deduplication. Then
$M=\lvert\mathcal{L}\rvert.$
Step 4: Group parallel classes and derive the crossing formula
Two distinct normalized lines are parallel exactly when they have the same normalized normal vector \((A,B)\). Let the parallel classes have sizes
$m_1,m_2,\dots,m_r,\qquad m_1+\cdots+m_r=M.$
The number of ordered pairs of distinct lines is
$M(M-1).$
Within a single class of size \(m_t\), the ordered parallel pairs number
$m_t(m_t-1).$
Once duplicate lines have already been merged, any two distinct non-parallel lines intersect exactly once in the Euclidean plane. Hence the required total is
$\boxed{S=M(M-1)-\sum_{t=1}^{r} m_t(m_t-1).}$
Step 5: Worked example with the first three generated points
The first three generated points are
$P_1=(527,144),\qquad P_2=(-488,732),\qquad P_3=(-454,-947).$
The three normalized lines are
$L_{12}: 84x+145y=65148,$
$L_{13}: 1091x-981y=433693,$
$L_{23}: 1679x+34y=-794464.$
All three triples are different, and no two of them share the same \((A,B)\). So every parallel class has size \(1\), which gives
$M=3,\qquad S=3\cdot 2-0=6.$
This is the smallest benchmark used by the implementation, and it confirms that the ordered-pair interpretation is the correct one.
How the Code Works
The C++, Python, and Java implementations all follow the same mathematical pipeline. They generate the point sequence, iterate over every pair \(i<j\), compute the normalized line coefficients, and collect those line keys. The Python implementation stores tuple keys in a set and sorts the distinct set afterward, while the C++ and Java implementations store compact integer keys so the full list can be sorted directly and deduplicated efficiently.
After sorting, consecutive equal keys are merged to obtain \(M\). A second linear scan groups consecutive lines by their slope part \((A,B)\). If one group contains \(m\) distinct lines, the implementation adds \(m(m-1)\) to the total number of ordered parallel pairs. Subtracting that sum from \(M(M-1)\) yields the final answer.
The compiled implementation also includes deterministic sanity checks: it confirms the first three generated points and verifies the benchmark values \(M=4948\) and \(S=24477690\) for the first \(100\) generated points before evaluating the full case \(n=2500\).
Complexity Analysis
Let \(P=\binom{n}{2}\). Generating all candidate lines takes \(O(P)\) arithmetic operations, and the gcd inputs are bounded because all coordinates stay inside a fixed box. Sorting dominates the runtime, so the total time is \(O(P\log P)\), which is \(O(n^2\log n)\). The memory usage is \(O(P)\), since the algorithm stores one key per point pair before deduplication.
Footnotes and References
- Problem page: https://projecteuler.net/problem=630
- Line (geometry): Wikipedia — Line (geometry)
- Greatest common divisor: Wikipedia — Greatest common divisor
- Finding the equation of a line from two points: cp-algorithms — Finding the equation of a line for a segment
- Line arrangement: Wikipedia — Line arrangement