Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 247: Squares Under a Hyperbola

Project Euler Problem 247 Solution

EulerSolve provides an optimized solution for Project Euler Problem 247, Squares Under a Hyperbola, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary Problem 247 builds an infinite family of axis-aligned squares inside the region below the rectangular hyperbola $y=1/x$, starting from the corner $(1,0)$. From any exposed lower-left corner $(x,y)$, the next square is defined to be the unique largest square anchored at that corner whose upper-right corner still lies on the curve. Each square receives a label $(\text{left},\text{below})$. Moving to the right child increases $\text{left}$ by 1, and moving to the upper child increases $\text{below}$ by 1. All squares are then ranked globally by decreasing side length, not by tree depth. The actual target in this problem is the last square labeled $(3,3)$, so the task is to determine where the 20th such square appears in that global size order. Mathematical Approach The geometry gives a closed formula for the next square, and the recursive construction turns naturally into a best-first search on exposed corners. The crucial point is that the code never tries to list squares in depth-first or breadth-first order; it always extracts the currently largest available square. The maximal square anchored at a corner Fix a corner $(x,y)$ with $x > 0$ and $0 \le y < 1/x$. Let $t=t(x,y)$ be the side length of the largest square whose lower-left corner is $(x,y)$ and whose sides remain parallel to the axes....

Detailed mathematical approach

Problem Summary

Problem 247 builds an infinite family of axis-aligned squares inside the region below the rectangular hyperbola $y=1/x$, starting from the corner $(1,0)$. From any exposed lower-left corner $(x,y)$, the next square is defined to be the unique largest square anchored at that corner whose upper-right corner still lies on the curve.

Each square receives a label $(\text{left},\text{below})$. Moving to the right child increases $\text{left}$ by 1, and moving to the upper child increases $\text{below}$ by 1. All squares are then ranked globally by decreasing side length, not by tree depth. The actual target in this problem is the last square labeled $(3,3)$, so the task is to determine where the 20th such square appears in that global size order.

Mathematical Approach

The geometry gives a closed formula for the next square, and the recursive construction turns naturally into a best-first search on exposed corners. The crucial point is that the code never tries to list squares in depth-first or breadth-first order; it always extracts the currently largest available square.

The maximal square anchored at a corner

Fix a corner $(x,y)$ with $x > 0$ and $0 \le y < 1/x$. Let $t=t(x,y)$ be the side length of the largest square whose lower-left corner is $(x,y)$ and whose sides remain parallel to the axes. The square occupies $[x,x+t]\times[y,y+t]$, so maximality occurs exactly when its upper-right corner touches the hyperbola:

$y+t=\frac{1}{x+t}.$

Multiplying through gives a quadratic equation:

$t^2+(x+y)t+(xy-1)=0.$

The positive root is

$t(x,y)=\frac{\sqrt{(x-y)^2+4}-(x+y)}{2}.$

This is the quantity computed in all three implementations. Because every generated corner stays strictly below the curve, the expression under the square root is positive and the chosen root is the unique feasible side length.

The recursive state space of exposed corners

Once the square of side $t$ has been placed at $(x,y)$, two new exposed lower-left corners matter for the continuation:

$\text{right child}: (x+t,y),\qquad \text{upper child}: (x,y+t).$

Those two children correspond exactly to the two ways a later square can first touch the placed square: along its right side or along its top side. If a square has label $(L,B)$, then its children have labels

$ (L+1,B)\qquad\text{and}\qquad (L,B+1). $

So the whole construction is a binary tree of corners. The geometric data of a node are the corner $(x,y)$ and the induced side $t(x,y)$; the combinatorial data are the two label counters $(L,B)$.

Why the priority queue gives the true global ordering

The ranking required by the problem is global: at every step we need the largest square among all squares that could ever appear later, not just among the children of the most recently processed node. The key monotonicity fact is that moving right or upward can only shrink the next square. If $x_1 \ge x_0$ and $y_1 \ge y_0$, with at least one inequality strict, then the feasible region under $y=1/x$ above $(x_1,y_1)$ is smaller than the one above $(x_0,y_0)$, hence

$t(x_1,y_1) < t(x_0,y_0).$

Therefore every descendant of a node is strictly smaller than that node. At any moment, every unseen square is a descendant of one of the corners already sitting on the frontier, so the largest unseen square must already be present on that frontier. A max-heap over side lengths is therefore enough to reproduce the exact global order demanded by the problem.

Counting how many target squares must be seen

The label recurrence is purely combinatorial. To reach $(L,B)$, one must take $L$ right moves and $B$ up moves in some order. If $N(L,B)$ is the number of nodes with that label, then the tree structure gives Pascal's recurrence

$N(L,B)=N(L-1,B)+N(L,B-1),$

with boundary values $N(L,0)=N(0,B)=1$. Hence

$N(L,B)=\binom{L+B}{L}.$

For the actual target $(3,3)$, this means

$N(3,3)=\binom{6}{3}=20.$

So the correct stopping rule is not “stop at the first $(3,3)$ square”. The algorithm must continue until the 20th such square has been popped from the heap, because that pop index is the required answer.

Worked example: the two squares labeled $(1,1)$

The root corner is $(1,0)$, so the first square has side

$t(1,0)=\frac{\sqrt{5}-1}{2}\approx 0.6180339887.$

Its right child is anchored at $(1+t(1,0),0)\approx(1.61803,0)$, with side

$t(1.61803,0)\approx 0.4772599965.$

Its upper child is anchored at $(1,t(1,0))\approx(1,0.61803)$, with side

$t(1,0.61803)\approx 0.2090569265.$

Now look at label $(1,1)$. There are two paths to it, $RU$ and $UR$, so there are two distinct squares. Their sides are approximately

$t_{RU}\approx 0.1035877034,\qquad t_{UR}\approx 0.1292042862.$

The labels match, but the sizes do not. The $UR$ square appears earlier in the global ranking because it is larger. This small example is exactly why the code counts occurrences of a target label instead of stopping at the first match.

How the Code Works

The C++, Python, and Java implementations all follow the same best-first search, differing only in language syntax and numeric types.

State kept for each candidate square

Each heap entry stores four pieces of information: the current label $(\text{left},\text{below})$, the geometric corner $(x,y)$, and the side length obtained from the closed formula above. The priority key is simply the side length, ordered so that the largest pending square is removed first.

One pop, one rank, two pushes

The search starts with the root corner $(1,0)$. Whenever the implementation pops the current maximum from the heap, that pop number is exactly the square's global index. If the popped label is $(3,3)$, the counter of seen target squares is incremented. Then the implementation computes the right and upper child corners, evaluates their side lengths with the same hyperbola formula, and pushes both children into the heap.

Why the stopping rule is exact

Because there are exactly $\binom{6}{3}=20$ squares labeled $(3,3)$, the loop stops when the 20th one is popped. The stored pop counter at that moment is the desired index. No post-processing is needed: the heap order is already the ranking order defined by the problem statement.

Complexity Analysis

Let $A(L,B)$ denote the index of the last square with label $(L,B)$ in the global size order. To solve the target $(L,B)$, the implementation must perform exactly $A(L,B)$ heap pops. Each pop is paired with two pushes, and each heap operation costs $O(\log M)$ when the heap currently contains $M$ elements.

Therefore the running time is $O(A(L,B)\log A(L,B))$. After $M$ pops the heap contains $M+1$ nodes, so the memory usage is $O(A(L,B))$. For the concrete Project Euler target $(3,3)$, this is easily practical.

Footnotes and References

Project Euler problem page: Project Euler 247 - Squares under a hyperbola
Rectangular hyperbola: Wikipedia - Rectangular hyperbola
Quadratic equation: Wikipedia - Quadratic equation
Binomial coefficient: Wikipedia - Binomial coefficient
Priority queue: Wikipedia - Priority queue