Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 733: Ascending Subsequences

Project Euler Problem 733 Solution

EulerSolve provides an optimized solution for Project Euler Problem 733, Ascending Subsequences, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary The sequence is $a_i \equiv 153^i \pmod{10000019},\qquad 1 \le i \le 10^6.$ We must compute the sum, modulo $10^9+7$, over all index quadruples $i_1<i_2<i_3<i_4$ whose values are also strictly increasing: $S=\sum_{\substack{1\le i_1<i_2<i_3<i_4\le 10^6\\a_{i_1}<a_{i_2}<a_{i_3}<a_{i_4}}}\left(a_{i_1}+a_{i_2}+a_{i_3}+a_{i_4}\right)\pmod{10^9+7}.$ A direct $O(n^4)$ search is hopeless. The implementation instead counts increasing subsequences by length and keeps both how many there are and what their total element-sum contribution is. Mathematical Approach The key observation is that the final quantity is not only about counting valid quadruples. For every valid quadruple we also need the sum of its four values, so the dynamic program must track two pieces of information at every stage. Step 1: Define the DP states For $t\in\{1,2,3,4\}$, let $N_t(i)$ be the number of strictly increasing subsequences of length $t$ that end at position $i$. Let $W_t(i)$ be the sum of the element-sums of all those subsequences. For length $1$, the only subsequence ending at $i$ is $(a_i)$, so $N_1(i)=1,\qquad W_1(i)=a_i.$ For larger $t$, the last element is fixed at $a_i$, and the previous element must come from some earlier position $j<i$ with $a_j<a_i$....

Detailed mathematical approach

Problem Summary

The sequence is

$a_i \equiv 153^i \pmod{10000019},\qquad 1 \le i \le 10^6.$

We must compute the sum, modulo $10^9+7$, over all index quadruples $i_1<i_2<i_3<i_4$ whose values are also strictly increasing:

$S=\sum_{\substack{1\le i_1<i_2<i_3<i_4\le 10^6\\a_{i_1}<a_{i_2}<a_{i_3}<a_{i_4}}}\left(a_{i_1}+a_{i_2}+a_{i_3}+a_{i_4}\right)\pmod{10^9+7}.$

A direct $O(n^4)$ search is hopeless. The implementation instead counts increasing subsequences by length and keeps both how many there are and what their total element-sum contribution is.

Mathematical Approach

The key observation is that the final quantity is not only about counting valid quadruples. For every valid quadruple we also need the sum of its four values, so the dynamic program must track two pieces of information at every stage.

Step 1: Define the DP states

For $t\in\{1,2,3,4\}$, let $N_t(i)$ be the number of strictly increasing subsequences of length $t$ that end at position $i$. Let $W_t(i)$ be the sum of the element-sums of all those subsequences.

For length $1$, the only subsequence ending at $i$ is $(a_i)$, so

$N_1(i)=1,\qquad W_1(i)=a_i.$

For larger $t$, the last element is fixed at $a_i$, and the previous element must come from some earlier position $j<i$ with $a_j<a_i$.

Step 2: Derive the recurrence

If a valid subsequence of length $t-1$ ends at $j$, then appending $a_i$ produces a valid subsequence of length $t$ ending at $i$ whenever $j<i$ and $a_j<a_i$. Therefore

$N_t(i)=\sum_{\substack{j<i\\a_j<a_i}} N_{t-1}(j),\qquad t\ge 2.$

For the weighted version, every such extension adds $a_i$ once to each older subsequence. Hence

$W_t(i)=\sum_{\substack{j<i\\a_j<a_i}}\left(W_{t-1}(j)+a_i\,N_{t-1}(j)\right),\qquad t\ge 2.$

Rearranging gives the more useful form

$W_t(i)=\sum_{\substack{j<i\\a_j<a_i}} W_{t-1}(j)+a_i\,N_t(i).$

Once $W_4(i)$ is known for every $i$, the final answer is simply

$S=\sum_{i=1}^{10^6} W_4(i)\pmod{10^9+7}.$

Step 3: Convert value comparisons into rank comparisons

Let the distinct sequence values in sorted order be

$b_1<b_2<\dots<b_m,$

and assign each $a_i$ its compressed rank $r_i\in\{1,\dots,m\}$. Then

$a_j<a_i \iff r_j<r_i.$

This matters because the recurrence needs sums over all earlier positions whose values are smaller than the current value. After compression that becomes a prefix query over ranks $1$ through $r_i-1$. Using $r_i-1$ rather than $r_i$ enforces the strict inequality automatically, even if equal values ever appear.

Step 4: Maintain prefix totals with Fenwick trees

While scanning the sequence from left to right, define prefix accumulators

$F_t(x)=\sum_{\substack{j<i\\r_j\le x}} N_t(j),\qquad G_t(x)=\sum_{\substack{j<i\\r_j\le x}} W_t(j).$

At the moment we process position $i$, these contain information from earlier positions only, so the index condition $j<i$ is already built in. The recurrences become

$N_1(i)=1,\qquad W_1(i)=a_i,$

$N_t(i)=F_{t-1}(r_i-1),\qquad W_t(i)=G_{t-1}(r_i-1)+a_i\,N_t(i),\qquad t=2,3,4.$

Each pair $(F_t,G_t)$ can be maintained by a Fenwick tree over ranks, so every query and update costs $O(\log m)$.

Step 5: Why only three Fenwick structures are needed

To compute length $4$, the implementation only needs prefix information for lengths $1$, $2$, and $3$. After calculating $W_4(i)$, that contribution is added directly to the answer; no later step ever needs subsequences of length $4$ as input. Therefore the data structure layer stores only the first three levels.

This keeps the scan compact:

$\text{query length 1}\rightarrow \text{build length 2},$

$\text{query length 2}\rightarrow \text{build length 3},$

$\text{query length 3}\rightarrow \text{build length 4 contribution}.$

Worked Example: the first six terms

The first six generated values are

$153,\ 23409,\ 3581577,\ 7980255,\ 976697,\ 9434375.$

After sorting these six distinct values, their compressed ranks are

$1,\ 2,\ 4,\ 5,\ 3,\ 6.$

The valid increasing quadruples are

$\begin{aligned} &(1,2,3,4),\quad (1,2,3,6),\quad (1,2,4,6),\\ &(1,2,5,6),\quad (1,3,4,6),\quad (2,3,4,6). \end{aligned}$

Their individual value-sums are

$11585394,\ 13039514,\ 17438192,\ 10434634,\ 20996360,\ 21019616,$

so the total is

$94513710.$

The DP sees the same structure incrementally. At the fourth term, there is exactly one eligible length-$3$ subsequence before it, so it contributes $11585394$. The fifth term has rank $3$, so no earlier length-$3$ subsequence ends below that rank and it contributes $0$. The sixth term can extend five earlier length-$3$ subsequences, and its combined contribution is

$13039514+17438192+10434634+20996360+21019616=82928316.$

Adding the fourth and sixth term contributions gives

$11585394+82928316=94513710,$

which matches the exact small-prefix check.

How the Code Works

The C++, Python, and Java implementations follow the same pipeline. First they generate the $10^6$ sequence values iteratively modulo $10000019$. Next they sort the distinct values and replace each sequence element by its compressed rank.

During the main left-to-right pass, the implementation maintains three Fenwick structures, one for each subsequence length $1$, $2$, and $3$. Every structure stores two prefix aggregates at once: counts of increasing subsequences ending in each rank range, and the total sum of all element-sums of those subsequences.

For a new value, it queries only smaller ranks, which simultaneously enforces the value condition $a_j<a_i$ and, because the scan is left to right, the index condition $j<i$. From the queried length-$1$ data it builds the length-$2$ contribution, from length $2$ it builds length $3$, and from length $3$ it computes the length-$4$ contribution that is added straight into the final answer. After that, the newly created length-$1$, length-$2$, and length-$3$ data are inserted back into the Fenwick structures at the current rank.

All arithmetic is reduced modulo $10^9+7$ as the scan proceeds, so intermediate weighted sums never need arbitrary-precision storage.

Complexity Analysis

Let $n=10^6$ and let $m$ be the number of distinct generated values. Building the sequence costs $O(n)$. Coordinate compression requires sorting, so it costs $O(n\log n)$. The dynamic scan performs only a constant number of Fenwick prefix queries and updates per element, each in $O(\log m)$, for a total of $O(n\log m)$ after compression.

Since $m\le n$, the end-to-end running time is $O(n\log n)$. The implementations store the sequence, its sorted distinct copy, the compressed ranks, and three Fenwick layers, so memory usage is $O(n+m)$, which is $O(n)$ in this problem.

Footnotes and References

Problem page: https://projecteuler.net/problem=733
Fenwick tree: Wikipedia — Fenwick tree
Dynamic programming: Wikipedia — Dynamic programming
Subsequence: Wikipedia — Subsequence
Modular arithmetic: Wikipedia — Modular arithmetic