Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 623: Lambda Count

Project Euler Problem 623 Solution

EulerSolve provides an optimized solution for Project Euler Problem 623, Lambda Count, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary Let $\Lambda(n)$ denote the number of closed lambda-terms that can be written with at most $n$ symbols, where terms that differ only by renaming bound variables are identified as the same object. The task is to compute $\Lambda(2000)$ modulo $10^9+7$. A direct enumeration is hopeless at this scale. The effective strategy is to count terms by exact size, track how many binders are currently available, and postpone the cumulative sum until the very end. Mathematical Approach For the exact-size dynamic program, define $A_d(s)$ as the number of terms of exact size $s$ that are built in a context with $d$ available binders. The final answer is then $\Lambda(n)=\sum_{s=1}^{n} A_0(s).$ The state $d=0$ corresponds to closed terms, because at the outermost level there are no binders available for a free variable to reference. Step 1: Encode $\alpha$-equivalence by binder depth Once $\alpha$-renaming is factored out, a variable occurrence is determined only by which enclosing abstraction it refers to. If the current context contains $d$ surrounding binders, then there are exactly $d$ valid variable choices. That gives the variable contribution $d\mathbf{1}_{s=1},$ where $\mathbf{1}_{s=1}$ enforces that a variable has size $1$....

Detailed mathematical approach

Problem Summary

Let $\Lambda(n)$ denote the number of closed lambda-terms that can be written with at most $n$ symbols, where terms that differ only by renaming bound variables are identified as the same object. The task is to compute $\Lambda(2000)$ modulo $10^9+7$.

A direct enumeration is hopeless at this scale. The effective strategy is to count terms by exact size, track how many binders are currently available, and postpone the cumulative sum until the very end.

Mathematical Approach

For the exact-size dynamic program, define $A_d(s)$ as the number of terms of exact size $s$ that are built in a context with $d$ available binders. The final answer is then

$\Lambda(n)=\sum_{s=1}^{n} A_0(s).$

The state $d=0$ corresponds to closed terms, because at the outermost level there are no binders available for a free variable to reference.

Step 1: Encode $\alpha$-equivalence by binder depth

Once $\alpha$-renaming is factored out, a variable occurrence is determined only by which enclosing abstraction it refers to. If the current context contains $d$ surrounding binders, then there are exactly $d$ valid variable choices.

That gives the variable contribution

$d\mathbf{1}_{s=1},$

where $\mathbf{1}_{s=1}$ enforces that a variable has size $1$. This is the same binder-depth viewpoint that underlies de Bruijn-style representations, but here we use it purely as a counting device.

Step 2: Decompose the term by its outer constructor

The implementations encode the statement's size model in a very direct way. A variable has size $1$. Wrapping a body in one abstraction adds $5$ symbols. Forming an application of two subterms adds $2$ symbols around the pair.

Therefore the exact-size counts satisfy the recurrence

$A_d(s)=d\mathbf{1}_{s=1}+A_{d+1}(s-5)+\sum_{i=1}^{s-3} A_d(i)\,A_d(s-2-i),$

with the usual convention that terms with non-positive or impossible sizes contribute $0$. The application sum uses ordered pairs: exchanging the left and right child generally produces a different term.

Step 3: Bound the relevant depth and size window

Only finitely many depth states can matter for a target bound $n$. To reach depth $d$ from the top level, we must already have paid for $d$ abstractions, which costs $5d$ symbols. Even the cheapest completion below that point is a single variable of size $1$.

Hence any contributing state must satisfy

$5d+1\le n,$

so the maximum relevant depth is

$d_{\max}=\left\lfloor\frac{n-1}{5}\right\rfloor.$

More generally, at depth $d$ we only need exact sizes

$1\le s\le n-5d.$

This truncation is what makes the dynamic program finite and practical.

Step 4: Turn the application term into a running convolution

For fixed depth $d$, define the ordered convolution

$C_d(k)=\sum_{i+j=k} A_d(i)\,A_d(j).$

Then the application contribution to size $s$ is simply $C_d(s-2)$. Recomputing that sum from scratch for every $s$ would be wasteful, so the implementations maintain it incrementally while sweeping sizes upward.

After $A_d(s)$ is known, it contributes to every future total $C_d(s+u)$. If $u\ne s$, the pair of size buckets $(s,u)$ can appear in both left-right orders, so the off-diagonal contribution is doubled. If $u=s$, the product $A_d(s)^2$ already counts all ordered pairs of two size-$s$ terms, so no extra factor is needed.

Step 5: Worked example for the first nonzero sizes

At depth $1$, there is exactly one size-$1$ term, because the lone available binder can be referenced in exactly one way:

$A_1(1)=1.$

Therefore the first closed term appears at size $6$, obtained by placing one abstraction around that size-$1$ body:

$A_0(6)=A_1(1)=1.$

Next consider depth $1$ and size $4$. The only possibility is an application of two size-$1$ variables, so

$A_1(4)=A_1(1)\,A_1(1)=1.$

Wrapping that in one abstraction gives the next closed term at size $9$:

$A_0(9)=A_1(4)=1.$

Hence the first cumulative values are

$\Lambda(6)=1,\qquad \Lambda(9)=2.$

The same recurrence continues to produce the further small checkpoints

$\Lambda(15)=20,\qquad \Lambda(35)=3166438,$

which agree with the values used for implementation sanity checks.

How the Code Works

The C++, Python, and Java implementations all follow the same recurrence. They start from the largest relevant depth $d_{\max}=\lfloor(2000-1)/5\rfloor$ and move downward to depth $0$. At any moment, only two depth layers are needed: the counts for the current depth and the counts for the next deeper depth.

For a fixed depth $d$, the implementation sweeps the exact size $s$ from $1$ up to $2000-5d$. Each value $A_d(s)$ is assembled from three ingredients: the variable term $d\mathbf{1}_{s=1}$, the abstraction term coming from depth $d+1$ and size $s-5$, and the application term read from the running convolution buffer at index $s-2$.

After computing the count for size $s$, the implementation updates the convolution buffer so that future sizes can reuse this new information immediately. All arithmetic is reduced modulo $10^9+7$. Once depth $0$ has been completed, the exact counts are prefix-summed to obtain $\Lambda(1),\Lambda(2),\dots,\Lambda(2000)$, and the final reported value is $\Lambda(2000)$.

Complexity Analysis

At depth $d$, the active size window is $S=2000-5d$, and the incremental convolution updates cost $O(S^2)$. Summing over all relevant depths gives

$\sum_{d=0}^{\lfloor(2000-1)/5\rfloor} O\!\left((2000-5d)^2\right)=O(2000^3),$

so the asymptotic running time is cubic in the target size bound $n$. The memory usage is only linear, because the algorithm stores two exact-size arrays and one auxiliary convolution array, each of length $O(n)$.

Footnotes and References

Problem page: Project Euler 623 - Lambda Count
Lambda calculus overview: Wikipedia - Lambda calculus
Binder-depth encoding: Wikipedia - De Bruijn index
Dynamic programming: Wikipedia - Dynamic programming
Convolution: Wikipedia - Convolution