Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 520: Simbers

Project Euler Problem 520 Solution

EulerSolve provides an optimized solution for Project Euler Problem 520, Simbers, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary A positive integer is called a simber when every even digit occurs an even number of times, and every odd digit that occurs does so an odd number of times. If $Q(n)$ denotes the number of simbers below $10^n$, the task is to evaluate $\sum_{u=1}^{39} Q(2^u)\pmod{1{,}000{,}000{,}123}.$ Because $2^{39}$ is enormous, direct enumeration is impossible. The solution therefore converts the digit-parity rules into algebraic projectors, compresses each fixed-length count into a short linear combination of powers $\lambda^t$, and then sums those powers as geometric series modulo the given modulus. Mathematical Approach Let $E=\{0,2,4,6,8\}$ and $O=\{1,3,5,7,9\}$. For any decimal string, let $c_d$ be the number of occurrences of digit $d$. Step 1: Turn Each Digit Rule into a Parity Projector For an even digit, the allowed counts are $0,2,4,\dots$, so its indicator is $I_{\mathrm{even}}(c)=\frac{1+(-1)^c}{2}.$ For an odd digit, the allowed counts are $0,1,3,5,\dots$: zero is allowed, but every positive even count is forbidden. A convenient form is $I_{\mathrm{odd}}(c)=\delta_{c,0}+\frac{1-(-1)^c}{2}=\frac{2\cdot 0^c+1^c-(-1)^c}{2},$ with the usual convention $0^0=1$. This identity gives $1$ when $c=0$, $1$ when $c$ is odd, and $0$ when $c$ is a positive even number....

Detailed mathematical approach

Problem Summary

A positive integer is called a simber when every even digit occurs an even number of times, and every odd digit that occurs does so an odd number of times. If $Q(n)$ denotes the number of simbers below $10^n$, the task is to evaluate

$\sum_{u=1}^{39} Q(2^u)\pmod{1{,}000{,}000{,}123}.$

Because $2^{39}$ is enormous, direct enumeration is impossible. The solution therefore converts the digit-parity rules into algebraic projectors, compresses each fixed-length count into a short linear combination of powers $\lambda^t$, and then sums those powers as geometric series modulo the given modulus.

Mathematical Approach

Let $E=\{0,2,4,6,8\}$ and $O=\{1,3,5,7,9\}$. For any decimal string, let $c_d$ be the number of occurrences of digit $d$.

Step 1: Turn Each Digit Rule into a Parity Projector

For an even digit, the allowed counts are $0,2,4,\dots$, so its indicator is

$I_{\mathrm{even}}(c)=\frac{1+(-1)^c}{2}.$

For an odd digit, the allowed counts are $0,1,3,5,\dots$: zero is allowed, but every positive even count is forbidden. A convenient form is

$I_{\mathrm{odd}}(c)=\delta_{c,0}+\frac{1-(-1)^c}{2}=\frac{2\cdot 0^c+1^c-(-1)^c}{2},$

with the usual convention $0^0=1$. This identity gives $1$ when $c=0$, $1$ when $c$ is odd, and $0$ when $c$ is a positive even number.

Step 2: Count All Simber Strings of One Fixed Length

Let $A_t$ be the number of length-$t$ decimal strings that satisfy the simber rules when leading zero is allowed. If the count vector is $(c_0,\dots,c_9)$, then

$A_t=\sum_{\substack{c_0+\cdots+c_9=t\\ c_d\ge 0}} \frac{t!}{c_0!\cdots c_9!}\prod_{e\in E} I_{\mathrm{even}}(c_e)\prod_{o\in O} I_{\mathrm{odd}}(c_o).$

After multiplying by $2^{10}$ and expanding the projectors, each even digit contributes a choice of weight $+1$ or $-1$, while each odd digit contributes a choice of weight $0$, $+1$, or $-1$ with coefficients $2$, $1$, and $-1$. For one complete global choice of weights $w_0,\dots,w_9$, the multinomial theorem gives

$\sum_{\substack{c_0+\cdots+c_9=t\\ c_d\ge 0}} \frac{t!}{c_0!\cdots c_9!}\prod_{d=0}^{9} w_d^{\,c_d}=(w_0+\cdots+w_9)^t.$

So the exact identities of the digits no longer matter; only the total weight matters. If $a$ digits contribute $+1$ and $b$ digits contribute $-1$, then the contribution is $\lambda^t$ with $\lambda=a-b$. After aggregating equal values of $\lambda$, we obtain a small coefficient table $C(\lambda)$ such that

$A_t=\frac{1}{2^{10}}\sum_{\lambda=-10}^{10} C(\lambda)\lambda^t.$

Step 3: Subtract the Strings That Start with Zero

A positive integer with exactly $t$ digits corresponds to a length-$t$ string that does not begin with zero. Therefore we must subtract the simber strings whose first character is $0$.

If we remove that leading zero, the remaining suffix has length $t-1$. Since digit $0$ has already appeared once, its remaining count must be odd so that the total number of zeros is even. The other four nonzero even digits keep the usual even rule, and the five odd digits keep the usual “zero or odd” rule. Hence the distinguished zero digit uses

$J(c)=\frac{1-(-1)^c}{2},$

which is the indicator of an odd count. Let $B_s$ be the number of length-$s$ suffixes with that modified rule for digit $0$. By the same expansion-and-grouping argument, there is another coefficient table $D(\lambda)$ with

$B_s=\frac{1}{2^{10}}\sum_{\lambda=-10}^{10} D(\lambda)\lambda^s.$

Therefore the number of genuine $t$-digit simbers is

$N_t=A_t-B_{t-1}.$

Step 4: Worked Example for Two Digits

For $t=2$, the leading-zero-allowed simber strings are easy to classify. There are five repeated even-digit strings

$00,\ 22,\ 44,\ 66,\ 88,$

and there are $5\cdot 4=20$ ordered strings formed by two distinct odd digits. Hence $A_2=25$.

Among these, only $00$ starts with zero, so $B_1=1$ and

$N_2=A_2-B_1=25-1=24.$

The one-digit simbers are $1,3,5,7,9$, so $N_1=5$. Therefore

$Q(2)=N_1+N_2=5+24=29.$

This tiny case is exactly what the general projector method is automating for huge values of $n$.

Step 5: Sum Over All Lengths with Geometric Series

Now sum the exact-length counts from $1$ to $n$:

$Q(n)=\sum_{t=1}^{n} N_t=\frac{1}{2^{10}}\left(\sum_{\lambda} C(\lambda)\sum_{t=1}^{n}\lambda^t-\sum_{\lambda} D(\lambda)\sum_{s=0}^{n-1}\lambda^s\right).$

For $\lambda\neq 0,1$, the inner sums are

$\sum_{t=1}^{n}\lambda^t=\lambda\frac{\lambda^n-1}{\lambda-1},\qquad \sum_{s=0}^{n-1}\lambda^s=\frac{\lambda^n-1}{\lambda-1}.$

The exceptional values are handled separately:

$\sum_{t=1}^{n}1^t=n,\qquad \sum_{s=0}^{n-1}1^s=n,$

$\sum_{t=1}^{n}0^t=0,\qquad \sum_{s=0}^{n-1}0^s=1\quad (n\ge 1).$

All arithmetic is performed modulo

$M=1{,}000{,}000{,}123,$

so division by $\lambda-1$ and by $2^{10}$ becomes multiplication by modular inverses.

Step 6: Why the Whole Method Stays Small

The decimal alphabet is fixed. Each of the five odd digits contributes one of three weights $\{0,+1,-1\}$, and each of the five even digits contributes one of two weights $\{+1,-1\}$. Therefore $\lambda$ can only lie between $-10$ and $10$, so after aggregation there are at most $21$ distinct powers to evaluate. The difficult combinatorics has been compressed into two tiny coefficient tables, and the remaining work is just fast modular exponentiation together with short geometric-sum evaluations.

How the Code Works

The C++, Python, and Java implementations build the two coefficient tables directly from the projector factors. One table corresponds to all length-$t$ strings, and the other corresponds to the leading-zero correction in which digit $0$ must occur an odd number of times in the suffix. In both cases, the expansion is grouped only by the value of $\lambda$, because that is the only quantity that survives the multinomial sum.

For a requested $n$, the implementation evaluates the first table against $\sum_{t=1}^{n}\lambda^t$ and the second table against $\sum_{s=0}^{n-1}\lambda^s$. Fast modular exponentiation is used inside the closed forms for those geometric sums, and the final factor $2^{-10}$ is applied with a modular inverse.

Finally, the program computes $Q(2^u)$ for $u=1,2,\dots,39$ and accumulates the answers modulo $M$. It also checks two published checkpoints,

$Q(7)=287975,\qquad Q(100)\equiv 123864868 \pmod{M},$

and confirms the small case $Q(4)$ by direct brute force.

Complexity Analysis

With the decimal digit set fixed, constructing the two coefficient tables is constant work and constant memory. After aggregation, each table has only a small number of nonzero $\lambda$-entries, bounded by the range $-10$ to $10$.

Each evaluation of $Q(n)$ therefore needs only a short loop over those entries, and each term uses $O(\log n)$ time for modular exponentiation. So one query costs $O(L\log n)$, where $L$ is the number of distinct $\lambda$-values, and in this problem $L$ is tiny. Summing the 39 required values is easily within practical limits, and the memory usage stays $O(L)$.