Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 696: Mahjong

View on Project Euler

Project Euler Problem 696 Solution

EulerSolve provides an optimized solution for Project Euler Problem 696, Mahjong, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary We are given \(s\) suits, each suit has ranks \(1,2,\dots,n\), and each tile type appears at most four times. A valid Mahjong-style hand consists of exactly \(t\) melds together with one pair, where a meld is either a triple of identical tiles or a run of three consecutive ranks inside one suit. The goal is to count all valid hands modulo \(10^9+7\). The implementations evaluate the extremely large case \(W(10^8,10^8,30)\), so any approach based on direct enumeration is hopeless. The entire solution therefore compresses the local suit constraints into a finite automaton, builds one-suit counting polynomials, and only then combines the suits. Mathematical Approach Let \(W(n,s,t)\) denote the number of valid hands with \(n\) ranks per suit, \(s\) suits, and exactly \(t\) melds. The key observation is that suits interact only through the total meld count and the fact that exactly one suit contains the unique pair. Step 1: Split a Suit into Independent Positive Blocks Fix one suit and write its tile multiplicities as \(c_1,\dots,c_n\), where every \(c_i\) lies in \(\{0,1,2,3,4\}\). A zero count breaks all possible runs, so no meld can cross a rank with \(c_i=0\). Therefore a suit decomposes into maximal contiguous blocks of positive counts....

Detailed mathematical approach

Problem Summary

We are given \(s\) suits, each suit has ranks \(1,2,\dots,n\), and each tile type appears at most four times. A valid Mahjong-style hand consists of exactly \(t\) melds together with one pair, where a meld is either a triple of identical tiles or a run of three consecutive ranks inside one suit.

The goal is to count all valid hands modulo \(10^9+7\). The implementations evaluate the extremely large case \(W(10^8,10^8,30)\), so any approach based on direct enumeration is hopeless. The entire solution therefore compresses the local suit constraints into a finite automaton, builds one-suit counting polynomials, and only then combines the suits.

Mathematical Approach

Let \(W(n,s,t)\) denote the number of valid hands with \(n\) ranks per suit, \(s\) suits, and exactly \(t\) melds. The key observation is that suits interact only through the total meld count and the fact that exactly one suit contains the unique pair.

Step 1: Split a Suit into Independent Positive Blocks

Fix one suit and write its tile multiplicities as \(c_1,\dots,c_n\), where every \(c_i\) lies in \(\{0,1,2,3,4\}\). A zero count breaks all possible runs, so no meld can cross a rank with \(c_i=0\).

Therefore a suit decomposes into maximal contiguous blocks of positive counts. Each block can be analyzed independently, and the only remaining global question inside that suit is how many zero gaps are inserted between the blocks. This is why the solution first counts contiguous positive blocks and postpones the placement of zeros until later.

Step 2: Encode One Positive Block with a Small Automaton

Consider a positive block with counts \(d_1,\dots,d_L\), where every \(d_i\in\{1,2,3,4\}\). We process the block from left to right. Before rank \(i\), the local state is \((u,v,\pi)\):

\(u\) is the number of runs started at rank \(i-1\) that still need tiles at ranks \(i\) and \(i+1\).

\(v\) is the number of runs started at rank \(i-2\) that need one final tile at rank \(i\).

\(\pi\in\{0,1\}\) records whether the unique pair has already been used inside the current block.

At rank \(i\), at least \(u+v\) tiles are forced, because those tiles must continue or finish the already open runs. If \(d_i\ge u+v\), let

$r_i=d_i-u-v.$

From these remaining tiles we may optionally spend \(2\) tiles on the pair, start some new runs at rank \(i\), and use the rest as triples. If \(\delta\in\{0,1\}\) indicates whether the pair is used at rank \(i\), and \(q_i\) is the number of new runs started there, then the transition is valid exactly when

$0\le q_i\le r_i-2\delta,\qquad r_i-2\delta-q_i\equiv 0 \pmod 3.$

The next state becomes

$ (u,v,\pi)\longrightarrow(q_i,u,\max(\pi,\delta)). $

Because several choices of \((\delta,q_i)\) may be possible for the same input count \(d_i\), the natural local machine is nondeterministic. The implementations determinize it once and then run ordinary dynamic programming on the reachable deterministic states.

Step 3: Count Contiguous Blocks by Length and Tile Total

Let \(\mathcal{B}_0(L,M)\) be the number of positive blocks of length \(L\) and total tile count \(M\) that end with no unfinished runs and no pair. Let \(\mathcal{B}_1(L,M)\) be the analogous count with exactly one pair used.

These values are obtained by pushing the deterministic automaton through all block lengths up to \(3t+2\) and all tile totals up to \(3t+2\). That cutoff is sufficient because a complete hand with \(t\) melds and one pair contains only \(3t+2\) tiles in total, so no single suit or block can contribute more than that.

The residue classes are forced:

$\mathcal{B}_0(L,M)\neq 0 \Longrightarrow M\equiv 0 \pmod 3,$

$\mathcal{B}_1(L,M)\neq 0 \Longrightarrow M\equiv 2 \pmod 3.$

Indeed, a block with no pair is built entirely from melds and therefore uses a multiple of three tiles, while a block with one pair uses \(2\) more than a multiple of three.

Step 4: Assemble Several Blocks Inside One Suit

A suit can contain several positive blocks separated by zero-count ranks. Let \(\mathcal{D}_{0,k}(L,M)\) be the number of ways to choose exactly \(k\) blocks whose total positive length is \(L\), total tile count is \(M\), and which contain no pair overall. Let \(\mathcal{D}_{1,k}(L,M)\) be the analogous quantity with exactly one pair overall.

The block convolution is then

$\mathcal{D}_{0,k+1}(L+L',M+M')\;{+}{=}\;\mathcal{D}_{0,k}(L,M)\,\mathcal{B}_0(L',M'),$

$\mathcal{D}_{1,k+1}(L+L',M+M')\;{+}{=}\;\mathcal{D}_{1,k}(L,M)\,\mathcal{B}_0(L',M')+\mathcal{D}_{0,k}(L,M)\,\mathcal{B}_1(L',M').$

Only one block is allowed to contribute the pair, so the second recurrence has exactly two sources. Also, there can be at most \(t+1\) blocks in one suit, because each nonempty block must contribute at least one group, and a full hand has only \(t\) melds plus one pair.

Step 5: Put Those Blocks Back into the \(n\) Ranks

Suppose a one-suit configuration uses \(k\) positive blocks whose lengths sum to \(L\). The remaining \(n-L\) ranks have count zero. Write the zero gaps as

$z_0+z_1+\cdots+z_k=n-L,$

where \(z_0\) is the leading zero run, \(z_k\) is the trailing zero run, and the interior gaps \(z_1,\dots,z_{k-1}\) must each satisfy \(z_i\ge 1\) so that the blocks stay separated.

After setting \(z_i'=z_i-1\) for \(1\le i\le k-1\), we get

$z_0+z_1'+\cdots+z_{k-1}'+z_k=n-L-k+1,$

so the number of embeddings is

$\binom{n-L+1}{k}.$

This placement factor is one of the main simplifications in the algorithm. For example, if \(n=7\), the chosen blocks have total positive length \(L=4\), and there are \(k=2\) blocks, then the number of placements is \(\binom{7-4+1}{2}=\binom{4}{2}=6\).

Step 6: Convert One Suit into Generating Functions

After multiplying each \(\mathcal{D}_{\varepsilon,k}(L,M)\) by the placement factor, we obtain one-suit totals

$A_M(n)=\sum_{k,L}\mathcal{D}_{0,k}(L,M)\binom{n-L+1}{k},$

$B_M(n)=\sum_{k,L}\mathcal{D}_{1,k}(L,M)\binom{n-L+1}{k}.$

Now define \(T_u(n)=A_{3u}(n)\) and \(P_u(n)=B_{3u+2}(n)\). In words, \(T_u(n)\) counts one-suit configurations contributing exactly \(u\) melds and no pair, while \(P_u(n)\) counts one-suit configurations contributing exactly \(u\) melds together with the pair.

Introduce the truncated generating functions

$\mathcal{T}_n(x)=\sum_{u\ge 0}T_u(n)x^u,\qquad \mathcal{P}_n(x)=\sum_{u\ge 0}P_u(n)x^u.$

Exactly one suit carries the pair, so the full answer is

$\boxed{W(n,s,t)=s\cdot [x^t]\;\mathcal{P}_n(x)\,\mathcal{T}_n(x)^{\,s-1}\pmod{10^9+7}.}$

The coefficient of \(x^t\) enforces that the total number of melds across all suits is exactly \(t\).

Worked Example: \(W(4,1,1)=20\)

With one suit, four ranks, and exactly one meld, every legal hand has \(3\cdot 1+2=5\) tiles. There are two cases.

Triple plus pair: choose the triple rank in \(4\) ways and the pair rank in one of the remaining \(3\) ways, giving \(4\cdot 3=12\).

Run plus pair: the only runs are \(123\) and \(234\), so there are \(2\) possible runs. For each run, the pair can be placed at any of the \(4\) ranks, giving \(2\cdot 4=8\).

Therefore

$W(4,1,1)=12+8=20,$

which agrees with the small sanity case checked by the implementations.

How the Code Works

The C++, Python, and Java implementations all follow the same pipeline. First they build the reachable deterministic automaton corresponding to the local block rules above. Acceptance is split into two categories: blocks that finish with no pair and blocks that finish with exactly one pair.

Next they run a dynamic program over block length and tile total, but only for positive counts \(1,2,3,4\). Zero-count ranks are deliberately excluded at this stage because zeros are handled later by the explicit placement factor \(\binom{n-L+1}{k}\).

After the contiguous block counts are known, the implementations run a second convolution-style dynamic program that combines many blocks inside one suit, once for the no-pair case and once for the one-pair case. This produces the tables indexed by block count, total positive length, and total tile count.

Then, for each admissible pair \((k,L)\), they multiply by \(\binom{n-L+1}{k}\) modulo \(10^9+7\). Because \(k\le t+1\) is small even when \(n\) is enormous, the binomial coefficient is computed as a short falling product times the modular inverse of \(k!\), rather than by precomputing factorials up to \(n\).

Finally the one-suit coefficients are packed into the truncated polynomials \(\mathcal{T}_n(x)\) and \(\mathcal{P}_n(x)\). The no-pair polynomial is raised to the \((s-1)\)-st power by binary exponentiation with truncation at degree \(t\), multiplied by the pair polynomial, and the coefficient of \(x^t\) is extracted and multiplied by \(s\).

Complexity Analysis

Let \(T=3t+2\). The deterministic automaton has constant size with respect to the input parameters, so the local block precomputation runs in \(O(T^2)\) time and uses \(O(T^2)\) working memory, up to that constant automaton factor.

The dominant part is the block-combination dynamic program. There are \(O(T^2)\) possible block types \((L,M)\), each convolution touches \(O(T^2)\) table cells, and this is repeated for \(O(t)\) block counts. Since \(T=O(t)\), the full precomputation is \(O(t^5)\) time with \(O(t^3)\) stored state.

Once the one-suit tables are built, the final evaluation for given \(n\) and \(s\) takes \(O(t^3+t^2\log s)\) time: \(O(t^3)\) to aggregate the one-suit coefficients and \(O(t^2\log s)\) for truncated polynomial exponentiation. This is exactly why the method remains practical even when \(n\) and \(s\) are both \(10^8\).

Footnotes and References

  1. Problem page: https://projecteuler.net/problem=696
  2. Deterministic finite automaton: Wikipedia - Deterministic finite automaton
  3. Dynamic programming: Wikipedia - Dynamic programming
  4. Stars and bars: Wikipedia - Stars and bars
  5. Generating function: Wikipedia - Generating function
  6. Exponentiation by squaring: Wikipedia - Exponentiation by squaring

Mathematical approach · C++ solution · Python solution · Java solution

Previous: Problem 695 · All Project Euler solutions · Next: Problem 697

Need help with a problem? Ask me! 💡
e
✦ Euler GLM 5.2
Hi! I'm Euler. Ask me anything about Project Euler problems, math concepts, or solution approaches! 🧮