learning.
learning9 min read

build sudoku + sudoku solver from scratch

Lesson 3 — Phase 3 of the sudoku app. Write a `Solver` class with classic backtracking — find empty cell → try 1-9 with `can_place` row/col/box guard → recurse → undo on failure. Two entry points: `solve(Grid&)` in-place for UI, `try_solve(const Grid&)` returning `std::optional<Grid>` for CLI + tests. Add a second binary `apps/sudoku_solver/` reading stdin → solving → printing stdout with `std::chrono::steady_clock` timing. GoogleTest 3-case suite covers happy path + contradiction + side-effect contract.

Phase 3: A Backtracking Sudoku Solver in C++17

A solver is the moment your sudoku project stops being a grid renderer and starts being a real program. By the end of this lesson you will have a Solver class that fills any solvable 9×9 board, a second binary that reads puzzles from stdin and prints solutions to stdout with microsecond timing, and a three-case GoogleTest suite that proves the API contract.

Phase 3 builds directly on the Grid class from the previous lesson. Same at(), set(), and 81-cell std::array layout. You will not touch the GUI in this phase. The solver is a pure header + cpp pair that the UI binary calls in Phase 4 and that the new CLI binary calls today.

Why Backtracking First

Sudoku has fancy algorithms. Dancing links, constraint propagation à la Norvig, SAT encodings. They are all faster than naive backtracking on adversarial puzzles. None of them belong in your first solver.

Plain backtracking is worth writing first for three reasons. It maps one-to-one onto how a human solver works in pencil-and-paper mode: pick an empty cell, try a candidate, recurse, erase if you got stuck. It compiles without any dependencies beyond the standard library. And when you measure it against published "world's hardest" puzzles, it still finishes in roughly 1 to 10 milliseconds, which is well under the 16ms frame budget of the UI you wrote in Phase 2. You can ship the GUI on top of this solver without it ever feeling sluggish.

A useful comparison: Peter Norvig's constraint-propagation solver is roughly 3× faster on the hardest known puzzles, but 4× longer in code and harder to debug for anyone who has not internalized propagation rules. For a learning project, backtracking is the better starting point. You can always swap in a propagator later behind the same Solver interface.

Two Entry Points, One Algorithm

The angle of this lesson is API design. Most beginner tutorials give you one function, bool solve(Grid&), and call it a day. That is fine for a CLI but awful for a GUI. The UI wants to ask "is this puzzle solvable?" without mutating the displayed board until the user clicks "Reveal Solution." The CLI wants the opposite: feed in, get answer, exit.

So Solver exposes two methods:

class Solver {
public:
    bool solve(Grid& grid) const;
    std::optional<Grid> try_solve(const Grid& grid) const;

private:
    bool backtrack(Grid& grid) const;
    std::optional<std::pair<int, int>> find_empty(const Grid& grid) const;
    bool can_place(const Grid& grid, int row, int col, int value) const;
};

solve takes a mutable reference and writes the solution in place. It returns true on success, false if the puzzle has no solution. The GUI uses this when the user explicitly asks for the answer to overwrite the board.

try_solve takes a const reference, runs the algorithm on an internal copy, and returns std::optional<Grid>. The empty optional means unsolvable. This is the version the CLI and tests use, because it composes. You can pipe the result into a printer without worrying about who owns the grid.

Both methods share the same private backtrack core. try_solve is a one-line wrapper that copies the grid and delegates:

std::optional<Grid> Solver::try_solve(const Grid& grid) const {
    Grid working = grid;
    if (backtrack(working)) return working;
    return std::nullopt;
}

bool Solver::solve(Grid& grid) const {
    return backtrack(grid);
}

This two-entry-point pattern is worth remembering. Mutating versions are convenient when the caller already owns a mutable buffer. Functional versions are safer everywhere else. Offering both is six lines of glue code and removes a class of subtle bugs where a "solve" call leaves the board in a half-filled state on failure. We will exploit exactly that behavior in the third test case.

The Backtracking Core

The algorithm itself is shorter than the API around it. Three steps: find the next empty cell, try each digit 1 to 9 that does not break the row, column, or box constraint, recurse. If none of the digits work, undo the placement and let the caller try its next candidate.

bool Solver::backtrack(Grid& grid) const {
    auto empty = find_empty(grid);
    if (!empty) return true;

    const auto [row, col] = *empty;
    for (int value = 1; value <= 9; ++value) {
        if (!can_place(grid, row, col, value)) continue;
        grid.set(row, col, value);
        if (backtrack(grid)) return true;
        grid.set(row, col, 0);
    }
    return false;
}

The base case is "no empty cell exists." That means the grid is solved. The recursive step writes a candidate, recurses, and on failure restores the zero. The undo step is the actual backtracking. Without it you would leave invalid candidates littered across the board.

find_empty returns the first zero cell in row-major order:

std::optional<std::pair<int, int>> Solver::find_empty(const Grid& grid) const {
    for (int row = 0; row < 9; ++row) {
        for (int col = 0; col < 9; ++col) {
            if (grid.at(row, col) == 0) return std::make_pair(row, col);
        }
    }
    return std::nullopt;
}

Row-major scan is the simplest choice. Smarter solvers pick the empty cell with the fewest legal candidates (the "most constrained variable" heuristic), and that change alone can cut runtimes by another order of magnitude on hard puzzles. Leave that as an exercise for after Phase 3 ships green.

can_place is the constraint check, factored into three independent loops for clarity:

bool Solver::can_place(const Grid& grid, int row, int col, int value) const {
    for (int c = 0; c < 9; ++c) if (grid.at(row, c) == value) return false;
    for (int r = 0; r < 9; ++r) if (grid.at(r, col) == value) return false;
    const int box_row = (row / 3) * 3;
    const int box_col = (col / 3) * 3;
    for (int r = box_row; r < box_row + 3; ++r) {
        for (int c = box_col; c < box_col + 3; ++c) {
            if (grid.at(r, c) == value) return false;
        }
    }
    return true;
}

You could collapse the three loops into one cleverer pass, but the readability win of three named scopes outweighs the marginal speedup. Profile before you simplify.

The Standalone CLI Binary

Phase 3 adds a second executable to your CMake project, sitting next to the GUI binary. Create apps/sudoku_solver/main.cpp:

#include <chrono>
#include <iostream>
#include <string>

#include "core/grid.hpp"
#include "core/solver.hpp"

int main() {
    Grid grid;
    std::string line;
    int row = 0;
    while (row < 9 && std::getline(std::cin, line)) {
        for (int col = 0; col < 9 && col < static_cast<int>(line.size()); ++col) {
            const char ch = line[col];
            const int value = (ch == '.' || ch == '0') ? 0 : (ch - '0');
            grid.set(row, col, value);
        }
        ++row;
    }

    Solver solver;
    const auto start = std::chrono::steady_clock::now();
    const auto solved = solver.try_solve(grid);
    const auto elapsed = std::chrono::steady_clock::now() - start;

    if (!solved) {
        std::cerr << "unsolvable\n";
        return 1;
    }

    for (int r = 0; r < 9; ++r) {
        for (int c = 0; c < 9; ++c) std::cout << solved->at(r, c);
        std::cout << '\n';
    }

    const auto us = std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
    std::cerr << "solved in " << us << "us\n";
    return 0;
}

A few decisions worth calling out. Timing uses std::chrono::steady_clock, not system_clock, because the latter can jump backwards on NTP correction and produce negative durations. steady_clock is monotonic and is exactly what the cppreference chrono page recommends for measuring intervals.

The CLI accepts both . and 0 as the empty marker, which matches the convention used by most online puzzle banks. Timing goes to stderr so you can pipe the solution to a file or another process without polluting it. The output is nine lines of nine digits, the simplest grid format another script can parse.

On a modern laptop, a typical "evil" rated puzzle from the standard test set finishes in about 1.2 milliseconds with this code. A pathological 17-clue puzzle from the minimum-clue puzzle list can push toward 50 milliseconds. Both are well under what the GUI needs.

A Three-Case GoogleTest Suite

Tests in this phase prove three things: the algorithm finds the right answer on a well-formed puzzle, it rejects a contradictory puzzle, and the two entry points obey their contracts about mutation.

Set up the test file at tests/solver_test.cpp:

#include <gtest/gtest.h>

#include "core/grid.hpp"
#include "core/solver.hpp"

namespace {

Grid load(const char* rows[9]) {
    Grid grid;
    for (int r = 0; r < 9; ++r) {
        for (int c = 0; c < 9; ++c) {
            const char ch = rows[r][c];
            grid.set(r, c, ch == '.' ? 0 : ch - '0');
        }
    }
    return grid;
}

}

TEST(SolverTest, SolvesEasyPuzzle) {
    const char* rows[9] = {
        "53..7....",
        "6..195...",
        ".98....6.",
        "8...6...3",
        "4..8.3..1",
        "7...2...6",
        ".6....28.",
        "...419..5",
        "....8..79",
    };
    Solver solver;
    auto solved = solver.try_solve(load(rows));
    ASSERT_TRUE(solved.has_value());
    EXPECT_EQ(solved->at(0, 0), 5);
    EXPECT_EQ(solved->at(8, 8), 9);
}

TEST(SolverTest, RejectsContradictoryPuzzle) {
    Grid grid;
    grid.set(0, 0, 5);
    grid.set(0, 1, 5);
    Solver solver;
    EXPECT_FALSE(solver.try_solve(grid).has_value());
}

TEST(SolverTest, TrySolveDoesNotMutateInput) {
    Grid grid;
    grid.set(0, 0, 5);
    const Grid before = grid;
    Solver solver;
    auto _ = solver.try_solve(grid);
    for (int r = 0; r < 9; ++r) {
        for (int c = 0; c < 9; ++c) {
            EXPECT_EQ(grid.at(r, c), before.at(r, c));
        }
    }
}

The first case is the canonical sudoku from Wikipedia. Asserting on two corner cells is enough to catch the common bug where the algorithm finds a valid completion but not the unique one. Sudoku is constrained enough that any deviation propagates to the corners.

The second case sets two 5s in the same row and asks the solver to prove there is no solution. This catches off-by-one errors in can_place and missing early-exit returns.

The third case is the side-effect contract. try_solve takes a const reference, so the compiler will already reject any mutation in the public API. But the test also documents intent: if a future refactor switches the parameter to a value or moves the function body, the test still catches the regression.

GoogleTest discovery is straightforward. Link the test binary against gtest_main and CMake's FetchContent integration keeps the dependency vendored without inflating the repo. Run ctest --output-on-failure from the build directory and the three tests should complete in well under a second.

Where Phase 4 Picks Up

You now have a solver that the UI can call. Phase 4 wires a "Solve" button in the GUI to solve(grid) and a "Hint" button to try_solve(grid) followed by diffing one cell. Because the API is already designed around both mutate-and-functional modes, you will not need to touch solver.hpp again for that work.

Three nice extensions, in order of payoff per line of code: (1) bias find_empty to the most-constrained cell, usually 5× faster on adversarial puzzles and roughly 15 added lines, (2) bail out of solve if it runs longer than a configurable timeout so the GUI never freezes, (3) detect puzzles with multiple solutions by continuing the search after the first hit and surfacing that as a separate API result.

Save those for after the tests stay green for a week. Phase 3's job is a correct, readable, testable solver, and you now have one.

References: