01.01.11 · foundations / linear-algebra

Jordan canonical form and minimal polynomial

shipped3 tiersLean: partial

Anchor (Master): Apostol Calculus Vol. 2 Ch. 4 §4.20; Axler — Linear Algebra Done Right Ch. 8; Hoffman-Kunze — Linear Algebra Ch. 6–7; Lang — Algebra Ch. III §10 + Ch. XIV §2

Intuition [Beginner]

Some linear maps are pure stretching. Pick the right axes and the map multiplies each axis by its own number; the matrix in those axes is diagonal, with the stretch factors on the diagonal. This is the simplest possible behaviour, and it is what eigenvectors and eigenvalues record 01.01.08.

Not every map is pure stretching. The map that sends the arrow $(0, 1)$ to $(1, 1)$ and leaves $(1, 0)$ alone has only one direction it stretches — the horizontal one — and along the vertical direction it shears, sliding everything sideways. No choice of axes makes the matrix diagonal. There is one stretching direction, and the map mixes the other direction with it.

The Jordan canonical form is the next-best matrix. Where the map is pure stretching, the diagonal entries record the stretch factors. Where the map shears, the matrix records a stretch factor on the diagonal together with a $1$ just above the diagonal that names the shear. Every linear map on a finite-dimensional complex vector space looks like this — pure stretches glued together with stretch-plus-shear pieces — once you choose the right basis. The pieces are the Jordan blocks.

Visual [Beginner]

The picture shows a four-by-four matrix in Jordan canonical form. Two Jordan blocks sit along the diagonal: a three-by-three block with the number $2$ on the diagonal and $1$ s on the superdiagonal, and a one-by-one block holding the number $5$ . To the right is a Jordan chain — three arrows pointing successively from $v_{3}$ to $v_{2}$ to $v_{1}$ and finally to $0$ , each labelled by the nilpotent shift. The chain is the basis in which the three-by-three block acts.

The two block sizes tell the whole story. The block of size $3$ at eigenvalue $2$ means there is one chain of length $3$ — one direction the map stretches by $2$ , plus two more directions glued to it by shears. The block of size $1$ at eigenvalue $5$ means there is one ordinary eigenvector with eigenvalue $5$ , and nothing further glued to it.

Worked example [Beginner]

Take the three-by-three matrix

A = 200120012 .

Compute the characteristic polynomial by expanding $det (t I - A) = (t - 2)^{3}$ , so the only eigenvalue is $2$ , with algebraic multiplicity $3$ . Find the eigenvectors: solve $(A - 2 I) v = 0$ . The matrix $A - 2 I$ is the nilpotent shift, sending $(x, y, z)$ to $(y, z, 0)$ . The kernel is ${(x, 0, 0)}$ , a one-dimensional eigenspace spanned by $(1, 0, 0)$ .

So the matrix has one eigenvalue, $2$ , with one eigendirection along $(1, 0, 0)$ . The matrix is not diagonalisable because the eigenspace has dimension $1$ instead of $3$ .

Build the Jordan chain. Start with the eigenvector $v_{1} = (1, 0, 0)$ . Look for $v_{2}$ with $(A - 2 I) v_{2} = v_{1}$ . Solving $(y, z, 0) = (1, 0, 0)$ gives $v_{2} = (0, 1, 0)$ . Look for $v_{3}$ with $(A - 2 I) v_{3} = v_{2}$ . Solving $(y, z, 0) = (0, 1, 0)$ gives $v_{3} = (0, 0, 1)$ . The three vectors $v_{1}, v_{2}, v_{3}$ are a basis of the space, and in this basis $A$ is its own Jordan canonical form: a single Jordan block $J_{3} (2)$ of size $3$ at the eigenvalue $2$ .

What this tells us. A matrix can be a single Jordan block already, with no work to do. The defect from diagonal — the size of the largest Jordan block minus $1$ — measures how far the map is from being pure stretching. For this matrix, the defect is $2$ : there is one stretching direction, and two further directions glued on by shears.

Check your understanding [Beginner]

Formal definition [Intermediate+]

Let $K$ be a field, $V$ a finite-dimensional $K$ -vector space, and $T : V \to V$ a linear operator. The minimal polynomial and the Jordan canonical form record the entire similarity class of $T$ .

Jordan block. For $k \geq 1$ and $λ \in K$ , the Jordan block of size $k$ at $λ$ is the matrix

J_{k} (λ) = λ 1 λ 1 ⋱ ⋱ λ 1 λ \in M_{k} (K),

with $λ$ on every diagonal entry, $1$ on every entry of the superdiagonal, and $0$ elsewhere. Equivalently $J_{k} (λ) = λ I_{k} + N_{k}$ where $N_{k}$ is the nilpotent shift with $1$ s on the superdiagonal; $N_{k}$ satisfies $N_{k}^{k} = 0$ and $N_{k}^{k - 1} \neq = 0$ . The matrix $J_{k} (λ)$ has characteristic polynomial $(t - λ)^{k}$ and minimal polynomial $(t - λ)^{k}$ , eigenvalue $λ$ with algebraic multiplicity $k$ and geometric multiplicity $1$ (the eigenspace is the span of the first standard basis vector).

Minimal polynomial. The set of polynomials $p \in K [t]$ with $p (T) = 0$ is an ideal of $K [t]$ , the kernel of the algebra evaluation map $K [t] \to End_{K} (V)$ at $T$ . Since $K [t]$ is a principal ideal domain and $End_{K} (V)$ is finite-dimensional over $K$ , the ideal is non-zero and has a unique monic generator. This monic generator is the minimal polynomial $m_{T} (t)$ of $T$ . Equivalently, $m_{T} (t)$ is the unique monic polynomial of smallest positive degree satisfying $m_{T} (T) = 0$ , and every polynomial $p$ with $p (T) = 0$ is a multiple of $m_{T} (t)$ in $K [t]$ ^{[textbooks-extra Calculus Vol.2 - Multi-Variable Calculus and Linear Algebra with Applications (Tom Apostol).pdf]}.

Relation to the characteristic polynomial. The Cayley-Hamilton identity from 01.01.08 gives $χ_{T} (T) = 0$ , hence $χ_{T}$ lies in the kernel ideal and $m_{T} (t) ∣ χ_{T} (t)$ in $K [t]$ . The two polynomials have the same roots: every root of $m_{T}$ is a root of $χ_{T}$ by divisibility, and every eigenvalue $λ$ of $T$ satisfies $m_{T} (λ) = 0$ because $T v = λ v$ with $v \neq = 0$ gives $m_{T} (T) v = m_{T} (λ) v = 0$ , forcing $m_{T} (λ) = 0$ . The multiplicities differ in general: the multiplicity of $λ$ in $m_{T}$ is the size of the largest Jordan block of $T$ at $λ$ , while the multiplicity in $χ_{T}$ is the algebraic multiplicity of $λ$ (the sum of the sizes of all Jordan blocks at $λ$ ).

Diagonalisability criterion. The operator $T$ is diagonalisable iff $m_{T} (t)$ factors as a product of distinct linear factors in $K [t]$ — equivalently iff $m_{T}$ is square-free and splits over $K$ . The forward direction: a diagonalisable $T$ with eigenvalues $λ_{1}, \dots, λ_{r}$ satisfies $\prod_{i} (T - λ_{i} I) = 0$ on each eigenspace, hence on $V = ⨁_{i} E_{λ_{i}}$ , so $m_{T} ∣ \prod_{i} (t - λ_{i})$ , which is square-free. The reverse direction is the content of the primary-decomposition argument in the key theorem below.

Generalised eigenspace. For $λ \in K$ and $m \geq 1$ , the generalised eigenspace of $T$ at $λ$ of order $m$ is the kernel $ker (T - λ I)^{m} \subseteq V$ . The kernels form an ascending chain $E_{λ} = ker (T - λ I) \subseteq ker (T - λ I)^{2} \subseteq \dots$ , which stabilises at some index $m_{λ}$ called the index of $λ$ . The stable subspace $G_{λ} = ker (T - λ I)^{m_{λ}}$ is the (full) generalised eigenspace at $λ$ ; its dimension equals the algebraic multiplicity of $λ$ when $K$ contains every root of $χ_{T}$ , and the index $m_{λ}$ equals the size of the largest Jordan block of $T$ at $λ$ .

Segre characteristic. Suppose $T$ is in Jordan canonical form with Jordan blocks at the eigenvalue $λ$ of sizes $k_{1} \geq k_{2} \geq \dots \geq k_{r}$ . The decreasing partition $(k_{1}, k_{2}, \dots, k_{r})$ of the algebraic multiplicity of $λ$ is the Segre characteristic of $T$ at $λ$ . The Segre characteristic determines the similarity class of $T$ on the generalised eigenspace $G_{λ}$ , and the Segre characteristics across the eigenvalues collectively determine the similarity class of $T$ on $V$ .

Counterexamples to common slips

The minimal polynomial divides the characteristic polynomial but is not always equal to it. For the identity matrix $I_{n}$ , $χ_{I} (t) = (t - 1)^{n}$ but $m_{I} (t) = t - 1$ . Equality $m_{T} = χ_{T}$ holds precisely for cyclic operators (those admitting a single vector whose $T$ -orbit spans $V$ ).
Algebraic multiplicity of $λ$ in $χ_{T}$ is the total number of $λ$ entries on the Jordan-form diagonal; geometric multiplicity is the number of Jordan blocks at $λ$ ; index is the size of the largest Jordan block at $λ$ . These three numbers are independent in general and only coincide when each Jordan block at $λ$ has size $1$ (the diagonalisable case at $λ$ ).
The Jordan canonical form classification requires the ground field to contain every eigenvalue. Over $R$ , the planar rotation by an angle other than $0$ or $π$ has no real Jordan form because its eigenvalues are non-real complex numbers; the classification over $R$ uses real Jordan blocks with two-by-two rotational pieces in place of complex-conjugate eigenvalue pairs.
The Segre characteristic is the decreasing partition of the algebraic multiplicity; the dual conjugate partition records the dimensions of the successive kernels $ker (T - λ I)^{j}$ , and is sometimes used in place of the Segre data. The two partitions encode the same information by Young-diagram conjugation.

Key theorem with proof [Intermediate+]

Theorem (existence and uniqueness of Jordan canonical form). Let $T : V \to V$ be a linear operator on a finite-dimensional vector space over an algebraically closed field $K$ . There exists a basis of $V$ in which the matrix of $T$ is block-diagonal,

[T] = diag (J_{k_{1}} (λ_{1}), J_{k_{2}} (λ_{2}), \dots, J_{k_{r}} (λ_{r})),

with each $J_{k_{i}} (λ_{i})$ a Jordan block. The block data — the multiset of pairs $(k_{i}, λ_{i})$ — is uniquely determined by $T$ up to reordering. ^{[Axler, S. — Linear Algebra Done Right (3rd ed.)]}

Proof. The argument has three steps: a primary decomposition that splits $V$ into generalised eigenspaces, a nilpotent-operator classification on each generalised eigenspace, and a counting argument that secures uniqueness.

Step 1: Primary decomposition. Factor the characteristic polynomial over $K$ (possible because $K$ is algebraically closed) as $χ_{T} (t) = \prod_{i = 1}^{s} (t - λ_{i})^{m_{i}}$ with $λ_{1}, \dots, λ_{s}$ the distinct eigenvalues and $m_{i} \geq 1$ their algebraic multiplicities. The polynomials $q_{i} (t) = (t - λ_{i})^{m_{i}}$ are pairwise coprime in $K [t]$ because the roots are distinct. By the Chinese remainder theorem applied to $K [t] / (χ_{T} (t))$ , the quotient ring decomposes as $K [t] / (χ_{T}) ≅ \prod_{i} K [t] / (q_{i})$ . The $K [t]$ -module structure on $V$ defined by letting $t$ act as $T$ has annihilator $χ_{T} (t)$ by Cayley-Hamilton, so $V$ is a module over $K [t] / (χ_{T})$ . The Chinese remainder isomorphism induces a $K [t]$ -module decomposition

V = i = 1 ⨁ s G_{λ_{i}}, G_{λ_{i}} = ker (T - λ_{i} I)^{m_{i}} .

Each $G_{λ_{i}}$ is $T$ -invariant (the operator $T$ and $(T - λ_{i} I)^{m_{i}}$ commute, so $T$ preserves the kernel), and on $G_{λ_{i}}$ the restricted operator $N_{i} = T - λ_{i} I$ is nilpotent of index at most $m_{i}$ because $(T - λ_{i} I)^{m_{i}}$ vanishes on $G_{λ_{i}}$ by construction.

Step 2: Jordan basis for a nilpotent operator. Fix $λ = λ_{i}$ and let $W = G_{λ}$ with nilpotent operator $N = T - λ I$ on $W$ , of index $ℓ$ (so $N^{ℓ} = 0$ but $N^{ℓ - 1} \neq = 0$ ). Form the kernel chain ${0} = ker N^{0} ⊊ ker N^{1} ⊊ ker N^{2} ⊊ \dots ⊊ ker N^{ℓ} = W$ . The aim is to produce a basis of $W$ consisting of Jordan chains: sequences ${v, N v, N^{2} v, \dots, N^{k - 1} v}$ with $N^{k} v = 0$ and $N^{k - 1} v \neq = 0$ .

Build the basis top-down. Pick a basis of the top quotient $W / ker N^{ℓ - 1}$ ; lift these basis elements to vectors $v_{1}, \dots, v_{d_{ℓ}} \in W$ . By construction, $N^{ℓ - 1} v_{j} \neq = 0$ and $N^{ℓ} v_{j} = 0$ , so each $v_{j}$ generates a Jordan chain ${v_{j}, N v_{j}, \dots, N^{ℓ - 1} v_{j}}$ of length $ℓ$ . The collected chains contribute $d_{ℓ}$ vectors at each level of the kernel filtration. Pass to the next quotient $ker N^{ℓ - 1} / ker N^{ℓ - 2}$ ; the images of $N v_{1}, \dots, N v_{d_{ℓ}}$ form a linearly independent subset of this quotient (because the $v_{j}$ were chosen linearly independent modulo $ker N^{ℓ - 1}$ and $N$ induces an injection on each successive quotient), which extend to a basis with new vectors $w_{1}, \dots, w_{d_{ℓ - 1}} \in ker N^{ℓ - 1}$ . Each $w_{k}$ generates a Jordan chain of length $ℓ - 1$ . Continue inductively down the kernel filtration, at each step extending the previously-built chains by new top elements.

The collected Jordan chains form a basis of $W$ because at each level $j$ of the filtration the chains contribute a basis of $ker N^{j} / ker N^{j - 1}$ . In this basis the matrix of $N$ is block-diagonal with nilpotent-shift blocks $N_{k}$ of various sizes $k$ , and the matrix of $T = λ I + N$ is block-diagonal with Jordan blocks $J_{k} (λ) = λ I_{k} + N_{k}$ . Assembling over $i$ gives the full block-diagonal Jordan form for $T$ on $V$ .

Step 3: Uniqueness. The block sizes at a fixed eigenvalue $λ$ are recovered from the dimensions of the kernel-filtration quotients by a counting identity. Let $d_{j} = dim ker (T - λ I)^{j} - dim ker (T - λ I)^{j - 1}$ for $j \geq 1$ ; this is the number of independent Jordan chains at $λ$ of length at least $j$ . The number of Jordan blocks at $λ$ of size exactly $k$ is $d_{k} - d_{k + 1}$ . Both $d_{k}$ and $d_{k + 1}$ are intrinsic invariants of $T$ — they depend only on the operator, not on the basis used to exhibit it — so the multiset of block sizes at $λ$ is intrinsic. Across the eigenvalues, the multiset of pairs $(k, λ)$ is therefore an invariant of the similarity class of $T$ . $□$

Corollary (similarity classification over $C$ ). Two matrices in $M_{n} (C)$ are similar iff they have the same Jordan canonical form up to permutation of blocks. This is the complete conjugacy classification of $GL_{n} (C) ↷ M_{n} (C)$ by conjugation, with parameter space the disjoint union over partitions $μ ⊢ n$ of conjugacy classes labelled by Segre-characteristic data.

Bridge. The Jordan canonical form theorem is the load-bearing structural result of finite-dimensional linear algebra. The proof packages four ideas; each gives a separate downstream synthesis.

First synthesis. The primary decomposition $V = ⨁_{i} G_{λ_{i}}$ is the linear-algebra incarnation of the Chinese remainder theorem in $K [t] / (χ_{T})$ . The same decomposition reappears as the primary decomposition of finitely generated modules over a PID: every such module decomposes as a direct sum of cyclic modules $K [t] / (p_{i} (t)^{e_{i}})$ with $p_{i}$ irreducible. Over $C$ the irreducibles are linear and the decomposition recovers Jordan form; over a general field the irreducibles can have higher degree and the decomposition recovers the rational canonical form via companion matrices.

Second synthesis. The classification of nilpotent operators by partitions — Jordan chains organised into Young-diagram-shaped data — is the linear-algebra core of the representation theory of the symmetric group: nilpotent conjugacy classes in $gl_{n} (C)$ are parametrised by partitions of $n$ , exactly as irreducible representations of $S_{n}$ are. The bridge between the two parametrisations is the Springer correspondence, which sends a nilpotent orbit to an irreducible $S_{n}$ -representation via the cohomology of Springer fibres.

Third synthesis. The Jordan form gives explicit functional calculus on operators. For any holomorphic function $f$ defined on a neighbourhood of the spectrum, $f (T)$ acts on each Jordan block $J_{k} (λ)$ as $f (J_{k} (λ)) = \sum_{j = 0}^{k - 1} \frac{f ^{(j)} ( λ )}{j !} N_{k}^{j}$ , with $N_{k}$ the nilpotent shift. This computes the matrix exponential $e^{t A}$ , the matrix logarithm, the resolvent $(T - z I)^{- 1}$ as a Laurent series at each $λ$ , and the holomorphic functional calculus for bounded operators on a Banach space; in the bounded-operator setting the Jordan-form identity is replaced by the Riesz functional calculus $f (T) = \frac{1}{2 π i} \oint_{Γ} f (z) (z I - T)^{- 1} d z$ .

Fourth synthesis. The Jordan canonical form is the parameter for the GIT quotient $M_{n} (C) // GL_{n} (C)$ under conjugation. The geometric invariant theory of this action is the prototype of more elaborate quotient constructions in algebraic geometry and representation theory: closures of nilpotent orbits become singular projective varieties (Springer's resolution of the nilpotent cone), and the Jordan partition is the dimension data parametrising the strata. The whole package of moduli, orbits, and stratifications stems from the partition-of- $n$ classification at the heart of Jordan form.

Exercises [Intermediate+]

Exercise 3 (medium, symbolic).

Determine the Jordan canonical form of the matrix $$ A = \begin{pmatrix} 5 & 4 & 2 & 1 \ 0 & 1 & -1 & -1 \ -1 & -1 & 3 & 0 \ 1 & 1 & -1 & 2 \end{pmatrix}. $$

Hint

The characteristic polynomial factors as $χ_{A} (t) = (t - 4)^{2} (t - 2)^{2}$ . For each eigenvalue, compute the dimensions of $ker (A - λ I)$ and $ker (A - λ I)^{2}$ and read off the Jordan block sizes from the differences.

Answer

The characteristic polynomial is $χ_{A} (t) = (t - 4)^{2} (t - 2)^{2}$ , so the eigenvalues are $4$ and $2$ , each of algebraic multiplicity $2$ .

At $λ = 4$ : $dim ker (A - 4 I) = 1$ , $dim ker (A - 4 I)^{2} = 2$ . One Jordan chain of length $2$ at $4$ , so the block at $4$ is $J_{2} (4)$ .

At $λ = 2$ : $dim ker (A - 2 I) = 1$ , $dim ker (A - 2 I)^{2} = 2$ . One Jordan chain of length $2$ at $2$ , so the block at $2$ is $J_{2} (2)$ .

The Jordan canonical form is $J = diag (J_{2} (4), J_{2} (2))$ , a four-by-four block-diagonal matrix with one $J_{2} (4)$ block and one $J_{2} (2)$ block. The minimal polynomial is $m_{A} (t) = (t - 4)^{2} (t - 2)^{2} = χ_{A} (t)$ , so $A$ is a cyclic operator.

Exercise 5 (medium, proof).

Let $T : V \to V$ be a linear operator on a finite-dimensional vector space over an algebraically closed field. Show that $T$ is diagonalisable iff the minimal polynomial $m_{T} (t)$ is square-free.

Hint

Combine the primary decomposition with the observation that on the generalised eigenspace at $λ$ , the operator $T - λ I$ is nilpotent.

Answer

Forward direction: if $T$ is diagonalisable with eigenvalues $λ_{1}, \dots, λ_{r}$ (distinct), pick an eigenbasis. The polynomial $p (t) = \prod_{i} (t - λ_{i})$ annihilates each eigenvector, hence $p (T) = 0$ on each eigenspace, hence on $V = ⨁_{i} E_{λ_{i}}$ . Therefore $m_{T} ∣ p$ in $K [t]$ . Since $m_{T}$ and $p$ have the same roots (the eigenvalues of $T$ ) and $p$ is monic of degree $r$ with distinct linear factors, $m_{T} = p$ is square-free.

Reverse direction: suppose $m_{T} (t) = \prod_{i} (t - λ_{i})$ with the $λ_{i}$ distinct. Consider the Lagrange interpolation polynomials $p_{i} (t) = \prod_{j \neq = i} (t - λ_{j}) / \prod_{j \neq = i} (λ_{i} - λ_{j}) \in K [t]$ , satisfying $p_{i} (λ_{j}) = δ_{ij}$ and $\sum_{i} p_{i} (t) = 1$ as a polynomial identity. Set $P_{i} = p_{i} (T) \in End_{K} (V)$ . The identity $\sum_{i} p_{i} (t) = 1$ gives $\sum_{i} P_{i} = I_{V}$ . The identity $(t - λ_{i}) p_{i} (t) \cdot \prod_{j \neq = i} (λ_{i} - λ_{j}) = \prod_{j} (t - λ_{j}) = m_{T} (t)$ , applied at $T$ , gives $(T - λ_{i} I) P_{i} = 0$ . Hence the image of $P_{i}$ lies in $E_{λ_{i}}$ . From $\sum_{i} P_{i} = I_{V}$ , every $v \in V$ decomposes as $v = \sum_{i} P_{i} v$ with $P_{i} v \in E_{λ_{i}}$ , so $V = \sum_{i} E_{λ_{i}}$ . The sum is direct (eigenvectors for distinct eigenvalues are linearly independent — exercise 7 of 01.01.08). So $V = ⨁_{i} E_{λ_{i}}$ and $T$ is diagonalisable.

Exercise 6 (medium, proof).

Let $T : V \to V$ be a linear operator on a finite-dimensional $C$ -vector space with Jordan canonical form $diag (J_{k_{1}} (λ_{1}), \dots, J_{k_{r}} (λ_{r}))$ . Show that the trace of $T$ equals the sum $\sum_{i} k_{i} λ_{i}$ .

Hint

The trace is invariant under similarity, so compute it on the Jordan form directly.

Answer

The trace of a Jordan block $J_{k} (λ)$ is $k \cdot λ$ — the sum of the $k$ diagonal entries, each equal to $λ$ . The trace of a block-diagonal matrix is the sum of the traces of the blocks, so $tr J = \sum_{i} k_{i} λ_{i}$ . Since $T$ is similar to $J$ and trace is similarity-invariant ( $tr (P A P^{- 1}) = tr A$ for invertible $P$ ), $tr T = tr J = \sum_{i} k_{i} λ_{i}$ . Equivalently, the trace is the sum of the eigenvalues of $T$ counted with algebraic multiplicity, recovered also from the next-to-leading coefficient of $χ_{T}$ .

Exercise 7 (hard, proof).

Let $N : W \to W$ be a nilpotent operator on a finite-dimensional vector space with $N^{ℓ} = 0$ and $N^{ℓ - 1} \neq = 0$ . Show that the number of Jordan blocks of $N$ of size exactly $k$ equals $dim ker N^{k} - 2 dim ker N^{k - 1} + \dots$ — more precisely, equals $(d_{k} - d_{k + 1})$ where $d_{j} = dim ker N^{j} - dim ker N^{j - 1}$ , $d_{0} = 0$ .

Hint

A Jordan chain of length $m$ at $λ = 0$ contributes one new vector to each of $ker N^{1}, ker N^{2}, \dots, ker N^{m}$ and contributes nothing to $ker N^{j}$ for $j > m$ .

Answer

Decompose $W$ into Jordan chains via the Jordan-form construction. Let $b_{m}$ be the number of Jordan chains (i.e., Jordan blocks at the eigenvalue $0$ ) of length exactly $m$ , for $m = 1, 2, \dots, ℓ$ . A chain of length $m$ has the form ${v, N v, \dots, N^{m - 1} v}$ with $N^{j} v \in ker N^{m - j}$ . So the chain contributes exactly $min (m, j)$ vectors to $ker N^{j}$ . Total dimension of $ker N^{j}$ :

dim ker N^{j} = m = 1 \sum ℓ b_{m} \cdot min (m, j) = m = 1 \sum j m \cdot b_{m} + j \cdot m = j + 1 \sum ℓ b_{m} .

So $d_{j} = dim ker N^{j} - dim ker N^{j - 1} = \sum_{m = j}^{ℓ} b_{m}$ (the number of chains of length at least $j$ ). Then $d_{k} - d_{k + 1} = \sum_{m = k}^{ℓ} b_{m} - \sum_{m = k + 1}^{ℓ} b_{m} = b_{k}$ , which is the number of Jordan blocks of size exactly $k$ , as claimed.

This formula gives the uniqueness of the Jordan canonical form: the block-size data is recovered from the operator-intrinsic kernel-filtration dimensions.

Exercise 8 (hard, proof).

Let $A \in M_{n} (C)$ have all eigenvalues with absolute value strictly less than $1$ . Show that $A^{m} \to 0$ as $m \to \infty$ entrywise.

Hint

Reduce to a single Jordan block via the Jordan canonical form, then compute $J_{k} (λ)^{m}$ explicitly.

Answer

Let $A = P J P^{- 1}$ with $J$ the Jordan canonical form. Then $A^{m} = P J^{m} P^{- 1}$ and the entrywise convergence of $A^{m}$ to $0$ is equivalent to the convergence of $J^{m}$ to $0$ . The matrix $J^{m}$ is block-diagonal with the $m$ th powers of the Jordan blocks of $J$ . So it suffices to show $J_{k} (λ)^{m} \to 0$ for every $k \geq 1$ and every $λ$ with $∣ λ ∣ < 1$ .

Write $J_{k} (λ) = λ I_{k} + N_{k}$ with $N_{k}$ the nilpotent shift, so $N_{k}^{k} = 0$ . The binomial expansion of $(λ I_{k} + N_{k})^{m}$ commutes because $λ I_{k}$ and $N_{k}$ commute, giving

J_{k} (λ)^{m} = j = 0 \sum k - 1 (j m) λ^{m - j} N_{k}^{j},

where the sum cuts off at $j = k - 1$ because $N_{k}^{j} = 0$ for $j \geq k$ . Each entry of $J_{k} (λ)^{m}$ is a sum of at most $k$ terms of the form $(j m) λ^{m - j}$ for $0 \leq j \leq k - 1$ . The term $(j m) λ^{m - j}$ is bounded above in absolute value by $m^{k - 1} \cdot ∣ λ ∣^{m - k + 1}$ for $m$ large, and $m^{k - 1} ∣ λ ∣^{m - k + 1} \to 0$ as $m \to \infty$ because exponential decay $∣ λ ∣^{m}$ beats polynomial growth $m^{k - 1}$ when $∣ λ ∣ < 1$ . So every entry of $J_{k} (λ)^{m}$ tends to $0$ .

Spectral-radius corollary: $A^{m} \to 0$ iff the spectral radius of $A$ is strictly less than $1$ .

Lean formalization [Intermediate+]

Mathlib packages the minimal polynomial of an endomorphism as minpoly K T, the divisibility minpoly K A ∣ Matrix.charpoly A through Matrix.aeval_self_charpoly together with minpoly.dvd, the generalised eigenspaces as Module.End.generalizedEigenspace, and the primary-decomposition theorem that generalised eigenspaces sum to the whole module over an algebraically closed field as Module.End.iSup_generalizedEigenspace_eq_top. The companion file Codex.Foundations.LinearAlgebra.JordanForm records the statements used above.

[object Promise]

This unit is marked lean_status: partial because Mathlib supplies every constituent ingredient — the minimal polynomial, the divisibility theorem, the generalised eigenspaces, the primary-decomposition theorem — but no single named Matrix.jordanForm result packaging the block-diagonal form with Segre-characteristic uniqueness data. The corresponding statement in the companion module is left as a sorry-gated alias pending that packaging.

Advanced results [Master]

Rational canonical form. Over an arbitrary field $K$ (not necessarily algebraically closed), every linear operator $T : V \to V$ on a finite-dimensional $K$ -vector space is similar to a block-diagonal matrix of companion matrices. The companion matrix of a monic polynomial $p (t) = t^{d} + a_{d - 1} t^{d - 1} + \dots + a_{0} \in K [t]$ is

C (p) = 010 ⋮ 0 001 ⋮ 0 \dots \dots \dots ⋱ \dots 000 ⋮ 1 - a_{0} - a_{1} - a_{2} ⋮ - a_{d - 1} \in M_{d} (K),

with characteristic and minimal polynomial both equal to $p (t)$ . The rational canonical form of $T$ is the block-diagonal matrix $diag (C (p_{1}), C (p_{2}), \dots, C (p_{s}))$ with $p_{1} ∣ p_{2} ∣ \dots ∣ p_{s}$ in $K [t]$ — the invariant factors of $T$ , uniquely determined. Over an algebraically closed field $K$ each $p_{i}$ factors as a product of $(t - λ)^{k}$ pieces, recovering the Jordan blocks via $C ((t - λ)^{k}) \sim J_{k} (λ)$ . The rational canonical form is therefore the field-of-definition-agnostic version of the Jordan form; the Smith normal form for matrices over $K [t]$ (below) produces it algorithmically ^{[Frobenius, F. G. — Theorie der linearen Formen mit ganzen Coefficienten]}.

Smith normal form. For a matrix $A \in M_{m \times n} (R)$ over a principal ideal domain $R$ , there exist invertible matrices $P \in GL_{m} (R)$ and $Q \in GL_{n} (R)$ such that $P A Q$ is a diagonal matrix $diag (d_{1}, d_{2}, \dots, d_{r}, 0, \dots, 0)$ with $d_{1} ∣ d_{2} ∣ \dots ∣ d_{r}$ in $R$ . The $d_{i}$ are the elementary divisors (also called invariant factors) of $A$ , uniquely determined up to units. For $R = Z$ this is the structure theorem for finitely generated abelian groups: an abelian group $G$ presented by an integer matrix $A$ decomposes as $Z^{k} \oplus ⨁_{i} Z / d_{i} Z$ with $k = n - r$ and $d_{i}$ the Smith elementary divisors. For $R = K [t]$ applied to $t I - T$ on $K^{n}$ , the Smith form produces the rational canonical form: the elementary divisors of $t I - T$ are the invariant factors of $T$ ^{[Smith, H. J. S. — On systems of linear indeterminate equations and congruences]}.

Module-theoretic packaging. A linear operator $T : V \to V$ on a finite-dimensional $K$ -vector space is the same datum as a finitely generated module structure on $V$ over $K [t]$ : the polynomial $t$ acts as $T$ , the scalars $K$ act as scalar multiplication, and $V$ is finitely generated because it is finite-dimensional over $K$ . The structure theorem for finitely generated modules over a PID applied to $R = K [t]$ gives

V ≅ i = 1 ⨁ s K [t] / (q_{i} (t)),

with $q_{1} ∣ q_{2} ∣ \dots ∣ q_{s}$ the invariant factors. Equivalently, using primary decomposition,

V ≅ j ⨁ K [t] / (p_{j} (t)^{e_{j}}),

with $p_{j}$ monic irreducible. Over $K = C$ the irreducibles are linear, $p_{j} = t - λ_{j}$ , and the summand $K [t] / (t - λ)^{k}$ is isomorphic as a $K [t]$ -module to the cyclic $T$ -module generated by the top of a Jordan chain of length $k$ at $λ$ ; the matrix of $T$ on $K [t] / (t - λ)^{k}$ in the basis ${1, (t - λ), (t - λ)^{2}, \dots, (t - λ)^{k - 1}}$ is exactly the Jordan block $J_{k} (λ)$ . The Jordan form, the rational canonical form, and the elementary-divisor decomposition are three equivalent encodings of the same structure-theorem output, viewed at three different levels of abstraction ^{[Lang, S. — Algebra (3rd ed., revised)]}.

Holomorphic functional calculus. For $f$ holomorphic on a neighbourhood of the spectrum of $T \in M_{n} (C)$ , the operator $f (T)$ is defined unambiguously by the Jordan-form formula

f (J_{k} (λ)) = j = 0 \sum k - 1 \frac{f ^{(j)} ( λ )}{j !} N_{k}^{j} = f (λ) 0 ⋮ 00 f^{'} (λ) f (λ) 00 \frac{f ^{''} ( λ )}{2 !} f^{'} (λ) ⋱ \dots \dots \dots \dots f (λ) 0 \frac{f ^{(k - 1)} ( λ )}{( k - 1 )!} \frac{f ^{(k - 2)} ( λ )}{( k - 2 )!} ⋮ f^{'} (λ) f (λ) .

Applied block-by-block and reassembled, this defines $f (T) = P f (J) P^{- 1}$ for $T = P J P^{- 1}$ the Jordan decomposition. The matrix exponential $e^{t A}$ used to solve linear ODEs $\overset{x}{˙} = A x$ is the case $f (z) = e^{t z}$ : on a Jordan block, $exp (t J_{k} (λ)) = e^{t λ} \sum_{j = 0}^{k - 1} \frac{( t N _{k} ) ^{j}}{j !}$ , producing the polynomial-in- $t$ -times-exponential mode shapes that characterise solutions of constant-coefficient linear ODEs.

Conjugacy classes of $GL_{n}$ . The Jordan canonical form is the complete parameter for the conjugation action $GL_{n} (C) ↷ M_{n} (C)$ . The orbit space — the adjoint quotient — is parametrised by Jordan-form data: pairs $(λ_{1}, μ_{1}), \dots, (λ_{s}, μ_{s})$ where $λ_{i} \in C$ are distinct eigenvalues and $μ_{i} = (k_{i, 1} \geq k_{i, 2} \geq \dots)$ is the Segre partition at $λ_{i}$ , with the global constraint $\sum_{i} ∣ μ_{i} ∣ = n$ . Over $C$ , conjugacy classes are not closed in the Zariski topology: closures of unipotent orbits contain other unipotent orbits and form the nilpotent cone, a singular projective variety whose stratification by Jordan type is studied in geometric representation theory (Springer correspondence, Lusztig's perverse-sheaf parametrisation of nilpotent orbits). The GIT quotient $M_{n} (C) // GL_{n} (C)$ — the affine quotient by conjugation, defined as $Spec$ of the ring of conjugation-invariant polynomials — is isomorphic to $A_{C}^{n}$ , parametrising matrices by the coefficients of their characteristic polynomial.

Connection to dynamical systems. The Jordan canonical form is the linear-algebra core of stability analysis. For a linear discrete-time system $v_{m + 1} = T v_{m}$ on $C^{n}$ , the iterates are $v_{m} = T^{m} v_{0}$ , and $T^{m}$ is computed block-by-block on the Jordan form: $J_{k} (λ)^{m} = \sum_{j = 0}^{k - 1} (j m) λ^{m - j} N_{k}^{j}$ . The orbit ${v_{m}}$ tends to $0$ iff every eigenvalue satisfies $∣ λ ∣ < 1$ ; it remains bounded iff $∣ λ ∣ \leq 1$ for every eigenvalue and Jordan blocks at eigenvalues on the unit circle have size $1$ ; it diverges iff some eigenvalue has $∣ λ ∣ > 1$ or some Jordan block at an eigenvalue on the unit circle has size $\geq 2$ . The continuous-time system $\overset{v}{˙} = T v$ has analogous dichotomies controlled by the real parts of the eigenvalues.

Synthesis. First synthesis — Jordan form as the complete invariant of similarity over $C$ : every coordinate-free property of an operator over an algebraically closed field is recovered from its Jordan-form data; the characteristic polynomial, the minimal polynomial, the trace, the determinant, the rank of every polynomial in $T$ , the dimension of every generalised eigenspace, the index of nilpotency on each generalised eigenspace, the answer to "is $T$ diagonalisable", "is $T$ semisimple", "is $T$ nilpotent" — all read off from the Jordan-form Segre-characteristic data.

Second synthesis — the unification through $K [t]$ -module structure: the Jordan canonical form, the rational canonical form, and the elementary-divisor decomposition are three packagings of the same fact. A linear operator on a finite-dimensional $K$ -vector space is the same datum as a finitely generated torsion $K [t]$ -module; the structure theorem for finitely generated modules over a PID classifies these modules by invariant factors or by primary decomposition; the matrix expressions of these classifications in suitable bases are exactly the three canonical forms. The choice of basis is the choice of presentation.

Third synthesis — function calculus and dynamics: the Jordan form computes $f (T)$ for any holomorphic $f$ on a neighbourhood of the spectrum, via the derivative-formula on each Jordan block. The matrix exponential, the resolvent, and the Riesz functional calculus on bounded operators on a Banach space all factor through the finite-dimensional Jordan case as the building-block computation. The dynamical behaviour of linear evolution — stability, oscillation, growth — is therefore encoded in the Jordan-form data of the generator.

Fourth synthesis — geometric and arithmetic generalisations: the Jordan form is the prototype for more elaborate classifications. The Springer correspondence parametrises irreducible representations of the symmetric group by Jordan-type data on the nilpotent cone of $gl_{n}$ ; the Lusztig–Spaltenstein parametrisation extends this to nilpotent orbits in semisimple Lie algebras. Smith normal form for $Z$ -matrices is the integer analogue, classifying finitely generated abelian groups. Frobenius eigenvalues on $ℓ$ -adic étale cohomology — Galois Jordan forms — encode the same partition data in number-theoretic context. The Jordan-Segre partition reappears wherever a single linear operator carries the structural data of a more elaborate object.

Full proof set [Master]

Existence of primary decomposition. Let $T : V \to V$ be a linear operator on a finite-dimensional vector space over a field $K$ with $χ_{T} (t) = \prod_{i = 1}^{s} (t - λ_{i})^{m_{i}}$ factoring as a product of distinct linear factors over an algebraic closure $\overline{K}$ ; assume the eigenvalues $λ_{i}$ lie in $K$ (so $K = \overline{K}$ or the eigenvalues all happen to be defined over $K$ ). The polynomials $q_{i} (t) = (t - λ_{i})^{m_{i}}$ are pairwise coprime: $g cd (q_{i}, q_{j}) = 1$ for $i \neq = j$ because they have disjoint root sets. By Bezout's lemma in $K [t]$ , for each $i$ there exist polynomials $a_{i}, b_{i} \in K [t]$ with $a_{i} q_{i} + b_{i} \prod_{j \neq = i} q_{j} = 1$ . Set $r_{i} (t) = b_{i} (t) \prod_{j \neq = i} q_{j} (t) \in K [t]$ ; then $r_{i}$ is divisible by $q_{j}$ for every $j \neq = i$ , and $\sum_{i} r_{i} (t) \equiv 1 (mod χ_{T} (t))$ by a Chinese-remainder-theorem manipulation.

Substitute $T$ in place of $t$ . The relation $\sum_{i} r_{i} (T) = I_{V}$ holds because $χ_{T} (T) = 0$ by Cayley-Hamilton. Set $P_{i} = r_{i} (T) \in End_{K} (V)$ . The image of $P_{i}$ is contained in $G_{λ_{i}} = ker (T - λ_{i} I)^{m_{i}}$ because $r_{i}$ is divisible by $q_{j} = (t - λ_{j})^{m_{j}}$ for every $j \neq = i$ , hence $r_{i} (T)$ factors through projections onto each $G_{λ_{j}}$ for $j \neq = i$ , which are annihilated by $(T - λ_{i} I)^{m_{i}}$ only on $G_{λ_{i}}$ — running the argument cleanly: $(T - λ_{i} I)^{m_{i}} P_{i} = q_{i} (T) r_{i} (T)$ , and $q_{i} \cdot r_{i}$ is divisible by $q_{i} \cdot \prod_{j \neq = i} q_{j} = χ_{T}$ , so $q_{i} (T) r_{i} (T) = 0$ . Hence $im (P_{i}) \subseteq G_{λ_{i}}$ . The identity $\sum_{i} P_{i} = I_{V}$ gives $V = \sum_{i} G_{λ_{i}}$ . The sum is direct because the $P_{i}$ form a system of orthogonal idempotents: $P_{i} P_{j} = r_{i} (T) r_{j} (T)$ is divisible by every $(t - λ_{k})^{m_{k}}$ when $i \neq = j$ (because the product $r_{i} r_{j}$ contains every $q_{k}$ ), hence by $χ_{T}$ , hence is $0$ . The $P_{i}$ are projections onto the $G_{λ_{i}}$ , and the decomposition is a direct sum.

Restriction is nilpotent on each generalised eigenspace. On $G_{λ_{i}}$ , the operator $T - λ_{i} I$ is nilpotent by definition: $(T - λ_{i} I)^{m_{i}}$ vanishes on $G_{λ_{i}}$ . The index of nilpotency on $G_{λ_{i}}$ — the smallest $k$ with $(T - λ_{i} I)^{k}$ identically zero on $G_{λ_{i}}$ — is the multiplicity of $λ_{i}$ in the minimal polynomial $m_{T}$ and equals the size of the largest Jordan block of $T$ at $λ_{i}$ .

Jordan basis for a nilpotent operator (full construction). Let $N : W \to W$ be a nilpotent operator on a finite-dimensional vector space, with $N^{ℓ} = 0$ and $N^{ℓ - 1} \neq = 0$ . The kernel filtration is ${0} = K_{0} ⊊ K_{1} ⊊ K_{2} ⊊ \dots ⊊ K_{ℓ} = W$ , where $K_{j} = ker N^{j}$ . The successive quotients are linked by $N$ : for each $j$ , the operator $N$ sends $K_{j}$ to $K_{j - 1}$ and induces a $K$ -linear injection $\overline{N}_{j} : K_{j} / K_{j - 1} ↪ K_{j - 1} / K_{j - 2}$ (injection because $w + K_{j - 1} \in K_{j} / K_{j - 1}$ with $\overline{N}_{j} (w + K_{j - 1}) = 0$ means $N w \in K_{j - 2}$ , i.e., $N^{j - 1} w = 0$ , i.e., $w \in K_{j - 1}$ ).

Set $d_{j} = dim (K_{j} / K_{j - 1})$ ; the injections give $d_{ℓ} \leq d_{ℓ - 1} \leq \dots \leq d_{1}$ . The Young diagram with $d_{j}$ boxes in row $j$ (from the top) is the conjugate partition to the Segre partition.

Build the basis top-down. Pick a basis $\overset{v}{ˉ}_{1}, \dots, \overset{v}{ˉ}_{d_{ℓ}}$ of the top quotient $K_{ℓ} / K_{ℓ - 1} = W / K_{ℓ - 1}$ , and lift to vectors $v_{1}, \dots, v_{d_{ℓ}} \in W$ . Each $v_{a}$ generates a Jordan chain ${v_{a}, N v_{a}, N^{2} v_{a}, \dots, N^{ℓ - 1} v_{a}}$ of length $ℓ$ , with $N^{ℓ - 1} v_{a} \neq = 0$ because $v_{a} \in / K_{ℓ - 1}$ and $N^{ℓ} v_{a} = 0$ because $N^{ℓ} = 0$ .

Descend to the next level. The vectors $N v_{1}, \dots, N v_{d_{ℓ}}$ lie in $K_{ℓ - 1}$ and their images in $K_{ℓ - 1} / K_{ℓ - 2}$ are linearly independent (by the injectivity of $\overline{N}_{ℓ}$ ). Extend to a basis $\overline{N v_{1}}, \dots, \overline{N v_{d_{ℓ}}}, \overset{w}{ˉ}_{1}, \dots, \overset{w}{ˉ}_{d_{ℓ - 1} - d_{ℓ}}$ of $K_{ℓ - 1} / K_{ℓ - 2}$ , and lift the new vectors to $w_{1}, \dots, w_{d_{ℓ - 1} - d_{ℓ}} \in K_{ℓ - 1}$ . Each $w_{b}$ generates a Jordan chain ${w_{b}, N w_{b}, \dots, N^{ℓ - 2} w_{b}}$ of length $ℓ - 1$ .

Continue inductively for $j = ℓ - 2, ℓ - 3, \dots, 1$ . At each level $j$ , the previously-built chains contribute $\sum_{k > j} d_{k}$ vectors to $K_{j} / K_{j - 1}$ via the $N^{ℓ - j}, N^{ℓ - 1 - j}, \dots$ images, all linearly independent in the quotient; extend to a basis with $d_{j} - \sum_{k > j} d_{k}$ new tops at level $j$ . Each new top generates a Jordan chain of length $j$ .

The total construction produces $b_{j} = d_{j} - d_{j + 1}$ Jordan chains of length exactly $j$ , for $j = 1, 2, \dots, ℓ$ . The chains together form a basis of $W$ , and in this basis $N$ has the block-diagonal matrix $⨁_{j} (N_{j}^{\oplus b_{j}})$ with $b_{j}$ copies of the size- $j$ nilpotent shift. Translating by $λ I$ on a generalised eigenspace $G_{λ}$ gives the Jordan block decomposition of $T$ on $G_{λ}$ .

Uniqueness via kernel-filtration dimensions. The numbers $d_{j} = dim ker N^{j} - dim ker N^{j - 1}$ are intrinsic invariants of the operator $N$ , independent of the choice of basis. The number $b_{j} = d_{j} - d_{j + 1}$ of Jordan blocks of size $j$ is therefore intrinsic. Assembling over the eigenvalues $λ$ , the multiset of pairs $(j, λ)$ of Jordan-block sizes and corresponding eigenvalues is intrinsic to $T$ . Two operators with the same Jordan-form data are similar (the bases constructed above conjugate one to the other), and two operators with different Jordan-form data are not similar (because the data is intrinsic to similarity).

Diagonalisability iff $m_{T}$ square-free (full proof). Suppose $m_{T} (t) = \prod_{i = 1}^{r} (t - λ_{i})$ is a product of distinct linear factors. The Lagrange interpolation polynomials $p_{i} (t) = \prod_{j \neq = i} (t - λ_{j}) / \prod_{j \neq = i} (λ_{i} - λ_{j})$ satisfy $\sum_{i} p_{i} (t) = 1$ and $p_{i} (λ_{j}) = δ_{ij}$ . From the identity $(t - λ_{i}) \cdot \prod_{j \neq = i} (λ_{i} - λ_{j}) \cdot p_{i} (t) = m_{T} (t)$ , applying at $T$ gives $(T - λ_{i} I) p_{i} (T) = 0$ , so $im p_{i} (T) \subseteq E_{λ_{i}}$ . From $\sum_{i} p_{i} (T) = I_{V}$ , every $v \in V$ decomposes as $\sum_{i} p_{i} (T) v$ with each summand an eigenvector. The decomposition is direct (eigenvectors for distinct eigenvalues are linearly independent), so $V = ⨁_{i} E_{λ_{i}}$ and $T$ is diagonalisable. Conversely, if $T = ⨁_{i} λ_{i} I_{E_{λ_{i}}}$ is diagonalisable, the polynomial $p (t) = \prod_{i} (t - λ_{i})$ annihilates $T$ , so $m_{T} ∣ p$ , and since $m_{T}$ shares the roots of $p$ and $p$ is square-free of degree $r$ , $m_{T} = p$ .

Connections [Master]

Eigenvalue, eigenvector, characteristic polynomial 01.01.08 — supplies the spectral framework on which the Jordan canonical form is built. The characteristic polynomial $χ_{T} (t) = det (t I - T)$ from 01.01.08 is the polynomial whose root multiset, with Segre-characteristic refinements, is the Jordan-form data. The Cayley-Hamilton identity $χ_{T} (T) = 0$ from the Master tier of 01.01.08 underlies the minimal-polynomial development of this unit, and the primary decomposition $V = ⨁_{λ} G_{λ}$ refines the eigendecomposition $V = ⨁_{λ} E_{λ}$ of 01.01.08 for non-diagonalisable operators. The Jordan form is the structural completion of the spectral picture begun in 01.01.08.
Determinant: axiomatic + expansion + properties 01.01.07 — supplies the multiplicativity $det (B A B^{- 1}) = det A$ that makes the characteristic polynomial — hence the entire Jordan-form data — a similarity invariant, and the explicit Leibniz / cofactor formulae used to compute characteristic polynomials in concrete examples. The block-diagonal nature of the Jordan form makes its characteristic polynomial straightforward to compute: $χ_{J} (t) = \prod_{i} (t - λ_{i})^{k_{i}}$ for $J = diag (J_{k_{1}} (λ_{1}), \dots)$ . The block-diagonal determinant identity is itself a consequence of the multiplicativity axioms of 01.01.07.
Inner-product space: orthogonality, Gram-Schmidt, spectral theorem 01.01.09 pending — refines the Jordan canonical form for self-adjoint operators on a finite-dimensional inner-product space. Every self-adjoint operator is diagonalisable (so has Jordan form with only size- $1$ blocks) and its eigenvalues are real; the spectral theorem of 01.01.09 pending is the inner-product-space sharpening of the diagonalisability criterion of this unit. The non-self-adjoint, non-diagonalisable operators on an inner-product space — those with Jordan blocks of size $\geq 2$ — fail the spectral theorem but admit a generalised polar decomposition and Schur upper-triangular form, an inner-product-space relative of the Jordan canonical form.
Matrix exponential and constant-coefficient linear ODE systems 02.06.03 pending — the matrix exponential $e^{t A}$ used to solve $\overset{x}{˙} = A x$ is computed block-by-block on the Jordan form of $A$ : on a Jordan block $J_{k} (λ)$ , the exponential is $e^{t J_{k} (λ)} = e^{t λ} \sum_{j = 0}^{k - 1} \frac{t ^{j} N _{k}^{j}}{j !}$ , producing the characteristic mode shapes $t^{j} e^{t λ}$ . The Jordan canonical form is the linear-algebra core of the constant-coefficient ODE theory, and the explicit polynomial-in- $t$ growth rates of solutions are read off directly from the block sizes.
Bounded and unbounded operators on Hilbert space 02.11.03 — the infinite-dimensional generalisation in which the Jordan canonical form is replaced by the spectral measure of a self-adjoint operator (Stone-von Neumann) and the holomorphic functional calculus of bounded operators on a Banach space (Riesz). The Jordan-form derivative-on-each-block formula $f (J_{k} (λ)) = \sum_{j} \frac{f ^{(j)} ( λ )}{j !} N_{k}^{j}$ generalises to the contour-integral formula $f (T) = \frac{1}{2 π i} \oint_{Γ} f (z) (z I - T)^{- 1} d z$ for bounded operators with spectrum enclosed by $Γ$ ; the finite-dimensional Jordan-form formula is the residue-sum case.
Structure theorem for finitely generated modules over a PID and rational canonical form [01.02.*] — the Jordan canonical form is the algebraically-closed-field instance of the structure theorem for finitely generated modules over a PID applied to $R = K [t]$ . The same theorem applied to $R = Z$ gives the structure theorem for finitely generated abelian groups; applied to $R = K [t]$ over a non-algebraically-closed field gives the rational canonical form via companion matrices of invariant factors. The unification through the PID structure theorem is the conceptual completion of the Jordan-form picture, situating it as a single instance of a general module-theoretic classification.

Historical & philosophical context [Master]

The classification problem for linear operators up to similarity originated in the study of bilinear and quadratic forms in the nineteenth century. Karl Weierstrass, in Zur Theorie der bilinearen und quadratischen Formen (Monatsberichte der Königlich Preussischen Akademie der Wissenschaften zu Berlin, 1868, 311–338), introduced the elementary divisor theory, classifying pencils of forms $A + λ B$ over $C$ by polynomial invariants — the elementary divisors of the matrix $A - λ B$ in modern language — and obtained what would now be called the Jordan canonical form for $A$ when $B$ is nondegenerate, two years before Jordan's own treatment ^{[Weierstrass, K. — Zur Theorie der bilinearen und quadratischen Formen]}. The earlier work of Henry John Stephen Smith, On systems of linear indeterminate equations and congruences (Philosophical Transactions of the Royal Society 151, 1861, 293–326), gave the analogue for matrices over the integers: every $Z$ -matrix is equivalent under integer-invertible row and column operations to a diagonal matrix with successive entries dividing the next — the Smith normal form — proving the structure theorem for finitely generated abelian groups in the same stroke ^{[Smith, H. J. S. — On systems of linear indeterminate equations and congruences]}.

Camille Jordan, in Traité des substitutions et des équations algébriques (Gauthier-Villars, Paris, 1870, Livre II, Note D), gave the canonical-form decomposition that bears his name, in the context of substitutions of variables in polynomial equations and the structure of finite groups ^{[Jordan, C. — Traité des substitutions et des équations algébriques]}. Ferdinand Georg Frobenius, in Theorie der linearen Formen mit ganzen Coefficienten (Journal für die reine und angewandte Mathematik 86, 1879, 146–208) and subsequent papers, developed the systematic theory of invariant factors of a linear substitution, giving what is now called the rational canonical form by companion matrices of invariant factors over an arbitrary field ^{[Frobenius, F. G. — Theorie der linearen Formen mit ganzen Coefficienten]}. The three threads — Weierstrass elementary divisors over $C$ , Smith normal form over $Z$ , Jordan canonical form over $C$ , Frobenius rational canonical form over a general field — were unified in the twentieth century by the structure theorem for finitely generated modules over a principal ideal domain, in the form codified in Serge Lang's Algebra (Addison-Wesley, 1965, Ch. III §10 and Ch. XIV §2).

The vocabulary eigenvalue and the systematic treatment of generalised eigenspaces date to the early-twentieth-century reformulation of linear algebra in the language of operator theory, driven by Hilbert's work on integral equations (1904–1910) and the formalisation of finite-dimensional spectral theory in the textbooks of Schreier-Sperner (Vorlesungen über Matrizen, 1932) and Halmos (Finite-Dimensional Vector Spaces, 1942, drawing on von Neumann's lecture notes). The module-theoretic packaging — a linear operator on a finite-dim vector space is a finitely-generated torsion module over $K [t]$ — is implicit in van der Waerden's Moderne Algebra (Springer, Berlin, 1930–1931) and explicit in Bourbaki's Algèbre (Hermann, 1942 onward, Ch. VII on the structure theorem). The Jordan canonical form, the rational canonical form, and the Smith normal form are now standard packaged together as instances of the structure theorem for finitely generated modules over a PID, with the three forms corresponding to three choices of summand presentation: invariant factors, elementary divisors over $C$ (Jordan), or elementary divisors over a general field (rational).

Bibliography [Master]

Nineteenth-century origins.

Weierstrass, K., "Zur Theorie der bilinearen und quadratischen Formen", Monatsberichte der Königlich Preussischen Akademie der Wissenschaften zu Berlin, 1868, 311–338.
Smith, H. J. S., "On systems of linear indeterminate equations and congruences", Philosophical Transactions of the Royal Society of London 151 (1861), 293–326.
Jordan, C., Traité des substitutions et des équations algébriques, Gauthier-Villars, Paris, 1870.
Frobenius, F. G., "Theorie der linearen Formen mit ganzen Coefficienten", Journal für die reine und angewandte Mathematik 86 (1879), 146–208.
Frobenius, F. G., "Über lineare Substitutionen und bilineare Formen", Journal für die reine und angewandte Mathematik 84 (1878), 1–63.

Twentieth-century systematisation.

Schreier, O. & Sperner, E., Einführung in die analytische Geometrie und Algebra, Vol. II (Vorlesungen über Matrizen), Teubner, Leipzig, 1932.
van der Waerden, B. L., Moderne Algebra, two volumes, Springer, Berlin, 1930–1931.
Halmos, P. R., Finite-Dimensional Vector Spaces, Annals of Mathematics Studies 7, Princeton University Press, 1942.
Bourbaki, N., Algèbre, Ch. VII (Modules sur les anneaux principaux), Hermann, Paris, 1952.

Modern textbook treatments.

Apostol, T. M., Calculus, Vol. 2: Multi-Variable Calculus and Linear Algebra with Applications, 2nd ed., John Wiley & Sons, 1969, Ch. 4 §4.20.
Hoffman, K. & Kunze, R., Linear Algebra, 2nd ed., Prentice-Hall, 1971, Ch. 6–7.
Lang, S., Algebra, 3rd ed. revised, Graduate Texts in Mathematics 211, Springer, 2002, Ch. III §10 and Ch. XIV §2.
Axler, S., Linear Algebra Done Right, 3rd ed., Springer, 2015, Ch. 8.

Geometric representation theory follow-ups.

Springer, T. A., "Trigonometric sums, Green functions of finite groups and representations of Weyl groups", Inventiones Mathematicae 36 (1976), 173–207.
Lusztig, G., "Green polynomials and singularities of unipotent classes", Advances in Mathematics 42 (1981), 169–178.

Prerequisites

01.01.08

Tier anchors

beginner: Stretch-plus-shear blocks — Apostol Calculus Vol. 2 Ch. 4 §4.20 (informal opening)
intermediate: Apostol Calculus Vol. 2 Ch. 4 §4.20; Axler — Linear Algebra Done Right §8.A–§8.D
master: Apostol Calculus Vol. 2 Ch. 4 §4.20; Axler — Linear Algebra Done Right Ch. 8; Hoffman-Kunze — Linear Algebra Ch. 6–7; Lang — Algebra Ch. III §10 + Ch. XIV §2

References

textbooks-extra
Calculus Vol.2 - Multi-Variable Calculus and Linear Algebra with Applications (Tom Apostol).pdf · Ch. 4 §4.20, partial Jordan-form discussion and minimal polynomial
TODO_REF
Axler, S. — Linear Algebra Done Right (3rd ed.) · Ch. 8, generalised eigenspaces, nilpotent operators, the Jordan-form theorem
TODO_REF
Hoffman, K. & Kunze, R. — Linear Algebra (2nd ed.) · Ch. 6–7, annihilating polynomials, minimal polynomial, primary decomposition, rational and Jordan canonical forms
TODO_REF
Lang, S. — Algebra (3rd ed., revised) · Ch. III §10 and Ch. XIV §2, structure of finitely generated modules over a PID and the Jordan / rational canonical forms
TODO_REF
Jordan, C. — Traité des substitutions et des équations algébriques · Gauthier-Villars, Paris, 1870 — original statement and proof of the Jordan canonical form
TODO_REF
Weierstrass, K. — Zur Theorie der bilinearen und quadratischen Formen · Monatsberichte der Königlich Preussischen Akademie der Wissenschaften zu Berlin, 1868, 311–338 — elementary divisor theory
TODO_REF
Smith, H. J. S. — On systems of linear indeterminate equations and congruences · Philosophical Transactions of the Royal Society 151 (1861), 293–326 — Smith normal form for integer matrices
TODO_REF
Frobenius, F. G. — Theorie der linearen Formen mit ganzen Coefficienten · Journal für die reine und angewandte Mathematik 86 (1879), 146–208 — rational canonical form and invariant factors

Lean module

Codex.Foundations.LinearAlgebra.JordanForm

Mathlib gap

Mathlib packages the minimal polynomial of an endomorphism as minpoly K T together with the divisibility minpoly K A ∣ Matrix.charpoly A through Matrix.aeval_self_charpoly and minpoly.dvd, the generalised eigenspaces as Module.End.generalizedEigenspace, and the primary-decomposition statement that generalised eigenspaces sum to the whole module over an algebraically closed field as Module.End.iSup_generalizedEigenspace_eq_top in LinearAlgebra.Eigenspace.Triangularizable. What is not packaged as a single self-contained module is the Jordan canonical form theorem in its textbook block-diagonal statement: there is no Matrix.jordanForm constructor returning a block-diagonal matrix with explicit Jordan blocks and an accompanying uniqueness lemma keyed off the Segre characteristic at each eigenvalue. The Smith normal form for matrices over an arbitrary principal ideal domain is similarly partial in Mathlib: the integer case Matrix.smithNormal exists but the general PID version remains in development as of the 2026 Mathlib snapshot used by Codex.

Reviewer

TBD

Estimated time

beginner: 20m
intermediate: 45m
master: 90m