Birkhoff normal form
Anchor (Master): Birkhoff 1927 *Dynamical Systems* (AMS Colloquium Publications IX, originator); Arnold-Kozlov-Neishtadt *Mathematical Aspects of Classical and Celestial Mechanics* Ch. 7; Pöschel 1989 (Gevrey refinement); Siegel 1942 (small divisors); Moser 1956
Intuition [Beginner]
Imagine a marble resting at the bottom of a bowl. Push it a little and it rocks back and forth. Push it a tiny bit harder and the rocking becomes a slightly more elaborate dance — but only by a little, because the bowl is almost perfectly parabolic near its bottom. The Birkhoff normal form is a recipe for organising that almost-parabolic motion: how to find coordinates in which the dance is as close to a sum of independent circular oscillations as the geometry allows.
The starting point is an equilibrium of a frictionless mechanical system. Near such a point the motion is governed by a small set of natural frequencies — one frequency per oscillation direction. The lowest-order behaviour is a clean superposition of these oscillations, and the Birkhoff procedure pushes the higher-order corrections into a form that depends only on the energies of each oscillator, not on their phases.
There is a catch. When the natural frequencies stand in a simple integer ratio — say one direction oscillates exactly twice as fast as another — the simplification breaks down and the directions exchange energy. These resonances are the seeds of chaos in mechanical systems and the reason the simplification is only partial.
Visual [Beginner]
A schematic phase portrait near an elliptic equilibrium of a two-degree-of-freedom Hamiltonian. On the left the linearised motion: nested ellipses, products of two independent oscillators, each at its own frequency. On the right the same system after a Birkhoff transformation: nearly-elliptical level sets of the action variables, with a few thin resonance regions where the simple picture breaks.
The picture captures the headline result: away from frequency resonances, the motion near an equilibrium reorganises into action-only level sets to any prescribed order in the distance from the equilibrium.
Worked example [Beginner]
Take a particle of unit mass moving in a two-dimensional well with potential . The point is a minimum, so it is a stable equilibrium. The quadratic part of the energy is , an oscillator with frequencies in the -direction and in the -direction.
The ratio is a simple integer — a resonance — so the cubic correction cannot be removed: it has the same frequency as the linear motion in the -direction. The Birkhoff procedure says this term survives as a resonant coupling between the two oscillators.
Change the potential to instead. Now and , a ratio that is far from any integer relation. The Birkhoff procedure removes the cubic and quartic angle-dependence by a small change of coordinates, leaving an action-only Hamiltonian to order four.
Takeaway: the integer arithmetic of the natural frequencies decides whether the equilibrium can be simplified or whether resonance forces the directions to talk to each other.
Check your understanding [Beginner]
Formal definition [Intermediate+]
Let be a symplectic manifold and let be a smooth Hamiltonian with an elliptic equilibrium at a point . Choose Darboux coordinates centred at in which and , . The Hessian quadratic form is, after a linear symplectic change of coordinates, the standard elliptic quadratic $$ H_2(q, p) = \tfrac{1}{2} \sum_{j=1}^{n} \omega_j (q_j^2 + p_j^2), $$ with linear frequencies , all non-zero and (without loss of generality) positive. Expand in homogeneous parts: $$ H = H_2 + H_3 + H_4 + \cdots, \qquad H_k \text{ homogeneous of degree } k. $$
Resonance and non-resonance. A frequency vector is non-resonant up to order (or -non-resonant) if $$ \langle k, \omega \rangle \neq 0 \quad \text{for all } k \in \mathbb{Z}^n \text{ with } 0 < |k| \leq N, $$ where . The resonance module is the lattice . A homogeneous polynomial of degree in the complex variables , — with monomials of total degree — is resonant if and non-resonant otherwise.
Birkhoff normal form to order . A Hamiltonian on near is in Birkhoff normal form to order if $$ H(q, p) = H_2(q, p) + Z_4(I) + Z_6(I) + \cdots + Z_{2 \lfloor N/2 \rfloor}(I) + R_{N+1}(q, p), $$ where are the actions of the linearised system, each is a polynomial in the actions, and is the remainder. Equivalently in complex coordinates , the normal-form polynomials contain only resonant monomials with when is non-resonant of every order, and more generally the resonant monomials allowed by when finite-order resonance is present.
The Birkhoff theorem states that if is -non-resonant, then a sequence of canonical transformations brings to this form to order .
Key theorem with proof [Intermediate+]
Theorem (Birkhoff normal form, finite order — Birkhoff 1927). Let be a smooth Hamiltonian on with an elliptic equilibrium at the origin and linear frequencies . Suppose is non-resonant up to order for some integer . Then there exists a smooth canonical transformation , defined on a neighbourhood of the origin, such that $$ H \circ \Phi(q, p) = H_2(q, p) + \sum_{k=2}^{\lfloor N/2 \rfloor} Z_{2k}(I_1, \ldots, I_n) + R_{N+1}(q, p), $$ where each is a polynomial of degree in the actions , , and is the identity to first order at the origin [Birkhoff 1927; ref: TODO_REF Arnold].
Proof (Lie-series method). Order the homogeneous parts of by degree and remove the angle-dependence one degree at a time. The transformations are time-one maps of polynomial Hamiltonian flows, which respect the polynomial-degree filtration cleanly.
Step 1: the cohomological operator at degree . Define , the Poisson-bracket action of on functions. In complex coordinates , the Hamiltonian — using — so a monomial is an eigenfunction: $$ \operatorname{ad}_{H_2}(z^\alpha \bar z^\beta) = {H_2, z^\alpha \bar z^\beta} = i \langle \alpha - \beta, \omega \rangle , z^\alpha \bar z^\beta. $$ The eigenvalues are zero for resonant monomials and bounded away from zero by the -non-resonance hypothesis for non-resonant monomials of total degree .
Step 2: kill the non-resonant part of . Decompose into its resonant and non-resonant components in the eigenbasis of . Define as the unique zero-resonant-mean polynomial of degree satisfying $$ \operatorname{ad}{H_2}(F_3) = H_3^{\text{nr}}, $$ which is solvable by inverting $\operatorname{ad}{H_2}F_3 = \sum_{(\alpha, \beta) : \alpha - \beta \notin \mathcal{R}, |\alpha| + |\beta| = 3} h^{\alpha\beta}3 / (i\langle \alpha - \beta, \omega\rangle) \cdot z^\alpha \bar z^\betah_3^{\alpha\beta}H_3\Phi_3-F_3$. Then to leading order $$ H \circ \Phi_3 = H_2 + H_3 - \operatorname{ad}{H_2}(F_3) + (\text{terms of degree} \geq 4) = H_2 + H_3^{\text{res}} + (\text{deg} \geq 4), $$ since exactly cancels the non-resonant cubic. The new degree- terms are explicit polynomial expressions in .
Step 3: iterate. Apply the same construction at degree to the new Hamiltonian: solve , define as the time-one map of the flow of . Continue through degree . Each is a polynomial of degree , each is a polynomial canonical transformation that is the identity through degree , and the composition is a canonical transformation that is the identity to first order at .
Step 4: identify the resonant terms with action polynomials. When is non-resonant to all orders, the only resonant monomials at total even degree are products with , i.e., polynomials in the actions . At odd degree there are no resonant monomials at all (since with odd forces a non-zero element of with odd -norm, ruled out by full-order non-resonance). The resonant pieces at odd vanish, and at even they are polynomials in the actions, denoted . The remainder collects everything of degree .
The proof is complete.
Bridge. The Birkhoff procedure builds toward the full perturbative apparatus of Hamiltonian dynamics, and its cohomological equation appears again in the KAM theorem 05.09.01, where the same Lie-series step is iterated infinitely many times under a Diophantine bound on the linear frequencies. The bridge is exactly the eigenvalue calculation : at each step the small-divisor pattern of KAM is the finite-order non-resonance pattern of Birkhoff continued past every order, and the foundational reason both apparatuses share a single technical core is that they invert the same operator on different function spaces. The same identification underlies adiabatic invariants 05.09.02, where the angle-average that survives in the slow-time limit is exactly the resonant subspace of in disguise. Putting these together, the Birkhoff normal form is the polynomial-formal infrastructure that the analytic and smooth perturbation theorems of the rest of the chapter rest upon.
Exercises [Intermediate+]
Lean formalization [Intermediate+]
lean_status: none — Mathlib lacks the symplectic-vector-space, Lie-series, and resonance-module infrastructure needed for the Birkhoff normal form. A formal statement would look like the following pseudocode, with each axiom replaced by a real definition once the prerequisites are in Mathlib.
A formal route would assemble: symplectic vector spaces with their canonical Poisson algebra of polynomial functions; the eigenvalue decomposition of on degree- homogeneous polynomials in ; the resonance module of an elliptic frequency vector; the Lie-series machinery acting on truncated polynomial Hamiltonians; bookkeeping for the polynomial-degree filtration; finally, induction on the order of the normal form. The Pöschel-Gevrey refinement requires additional work: Cauchy estimates on holomorphic-strip norms, factorial bounds on Birkhoff coefficients under Diophantine hypotheses, and the optimal-truncation principle for asymptotic series. Both finite-order and Gevrey statements remain Mathlib-roadmap items.
Advanced results [Master]
The Birkhoff normal form is the headline of a structural circle of ideas concerning local analysis of Hamiltonian systems near equilibria. Six refinements deepen, generalise, or sit beside the basic theorem.
Birkhoff-Gustavson algorithm (1966). Gustavson's 1966 paper On constructing formal integrals of a Hamiltonian system near an equilibrium point in Astron. J. 71 [Gustavson 1966] gave the first systematic computer-algebra implementation of the Birkhoff procedure. Gustavson applied the algorithm to the Hénon-Heiles Hamiltonian , computing the normal form to high orders and demonstrating that the truncated Hamiltonian was integrable while the full system was chaotic. Gustavson's computation is the modern starting point for celestial-mechanics applications of Birkhoff's theorem and for stability estimates near Lagrange points in spacecraft orbit design.
Resonant normal forms in two degrees of freedom. When the linear frequencies satisfy a low-order resonance with , the Birkhoff procedure cannot eliminate the resonant cubic or quartic monomials; the resonant normal form at order classifies the local dynamics into focus / saddle / centre types according to the sign and structure of the resonant coefficient. The resonance gives an additional rotation symmetry, yielding the integrable normal form whose moment-map level sets are concentric two-spheres in a three-dimensional reduced space. Cushman-Bates Global Aspects of Classical Integrable Systems gives the encyclopaedic treatment of the resonance classification, including the so-called , , and resonances and the bifurcation diagrams of their action-level dynamics.
Pöschel-Gevrey theory (1989). Pöschel 1989 [Pöschel 1989] proved that under a Diophantine condition on the linear frequencies, the Birkhoff coefficients in suitable holomorphic-strip norms grow no faster than — a Gevrey-class bound. Optimal-truncation balancing then gives an exponentially-small remainder for . The statement upgrades the Birkhoff normal form from a finite-order asymptotic expansion to an exponentially precise local model, and yields a Nekhoroshev-style perpetual-stability estimate on a polynomial-in- time scale near the equilibrium even when the full series diverges. Giorgilli and Locatelli have refined the constants for celestial-mechanics applications.
Siegel-Moser convergence in special cases. Siegel 1942 [Siegel 1942] established that for a single complex variable the Birkhoff-style series for the conjugacy of an analytic map to its linear part converges under a Brjuno-type small-divisor condition; Moser 1956 [Moser 1956] extended the result to area-preserving maps. The convergent cases are special: there is no full-dimensional convergent Birkhoff theorem in degrees of freedom, since Poincaré non-integrability obstructs convergence on a full neighbourhood. The KAM theorem 05.09.01 is the appropriate replacement, recovering convergence on a Cantor set of full measure rather than a full neighbourhood.
Bifurcation theory of equilibria. When the frequencies of the linearised system depend continuously on a parameter, low-order resonances cross transversely as the parameter varies, and the Birkhoff normal form provides the local model in which the bifurcation can be analysed. The Hopf bifurcation, the period-doubling cascade in area-preserving maps, and the Krein collision of elliptic eigenvalues all correspond to specific resonant Birkhoff structures. The theory underlies the universal bifurcation analysis of Arnold Geometrical Methods in the Theory of Ordinary Differential Equations and the entire industry of Hamiltonian bifurcation analysis in mathematical physics.
Hyperbolic equilibria and the stable/unstable manifolds. When the linear part has hyperbolic eigenvalues — frequencies with non-zero real part — a different normal-form theorem applies: Sternberg's 1957 theorem in the smooth category and the Poincaré-Dulac normal form in the analytic category give a polynomial conjugacy to a normal form determined by the resonance structure of the eigenvalues. The hyperbolic case rests on the same eigenvalue calculation with the full spectrum, but with a different small-divisor character — eigenvalue resonances rather than frequency resonances — and the Hartman-Grobman theorem in the topological category as a coarser predecessor. The smooth-vs-analytic distinction is central: Sternberg gives smooth conjugacy in the absence of resonances, while analytic conjugacy demands a Brjuno-type Diophantine bound.
Synthesis. The Birkhoff normal form is the polynomial-formal scaffolding on which the analytic and smooth perturbation theorems of Hamiltonian dynamics are built. The foundational reason the same eigenvalue calculation underlies Birkhoff, KAM 05.09.01, and adiabatic invariants 05.09.02 is exactly that all three apparatuses solve the same cohomological equation — the Lie-bracket-with- inversion problem — on different function spaces and with different convergence schemes. Putting these together, Birkhoff inverts the operator order by order in polynomial degree on a full neighbourhood, producing a generically divergent series; KAM inverts it on a Cantor set of Diophantine actions with a Newton iteration whose quadratic gain dominates the loss of regularity; the adiabatic theorem inverts it once on a single frequency under slow time variation. The bridge between Birkhoff's polynomial-formal theorem and KAM's Cantor-convergent theorem is the same identification of the resonant subspace with the action-only polynomials, and this is exactly the central insight that organises the entire perturbative apparatus: the resonant subspace of is the one place the inversion fails, and the geometry of that failure — finite-order non-resonance, Diophantine measure-positivity, single-frequency reducibility — selects which theorem applies. The Birkhoff normal form generalises the action-angle theory of integrable systems 05.02.04 to a perturbative neighbourhood of an equilibrium, and is dual to the Hamilton-Jacobi inversion that produces action-angle coordinates in the first place.
Full proof set [Master]
Lemma (eigenvalue decomposition of ). Let with . In the complex coordinates , (canonical with ), the Poisson-bracket operator is diagonal on monomials: $$ \operatorname{ad}_{H_2}(z^\alpha \bar z^\beta) = i \langle \alpha - \beta, \omega\rangle z^\alpha \bar z^\beta. $$
Proof. Direct computation: and . More carefully: and in the convention , . Therefore , and summing with weights gives the eigenvalue .
Lemma (cohomological equation at degree ). Let be non-resonant up to order and let be a homogeneous polynomial of degree in with no resonant component (i.e., ). Then there is a unique homogeneous polynomial of degree with no resonant component such that , given by $$ F = \sum_{|\alpha| + |\beta| = k,, \alpha - \beta \notin \mathcal{R}(\omega)} \frac{R^{\alpha\beta}}{i\langle \alpha - \beta, \omega\rangle} z^\alpha \bar z^\beta, $$ where are the coefficients of .
Proof. The previous lemma diagonalises with eigenvalues . For non-resonant multi-indices the eigenvalues are non-zero, so the operator is invertible on the non-resonant subspace and the explicit formula provides the unique inverse. The result is real (closed under complex conjugation) because is real and the coefficient pairing is the same as , so the corresponding pairing in also closes.
Lemma (one Lie-series step). Let be a smooth Hamiltonian on with elliptic equilibrium at . Suppose is non-resonant up to order . For any , suppose has been transformed so that the homogeneous parts of degree are already in resonant normal form (i.e., for ). Then there is a polynomial canonical transformation that is the identity through degree at and brings into the form .
Proof. Decompose . By the previous lemma applied to (a non-resonant homogeneous polynomial of degree ), there is a unique non-resonant of degree with . Let be the time-one map of the Hamiltonian flow of ; this flow is well-defined in a neighbourhood of because is polynomial and vanishes to order at . Compute the pull-back via Lie series: $$ H \circ \Phi_k = H + {H, -F_k} + \tfrac{1}{2!}{ {H, -F_k}, -F_k} + \cdots = H - {H, F_k} + O((\text{deg} \geq k + 1)). $$ Splitting , the leading correction is , exactly cancelling the non-resonant part of . The remaining corrections are of degree since vanishes to order at and the Poisson bracket reduces total polynomial degree by at most (each factor brings one derivative; two factors of degree combine to total degree when ). Hence has the stated form. The transformation is canonical because it is the time-one map of a Hamiltonian flow.
Theorem (Birkhoff, finite order). Under the hypotheses stated, there is a polynomial canonical transformation such that $$ H \circ \Phi = H_2 + H_3^{\text{res}} + H_4^{\text{res}} + \cdots + H_N^{\text{res}} + R_{N+1}, $$ with . When is non-resonant of every order, the resonant homogeneous polynomials are all polynomials in the actions , and the odd-degree vanish.
Proof. Iterate the previous lemma from through . At each step the transformation leaves the lower-degree resonant content unchanged (since is the identity through degree ) and brings the degree- piece into resonant normal form. The composition is a polynomial canonical transformation, identity to first order at . The action-only characterisation when is non-resonant of every order: a resonant monomial requires , hence and . At odd total degree , forces even, contradicting odd, so no resonant monomial of odd degree exists.
Theorem (Pöschel-Gevrey, statement only). Under a Diophantine condition on the linear frequencies and analyticity of on a polydisc of radius around the equilibrium, the Birkhoff coefficients satisfy in the strip-norm sense, and the optimal-truncation residual at distance from the equilibrium is bounded by . Stated without proof; Pöschel 1989 [Pöschel 1989] and Giorgilli-Locatelli are the canonical references. The proof iterates the one-step lemma with quantitative Cauchy estimates, balances , and absorbs growth via Stirling.
Connections [Master]
Hamiltonian vector field
05.02.01— the polynomial flows that compose into the Birkhoff transformation are exactly Hamiltonian vector fields generated by polynomial functions, and the symplecticity of the resulting transformation reduces to the symplecticity of each individual flow.Action-angle coordinates
05.02.04— the Birkhoff normal form generalises the action-angle theory from integrable systems to a perturbative neighbourhood of an equilibrium of any (not necessarily integrable) Hamiltonian, with the truncated normal form being integrable and the actions playing the canonical role.Symplectic manifold
05.01.02— the ambient category in which the equilibrium and its neighbourhood live; the Darboux normal form for the symplectic structure is the linearised version of the Birkhoff normal form for the Hamiltonian, and both rest on the same Lie-derivative bookkeeping.KAM theorem
05.09.01— the convergent counterpart of the divergent Birkhoff scheme: KAM iterates the same cohomological equation with a Newton-iteration scheme on a Cantor set of Diophantine action levels, replacing the linear-step divergence of Birkhoff with quadratic-step convergence on a measure-positive subset.Adiabatic invariants
05.09.02— the slow-time-perturbation counterpart: where Birkhoff inverts in polynomial degree, the adiabatic theorem inverts the analogous operator in the angle variable on a single frequency under slow time variation, with the same resonant-versus-non-resonant decomposition.Poisson bracket
05.02.02— the operator is the Poisson-bracket action; the resonant subspace of is the kernel of this action on polynomials, identifying it with the algebra of action-polynomial first integrals of the linearised system.Generating functions
05.05.03— the Lie-series transformation is generated by a polynomial Hamiltonian in the modern Lie-series convention; in the original Birkhoff formulation the same transformation is described through a type-II generating function, and the equivalence of the two viewpoints is the polynomial-time-one-flow correspondence.Restricted three-body problem and celestial mechanics — the foundational application: at triangular Lagrange points the Birkhoff normal form furnishes Lyapunov families of stable periodic orbits, and the Pöschel-Gevrey refinement gives Nekhoroshev-style stability bounds on astronomically long time scales for these orbits.
Hénon-Heiles and chaotic Hamiltonian systems — Gustavson's 1966 computer-algebra implementation [Gustavson 1966] of the Birkhoff procedure produced the first quantitative comparison between truncated normal-form integrability and full-system chaos, establishing the Birkhoff series as a diagnostic tool for the onset of non-integrable behaviour.
The bridge from the polynomial input — the cohomological equation at each order — to the geometric output — invariant action-only level sets to order — is the foundational reason the Birkhoff procedure unifies the local theory of Hamiltonian dynamics. Putting these connections together, the same eigenvalue calculation organises the perturbative apparatus of the entire chapter, and the resonance module of the linear frequencies controls precisely how much of the integrable structure of survives the higher-order terms.
Historical & philosophical context [Master]
George David Birkhoff introduced the normal-form construction in his 1927 monograph Dynamical Systems, AMS Colloquium Publications IX [Birkhoff 1927], building on his earlier work on the restricted three-body problem and on Poincaré's Méthodes Nouvelles de la Mécanique Céleste (1892-1899). Birkhoff's motivation was the local stability analysis of periodic orbits in celestial mechanics: by reducing the dynamics near an elliptic periodic orbit to a polynomial integrable model plus a small remainder, Birkhoff hoped to extract perpetual-stability statements. The 1927 monograph contains the first systematic statement of the theorem, the construction of the Lie-series transformations, and the remark that the formal series is generally divergent — a remark that anticipates the entire post-war small-divisor literature.
Carl Ludwig Siegel proved in 1942 Iteration of analytic functions, Annals of Math. 43 [Siegel 1942], that for a single complex variable the Birkhoff conjugacy of an analytic map to its linear part converges under a Brjuno-type small-divisor condition. Siegel's paper introduced the analytic toolkit — Cauchy estimates on holomorphic-strip norms, Diophantine analysis of small divisors, induction over polynomial degrees — that became the standard framework for the post-war theory of small divisors and ultimately for KAM. Jürgen Moser extended Siegel's result to area-preserving maps in his 1956 paper The analytic invariants of an area-preserving mapping near a hyperbolic fixed point, Comm. Pure Appl. Math. 9 [Moser 1956], establishing a convergent normal-form theorem in the analytic two-dimensional case that complemented the divergent Birkhoff theorem in higher dimensions.
The post-war Russian school, led by Vladimir Arnold and Anatoly Neishtadt, integrated the Birkhoff normal form into a comprehensive perturbation theory of Hamiltonian dynamics. Arnold's Mathematical Methods of Classical Mechanics §22 and Appendix 7 [Arnold] gave the canonical pedagogical treatment, and the Arnold-Kozlov-Neishtadt encyclopaedia [Arnold-Kozlov-Neishtadt] consolidated the classical theory together with the Nekhoroshev programme that derives polynomial-time-stability estimates from finite-order Birkhoff truncation. Floyd Gustavson's 1966 paper On constructing formal integrals of a Hamiltonian system near an equilibrium point, Astron. J. 71 [Gustavson 1966], implemented the Birkhoff procedure as a computer-algebra algorithm and applied it to the Hénon-Heiles Hamiltonian — an application that became the standard demonstration of the diagnostic power of the truncated normal form for detecting the onset of chaos.
Jürgen Pöschel's 1989 paper On invariant manifolds of complex analytic mappings near non-resonant fixed points, Expositiones Math. 7 [Pöschel 1989], established the optimal Gevrey-class regularity of the Birkhoff coefficients under a Diophantine hypothesis on the frequencies, sharpening the Siegel-Moser convergence theorems and connecting the Birkhoff normal form to the Nekhoroshev exponential-stability programme. Antonio Giorgilli and his collaborators in Milan developed the quantitative Birkhoff theory throughout the 1990s and 2000s for celestial-mechanics applications, providing rigorous numerical bounds on the stability of triangular Lagrange points in the Sun-Jupiter system. The Birkhoff normal form remains the primary local-analytic tool of perturbative Hamiltonian mechanics and the polynomial-formal infrastructure underlying the convergent KAM and Nekhoroshev theorems.