Taylor's theorem and extrema in several variables
Anchor (Master): Apostol *Calculus* Vol. 2 Ch. 9 (originator pedagogical presentation); Taylor 1715 *Methodus incrementorum directa et inversa* (originator of the single-variable formula); Lagrange 1797 *Théorie des fonctions analytiques* (originator of the remainder form); Hesse 1857 *Über die Determinanten und ihre Anwendung in der Geometrie* (J. reine angew. Math. 53) — originator of the Hessian determinant; Morse 1925 *Relations between the critical points of a real function of n independent variables* (Transactions AMS 27) — originator of Morse theory; Thom 1972 *Stabilité structurelle et morphogénèse* (originator of catastrophe theory)
Intuition [Beginner]
A smooth function near a chosen point looks like a polynomial. The first piece of that polynomial is the constant value — the function's height at the point. The second piece is linear — the tangent plane, which captures how the function rises and falls along each coordinate direction. The third piece is quadratic — it captures how the function curves. Together, these three pieces are the Taylor expansion of the function at the chosen point.
At a point where the tangent plane is flat — meaning every directional rate of change is zero — the function has a critical point. The local behaviour is then controlled by the quadratic piece. That quadratic piece is encoded in a square symmetric table of second-rate-of-change values called the Hessian matrix. The table has one row per coordinate direction and one column per coordinate direction, and the entry in row , column records how the rate of change in direction shifts as you move slightly in direction .
The second-derivative test reads the curvature off this table. If the function curves upward in every direction at the critical point — every test direction shows the function bending up — the critical point is a local minimum, a bowl. If the function curves downward in every direction, it is a local maximum, an upside-down bowl. If some directions curve up and others curve down, the critical point is a saddle — like the middle of a horse's saddle, low along one direction and high along another. The Hessian's eigenvalues are the curvature numbers along the principal directions; their signs decide which case the critical point falls into.
Visual [Beginner]
A three-panel diagram. The leftmost panel shows the surface — a bowl opening upward — with a small dot at the origin and a flat plane tangent to the surface there. A short caption reads "all positive eigenvalues: local minimum". The middle panel shows , a bowl opening downward, with the same flat tangent plane at the origin and a caption "all negative eigenvalues: local maximum". The rightmost panel shows , a saddle surface, with the tangent plane crossing the surface at the origin along two diagonal lines and a caption "mixed signs: saddle point".
The visual signature: the Hessian's eigenvalue signs read off the shape of the surface at the critical point. Three pictures, three sign patterns.
Worked example [Beginner]
Take . The rate of change with respect to is , and with respect to is . Both rates are zero only at , so the single critical point is the origin.
The Hessian matrix at the origin is the matrix whose four entries are the second rates of change: the entry in row , column is the rate of change with respect to coordinate of the rate of change with respect to coordinate . For this , the four entries are , giving the diagonal matrix with diagonal entries and . The eigenvalues are read off the diagonal as and , both positive. The second-derivative test says: local minimum at the origin. Direct check: with equality only at the origin, so the origin is in fact a global minimum.
Now . The rates of change are and . Both vanish at . The Hessian's four entries are , giving the diagonal matrix with diagonal entries and . The eigenvalues are and , one positive and one negative. The second-derivative test says: saddle point. Direct check: along the -axis rises away from the origin; along the -axis falls away. The two directions disagree on which way is "up", and that disagreement is precisely a saddle.
What this tells us: at a critical point, the Hessian eigenvalue signs decide the local behaviour. All positive means a local min; all negative means a local max; mixed signs means a saddle.
Check your understanding [Beginner]
Formal definition [Intermediate+]
Throughout this section denotes an open set. A map is for when all partial derivatives of of orders exist and are continuous on .
Multi-index notation. A multi-index is a tuple of non-negative integers. Set and . For write . The partial-derivative operator of multi-degree is $$ D^\alpha f = \frac{\partial^{|\alpha|} f}{\partial x_1^{\alpha_1} \cdots \partial x_n^{\alpha_n}}. $$ Equality of mixed partials (Clairaut-Schwarz for functions) means depends only on the multi-index , not on the order in which the partials are taken.
Taylor's theorem (multi-variable, Lagrange form). Let be , let , and let be such that the line segment lies inside . Then there exists on that segment with $$ f(a + h) = \sum_{|\alpha| \leq k} \frac{D^\alpha f(a)}{\alpha!} h^\alpha + \sum_{|\alpha| = k+1} \frac{D^\alpha f(c)}{\alpha!} h^\alpha. $$ The first sum is the Taylor polynomial of at of order ; the second sum is the remainder in Lagrange form.
Following Apostol [Apostol Ch. 9 §9.4–9.7]. For convex , the segment hypothesis is automatic for .
Hessian matrix. The Hessian of a function at is the matrix $$ H_f(a) = \left( \frac{\partial^2 f}{\partial x_i \partial x_j}(a) \right)_{i, j = 1}^n. $$ By Clairaut-Schwarz, is symmetric. The Hessian is the matrix representing the second-order term in the Taylor expansion: for of class on a convex open and , $$ f(a + h) = f(a) + \nabla f(a) \cdot h + \tfrac{1}{2} h^T H_f(a) h + o(|h|^2), $$ where is the gradient row and is the quadratic form on associated to .
Critical point. A point is a critical point of a function when . A critical point is a local minimum when for all in some neighbourhood of , a local maximum when similarly, and a saddle when is neither: every neighbourhood of contains points with and points with .
Definiteness classes for a symmetric matrix. A symmetric matrix is positive definite when for all ; equivalently, all eigenvalues of are strictly positive. Negative definite is the same with the inequality reversed; equivalently, all eigenvalues strictly negative. Positive semidefinite allows with possible equality on a nonzero subspace; some eigenvalues may vanish. Indefinite when both positive and negative eigenvalues are present; equivalently, takes both positive and negative values.
Counterexamples to common slips
The second-derivative test is silent at degenerate critical points. A zero eigenvalue of the Hessian leaves the test inconclusive. The classical witnesses on : has Hessian zero at the origin (both eigenvalues zero) but is a strict local minimum; has Hessian zero at the origin but is a saddle; has one zero eigenvalue at the origin but is a strict local minimum. Higher-order Taylor terms decide.
The gradient must actually vanish. A test applied at a non-critical point reports nothing about extrema. At a non-critical point the function strictly increases along the gradient direction and strictly decreases along its negative; no extremum is possible.
Symmetry of the Hessian assumes . Equality of mixed partials requires continuity of the second partials. A function with discontinuous second partials can have asymmetric mixed-partial values; the classical extended by has at the origin because is not there.
Eigenvalue signs, not eigenvalue magnitudes, decide. An eigenvalue of is positive; the test treats it the same as an eigenvalue of . The conclusion depends on signs, not on quantitative magnitudes.
Definiteness is a closed-cone condition. Positive definiteness is preserved under small symmetric perturbations (the eigenvalues vary continuously and stay positive), so a strict local minimum at one critical point persists under small smooth perturbations of — a stability statement at the heart of Morse theory.
Key theorem with proof [Intermediate+]
Theorem (second-derivative test for extrema). Let be open, a function, and a critical point with . Let be the Hessian at .
- If is positive definite, then is a strict local minimum of .
- If is negative definite, then is a strict local maximum of .
- If is indefinite (has both a positive and a negative eigenvalue), then is a saddle.
- If is positive semidefinite or negative semidefinite with at least one zero eigenvalue, the test is inconclusive.
Following Apostol [Apostol Ch. 9 §9.8–9.9].
Proof. Cases (1), (2), and (3) admit a direct argument; case (4) is the negation, witnessed by the examples in the counterexamples list above.
Set for small enough that . Apply Taylor's theorem to at with : there exists on the segment from to with $$ f(a + h) = f(a) + \nabla f(a) \cdot h + \tfrac{1}{2} \sum_{|\alpha| = 2} \frac{2}{\alpha!} D^\alpha f(c(h)) h^\alpha = f(a) + 0 + \tfrac{1}{2} h^T H_f(c(h)) h, $$ where the gradient term vanishes by the critical-point hypothesis and the second sum equals since, in multi-index notation, the multi-indices with enumerate the entries of the Hessian with the coefficient matching the row/column duplication in the quadratic form. Hence $$ \varphi(h) = \tfrac{1}{2} h^T H_f(c(h)) h. $$
Write and at the point on the segment, with small. By continuity of the second partials, entrywise as , so .
Case (1): positive definite. Let be the smallest eigenvalue of . The Rayleigh-quotient bound gives for all . By continuity of , choose so implies . Then $$ h^T H(t) h = h^T H h + h^T (H(t) - H) h \geq \lambda_{\min} |h|^2 - \tfrac{1}{2} \lambda_{\min} |h|^2 = \tfrac{1}{2} \lambda_{\min} |h|^2. $$ For , . So for in a punctured ball about : is a strict local minimum.
Case (2): negative definite. Apply case (1) to , whose Hessian at is , positive definite. The conclusion that has a strict local minimum at translates to having a strict local maximum at .
Case (3): indefinite. Let be a unit eigenvector of with eigenvalue and a unit eigenvector with eigenvalue . Restrict along the line : along this line . By continuity, as , so for small , , hence . Similarly along the line , for small . Every neighbourhood of contains points with and points with , so is a saddle.
Case (4): semidefinite with zero eigenvalue. The three witnesses in the counterexamples list show that all three behaviours (local min, local max, saddle) can occur with the same Hessian signature, so the second-derivative test alone cannot decide. Higher-order Taylor terms — or direct inspection — must resolve.
Bridge. Five threads run from the second-derivative test into the rest of the curriculum, and each one is a refinement of the same eigenvalue-counting machine. First, the test is local: the Hessian eigenvalue signature at one point decides the local-shape question without any global information about . The proof shows that strict definiteness is what powers the conclusion — the quadratic form dominates the Taylor remainder. Second, the test connects to Morse theory: a Morse function is one whose critical points are all non-degenerate (Hessian invertible). The number of negative eigenvalues at a critical point is its Morse index, and the topology of the sublevel set changes by attaching a cell of dimension equal to the index as passes a critical value. The eigenvalue signature is the topological invariant.
Third, the test connects to Lagrange multipliers via the implicit function theorem 02.05.04: constrained extrema of on are critical points of the Lagrangian , and the bordered-Hessian version of the second-derivative test classifies them. Fourth, the test connects to asymptotic analysis through Laplace's method: integrals of the form concentrate as around the minima of , with Hessian-determinant prefactors governing the Gaussian-integral leading term. Fifth, the test connects to catastrophe theory through Thom's classification of generic degenerate critical points: when the Hessian is degenerate, the fourth-order or higher Taylor terms still classify the critical point up to local diffeomorphism into a finite list of catastrophe normal forms — fold, cusp, swallowtail, butterfly, and so on. Together, these threads identify the Hessian eigenvalue structure as the foundational mechanism for all local-shape classification in differential calculus and its downstream applications.
Exercises [Intermediate+]
Lean formalization [Intermediate+]
lean_status: partial — Mathlib provides the multi-variable Taylor expansion through taylorWithinEval, the iterated Fréchet derivative iteratedFDeriv, and Taylor's theorem in Lagrange and integral remainder forms. The Hessian appears via iteratedFDeriv ℝ 2 f x paired with the bilinear-form interpretation through ContinuousMultilinearMap.toBilinearMap. The second-derivative test for extrema is partially packaged through definiteness results on quadratic forms in LinearAlgebra.QuadraticForm.Basic. The textbook-style packaging in Apostol notation under one named result is the Codex-facing gap.
The companion module at Codex.Analysis.MultiVariable.TaylorExtrema re-exports these statements and records the unification gap.
Advanced results [Master]
Taylor's theorem, integral remainder form. Let be on a convex open , , . Then $$ f(a + h) = \sum_{|\alpha| \leq k} \frac{D^\alpha f(a)}{\alpha!} h^\alpha + \sum_{|\alpha| = k+1} \frac{k+1}{\alpha!} h^\alpha \int_0^1 (1 - t)^k D^\alpha f(a + th) , dt. $$ The integral form gives a remainder that is continuous in and , in contrast to the Lagrange form whose remainder depends on a non-explicit intermediate point [Apostol Ch. 9 §9.4–9.7]. Proof in Exercise 7 by reduction to the single-variable integral form along the segment . The integral form is the version preferred in modern analysis because of its uniformity properties.
Morse theory and the index of a critical point. A function on a smooth manifold is a Morse function when every critical point of has invertible Hessian. The Morse index of a non-degenerate critical point is the number of negative eigenvalues of . The Morse lemma states: near a non-degenerate critical point of index , there exist local coordinates centred at in which takes the canonical form . The topology of the sublevel set changes by attaching a -cell as passes a critical value of index . Originator: Marston Morse [Morse 1925]; standard modern reference Milnor Morse Theory (1963). The Morse lemma is a quadratic-form normal-form result whose proof is a slick parametric application of the inverse function theorem 02.05.04 to a smoothly varying square-root of the Hessian; the cellular-attachment statement is the foundation for the proof of the h-cobordism theorem (Smale 1961), Floer homology (Floer 1988), and gradient-flow constructions across geometric analysis.
Lagrange multipliers. Let be . If is a local extremum of on the constraint set and the constraint gradients are linearly independent (the constraint qualification), then there exist scalars with . The are the Lagrange multipliers. The Lagrangian has critical points in -space precisely at constraint-satisfying extrema with multipliers. Originator: Joseph-Louis Lagrange 1788, Méchanique analytique. The proof is a direct corollary of the implicit function theorem 02.05.04 applied to at , with the chain rule 02.05.03 producing the gradient identity (Exercise 7 of 02.05.04). The constrained second-derivative test uses the bordered Hessian — the Hessian of restricted to the tangent space — and classifies as a constrained local minimum, maximum, or saddle by the definiteness signature on this subspace (Exercise 6).
Catastrophe theory. Thom's classification theorem states: a generic family , smooth and depending on control parameters, has critical points whose local normal forms — after a smooth change of coordinates in — are one of seven types: the fold (), cusp (), swallowtail (), butterfly (), and three umbilics (hyperbolic, elliptic, parabolic). Originator: René Thom [Thom 1972]; rigorous proof by John Mather (1968–1971). The classification extends the second-derivative test from non-degenerate critical points (where the Hessian classifies completely) to degenerate critical points in low-codimension families. Each catastrophe is a bifurcation: as the control parameters cross a critical surface, the structure of critical points of changes qualitatively. Applications run from optics (caustics are fold and cusp catastrophes) to fluid mechanics (vortex shedding) to phase transitions in statistical mechanics. The classification is a finite list because bounds the codimension of the universal unfolding.
Laplace's method. Let be on a compact with a unique minimum at in the interior, , and positive definite. Then as , $$ \int_U e^{-N f(x)} , dx \sim e^{-N f(a)} \left( \frac{2\pi}{N} \right)^{n/2} \frac{1}{\sqrt{\det H_f(a)}}. $$ The asymptotic expansion is to leading order; higher-order corrections come from higher Taylor coefficients of at . The proof substitutes the Taylor expansion of at , completes the square in the quadratic term, and identifies the leading Gaussian integral. Laplace's method is the deterministic ancestor of the saddle-point method in complex analysis (06.* future units), the partition-function asymptotic in statistical mechanics (08.* future units), and the WKB approximation in quantum mechanics. Originator: Laplace 1774 Mémoire sur la probabilité des causes par les évènemens; modern treatment in Bleistein & Handelsman Asymptotic Expansions of Integrals (1975).
Higher-order tests via the multi-jet. When the Hessian is degenerate, the -jet of at — the equivalence class of under iff for — controls the local-shape classification at higher order. The space of -jets at , , is a finite-dimensional vector space of dimension , and the question "is a local minimum?" depends only on the -jet for the smallest that breaks the tie. For positive semidefinite Hessian with a single zero eigenvalue, the answer at order three or four typically suffices: the third-jet decides if it has a nonzero component in the kernel direction (saddle-like behaviour at order three is the same as a fold catastrophe), and the fourth-jet decides if the third vanishes (cusp catastrophe at order four). The full catastrophe classification systematises this.
Synthesis. Five observations organise the unit. First, the second-derivative test rests on Taylor's theorem and the spectral theorem for symmetric matrices: the quadratic form achieves its sign on at eigenvectors of , so the eigenvalue signature controls the sign across all directions simultaneously. Second, the proof reduces to two ingredients: the Rayleigh-quotient lower bound for positive definite , and the continuity of the Hessian, which lets the small- behaviour of be controlled by . Third, the test extends to constrained extrema through the implicit function theorem 02.05.04: a constrained local extremum of on is a critical point of the Lagrangian, and the second-derivative test on the bordered Hessian gives the constrained classification. Fourth, the test generalises to Morse theory on smooth manifolds: a non-degenerate critical point has a Morse index equal to the number of negative Hessian eigenvalues, and the index controls the cellular topology of sublevel sets. The signature is a topological invariant, not merely a local-shape invariant. Fifth, the test fails on degenerate critical points but is rescued by higher-order Taylor jets and catastrophe theory: Thom's classification gives a finite list of normal forms for generic degenerate critical points in low-codimension families, and the eigenvalue-counting machine extends to a jet-counting machine on the higher Taylor terms. The Hessian eigenvalue structure is the foundational mechanism that gives critical points their topologically distinct flavours.
Full proof set [Master]
Second-derivative test (cases 1–3). Proved in §"Key theorem with proof" above by the Taylor expansion with Lagrange remainder, the Rayleigh-quotient bound on , and the continuity of near .
Single-variable second-derivative test. Proved as Exercise 5 by specialising the multi-variable proof to .
Integral remainder form of Taylor's theorem. Proved as Exercise 7 by composing with the line and applying the single-variable integral form to , with the multinomial theorem identifying as a sum over multi-indices.
Morse lemma. Statement above. Proof sketch. At a non-degenerate critical point of on , choose local coordinates centred at in which is the diagonal matrix with negative entries. Write for a smooth symmetric matrix-valued function with , by the integral form of Taylor's theorem. Apply a smoothly varying linear change of coordinates where is a smooth invertible matrix with chosen so that ; such exists by a parametric application of the inverse function theorem 02.05.04 to the smooth map in a neighbourhood of . In the -coordinates, .
Lagrange multipliers (regular case). Statement above. The proof is Exercise 7 of 02.05.04 for the single-constraint case, generalised by the rank- form of the implicit function theorem applied to the constraint map . Constraint qualification (linear independence of the constraint gradients) is the surjectivity hypothesis has rank ; the implicit function theorem produces a local parametrisation of the constraint surface , and the unconstrained second-derivative test on the parametrisation gives the constrained second-derivative test.
Laplace's method. Statement above. Proof sketch. Translate so and . By Taylor, with positive definite and . Split the integral into a neighbourhood of and its complement. On , is bounded below by a positive constant on compact , so is exponentially small. On , Taylor's theorem and the Rayleigh bound give for small, and the rescaling converts the integral into a Gaussian integral with as covariance matrix: . Subleading terms vanish as .
Catastrophe classification. Statement above. Proof outline. The Thom-Mather classification proceeds in three stages. Stage 1: classify singularities of smooth functions up to -equivalence (compose with smooth diffeomorphisms on source and target). For -dimensional critical points of codimension , the universal unfolding has moduli parameters; finite codimension reduces to a finite list. Stage 2: identify the universal unfoldings by Mather's preparation theorem (the smooth-function analogue of the Weierstrass preparation theorem) and the Malgrange division theorem. Stage 3: realise each algebraic normal form as a polynomial family of explicit form. The seven elementary catastrophes for are the result. Full proofs in Mather Stability of mappings (1968–1971) and Thom 1972 Stabilité structurelle et morphogénèse [Thom 1972].
Connections [Master]
Implicit and inverse function theorems 02.05.04 — the Lagrange multiplier rule descends from the implicit function theorem applied to the constraint map. The bordered Hessian classification of constrained extrema combines the second-derivative test of this unit with the implicit-function parametrisation. The Morse lemma's smooth coordinate change is itself a parametric application of the inverse function theorem to a quadratic-form normalisation, and the constant-rank theorem's local-normal-form construction shares the same proof structure.
Chain rule for multi-variable functions 02.05.03 — the multi-variable Taylor expansion is built by iterating the chain rule on , with the multinomial theorem identifying as a sum over multi-indices. The bordered Hessian in the Lagrange-multiplier setting is computed by chain rule applied to for the implicit-function parametrisation .
Smooth manifold 03.02.01 — Morse functions live on smooth manifolds and their critical-point classification depends on a choice of local chart, but the Morse index is chart-independent because change of coordinates preserves Hessian definiteness up to congruence. The Morse-lemma local normal form is the manifold-level statement that critical points of Morse functions look locally Euclidean, with the canonical quadratic form .
Riemannian metric and curvature [03.05.*] — sectional curvature of a Riemannian manifold is the second derivative of the metric structure along a geodesic, computed via a Jacobi-field equation whose solution structure is governed by the Hessian of a distance function. Bochner's formulas in Riemannian geometry are integral identities controlling the Hessian of a function against the Ricci curvature, used in Lichnerowicz-type bounds and the proof of positive scalar curvature constraints.
Morse theory and topology of manifolds [03.12. — pending unit]* — the Morse-theoretic cellular decomposition assembles the manifold from cells whose attaching maps are determined by critical-point indices of a Morse function. Smale's h-cobordism theorem (1961) and Milnor's exotic sphere construction (1956) both rest on Morse theory. Floer homology (Andreas Floer, 1988) extends the construction to infinite-dimensional gradient flows on loop spaces of symplectic manifolds.
Symplectic geometry and Hamiltonian dynamics [05.*] — the symplectic structure on pairs with the Hessian of a Hamiltonian to produce the linearised flow at a fixed point, with eigenvalue signature controlling stability (KAM persistence vs hyperbolic instability). The second-derivative test is the bridge between local critical-point geometry and global Hamiltonian dynamics.
Asymptotic analysis and stationary phase [06. — pending unit]* — Laplace's method and the saddle-point method in complex analysis use the Hessian-determinant prefactor to extract the leading Gaussian-integral contribution from oscillatory integrals and exponential integrals . The WKB approximation in quantum mechanics is the same mechanism applied to wave-function asymptotics. Edge-of-the-wedge theorems and Borel resummation continue the asymptotic-analysis thread into complex analysis.
Statistical mechanics and the partition function [08. — pending unit]* — partition-function expansions at low temperature () are governed by Laplace's method applied to the Hamiltonian : the leading contribution comes from the ground state with Hessian-determinant fluctuation prefactor. Phase transitions are bifurcations of critical points of an effective potential, classified by catastrophe theory in the mean-field regime.
Historical & philosophical context [Master]
Brook Taylor's 1715 Methodus incrementorum directa et inversa [Taylor 1715] introduced the single-variable Taylor formula in the finite-difference form characteristic of early eighteenth-century calculus. Joseph-Louis Lagrange in his 1797 Théorie des fonctions analytiques [Lagrange 1797] gave the remainder form now bearing his name, with the explicit intermediate-point formula for some . Cauchy in his 1821 Cours d'analyse and 1823 Résumé des leçons gave the first rigorous - proof. The multi-variable version emerged through nineteenth-century pedagogical practice, with the multi-index notation crystallising in the early twentieth century. Apostol's 1969 Calculus Vol. 2 Ch. 9 [Apostol Ch. 9] gave the canonical undergraduate presentation, with the Hessian-based second-derivative test in the form taught today.
Otto Hesse introduced the determinant of the matrix of second partial derivatives in his 1857 paper Über die Determinanten und ihre Anwendung in der Geometrie [Hesse 1857] in the Journal für die reine und angewandte Mathematik, in the context of algebraic geometry — the singular points of a projective hypersurface are precisely the zeros of what is now called the Hessian determinant. The matrix itself acquired Hesse's name through Jacob Sylvester's later usage. Marston Morse in his 1925 Transactions paper [Morse 1925] introduced what is now called Morse theory, with the index of a non-degenerate critical point as the central invariant; the Morse-theoretic decomposition of manifolds powered Smale's 1961 proof of the high-dimensional Poincaré conjecture and Milnor's 1956 construction of exotic seven-spheres. René Thom in his 1972 monograph Stabilité structurelle et morphogénèse [Thom 1972] introduced catastrophe theory, classifying generic degenerate critical points of smooth functions in families of up to four control parameters into seven elementary types (fold, cusp, swallowtail, butterfly, hyperbolic umbilic, elliptic umbilic, parabolic umbilic); the rigorous proofs were given by John Mather (1968–1971) using his preparation theorem for smooth functions, and the theory found applications in optics, fluid mechanics, and mathematical biology.