02.05.05 · analysis / multivariable-differentiation

Taylor's theorem and extrema in several variables

shipped3 tiersLean: partial

Anchor (Master): Apostol *Calculus* Vol. 2 Ch. 9 (originator pedagogical presentation); Taylor 1715 *Methodus incrementorum directa et inversa* (originator of the single-variable formula); Lagrange 1797 *Théorie des fonctions analytiques* (originator of the remainder form); Hesse 1857 *Über die Determinanten und ihre Anwendung in der Geometrie* (J. reine angew. Math. 53) — originator of the Hessian determinant; Morse 1925 *Relations between the critical points of a real function of n independent variables* (Transactions AMS 27) — originator of Morse theory; Thom 1972 *Stabilité structurelle et morphogénèse* (originator of catastrophe theory)

Intuition [Beginner]

A smooth function near a chosen point looks like a polynomial. The first piece of that polynomial is the constant value — the function's height at the point. The second piece is linear — the tangent plane, which captures how the function rises and falls along each coordinate direction. The third piece is quadratic — it captures how the function curves. Together, these three pieces are the Taylor expansion of the function at the chosen point.

At a point where the tangent plane is flat — meaning every directional rate of change is zero — the function has a critical point. The local behaviour is then controlled by the quadratic piece. That quadratic piece is encoded in a square symmetric table of second-rate-of-change values called the Hessian matrix. The table has one row per coordinate direction and one column per coordinate direction, and the entry in row $i$ , column $j$ records how the rate of change in direction $i$ shifts as you move slightly in direction $j$ .

The second-derivative test reads the curvature off this table. If the function curves upward in every direction at the critical point — every test direction shows the function bending up — the critical point is a local minimum, a bowl. If the function curves downward in every direction, it is a local maximum, an upside-down bowl. If some directions curve up and others curve down, the critical point is a saddle — like the middle of a horse's saddle, low along one direction and high along another. The Hessian's eigenvalues are the curvature numbers along the principal directions; their signs decide which case the critical point falls into.

Visual [Beginner]

A three-panel diagram. The leftmost panel shows the surface $z = x^{2} + y^{2}$ — a bowl opening upward — with a small dot at the origin and a flat plane tangent to the surface there. A short caption reads "all positive eigenvalues: local minimum". The middle panel shows $z = - x^{2} - y^{2}$ , a bowl opening downward, with the same flat tangent plane at the origin and a caption "all negative eigenvalues: local maximum". The rightmost panel shows $z = x^{2} - y^{2}$ , a saddle surface, with the tangent plane crossing the surface at the origin along two diagonal lines and a caption "mixed signs: saddle point".

The visual signature: the Hessian's eigenvalue signs read off the shape of the surface at the critical point. Three pictures, three sign patterns.

Worked example [Beginner]

Take $f (x, y) = x^{2} + y^{2}$ . The rate of change with respect to $x$ is $2 x$ , and with respect to $y$ is $2 y$ . Both rates are zero only at $(0, 0)$ , so the single critical point is the origin.

The Hessian matrix at the origin is the $2 \times 2$ matrix whose four entries are the second rates of change: the entry in row $i$ , column $j$ is the rate of change with respect to coordinate $j$ of the rate of change with respect to coordinate $i$ . For this $f$ , the four entries are $2, 0, 0, 2$ , giving the diagonal matrix with diagonal entries $2$ and $2$ . The eigenvalues are read off the diagonal as $2$ and $2$ , both positive. The second-derivative test says: local minimum at the origin. Direct check: $f (x, y) \geq 0$ with equality only at the origin, so the origin is in fact a global minimum.

Now $g (x, y) = x^{2} - y^{2}$ . The rates of change are $2 x$ and $- 2 y$ . Both vanish at $(0, 0)$ . The Hessian's four entries are $2, 0, 0, - 2$ , giving the diagonal matrix with diagonal entries $2$ and $- 2$ . The eigenvalues are $2$ and $- 2$ , one positive and one negative. The second-derivative test says: saddle point. Direct check: along the $x$ -axis $g (x, 0) = x^{2}$ rises away from the origin; along the $y$ -axis $g (0, y) = - y^{2}$ falls away. The two directions disagree on which way is "up", and that disagreement is precisely a saddle.

What this tells us: at a critical point, the Hessian eigenvalue signs decide the local behaviour. All positive means a local min; all negative means a local max; mixed signs means a saddle.

Check your understanding [Beginner]

Formal definition [Intermediate+]

Throughout this section $U \subseteq R^{n}$ denotes an open set. A map $f : U \to R$ is $C^{k}$ for $k \geq 1$ when all partial derivatives of $f$ of orders $\leq k$ exist and are continuous on $U$ .

Multi-index notation. A multi-index is a tuple $α = (α_{1}, \dots, α_{n}) \in N^{n}$ of non-negative integers. Set $∣ α ∣ = α_{1} + \dots + α_{n}$ and $α! = α_{1}! \dots α_{n}!$ . For $h \in R^{n}$ write $h^{α} = h_{1}^{α_{1}} \dots h_{n}^{α_{n}}$ . The partial-derivative operator of multi-degree $α$ is $$ D^\alpha f = \frac{\partial^{|\alpha|} f}{\partial x_1^{\alpha_1} \cdots \partial x_n^{\alpha_n}}. $$ Equality of mixed partials (Clairaut-Schwarz for $C^{∣ α ∣}$ functions) means $D^{α} f$ depends only on the multi-index $α$ , not on the order in which the partials are taken.

Taylor's theorem (multi-variable, Lagrange form). Let $f : U \to R$ be $C^{k + 1}$ , let $a \in U$ , and let $h \in R^{n}$ be such that the line segment ${a + t h : t \in [0, 1]}$ lies inside $U$ . Then there exists $c$ on that segment with $$ f(a + h) = \sum_{|\alpha| \leq k} \frac{D^\alpha f(a)}{\alpha!} h^\alpha + \sum_{|\alpha| = k+1} \frac{D^\alpha f(c)}{\alpha!} h^\alpha. $$ The first sum is the Taylor polynomial of $f$ at $a$ of order $k$ ; the second sum is the remainder in Lagrange form.

Following Apostol ^{[Apostol Ch. 9 §9.4–9.7]}. For convex $U$ , the segment hypothesis is automatic for $a + h \in U$ .

Hessian matrix. The Hessian of a $C^{2}$ function $f : U \to R$ at $a \in U$ is the $n \times n$ matrix $$ H_f(a) = \left( \frac{\partial^2 f}{\partial x_i \partial x_j}(a) \right)_{i, j = 1}^n. $$ By Clairaut-Schwarz, $H_{f} (a)$ is symmetric. The Hessian is the matrix representing the second-order term in the Taylor expansion: for $f$ of class $C^{2}$ on a convex open $U$ and $a + h \in U$ , $$ f(a + h) = f(a) + \nabla f(a) \cdot h + \tfrac{1}{2} h^T H_f(a) h + o(|h|^2), $$ where $\nabla f (a) = (\partial f / \partial x_{1} (a), \dots, \partial f / \partial x_{n} (a))$ is the gradient row and $h^{T} H_{f} (a) h$ is the quadratic form on $R^{n}$ associated to $H_{f} (a)$ .

Critical point. A point $a \in U$ is a critical point of a $C^{1}$ function $f : U \to R$ when $\nabla f (a) = 0$ . A critical point $a$ is a local minimum when $f (a) \leq f (x)$ for all $x$ in some neighbourhood of $a$ , a local maximum when $f (a) \geq f (x)$ similarly, and a saddle when $a$ is neither: every neighbourhood of $a$ contains points $x$ with $f (x) > f (a)$ and points $x$ with $f (x) < f (a)$ .

Definiteness classes for a symmetric matrix. A symmetric matrix $H \in R^{n \times n}$ is positive definite when $h^{T} H h > 0$ for all $h \neq = 0$ ; equivalently, all eigenvalues of $H$ are strictly positive. Negative definite is the same with the inequality reversed; equivalently, all eigenvalues strictly negative. Positive semidefinite allows $h^{T} H h \geq 0$ with possible equality on a nonzero subspace; some eigenvalues may vanish. Indefinite when both positive and negative eigenvalues are present; equivalently, $h^{T} H h$ takes both positive and negative values.

Counterexamples to common slips

The second-derivative test is silent at degenerate critical points. A zero eigenvalue of the Hessian leaves the test inconclusive. The classical witnesses on $R^{2}$ : $f (x, y) = x^{4} + y^{4}$ has Hessian zero at the origin (both eigenvalues zero) but is a strict local minimum; $g (x, y) = x^{4} - y^{4}$ has Hessian zero at the origin but is a saddle; $h (x, y) = x^{2} + y^{4}$ has one zero eigenvalue at the origin but is a strict local minimum. Higher-order Taylor terms decide.
The gradient must actually vanish. A test applied at a non-critical point reports nothing about extrema. At a non-critical point the function strictly increases along the gradient direction and strictly decreases along its negative; no extremum is possible.
Symmetry of the Hessian assumes $C^{2}$ . Equality of mixed partials requires continuity of the second partials. A function with discontinuous second partials can have asymmetric mixed-partial values; the classical $f (x, y) = x y (x^{2} - y^{2}) / (x^{2} + y^{2})$ extended by $f (0, 0) = 0$ has $\partial^{2} f / \partial x \partial y \neq = \partial^{2} f / \partial y \partial x$ at the origin because $f$ is not $C^{2}$ there.
Eigenvalue signs, not eigenvalue magnitudes, decide. An eigenvalue of $1 0^{- 12}$ is positive; the test treats it the same as an eigenvalue of $1 0^{12}$ . The conclusion depends on signs, not on quantitative magnitudes.
Definiteness is a closed-cone condition. Positive definiteness is preserved under small symmetric perturbations (the eigenvalues vary continuously and stay positive), so a strict local minimum at one critical point persists under small smooth perturbations of $f$ — a stability statement at the heart of Morse theory.

Key theorem with proof [Intermediate+]

Theorem (second-derivative test for extrema). Let $U \subseteq R^{n}$ be open, $f : U \to R$ a $C^{2}$ function, and $a \in U$ a critical point with $\nabla f (a) = 0$ . Let $H = H_{f} (a)$ be the Hessian at $a$ .

If $H$ is positive definite, then $a$ is a strict local minimum of $f$ .
If $H$ is negative definite, then $a$ is a strict local maximum of $f$ .
If $H$ is indefinite (has both a positive and a negative eigenvalue), then $a$ is a saddle.
If $H$ is positive semidefinite or negative semidefinite with at least one zero eigenvalue, the test is inconclusive.

Following Apostol ^{[Apostol Ch. 9 §9.8–9.9]}.

Proof. Cases (1), (2), and (3) admit a direct argument; case (4) is the negation, witnessed by the examples in the counterexamples list above.

Set $φ (h) = f (a + h) - f (a)$ for $h$ small enough that $a + h \in U$ . Apply Taylor's theorem to $f$ at $a$ with $k = 1$ : there exists $c (h)$ on the segment from $a$ to $a + h$ with $$ f(a + h) = f(a) + \nabla f(a) \cdot h + \tfrac{1}{2} \sum_{|\alpha| = 2} \frac{2}{\alpha!} D^\alpha f(c(h)) h^\alpha = f(a) + 0 + \tfrac{1}{2} h^T H_f(c(h)) h, $$ where the gradient term vanishes by the critical-point hypothesis and the second sum equals $\frac{1}{2} h^{T} H_{f} (c (h)) h$ since, in multi-index notation, the multi-indices $α$ with $∣ α ∣ = 2$ enumerate the entries of the Hessian with the coefficient $2/ α!$ matching the row/column duplication in the quadratic form. Hence $$ \varphi(h) = \tfrac{1}{2} h^T H_f(c(h)) h. $$

Write $H = H_{f} (a)$ and $H (t) = H_{f} (c (h))$ at the point $c (h)$ on the segment, with $t = ∥ h ∥$ small. By continuity of the second partials, $H (t) \to H$ entrywise as $t \to 0$ , so $∥ H (t) - H ∥_{op} \to 0$ .

Case (1): $H$ positive definite. Let $λ_{m i n} > 0$ be the smallest eigenvalue of $H$ . The Rayleigh-quotient bound gives $h^{T} H h \geq λ_{m i n} ∥ h ∥^{2}$ for all $h \in R^{n}$ . By continuity of $H_{f}$ , choose $δ > 0$ so $∥ h ∥ < δ$ implies $∥ H (t) - H ∥_{op} < λ_{m i n} /2$ . Then $$ h^T H(t) h = h^T H h + h^T (H(t) - H) h \geq \lambda_{\min} |h|^2 - \tfrac{1}{2} \lambda_{\min} |h|^2 = \tfrac{1}{2} \lambda_{\min} |h|^2. $$ For $0 < ∥ h ∥ < δ$ , $φ (h) \geq \frac{1}{4} λ_{m i n} ∥ h ∥^{2} > 0$ . So $f (a + h) > f (a)$ for $h$ in a punctured ball about $0$ : $a$ is a strict local minimum.

Case (2): $H$ negative definite. Apply case (1) to $- f$ , whose Hessian at $a$ is $- H$ , positive definite. The conclusion that $- f$ has a strict local minimum at $a$ translates to $f$ having a strict local maximum at $a$ .

Case (3): $H$ indefinite. Let $v$ be a unit eigenvector of $H$ with eigenvalue $λ_{+} > 0$ and $w$ a unit eigenvector with eigenvalue $λ_{-} < 0$ . Restrict $f$ along the line $a + t v$ : along this line $φ (t v) = \frac{1}{2} t^{2} v^{T} H (t v) v$ . By continuity, $v^{T} H (s) v \to v^{T} H v = λ_{+} > 0$ as $s \to 0$ , so for small $∣ t ∣$ , $v^{T} H (t v) v \geq λ_{+} /2 > 0$ , hence $φ (t v) > 0$ . Similarly along the line $a + tw$ , $φ (tw) < 0$ for small $∣ t ∣ > 0$ . Every neighbourhood of $a$ contains points $a + t v$ with $f > f (a)$ and points $a + tw$ with $f < f (a)$ , so $a$ is a saddle.

Case (4): $H$ semidefinite with zero eigenvalue. The three witnesses in the counterexamples list show that all three behaviours (local min, local max, saddle) can occur with the same Hessian signature, so the second-derivative test alone cannot decide. Higher-order Taylor terms — or direct inspection — must resolve. $□$

Bridge. Five threads run from the second-derivative test into the rest of the curriculum, and each one is a refinement of the same eigenvalue-counting machine. First, the test is local: the Hessian eigenvalue signature at one point decides the local-shape question without any global information about $f$ . The proof shows that strict definiteness is what powers the conclusion — the quadratic form dominates the Taylor remainder. Second, the test connects to Morse theory: a Morse function is one whose critical points are all non-degenerate (Hessian invertible). The number of negative eigenvalues at a critical point is its Morse index, and the topology of the sublevel set $f^{- 1} ((- \infty, c])$ changes by attaching a cell of dimension equal to the index as $c$ passes a critical value. The eigenvalue signature is the topological invariant.

Third, the test connects to Lagrange multipliers via the implicit function theorem 02.05.04: constrained extrema of $f$ on ${g = 0}$ are critical points of the Lagrangian $L (x, λ) = f (x) - λ g (x)$ , and the bordered-Hessian version of the second-derivative test classifies them. Fourth, the test connects to asymptotic analysis through Laplace's method: integrals of the form $\int e^{- N f (x)} d x$ concentrate as $N \to \infty$ around the minima of $f$ , with Hessian-determinant prefactors $(2 π)^{n} / N^{n} det H_{f} (a^{*})$ governing the Gaussian-integral leading term. Fifth, the test connects to catastrophe theory through Thom's classification of generic degenerate critical points: when the Hessian is degenerate, the fourth-order or higher Taylor terms still classify the critical point up to local diffeomorphism into a finite list of catastrophe normal forms — fold, cusp, swallowtail, butterfly, and so on. Together, these threads identify the Hessian eigenvalue structure as the foundational mechanism for all local-shape classification in differential calculus and its downstream applications.

Exercises [Intermediate+]

Exercise 4 (medium, numeric).

For $f (x, y, z) = x^{2} + 2 y^{2} + 3 z^{2} - 2 x y - 2 y z$ , compute the smallest eigenvalue of the Hessian at the origin. (The function is a quadratic form, so the Hessian is constant and the answer determines the global behaviour at the origin.)

Hint

The Hessian of a quadratic form $\frac{1}{2} x^{T} A x$ is the symmetric matrix $A$ . Read off the entries of $A$ and compute its smallest eigenvalue.

Answer

The Hessian is the matrix of second partials: $\partial^{2} f / \partial x^{2} = 2$ , $\partial^{2} f / \partial y^{2} = 4$ , $\partial^{2} f / \partial z^{2} = 6$ , $\partial^{2} f / \partial x \partial y = - 2$ , $\partial^{2} f / \partial y \partial z = - 2$ , $\partial^{2} f / \partial x \partial z = 0$ . So $$ H = \begin{pmatrix} 2 & -2 & 0 \ -2 & 4 & -2 \ 0 & -2 & 6 \end{pmatrix}. $$ The characteristic polynomial $det (H - λ I)$ expands as $(2 - λ) [(4 - λ) (6 - λ) - 4] - (- 2) [- 2 (6 - λ) - 0] = (2 - λ) (λ^{2} - 10 λ + 20) - 4 (6 - λ)$ . Expand: $(2 - λ) (λ^{2} - 10 λ + 20) = 2 λ^{2} - 20 λ + 40 - λ^{3} + 10 λ^{2} - 20 λ = - λ^{3} + 12 λ^{2} - 40 λ + 40$ . Subtract $4 (6 - λ) = 24 - 4 λ$ : characteristic polynomial $- λ^{3} + 12 λ^{2} - 36 λ + 16$ . Roots numerically: $λ \approx 0.539, 3.305, 8.156$ . Smallest eigenvalue is approximately $0.539$ , positive. So $H$ is positive definite and the origin is a strict local (in fact global) minimum. Rubric: full credit for the Hessian, characteristic polynomial, and the numerical smallest eigenvalue (within $\pm 0.01$ ).

Exercise 5 (medium, short-answer).

State and prove the single-variable case of the second-derivative test: if $f : R \to R$ is $C^{2}$ , $f^{'} (a) = 0$ , and $f^{''} (a) > 0$ , then $a$ is a strict local minimum of $f$ .

Hint

Apply the proof of the multi-variable test in dimension $n = 1$ where the Hessian is the single number $f^{''} (a)$ .

Answer

In dimension $n = 1$ , the Hessian at $a$ is the $1 \times 1$ matrix whose single entry is $f^{''} (a)$ . Positive definiteness of a $1 \times 1$ matrix is the same as its single entry being positive. The case (1) argument from the multi-variable proof specialises: Taylor expansion gives $f (a + h) - f (a) = \frac{1}{2} f^{''} (c (h)) h^{2}$ for some $c (h)$ between $a$ and $a + h$ . By continuity of $f^{''}$ at $a$ , choose $δ > 0$ so $∣ h ∣ < δ$ implies $∣ f^{''} (c (h)) - f^{''} (a) ∣ < f^{''} (a) /2$ , hence $f^{''} (c (h)) > f^{''} (a) /2 > 0$ . Then $f (a + h) - f (a) \geq \frac{1}{4} f^{''} (a) h^{2} > 0$ for $0 < ∣ h ∣ < δ$ , proving $a$ is a strict local minimum. Rubric: full credit for the Taylor expansion, the continuity step, and the strict-inequality conclusion.

Exercise 6 (hard, short-answer).

Derive the Lagrange multiplier rule with second-derivative classification: let $f, g : R^{n} \to R$ be $C^{2}$ with $\nabla g (a) \neq = 0$ and $g (a) = 0$ . Suppose $\nabla f (a) = λ \nabla g (a)$ for some $λ \in R$ . Sketch how the bordered Hessian — the Hessian of the Lagrangian $L (x, λ) = f (x) - λ g (x)$ restricted to the tangent space ${v : \nabla g (a) \cdot v = 0}$ — classifies $a$ as a constrained local minimum, maximum, or saddle of $f$ subject to $g = 0$ .

Hint

Apply the implicit function theorem 02.05.04 to $g$ near $a$ to parametrise the constraint surface locally. The restricted function $F (t) = f (γ (t))$ on a chart $γ$ then has $0$ as a critical point, and the second-derivative test in the parametrised variables uses the bordered Hessian.

Answer

Since $\nabla g (a) \neq = 0$ , the implicit function theorem 02.05.04 applies to $g$ at $a$ . Pick coordinates so $\partial g / \partial x_{n} (a) \neq = 0$ ; solve $g (x^{'}, x_{n}) = 0$ locally for $x_{n} = h (x^{'})$ with $h$ a $C^{2}$ function of $x^{'} = (x_{1}, \dots, x_{n - 1})$ . The constraint surface ${g = 0}$ is locally the graph $x_{n} = h (x^{'})$ , and the restricted function $F (x^{'}) = f (x^{'}, h (x^{'}))$ has gradient $\nabla F (a^{'}) = (\nabla f - λ \nabla g) ∣_{a}$ projected onto the first $n - 1$ coordinates by the chain rule 02.05.03 — zero by the Lagrange-multiplier hypothesis. The Hessian of $F$ at $a^{'}$ computes to the restriction of the Hessian of $L = f - λ g$ to the tangent space ${v : \nabla g (a) \cdot v = 0}$ , by differentiating the chain-rule expression once more. The second-derivative test from this unit then applies to $F$ : positive definiteness of the restricted Hessian gives a strict constrained local minimum, negative definiteness a strict constrained local maximum, and indefiniteness a constrained saddle. The "bordered Hessian" of textbook tradition is the matrix encoding the same restricted quadratic form via the constraint gradient as a Lagrange multiplier border. Rubric: full credit for the implicit-function parametrisation, the chain-rule derivation of the restricted Hessian, and the application of the second-derivative test.

Exercise 7 (hard, short-answer).

Prove the integral form of the Taylor remainder: for $f : U \to R$ of class $C^{k + 1}$ on a convex open set $U$ with $a \in U$ and $a + h \in U$ , $$ R_k(a, h) = \sum_{|\alpha| = k+1} \frac{k+1}{\alpha!} h^\alpha \int_0^1 (1 - t)^k D^\alpha f(a + th) , dt. $$ Derive it from the single-variable integral remainder applied to $ϕ (t) = f (a + t h)$ on $[0, 1]$ .

Hint

Compute $ϕ^{(j)} (t)$ via repeated chain rule 02.05.03 and recognise the multinomial expansion of $(h_{1} \partial_{1} + \dots + h_{n} \partial_{n})^{j}$ acting on $f$ .

Answer

Set $ϕ (t) = f (a + t h)$ on $[0, 1]$ . By repeated chain-rule application 02.05.03, $ϕ^{'} (t) = \nabla f (a + t h) \cdot h = \sum_{i} h_{i} \partial_{i} f (a + t h)$ , and inductively $ϕ^{(j)} (t) = \sum_{∣ α ∣ = j} (α j) h^{α} D^{α} f (a + t h)$ where $(α j) = j! / α!$ is the multinomial coefficient (the multinomial theorem applied to the differential operator $h \cdot \nabla$ raised to the $j$ -th power). The single-variable integral form of Taylor's theorem applied to $ϕ$ on $[0, 1]$ reads $$ \phi(1) = \sum_{j = 0}^k \frac{\phi^{(j)}(0)}{j!} + \frac{1}{k!} \int_0^1 (1 - t)^k \phi^{(k+1)}(t) , dt. $$ The left side is $f (a + h)$ . Substituting the multi-index expression for $ϕ^{(j)} (0)$ and $ϕ^{(k + 1)} (t)$ and simplifying $(α j) / j! = 1/ α!$ gives the multi-index Taylor polynomial plus the remainder $$ R_k(a, h) = \frac{1}{k!} \int_0^1 (1 - t)^k \sum_{|\alpha| = k+1} \frac{(k+1)!}{\alpha!} h^\alpha D^\alpha f(a + th) , dt = \sum_{|\alpha| = k+1} \frac{k+1}{\alpha!} h^\alpha \int_0^1 (1 - t)^k D^\alpha f(a + th) , dt. $$ Rubric: full credit for the chain-rule derivation of $ϕ^{(j)}$ , the multinomial coefficient, and the substitution into the single-variable integral form.

Exercise 8 (hard, short-answer).

Show by example that the second-derivative test fails on the boundary of its hypothesis: construct a $C^{2}$ function $f : R^{2} \to R$ with $\nabla f (0, 0) = 0$ and positive semidefinite Hessian (one zero eigenvalue, one positive eigenvalue) for which the origin is not a local minimum.

Hint

Take $f (x, y) = x^{2} - y^{3}$ or $f (x, y) = x^{2} + y^{3}$ .

Answer

Take $f (x, y) = x^{2} - y^{3}$ . Compute $\nabla f (x, y) = (2 x, - 3 y^{2})$ , which vanishes at $(0, 0)$ . The Hessian $H_{f} (0, 0) = (2000)$ has eigenvalues $2$ and $0$ , positive semidefinite. Yet $(0, 0)$ is not a local minimum: along the line $x = 0$ , $f (0, y) = - y^{3}$ , which takes negative values for $y > 0$ arbitrarily close to the origin. So $f$ has values strictly less than $f (0, 0) = 0$ in every neighbourhood of the origin. The semidefinite case is genuinely inconclusive: the same Hessian signature can correspond to a strict local minimum (replace $- y^{3}$ with $y^{4}$ ) or to neither min nor max (as in this example). Higher-order Taylor terms decide. Rubric: full credit for the example, the verification that the gradient vanishes and the Hessian is semidefinite, and the demonstration that the origin is not a local minimum.

Lean formalization [Intermediate+]

lean_status: partial — Mathlib provides the multi-variable Taylor expansion through taylorWithinEval, the iterated Fréchet derivative iteratedFDeriv, and Taylor's theorem in Lagrange and integral remainder forms. The Hessian appears via iteratedFDeriv ℝ 2 f x paired with the bilinear-form interpretation through ContinuousMultilinearMap.toBilinearMap. The second-derivative test for extrema is partially packaged through definiteness results on quadratic forms in LinearAlgebra.QuadraticForm.Basic. The textbook-style packaging in Apostol notation under one named result is the Codex-facing gap.

[object Promise]

The companion module at Codex.Analysis.MultiVariable.TaylorExtrema re-exports these statements and records the unification gap.

Advanced results [Master]

Taylor's theorem, integral remainder form. Let $f : U \to R$ be $C^{k + 1}$ on a convex open $U \subseteq R^{n}$ , $a \in U$ , $a + h \in U$ . Then $$ f(a + h) = \sum_{|\alpha| \leq k} \frac{D^\alpha f(a)}{\alpha!} h^\alpha + \sum_{|\alpha| = k+1} \frac{k+1}{\alpha!} h^\alpha \int_0^1 (1 - t)^k D^\alpha f(a + th) , dt. $$ The integral form gives a remainder that is continuous in $a$ and $h$ , in contrast to the Lagrange form whose remainder depends on a non-explicit intermediate point ^{[Apostol Ch. 9 §9.4–9.7]}. Proof in Exercise 7 by reduction to the single-variable integral form along the segment $t \mapsto f (a + t h)$ . The integral form is the version preferred in modern analysis because of its uniformity properties.

Morse theory and the index of a critical point. A $C^{2}$ function $f : M \to R$ on a smooth manifold is a Morse function when every critical point of $f$ has invertible Hessian. The Morse index of a non-degenerate critical point $a$ is the number of negative eigenvalues of $H_{f} (a)$ . The Morse lemma states: near a non-degenerate critical point $a$ of index $k$ , there exist local coordinates $(y_{1}, \dots, y_{n})$ centred at $a$ in which $f$ takes the canonical form $f = f (a) - y_{1}^{2} - \dots - y_{k}^{2} + y_{k + 1}^{2} + \dots + y_{n}^{2}$ . The topology of the sublevel set $M_{c} = f^{- 1} ((- \infty, c])$ changes by attaching a $k$ -cell as $c$ passes a critical value of index $k$ . Originator: Marston Morse ^{[Morse 1925]}; standard modern reference Milnor Morse Theory (1963). The Morse lemma is a quadratic-form normal-form result whose proof is a slick parametric application of the inverse function theorem 02.05.04 to a smoothly varying square-root of the Hessian; the cellular-attachment statement is the foundation for the proof of the h-cobordism theorem (Smale 1961), Floer homology (Floer 1988), and gradient-flow constructions across geometric analysis.

Lagrange multipliers. Let $f, g_{1}, \dots, g_{m} : R^{n} \to R$ be $C^{1}$ . If $a$ is a local extremum of $f$ on the constraint set ${x : g_{1} (x) = \dots = g_{m} (x) = 0}$ and the constraint gradients $\nabla g_{1} (a), \dots, \nabla g_{m} (a)$ are linearly independent (the constraint qualification), then there exist scalars $λ_{1}, \dots, λ_{m} \in R$ with $\nabla f (a) = \sum_{i} λ_{i} \nabla g_{i} (a)$ . The $λ_{i}$ are the Lagrange multipliers. The Lagrangian $L (x, λ) = f (x) - \sum_{i} λ_{i} g_{i} (x)$ has critical points in $(x, λ)$ -space precisely at constraint-satisfying extrema with multipliers. Originator: Joseph-Louis Lagrange 1788, Méchanique analytique. The proof is a direct corollary of the implicit function theorem 02.05.04 applied to $g$ at $a$ , with the chain rule 02.05.03 producing the gradient identity (Exercise 7 of 02.05.04). The constrained second-derivative test uses the bordered Hessian — the Hessian of $L$ restricted to the tangent space ${v : \nabla g_{i} (a) \cdot v = 0 for all i}$ — and classifies $a$ as a constrained local minimum, maximum, or saddle by the definiteness signature on this subspace (Exercise 6).

Catastrophe theory. Thom's classification theorem states: a generic family $F : R^{n} \times R^{r} \to R$ , smooth and depending on $r \leq 4$ control parameters, has critical points whose local normal forms — after a smooth change of coordinates in $R^{n}$ — are one of seven types: the fold ( $x^{3}$ ), cusp ( $x^{4} + a x^{2}$ ), swallowtail ( $x^{5}$ ), butterfly ( $x^{6}$ ), and three umbilics (hyperbolic, elliptic, parabolic). Originator: René Thom ^{[Thom 1972]}; rigorous proof by John Mather (1968–1971). The classification extends the second-derivative test from non-degenerate critical points (where the Hessian classifies completely) to degenerate critical points in low-codimension families. Each catastrophe is a bifurcation: as the control parameters cross a critical surface, the structure of critical points of $F (\cdot, λ)$ changes qualitatively. Applications run from optics (caustics are fold and cusp catastrophes) to fluid mechanics (vortex shedding) to phase transitions in statistical mechanics. The classification is a finite list because $r \leq 4$ bounds the codimension of the universal unfolding.

Laplace's method. Let $f : U \to R$ be $C^{2}$ on a compact $U \subseteq R^{n}$ with a unique minimum at $a$ in the interior, $\nabla f (a) = 0$ , and $H_{f} (a)$ positive definite. Then as $N \to \infty$ , $$ \int_U e^{-N f(x)} , dx \sim e^{-N f(a)} \left( \frac{2\pi}{N} \right)^{n/2} \frac{1}{\sqrt{\det H_f(a)}}. $$ The asymptotic expansion is to leading order; higher-order corrections come from higher Taylor coefficients of $f$ at $a$ . The proof substitutes the Taylor expansion of $f$ at $a$ , completes the square in the quadratic term, and identifies the leading Gaussian integral. Laplace's method is the deterministic ancestor of the saddle-point method in complex analysis (06.* future units), the partition-function asymptotic in statistical mechanics (08.* future units), and the WKB approximation in quantum mechanics. Originator: Laplace 1774 Mémoire sur la probabilité des causes par les évènemens; modern treatment in Bleistein & Handelsman Asymptotic Expansions of Integrals (1975).

Higher-order tests via the multi-jet. When the Hessian is degenerate, the $k$ -jet of $f$ at $a$ — the equivalence class of $f$ under $f \sim g$ iff $D^{α} f (a) = D^{α} g (a)$ for $∣ α ∣ \leq k$ — controls the local-shape classification at higher order. The space of $k$ -jets at $a$ , $J_{a}^{k} (R^{n}, R)$ , is a finite-dimensional vector space of dimension $(k n + k)$ , and the question "is $a$ a local minimum?" depends only on the $k$ -jet for the smallest $k$ that breaks the tie. For positive semidefinite Hessian with a single zero eigenvalue, the answer at order three or four typically suffices: the third-jet decides if it has a nonzero component in the kernel direction (saddle-like behaviour at order three is the same as a fold catastrophe), and the fourth-jet decides if the third vanishes (cusp catastrophe at order four). The full catastrophe classification systematises this.

Synthesis. Five observations organise the unit. First, the second-derivative test rests on Taylor's theorem and the spectral theorem for symmetric matrices: the quadratic form $h^{T} H h$ achieves its sign on $∥ h ∥ = 1$ at eigenvectors of $H$ , so the eigenvalue signature controls the sign across all directions simultaneously. Second, the proof reduces to two ingredients: the Rayleigh-quotient lower bound $h^{T} H h \geq λ_{m i n} ∥ h ∥^{2}$ for positive definite $H$ , and the continuity of the Hessian, which lets the small- $h$ behaviour of $H_{f} (c (h))$ be controlled by $H_{f} (a)$ . Third, the test extends to constrained extrema through the implicit function theorem 02.05.04: a constrained local extremum of $f$ on ${g = 0}$ is a critical point of the Lagrangian, and the second-derivative test on the bordered Hessian gives the constrained classification. Fourth, the test generalises to Morse theory on smooth manifolds: a non-degenerate critical point has a Morse index equal to the number of negative Hessian eigenvalues, and the index controls the cellular topology of sublevel sets. The signature is a topological invariant, not merely a local-shape invariant. Fifth, the test fails on degenerate critical points but is rescued by higher-order Taylor jets and catastrophe theory: Thom's classification gives a finite list of normal forms for generic degenerate critical points in low-codimension families, and the eigenvalue-counting machine extends to a jet-counting machine on the higher Taylor terms. The Hessian eigenvalue structure is the foundational mechanism that gives critical points their topologically distinct flavours.

Full proof set [Master]

Second-derivative test (cases 1–3). Proved in §"Key theorem with proof" above by the Taylor expansion with Lagrange remainder, the Rayleigh-quotient bound on $h^{T} H h$ , and the continuity of $H_{f}$ near $a$ .

Single-variable second-derivative test. Proved as Exercise 5 by specialising the multi-variable proof to $n = 1$ .

Integral remainder form of Taylor's theorem. Proved as Exercise 7 by composing $f$ with the line $t \mapsto a + t h$ and applying the single-variable integral form to $ϕ (t) = f (a + t h)$ , with the multinomial theorem identifying $ϕ^{(j)}$ as a sum over multi-indices.

Morse lemma. Statement above. Proof sketch. At a non-degenerate critical point $a$ of $f$ on $M$ , choose local coordinates $(x_{1}, \dots, x_{n})$ centred at $a$ in which $H_{f} (a)$ is the diagonal matrix $diag (- 1, \dots, - 1, + 1, \dots, + 1)$ with $k$ negative entries. Write $f (x) - f (a) = \frac{1}{2} x^{T} A (x) x$ for a smooth symmetric matrix-valued function $A$ with $A (0) = H_{f} (a)$ , by the integral form of Taylor's theorem. Apply a smoothly varying linear change of coordinates $y = B (x) x$ where $B (x)$ is a smooth invertible matrix with $B (0) = I$ chosen so that $B (x)^{T} A (x) B (x) = A (0)$ ; such $B$ exists by a parametric application of the inverse function theorem 02.05.04 to the smooth map $B \mapsto B^{T} A B$ in a neighbourhood of $B = I$ . In the $y$ -coordinates, $f (y) - f (a) = \frac{1}{2} y^{T} A (0) y = - y_{1}^{2} - \dots - y_{k}^{2} + y_{k + 1}^{2} + \dots + y_{n}^{2}$ . $□$

Lagrange multipliers (regular case). Statement above. The proof is Exercise 7 of 02.05.04 for the single-constraint case, generalised by the rank- $m$ form of the implicit function theorem applied to the constraint map $g = (g_{1}, \dots, g_{m}) : R^{n} \to R^{m}$ . Constraint qualification (linear independence of the constraint gradients) is the surjectivity hypothesis $D g (a)$ has rank $m$ ; the implicit function theorem produces a local parametrisation of the constraint surface ${g = 0}$ , and the unconstrained second-derivative test on the parametrisation gives the constrained second-derivative test. $□$

Laplace's method. Statement above. Proof sketch. Translate so $a = 0$ and $f (0) = 0$ . By Taylor, $f (x) = \frac{1}{2} x^{T} H x + r (x)$ with $H = H_{f} (0)$ positive definite and $r (x) = o (∥ x ∥^{2})$ . Split the integral into a neighbourhood $B_{δ}$ of $0$ and its complement. On $B_{δ}^{c}$ , $f$ is bounded below by a positive constant on compact $U$ , so $\int_{U ∖ B_{δ}} e^{- N f}$ is exponentially small. On $B_{δ}$ , Taylor's theorem and the Rayleigh bound give $f (x) \geq \frac{1}{4} λ_{m i n} (H) ∥ x ∥^{2}$ for $δ$ small, and the rescaling $x = y / N$ converts the integral into a Gaussian integral with $H$ as covariance matrix: $\int_{B_{δ}} e^{- N f (x)} d x \sim N^{- n /2} \int_{R^{n}} e^{- \frac{1}{2} y^{T} H y} d y = N^{- n /2} (2 π)^{n /2} / det H$ . Subleading terms vanish as $N \to \infty$ . $□$

Catastrophe classification. Statement above. Proof outline. The Thom-Mather classification proceeds in three stages. Stage 1: classify singularities of smooth functions $R^{n} \to R$ up to $C^{\infty}$ -equivalence (compose with smooth diffeomorphisms on source and target). For $n$ -dimensional critical points of codimension $\leq r$ , the universal unfolding has $r + 1$ moduli parameters; finite codimension reduces to a finite list. Stage 2: identify the universal unfoldings by Mather's preparation theorem (the smooth-function analogue of the Weierstrass preparation theorem) and the Malgrange division theorem. Stage 3: realise each algebraic normal form as a polynomial family $F (x, λ)$ of explicit form. The seven elementary catastrophes for $r \leq 4$ are the result. Full proofs in Mather Stability of $C^{\infty}$ mappings (1968–1971) and Thom 1972 Stabilité structurelle et morphogénèse ^{[Thom 1972]}. $□$

Connections [Master]

Implicit and inverse function theorems 02.05.04 — the Lagrange multiplier rule descends from the implicit function theorem applied to the constraint map. The bordered Hessian classification of constrained extrema combines the second-derivative test of this unit with the implicit-function parametrisation. The Morse lemma's smooth coordinate change is itself a parametric application of the inverse function theorem to a quadratic-form normalisation, and the constant-rank theorem's local-normal-form construction shares the same proof structure.

Chain rule for multi-variable functions 02.05.03 — the multi-variable Taylor expansion is built by iterating the chain rule on $ϕ (t) = f (a + t h)$ , with the multinomial theorem identifying $ϕ^{(j)}$ as a sum over multi-indices. The bordered Hessian in the Lagrange-multiplier setting is computed by chain rule applied to $F (x^{'}) = f (x^{'}, h (x^{'}))$ for the implicit-function parametrisation $h$ .

Smooth manifold 03.02.01 — Morse functions live on smooth manifolds and their critical-point classification depends on a choice of local chart, but the Morse index is chart-independent because change of coordinates preserves Hessian definiteness up to congruence. The Morse-lemma local normal form is the manifold-level statement that critical points of Morse functions look locally Euclidean, with the canonical quadratic form $- y_{1}^{2} - \dots - y_{k}^{2} + y_{k + 1}^{2} + \dots + y_{n}^{2}$ .

Riemannian metric and curvature [03.05.*] — sectional curvature of a Riemannian manifold is the second derivative of the metric structure along a geodesic, computed via a Jacobi-field equation whose solution structure is governed by the Hessian of a distance function. Bochner's formulas in Riemannian geometry are integral identities controlling the Hessian of a function against the Ricci curvature, used in Lichnerowicz-type bounds and the proof of positive scalar curvature constraints.

Morse theory and topology of manifolds [03.12. — pending unit]* — the Morse-theoretic cellular decomposition assembles the manifold from cells whose attaching maps are determined by critical-point indices of a Morse function. Smale's h-cobordism theorem (1961) and Milnor's exotic sphere construction (1956) both rest on Morse theory. Floer homology (Andreas Floer, 1988) extends the construction to infinite-dimensional gradient flows on loop spaces of symplectic manifolds.

Symplectic geometry and Hamiltonian dynamics [05.*] — the symplectic structure on $T^{*} M$ pairs with the Hessian of a Hamiltonian $H : T^{*} M \to R$ to produce the linearised flow at a fixed point, with eigenvalue signature controlling stability (KAM persistence vs hyperbolic instability). The second-derivative test is the bridge between local critical-point geometry and global Hamiltonian dynamics.

Asymptotic analysis and stationary phase [06. — pending unit]* — Laplace's method and the saddle-point method in complex analysis use the Hessian-determinant prefactor to extract the leading Gaussian-integral contribution from oscillatory integrals $\int e^{i N f (x)} d x$ and exponential integrals $\int e^{- N f (x)} d x$ . The WKB approximation in quantum mechanics is the same mechanism applied to wave-function asymptotics. Edge-of-the-wedge theorems and Borel resummation continue the asymptotic-analysis thread into complex analysis.

Statistical mechanics and the partition function [08. — pending unit]* — partition-function expansions $Z = \int e^{- β H (x)} d x$ at low temperature ( $β \to \infty$ ) are governed by Laplace's method applied to the Hamiltonian $H$ : the leading contribution comes from the ground state with Hessian-determinant fluctuation prefactor. Phase transitions are bifurcations of critical points of an effective potential, classified by catastrophe theory in the mean-field regime.

Historical & philosophical context [Master]

Brook Taylor's 1715 Methodus incrementorum directa et inversa ^{[Taylor 1715]} introduced the single-variable Taylor formula in the finite-difference form characteristic of early eighteenth-century calculus. Joseph-Louis Lagrange in his 1797 Théorie des fonctions analytiques ^{[Lagrange 1797]} gave the remainder form now bearing his name, with the explicit intermediate-point formula $f (a + h) = \sum_{j = 0}^{k} f^{(j)} (a) h^{j} / j! + f^{(k + 1)} (c) h^{k + 1} / (k + 1)!$ for some $c \in (a, a + h)$ . Cauchy in his 1821 Cours d'analyse and 1823 Résumé des leçons gave the first rigorous $ε$ - $δ$ proof. The multi-variable version emerged through nineteenth-century pedagogical practice, with the multi-index notation crystallising in the early twentieth century. Apostol's 1969 Calculus Vol. 2 Ch. 9 ^{[Apostol Ch. 9]} gave the canonical undergraduate presentation, with the Hessian-based second-derivative test in the form taught today.

Otto Hesse introduced the determinant of the matrix of second partial derivatives in his 1857 paper Über die Determinanten und ihre Anwendung in der Geometrie ^{[Hesse 1857]} in the Journal für die reine und angewandte Mathematik, in the context of algebraic geometry — the singular points of a projective hypersurface are precisely the zeros of what is now called the Hessian determinant. The matrix itself acquired Hesse's name through Jacob Sylvester's later usage. Marston Morse in his 1925 Transactions paper ^{[Morse 1925]} introduced what is now called Morse theory, with the index of a non-degenerate critical point as the central invariant; the Morse-theoretic decomposition of manifolds powered Smale's 1961 proof of the high-dimensional Poincaré conjecture and Milnor's 1956 construction of exotic seven-spheres. René Thom in his 1972 monograph Stabilité structurelle et morphogénèse ^{[Thom 1972]} introduced catastrophe theory, classifying generic degenerate critical points of smooth functions in families of up to four control parameters into seven elementary types (fold, cusp, swallowtail, butterfly, hyperbolic umbilic, elliptic umbilic, parabolic umbilic); the rigorous proofs were given by John Mather (1968–1971) using his preparation theorem for smooth functions, and the theory found applications in optics, fluid mechanics, and mathematical biology.

Bibliography [Master]

[object Promise]

Prerequisites

02.05.04

Tier anchors

beginner: 3Blue1Brown style 'tangent paraboloid' framing; Strogatz informal 'cup, cap, saddle' picture for critical points
intermediate: Apostol *Calculus* Vol. 2 Ch. 9 §9.4–9.9 (Taylor's theorem in several variables, Hessian matrix, sufficient conditions for extrema); Rudin *Principles of Mathematical Analysis* Ch. 9 §9.39–9.41; Spivak *Calculus on Manifolds* Ch. 2
master: Apostol *Calculus* Vol. 2 Ch. 9 (originator pedagogical presentation); Taylor 1715 *Methodus incrementorum directa et inversa* (originator of the single-variable formula); Lagrange 1797 *Théorie des fonctions analytiques* (originator of the remainder form); Hesse 1857 *Über die Determinanten und ihre Anwendung in der Geometrie* (J. reine angew. Math. 53) — originator of the Hessian determinant; Morse 1925 *Relations between the critical points of a real function of n independent variables* (Transactions AMS 27) — originator of Morse theory; Thom 1972 *Stabilité structurelle et morphogénèse* (originator of catastrophe theory)

References

TODO_REF
Apostol — Calculus Vol. 2 · Ch. 9 §9.4–9.9, Taylor's theorem in several variables, Hessian matrix, and sufficient conditions for extrema via the second-derivative test
TODO_REF
Rudin — Principles of Mathematical Analysis · Ch. 9 §9.39–9.41, Taylor's theorem on $\mathbb{R}^n$ and the second-derivative test for extrema
TODO_REF
Spivak — Calculus on Manifolds · Ch. 2 §2-12 to §2-14, Taylor's theorem and the classification of critical points via the Hessian
TODO_REF
Taylor 1715 — Methodus incrementorum directa et inversa · originator of the single-variable Taylor formula in finite-difference form
TODO_REF
Lagrange 1797 — Théorie des fonctions analytiques · originator of the Lagrange form of the remainder term
TODO_REF
Hesse 1857 — Über die Determinanten und ihre Anwendung in der Geometrie · Journal für die reine und angewandte Mathematik 53, originator of the Hessian determinant and its role in classifying critical points
TODO_REF
Morse 1925 — Relations between the critical points of a real function of n independent variables · Transactions of the American Mathematical Society 27, originator of Morse theory and the index of a non-degenerate critical point
TODO_REF
Thom 1972 — Stabilité structurelle et morphogénèse · originator of catastrophe theory, the classification of generic degenerate critical points up to local diffeomorphism

Lean module

Codex.Analysis.MultiVariable.TaylorExtrema

Mathlib gap

Mathlib provides the multi-variable Taylor expansion through
`taylorWithinEval`, `Polynomial.taylor`, and the iterated Fréchet
derivative `iteratedFDeriv`. The Hessian appears via
`iteratedFDeriv ℝ 2 f x` paired with the bilinear-form interpretation
through `ContinuousMultilinearMap.toBilinearMap`. The second-derivative
test for extrema is partially available through `IsLocalMin.of_deriv_*`
in the single-variable case and through positive-definiteness results
on quadratic forms in `LinearAlgebra.QuadraticForm.Basic`, but a
unified statement that names the multi-variable test in Apostol notation
— gradient vanishing, Hessian eigenvalue sign classification, strict
local min / max / saddle conclusion — is not packaged. What is also not
packaged is the Morse-theoretic interpretation of the Hessian index and
the higher-order extension via the catastrophe-theory normal forms. The
Codex module collects the second-derivative test into the textbook
presentation and records the unification gap.

Reviewer

TBD

Estimated time

beginner: 15m
intermediate: 35m
master: 65m