02.05.04 · analysis / multivariable-differentiation

Implicit and inverse function theorems

shipped3 tiersLean: partial

Anchor (Master): Apostol *Calculus* Vol. 2 Ch. 13 (originator pedagogical presentation); Spivak *Calculus on Manifolds* Ch. 2; Dieudonné *Foundations of Modern Analysis* Ch. X §10.2 (Banach-space form); Cauchy 1831 *Sur la mécanique céleste et sur un nouveau calcul* (analytic implicit-function tradition); Goursat 1903 *Sur la théorie des fonctions implicites* (Bulletin de la Société Mathématique de France 31) — originator of the modern $C^1$-and-Banach formulation; Dini 1877 *Lezioni di analisi infinitesimale* (originator of the modern implicit-function theorem on $\mathbb{R}^n$); Nash 1956 *The imbedding problem for Riemannian manifolds* (Annals of Mathematics 63); Moser 1966 *A rapidly convergent iteration method and non-linear differential equations* (Annali Scuola Norm. Sup. Pisa 20)

Intuition [Beginner]

Two questions sit at the heart of multi-variable calculus, and they have the same answer. First, when does a smooth map have a smooth inverse — at least near a given point? Second, when does an equation like pin one of its variables down as a function of the other? Both questions are local: they ask what happens in a small neighbourhood of a chosen point, not across the whole space.

The first answer is the inverse function theorem. Zoom in close enough on a smooth map at a point, and the map looks like its derivative, which is a linear map. If that linear map is invertible — that is, if its determinant is nonzero — then the smooth map is invertible too in some small neighbourhood. The smooth inverse exists and is itself smooth.

The second answer is the implicit function theorem. An equation with smooth carves out a curve or surface in the plane or space. Near a point where the rate of change of with respect to is nonzero, you can solve for as a smooth function of — the level set of looks locally like a graph.

The two answers are the same. The implicit theorem is the inverse theorem in disguise. Both rest on a single linear-algebra check: a derivative, somewhere in the picture, is invertible.

Visual [Beginner]

Two panels side by side. The left panel shows a smooth map drawn as a deformed grid superimposed on a coordinate grid: tiny squares near a chosen point have been stretched, rotated, and sheared but not collapsed. An arrow above the left panel says "zoom in" and a dashed inset shows the squares becoming a parallelogram tiling — the local picture is a linear map, the derivative . Because the parallelograms have nonzero area, the local map is invertible, and the smooth map has a smooth local inverse.

Two panels. Left panel: a smooth map shown as a deformed grid near a point a, with an inset zoomed view showing the deformation become a parallelogram tiling — the linear approximation Df(a). A label notes that nonzero determinant of Df(a) means an invertible local inverse exists. Right panel: the unit circle in the plane drawn through the point (1 over root 2, 1 over root 2). A dashed horizontal segment shows a small neighbourhood of that x-coordinate, with the corresponding piece of the circle lying directly above as a graph y = h(x). A label says the rate of change of g equals x squared plus y squared minus one with respect to y is two y, nonzero at the point.

The right panel shows the unit circle and a chosen point on it. A short horizontal interval beneath the point picks out a tiny range of values; the piece of the circle directly above that interval is a function . The implicit theorem says: where the rate of change of with respect to is nonzero, that local graph exists and is smooth.

Worked example [Beginner]

Take the smooth map at the point . The output is . The derivative is a matrix of partial derivatives. The four entries are , , , at , so the derivative matrix is the diagonal matrix with entries and . Its determinant is , which is nonzero.

The inverse function theorem then says: a small neighbourhood of in the input plane maps one-to-one onto a small neighbourhood of in the output plane, and the map has a smooth local inverse. (Globally, is the squaring map on the complex plane, which is 2-to-1; the local inverse only exists in a small neighbourhood.)

Now an implicit example. Take , whose zero set is the unit circle. Pick the point on the circle. The rate of change of with respect to , written , equals , so at this point , which is nonzero. The implicit function theorem says: near this point, the equation defines as a smooth function of . In fact for near , the upper half of the circle.

What this tells us: the single condition "some derivative is nonzero / invertible" controls both local invertibility of a map and the local solvability of an equation. The same check answers both questions.

Check your understanding [Beginner]

Formal definition [Intermediate+]

Throughout this section , , and denote open subsets of finite-dimensional Euclidean spaces. A map is for when all partial derivatives of of orders exist and are continuous on .

Inverse function theorem. Let be open and let be with . If and the derivative is invertible as a linear map, then there exist open neighbourhoods of and of such that the restriction is a bijection with inverse . The derivative of the inverse satisfies .

The notation for the Jacobian matrix continues from 02.05.03; invertibility of is equivalent to . Following Apostol [Apostol Ch. 13 §13.4–13.7].

Implicit function theorem. Let be open and let be with . Suppose with , , and . Write the derivative as the block matrix , where is the Jacobian of in the first variables and is the Jacobian in the last variables. If is invertible, then there exist open neighbourhoods of and a map with and for all . The derivative of at satisfies $$ Dh(a) = -D_2 g(a, b)^{-1} \cdot D_1 g(a, b). $$

The local solution is the unique map satisfying the level-set equation on a sufficiently small neighbourhood of . Following Apostol [Apostol Ch. 13 §13.4–13.7] and Spivak [Spivak Ch. 2 Theorem 2-12].

Counterexamples to common slips

  • continuity is not enough. A continuous map with no differentiability assumption admits no invertibility test on a derivative — the hypothesis is essential.
  • Pointwise invertibility, not constancy of . The hypothesis is a check at the single point . The theorem produces a neighbourhood on which the inverse exists; the determinant stays nonzero there by continuity, but the theorem does not assume invertibility on the whole domain.
  • Local, not global. The map has invertible derivative at every nonzero point yet is 2-to-1 across the plane. The inverse theorem produces a local inverse near each nonzero point, not a global one.
  • The implicit hypothesis is on , not on . To solve for the last block of variables in terms of the first, the invertibility test is on the partial Jacobian in the last block. The first block can have any rank.
  • Wrong partial Jacobian, wrong conclusion. At on the unit circle , the partial vanishes. The implicit theorem does not apply in that orientation; the circle is not a graph near , but it is a graph since .

Key theorem with proof [Intermediate+]

Theorem (inverse implies implicit). Let be open and let be with , at . If is invertible, then the implicit function theorem conclusion holds: there exist a neighbourhood of in and a map with and for all , and .

Proof. Define the auxiliary map by $$ F(x, y) = (x, g(x, y)), $$ sending the input pair to the pair . The map is because is and the first component is the projection. Compute the derivative as a block matrix in the form. The first rows come from the first component : the rate of change of with respect to is the identity block of shape , and the rate of change of with respect to is the zero block of shape . The last rows come from the second component : the rate of change of with respect to is of shape , and the rate of change of with respect to is of shape . The block-matrix form is $$ DF(a, b) = \begin{pmatrix} I_n & 0 \ D_1 g(a, b) & D_2 g(a, b) \end{pmatrix}. $$ This is a block lower-triangular matrix; its determinant equals the product of the determinants of its diagonal blocks, . By hypothesis is invertible, so , and is invertible as a linear map on .

Apply the inverse function theorem to at . There exist open neighbourhoods of in and of in such that is a bijection with inverse . By the form of , the first components of are just , so the inverse preserves the first components: for some map .

Define on a neighbourhood of by $$ h(x) = G_2(x, 0), $$ where is open since is open and the map is continuous. The map is as the composition of maps. Check . The inverse identity becomes , hence , hence .

Check for . The defining identity unpacks to , which by definition of is , hence .

The derivative formula. Differentiate the identity in at using the chain rule 02.05.03: the derivative of in is the sum of the partial derivatives in the two argument blocks, multiplied by the derivatives of those arguments. The first block argument is , with derivative , so its contribution is . The second block argument is , with derivative , so its contribution is . The sum is : $$ D_1 g(a, b) + D_2 g(a, b) \cdot Dh(a) = 0. $$ Solve for by left-multiplying by : $$ Dh(a) = -D_2 g(a, b)^{-1} \cdot D_1 g(a, b). \qquad \square $$

The proof reduces the implicit function theorem to the inverse function theorem through the lifting trick . The remaining content is the inverse function theorem itself, proved via the Banach contraction principle.

Theorem (inverse function theorem; Apostol Ch. 13). Let be open, a map, and with invertible. Then there are open neighbourhoods , with a bijection whose inverse is on .

Proof sketch (contraction principle). Reduce to the case , , by composing with two affine maps; the general case follows from this normalised case by undoing the composition. With this reduction, define for each target value in a small neighbourhood of the auxiliary map $$ T_y : x \mapsto x - (f(x) - y) = y + x - f(x). $$ A fixed point of — a point with — is a preimage of under . The map has derivative at . Since and is continuous, on a small closed ball . The mean-value inequality gives for , so is a contraction with constant . For small enough, maps into itself. The Banach contraction principle gives a unique fixed point with , and this fixed point depends continuously on . The construction defines the local inverse on a small neighbourhood of .

Smoothness of the inverse comes from the chain rule 02.05.03: differentiating the identity in at gives , hence . The same computation at any nearby with gives . The right-hand side is continuous in because is continuous and the matrix inverse is a continuous operation on the open set of invertible matrices. Hence is . Iterating the chain-rule argument shows is when is .

The full Apostol proof carries the contraction estimate at the level of operator norms with explicit bounds. The structure is invariant: a single contraction-mapping argument plus chain rule for smoothness.

Bridge. Four threads run from the inverse and implicit function theorems into the rest of the curriculum, and each one identifies the linear-algebra check as the foundational mechanism. First, both theorems are local: they trade the global question of invertibility for a pointwise derivative check. The trade is exact — the proof shows that invertibility of the linear approximation is the only ingredient needed, because the contraction-mapping argument absorbs the higher-order error terms. Second, the two theorems are equivalent: the implicit theorem follows from the inverse theorem by the auxiliary map , and the inverse theorem follows from the implicit theorem applied to . The single content is local linearisation; the two statements are two faces of it. Third, the theorems unlock the manifold structure on level sets: when has surjective derivative at every point of the zero set , the implicit theorem makes that zero set locally a graph at every point — the regular-level-set theorem, foundational for 03.02.01 smooth manifolds. Fourth, the contraction-mapping engine generalises far beyond Euclidean space: the same argument runs on Banach spaces (the Banach-space inverse theorem) and, with a loss-of-derivatives correction, on tame Fréchet spaces (the Nash-Moser hard implicit theorem, used in KAM theory and the Riemannian embedding problem). The linear-algebra invertibility check is the load-bearing piece in every case.

Exercises [Intermediate+]

Lean formalization [Intermediate+]

lean_status: partial — Mathlib provides the inverse function theorem in Fréchet-derivative form through HasFDerivAt.localInverse, HasStrictFDerivAt.to_localInverse, and the contraction-mapping driver. The implicit function theorem is packaged as HasStrictFDerivAt.implicitFunction with the split-domain map and the derivative formula. The textbook-style packaging in Apostol notation and the constant-rank theorem under one named result is the Codex-facing gap.

[object Promise]

The companion module at Codex.Analysis.MultiVariable.ImplicitInverse re-exports these statements and records the unification gap.

Advanced results [Master]

Banach-space inverse function theorem. Let , be Banach spaces, open, a Fréchet-differentiable map, and with a topological linear isomorphism (a continuous linear bijection with continuous inverse). Then there are open neighbourhoods and with a bijection whose inverse is on [Dieudonné Ch. X §10.2]. The proof transcribes the Euclidean contraction-mapping argument: the construction is a contraction on a closed ball in , and completeness of delivers the fixed point. The hypothesis that is a topological isomorphism — not just an algebraic bijection — is essential: on Banach spaces, a continuous linear bijection has continuous inverse iff and are complete and the open mapping theorem applies, but a Fréchet derivative that is merely algebraically bijective without topological inverse can fail to deliver a local inverse.

Holomorphic inverse function theorem. Let be open and a holomorphic map. If and the complex Jacobian is invertible (equivalently, ), then there exist open neighbourhoods and with a biholomorphism — a holomorphic bijection with holomorphic inverse. The proof runs the same contraction-mapping argument with holomorphicity preserved at each step, and the resulting local inverse is automatically holomorphic by the Cauchy-Riemann equations applied to the inverse relation . The holomorphic version powers the local theory of complex manifolds and underlies the definition of an étale morphism in algebraic geometry: a morphism of smooth schemes over is étale at a point iff the induced map on the algebraic tangent space is an isomorphism, which by the holomorphic IFT is equivalent to being a local biholomorphism.

Real-analytic inverse function theorem. If is real-analytic and is invertible at , the local inverse is real-analytic on a neighbourhood of . Proved via the Cauchy-Kovalevskaya majorant-series technique: bound the Taylor coefficients of by a geometric series, conclude convergence on a small ball. Originator: Cauchy 1831 in his Turin lectures on celestial mechanics [Cauchy 1831].

Constant rank theorem. Let be open and a map with constant rank on — that is, for every . Then for each there exist local diffeomorphisms near in and near in such that is the linear projection on a neighbourhood of the origin. The proof iterates the inverse function theorem on a rearrangement of coordinates that makes the top-left block of the Jacobian invertible; the implicit function theorem then absorbs the remaining output coordinates as functions of the first inputs, and a final coordinate change linearises the structure. The constant rank theorem is the geometric statement that subsumes both inverse (case ) and implicit (case ) function theorems: every map of constant rank is locally a linear projection up to diffeomorphism.

Nash-Moser hard implicit function theorem. Let , be tame Fréchet spaces (a class of locally convex topological vector spaces with a graded sequence of seminorms and a smoothing-operator family). Let be a smooth map with admitting a tame right inverse — meaning the inverse loses a finite, controlled number of derivatives but remains tame. Under appropriate quantitative hypotheses on the loss of derivatives and the smoothing operators, there is a neighbourhood of in and a smooth map with for and . The proof runs a Newton-style iteration with smoothing at each step to compensate the loss of derivatives. Originator: John Nash 1956 [Nash 1956] in solving the Riemannian embedding problem; refined by Jürgen Moser 1966 [Moser 1966] for the KAM theorem on persistence of quasi-periodic orbits in Hamiltonian systems. The hard IFT extends the classical inverse function theorem from Banach spaces, where the linearised inverse has zero loss of derivatives, to the much wider class of nonlinear PDE problems where Sobolev-space estimates show a loss but only by a bounded amount.

Failure on Banach spaces with non-topological-isomorphism derivatives. On infinite-dimensional Banach spaces, a continuous Fréchet derivative that is algebraically a bijection but lacks a continuous inverse is insufficient to conclude has a local inverse. The standard witness: take (the Banach space of sequences converging to ), and let send . The map is a continuous linear bijection with set-theoretic inverse , but is unbounded. A map with cannot be inverted locally as a map. The topological-isomorphism hypothesis in the Banach-space IFT is essential.

Synthesis. Five observations organise the unit. First, the inverse and implicit function theorems are two presentations of one content: local linearisation. The auxiliary map converts each into the other, and the linear-algebra check on or is the same condition in different coordinates. Second, the proof rests on the Banach contraction principle applied to the displacement map . The contraction estimate uses the operator-norm bound on , which is small near because is continuous. Third, smoothness of the inverse is a chain-rule 02.05.03 consequence: differentiate to obtain , then iterate to push regularity through. Fourth, the theorems generalise smoothly to Banach spaces with the same contraction-mapping argument, holomorphically with the Cauchy-Riemann equations preserving complex-differentiability through the contraction, real-analytically with majorant series, and to tame Fréchet spaces with the Nash-Moser smoothing iteration. The single argument re-runs across four function-space categories with appropriate adjustments. Fifth, the theorems are the foundational mechanism connecting infinitesimal invertibility (the linear-algebra check at one point) to local invertibility (a smooth inverse on a neighbourhood). The bridge is the contraction-mapping argument; the linear-algebra check is the only ingredient. Both theorems unlock the manifold structure on regular level sets and the constant-rank theorem, and together they form the backbone of the local theory of smooth maps.

Full proof set [Master]

Inverse implies implicit. Proved in §"Key theorem with proof" above by the auxiliary map , its block-triangular derivative with invertible diagonal blocks, the inverse function theorem applied to , and the chain rule 02.05.03 applied to the identity to extract the derivative formula.

Inverse function theorem (contraction-mapping proof). Sketched in §"Key theorem with proof" above and detailed here. Statement above. Normalise to , , by left-composing with and translating. Define ; fixed points of are preimages of under . The derivative tends to in operator norm as by continuity of . Choose so on . The mean-value inequality gives for . Choose so forces , hence by the contraction estimate. Banach contraction principle on the complete metric space gives a unique fixed point depending continuously on . Set on the open ball . Smoothness of comes from chain rule 02.05.03 applied to : , hence , a continuous function of . Iterate to push regularity through.

Regular value theorem. Stated as Exercise 5. Surjectivity of at gives an invertible submatrix of . Permute coordinates so this submatrix sits in the last columns. The implicit function theorem produces a local graph. The zero set is locally an -dimensional graph at every — the defining property of an -dimensional submanifold of .

Banach-space inverse function theorem. Statement above. The Euclidean proof transcribes with now a map on a closed ball in . The contraction estimate uses the operator norm on bounded linear maps between Banach spaces; the mean-value inequality holds for maps on Banach spaces by an integration of the derivative along a line segment. Completeness of is what makes the closed ball complete in the induced metric and powers the Banach contraction principle. The smoothness of the inverse is again a chain-rule consequence; the matrix inverse is continuous on the open set of topological isomorphisms in the operator-norm topology.

Holomorphic inverse function theorem. Statement above. The contraction-mapping argument runs verbatim with complex differentiability replacing real differentiability throughout. The Cauchy-Riemann equations for pass through the contraction to the limit . The matrix inverse is a polynomial-rational function of the matrix entries, hence holomorphic in the entries; the local inverse is holomorphic as the composition of holomorphic operations.

Constant rank theorem. Statement above. By rearranging coordinates on the source and target, place the invertible submatrix of in the top-left position. The first output coordinates , viewed as a map from to , have full-rank Jacobian at in the first input coordinates; the inverse function theorem applied to the augmented map yields a local diffeomorphism in . In the new -coordinates, the first components of are simply the first coordinates. Constant rank then forces the remaining components of to depend only on the first coordinates (since the rank in the last coordinates is already saturated by the first output components). A final target-side diffeomorphism kills the dependence of the trailing components on the first , yielding the canonical projection.

Nash-Moser hard IFT (sketch). Statement above. The proof runs a smoothed Newton iteration where are smoothing operators that compress the function space onto its low-frequency component up to scale , and is the approximate right inverse of . The loss of derivatives in is compensated by the smoothing with at a geometrically rapid rate. Quantitative tame estimates on the smoothing operators and the inverse are what makes the iteration converge in the Fréchet topology. Full proof in Hamilton's 1982 survey [Hamilton 1982] (under the bibliography below) building on Nash 1956 and Moser 1966.

Connections [Master]

Chain rule for multi-variable functions 02.05.03 — the chain rule supplies both the proof structure for the inverse function theorem (smoothness of is recovered by differentiating and applying the chain rule) and the derivative formula for the implicit function theorem (differentiating and solving for ). Every quantitative output of this unit is a chain-rule consequence with a contraction-mapping kernel.

Multi-variable limit and continuity 02.05.01 — the contraction-mapping argument that powers the inverse function theorem uses the completeness of closed balls in as a complete metric space, and the operator-norm continuity of inherits from the multi-variable continuity framework. The Banach contraction principle is the topological engine of the proof; metric-space continuity is its scaffolding.

Smooth manifold 03.02.01 — the regular level set theorem (Exercise 5) gives the implicit-function-theorem certificate that the zero set of a submersion is a submanifold. The constant rank theorem extends this: a map of constant rank has fibres and image both submanifolds. These are the two foundational constructions of submanifolds in and on abstract manifolds.

Banach spaces (pending unit 02.11.04) — the Banach-space inverse function theorem packages the contraction-mapping engine on complete normed spaces. The Picard-Lindelöf existence theorem for ODEs and Newton's method on Banach spaces share the same engine. The Nash-Moser hard IFT extends the construction to tame Fréchet spaces, used in KAM theory and the Riemannian embedding problem.

Étale morphisms in algebraic geometry [04.*] — a morphism of smooth schemes over a field is étale at iff the induced map on Zariski tangent spaces is an isomorphism. Over with the analytic topology, étale at means the holomorphic IFT applies, hence is a local biholomorphism near . The two characterisations coincide.

KAM theorem and the Nash-Moser machinery [05.09. — pending unit]* — Kolmogorov-Arnold-Moser theory establishes the persistence of quasi-periodic orbits under small Hamiltonian perturbations. The Nash-Moser hard IFT is the analytic engine: the linearised perturbation equation loses derivatives, the smoothing operators compensate, and the Newton iteration converges in the Fréchet topology. KAM was the original application that motivated the hard IFT.

Riemann surfaces and Riemannian embedding [06., 03.05.] — the holomorphic IFT powers the local theory of Riemann surfaces, where charts are biholomorphisms. The Nash-Moser hard IFT was developed to solve the isometric Riemannian embedding problem: every Riemannian manifold for sufficiently large embeds isometrically into some . The proof is a smoothed Newton iteration on a quadratic functional with a loss-of-derivatives linearised inverse.

Historical & philosophical context [Master]

The implicit function theorem in its modern form has a long lineage. Single-variable forerunners go back to Newton's Method of Fluxions (1671, published 1736), where solving an implicit equation for one variable in terms of another was treated via series expansion. In multi-variable form, Lagrange in his Théorie des fonctions analytiques (1797) gave a series-based version, and Cauchy in his Turin lectures of 1831 [Cauchy 1831] gave the first rigorous analytic-function form via majorant series — the technique that survives in the modern real-analytic IFT. Ulisse Dini's 1877 Lezioni di analisi infinitesimale [Dini 1877] established the modern form on under the now-standard hypothesis of continuous partial derivatives, and the theorem carries his name in Italian and French mathematical traditions.

Édouard Goursat's 1903 Sur la théorie des fonctions implicites [Goursat 1903] in the Bulletin de la Société Mathématique de France gave the contraction-mapping proof under the modern hypothesis, packaging the theorem in essentially the form taught today. The Banach-space generalisation came with Stefan Banach's 1922 introduction of complete normed spaces and the contraction principle, and was developed in textbook form by Jean Dieudonné in his 1960 Foundations of Modern Analysis [Dieudonné Ch. X §10.2]. Apostol's 1969 Calculus Vol. 2 Ch. 13 [Apostol Ch. 13] gave the canonical undergraduate pedagogical presentation, with the inverse and implicit theorems proved together via the contraction-mapping reduction. Spivak's 1965 Calculus on Manifolds Ch. 2 [Spivak] gave the parallel honours-undergraduate presentation in the modern coordinate-free language.

The hard implicit function theorem came from a different tradition. John Nash in his 1956 paper The imbedding problem for Riemannian manifolds [Nash 1956] in the Annals of Mathematics introduced a smoothed Newton iteration to solve the isometric embedding problem, in which the linearised equation loses derivatives. Jürgen Moser in 1966 [Moser 1966] generalised and clarified the technique, applying it to the KAM theorem on persistence of quasi-periodic orbits and to nonlinear differential equations on the torus. Richard Hamilton's 1982 survey The inverse function theorem of Nash and Moser (Bulletin of the American Mathematical Society 7) packaged the technique into the modern tame-Fréchet-space framework, in which the Nash-Moser IFT is a tool used routinely in geometric analysis, nonlinear PDE, and dynamical systems.

Bibliography [Master]

[object Promise]