Inequalities (linear and quadratic)
Anchor (Master): Cauchy 1821 Cours d'analyse; Schwarz 1885; Hölder 1889; Minkowski 1896; Jensen 1906; Hardy-Littlewood-Pólya Inequalities 1934; Tarski 1948 A Decision Method for Elementary Algebra and Geometry; Lang Algebra Ch. XV
Intuition [Beginner]
An inequality is what you get when you replace the equals sign in an equation with one of , , , or . The equation asks for the single number . The inequality asks for every number that makes bigger than . The answer is not a point but a region of the number line: all numbers larger than . Inequalities describe ranges and bounds — the natural language of "at least", "at most", "near", and "within".
You solve an inequality with the same moves as an equation: add the same thing to both sides, subtract the same thing, multiply both sides by the same number. There is one twist. If you multiply or divide both sides by a negative number, the inequality flips direction. The reason is the orientation of the number line: negation reflects the line through zero, swapping which side a number sits on. So becomes after dividing by , not .
The picture is a number line marked with shaded regions. A linear inequality such as shades a half-line — everything to the right of . A quadratic inequality such as shades a bounded interval . The same problem can also have the outside of an interval as its answer, or the whole line, or nothing at all. The shape of the shaded region depends on the leading sign and on how many places the corresponding parabola crosses the axis.
Visual [Beginner]
Picture the number line with the two roots and of the equation marked as open circles. The two roots split the line into three regions: the part to the left of , the part between and , and the part to the right of . Above the middle region a label reads ; above the outer two regions a label reads . The middle region is shaded.
The picture explains the method. The product is negative exactly when the two factors have opposite signs. To the left of both factors are negative, so the product is positive. Between the roots one factor is positive and the other negative, so the product is negative — that is the answer to the inequality. To the right of both factors are positive, so the product is positive again. Two open circles at the roots record that the strict inequality excludes the boundary, where the product equals zero.
Worked example [Beginner]
Solve . Subtract from both sides: . Divide both sides by , a positive number, so the inequality does not flip: . The solution is every real number larger than , the half-line . Check: at the left side is and the right side is , so holds; at the left side is and fails, as expected since lies outside the solution region.
Solve . Subtract from both sides: . Divide both sides by , a negative number, so the inequality flips: . The solution is the half-line , closed at because the inequality is non-strict. Check: at the left side is and the right side is , so holds; at the left side is and fails.
Solve by factoring and sign analysis. Factor: . The roots are and . A product of two real numbers is negative when one factor is positive and the other is negative. The factor is negative when and positive when . The factor is negative when and positive when . The two factors have opposite signs exactly between the two roots, . The solution is the open interval .
What this tells us: linear inequalities give half-lines, quadratic inequalities give intervals or their complements, and the sign of a product is read off from the signs of its factors. The number-line picture records every case in a single diagram.
Check your understanding [Beginner]
Formal definition [Intermediate+]
Let be a linear-ordered field — for the purposes of this unit — with the order relation and the induced relations , , . An inequality in one variable is a relation of the form , where and are polynomial expressions in with coefficients in [Lang — Basic Mathematics Ch. 4]. The solution set is .
A linear inequality in one variable is with and . By the order axioms of — for any , , and together with — the solution set is the half-line when and the half-line when , with analogous closed half-lines for the non-strict variants. The sign of controls the direction of the half-line, and the same order axiom multiplied through by a positive (respectively negative) scalar preserves (respectively reverses) the inequality.
A quadratic inequality in one variable is with and . Let be the discriminant of the underlying quadratic 00.03.02. The solution set depends on the sign of and on the sign of :
- If the quadratic factors over as with two distinct real roots . For with , the solution is the closed interval ; with it is .
- If the quadratic factors as with a single repeated root . The strict inequality has solution when and when .
- If the quadratic does not factor over and has constant sign on all of : positive when , negative when . The solution sets are correspondingly the entire line or the empty set.
The sign-analysis method extends this to a polynomial inequality for arbitrary with distinct real roots . The roots partition into open intervals; on each interval has constant sign, determined by the parity of the count of factors that are negative on that interval. The solution set is the union of those intervals on which the sign matches , with the roots included or excluded according to whether is non-strict or strict.
Counterexamples to common slips
- Multiplying both sides of an inequality by an expression of unknown sign is invalid. The step does not become for all : when the multiplication by reverses the inequality, and when the original expression is undefined. Solve by sign cases on , or rewrite as a single rational inequality.
- A non-strict quadratic inequality with never has the empty set as its solution: the quadratic has constant nonzero sign on all of , so with and is satisfied by every , not by none.
- "Take the square root of both sides" is not a single move on an inequality. From with one obtains , which unpacks as — two one-sided conditions, not just . The same care applies to extracting roots in any inequality involving an even power.
Key theorem with proof [Intermediate+]
Theorem (Cauchy-Schwarz inequality). Let be a real inner-product space with inner product and induced norm . For every ,
with equality iff and are linearly dependent [Rudin — Principles of Mathematical Analysis Ch. 1].
Proof. If then and , so both sides of the inequality are zero and equality holds; also and are linearly dependent because the zero vector is a scalar multiple of every vector. Assume from here that , so .
For any real , expand the squared norm of using bilinearity and symmetry of the inner product:
The left side is a squared norm, hence non-negative for every . The right side, viewed as a polynomial in , is therefore a non-negative quadratic with positive leading coefficient . By the discriminant trichotomy for quadratics with positive leading coefficient 00.03.02, a non-negative real-coefficient quadratic has discriminant . Computing the discriminant of :
The condition rearranges to . Taking square roots, which preserves the inequality since both sides are non-negative, yields .
The equality case holds iff the quadratic in has a (repeated) real root, that is, iff there exists with . Since the norm is positive-definite, this is equivalent to , the linear dependence of on . Conversely, if and are linearly dependent and , then for some , and substitution gives and , so .
Corollary (triangle inequality on inner-product spaces). For every , .
Proof of corollary. Expand and apply Cauchy-Schwarz to bound the cross term: . Take square roots.
Corollary (AM-GM at two points). For non-negative reals , , with equality iff .
Proof of corollary. Apply Cauchy-Schwarz to and : and . Hence , equivalently . Equality in Cauchy-Schwarz forces linear dependence of and , which after squaring components forces . (A direct derivation also works: expands to , with equality iff .)
Bridge. Cauchy-Schwarz is the load-bearing instance of a wider family of inequalities, and the discriminant-of-a-non-negative-quadratic argument used in its proof recurs across the analysis strand. A first generalisation is the AM-GM inequality in variables: for non-negative reals , , with equality iff all coincide. This sharpens AM-GM at two points to an inequality on an arbitrarily long list and is a consequence of the concavity of the logarithm together with Jensen's inequality (1906). A second is Hölder's inequality (Hölder 1889): for conjugate exponents with and , . The case recovers Cauchy-Schwarz; the case is the elementary . A third is Minkowski's inequality (Minkowski 1896): , the triangle inequality on the norm. A fourth is Jensen's inequality: for a convex function and a probability measure, , the inequality from which AM-GM, Hölder, and Cauchy-Schwarz can all be derived as special cases.
Putting these together, the foundational role of Cauchy-Schwarz is twofold. It supplies the angle in an inner-product space: defining produces a number in exactly because , and this is the reason an inner-product space carries a geometry of angles in the first place. It also supplies the triangle inequality on the inner-product norm, hence the metric on the space, hence the topology on which all of analysis on inner-product spaces is built — the metric-space machinery whose triangle inequality on 00.01.02 is its one-dimensional case. The discriminant argument from the quadratic-formula unit thereby propagates from a fact about polynomials in one real variable to the defining geometric inequality of every Hilbert and Banach space in the analysis strand. The narrower lesson — that the rules for manipulating inequalities are constrained by the sign of the multiplier — generalises, in real algebraic geometry, to the Tarski-Seidenberg theorem (1948): the first-order theory of the real numbers is decidable, and the sets defined by systems of polynomial inequalities (the semi-algebraic sets) form a class closed under polynomial maps and Boolean operations.
Exercises [Intermediate+]
Lean formalization [Intermediate+]
[object Promise]The named statements compile against Mathlib's Algebra.Order.Ring and Analysis.InnerProductSpace.Basic. Mathlib supplies the sign-flip rule via mul_lt_mul_of_neg_left, the sign-product characterisation via mul_neg_iff, and the Cauchy-Schwarz inequality on a real inner-product space via inner_mul_le_norm_mul_norm. The human reviewer named in the frontmatter signs off on the coverage claim.
Advanced results [Master]
The Cauchy-Schwarz inequality at the Intermediate tier is the simplest non-degenerate instance of a hierarchy of inequalities that organises classical analysis and its applications. The generalisations go in coordinated directions: from finite sums to integrals, from squares to arbitrary powers, from inner-product spaces to spaces, and from real-variable inequalities to probabilistic and geometric ones.
Hölder's inequality. For exponents with — conjugate exponents — and any real or complex sequences (or measurable functions on a measure space),
with equality iff and are proportional sequences [Hardy-Littlewood-Pólya — Inequalities Ch. 6]. The case is Cauchy-Schwarz. The standard proof uses Young's inequality for and conjugate exponents , itself a consequence of the concavity of the logarithm: , where the inequality is the concavity inequality for with weights summing to . Hölder's inequality first appears in Hölder's 1889 Über einen Mittelwertssatz in the context of mean-value comparisons and is the load-bearing tool in the duality theory of spaces: the topological dual of is via the pairing exactly because Hölder's inequality bounds this pairing by .
Minkowski's inequality. For and any real or complex sequences ,
with equality iff and are non-negative proportional sequences (or one is zero) [Hardy-Littlewood-Pólya — Inequalities Ch. 6]. The case is the ordinary triangle inequality applied term-by-term; the case is the triangle inequality on the Euclidean norm; the case is the analogous statement . Minkowski first published this in Geometrie der Zahlen (1896) in the context of the geometry of convex bodies in . The inequality supplies the triangle inequality for the and norms — without Minkowski's inequality, would not be a norm for , and the whole edifice of spaces would collapse to a quasi-normed setting.
Jensen's inequality. For a convex function on an interval , a probability space , and a random variable with ,
For a strictly convex equality holds iff is almost-surely constant [Hardy-Littlewood-Pólya — Inequalities Ch. 3]. The discrete case for non-negative weights summing to is the convexity inequality. Specialising to on recovers the AM-GM inequality with weights: , equivalently . Specialising further to equal weights recovers the standard AM-GM. Jensen's 1906 paper was the formal definition of convex function and its inequality together with applications to mean-value comparisons; the probabilistic restatement was made explicit later by Khintchine and Kolmogorov.
The isoperimetric inequality. Among all bounded regions in of fixed area , the disk has the minimum perimeter, equivalently for the perimeter of any sufficiently regular plane region, with equality iff the region is a disk. The inequality dates to Greek antiquity (folklore claim, with rigorous proofs by Steiner 1838 and Schwarz 1884) and has profound generalisations: in , the ball minimises surface area among regions of fixed volume; on a Riemannian manifold with a positive lower bound on the Ricci curvature, the Lévy-Gromov isoperimetric inequality gives a sharp comparison with the sphere of the same dimension. The inequality is the prototype of geometric inequalities — those linking a region's content to its boundary content — and supplies the sharp constants in many functional inequalities on manifolds (Sobolev, Poincaré, log-Sobolev).
Concentration of measure. A modern descendant of the classical inequalities is the family of concentration inequalities. Markov's inequality is the elementary case; Chebyshev's inequality is the second-moment version, an immediate consequence of Markov applied to . Sharper bounds for sums of independent random variables (Hoeffding 1963, Bernstein 1924, Bennett 1962) give exponential rather than polynomial tail decay, and the geometric viewpoint (Talagrand 1995) generalises this to functions of independent random variables satisfying a Lipschitz condition. The unifying perspective is that an inequality on the expectation of a non-negative function translates into a tail bound on .
Tarski-Seidenberg and polynomial inequalities. A semi-algebraic subset of is one defined by a finite Boolean combination of polynomial inequalities with . The class of semi-algebraic sets is closed under finite union, finite intersection, complement, and projection onto a coordinate subspace [Tarski — A Decision Method for Elementary Algebra and Geometry]. The substantive closure under projection — equivalent to quantifier elimination in the first-order theory of as a real-closed field — was proved by Tarski in unpublished work in 1948 and independently by Seidenberg 1954, with an effective algorithm due to Tarski. As a consequence, the elementary theory of is decidable: there is an algorithm that, given any first-order sentence in the language of ordered fields built from polynomial inequalities, decides whether it is true of . The complexity is doubly-exponential in the worst case (Davenport-Heintz 1988), and an improved algorithm due to Collins 1975 (cylindrical algebraic decomposition) achieves the same decidability with practically computable bounds on moderately-sized inputs. Polynomial inequalities, the unit's central object at the Intermediate tier, thereby sit at the foundation of real algebraic geometry and computational real algebra.
Synthesis. The bridge between elementary inequalities on and the broader theory of inequalities is the recognition that an inequality is a statement about an ordered algebraic structure, and that the manipulations preserving the inequality are constrained by the interaction of order and arithmetic. Once one has the order axioms, the discriminant-of-a-non-negative-quadratic argument runs in any linear-ordered field; once one has an inner product, the same argument gives Cauchy-Schwarz; once one has convexity, the same argument gives Jensen. The named inequalities — Cauchy-Schwarz, Hölder, Minkowski, Jensen, AM-GM, Young — form a family whose members are interderivable in pairs and which collectively encode the quantitative content of analysis: where equations fix points and equalities determine values, inequalities supply the slack within which limits, convergence, approximation, and error estimates take place. Without inequalities there is no -, no Banach contraction, no Hilbert-space angle, no norm; with them, analysis acquires its characteristic apparatus of bounds and estimates.
This unit identifies the basic linear and quadratic inequalities on as the prototype ordered-field manipulations, the sign-analysis method as the prototype technique for solving polynomial inequalities, Cauchy-Schwarz as the prototype inner-product-space inequality and the prototype consequence of the discriminant trichotomy for non-negative quadratics, AM-GM as the prototype mean-comparison inequality and the simplest consequence of the concavity of , and the triangle inequality as the prototype norm inequality. Each of these prototype roles motivates a downstream generalisation that runs through the analysis, probability, geometry, and logic strands of the curriculum.
Full proof set [Master]
Proposition (sign-flip rule for inequalities). Let be a linear-ordered field and let with and . Then .
Proof. From and the order axioms, . From and order compatibility with multiplication by positive elements, , that is, . Adding to both sides yields , equivalently .
Proposition (quadratic inequality by sign analysis). Let with and let . The set is: (i) the open interval when and , where are the two real roots; (ii) the complement when and ; (iii) empty when and ; (iv) when and , where is the repeated root; (v) when and .
Proof. Factor over an extension when (with in the repeated-root case), where and are real when by the quadratic formula 00.03.02. When the factorisation is over but the polynomial has constant sign on , equal to the sign of , since and the bracketed factor is positive everywhere by the non-negativity of the square and the positivity of .
Case , : with , the product is positive when (both factors negative), negative when (opposite signs), and positive when (both positive). Multiplication by preserves signs, so the inequality has solution .
Case , : multiplication by reverses signs, so the inequality holds outside , giving .
Cases : the polynomial has constant sign equal to on when and on all of when . The solution sets follow.
Proposition (Cauchy-Schwarz in ). For real vectors and , , with equality iff and are linearly dependent.
Proof. Apply the inner-product-space proof from the Key theorem section to the standard inner product on .
Proposition (AM-GM at points, weighted form). For non-negative reals and positive weights with , , with equality iff all are equal.
Proof. If any the left side is zero and the right side is non-negative, with equality iff the right side is also zero, i.e. all are zero (and the equality case condition is satisfied). Otherwise, apply Jensen's inequality to the strictly concave function : . Exponentiating gives . Equality in Jensen requires the to be almost-surely constant — here, all equal.
Proposition (Young's inequality). For and conjugate exponents with and , , with equality iff .
Proof. If or the left side is zero and the right side is non-negative, so the inequality holds (with equality iff the corresponding or is also zero). Otherwise apply the weighted AM-GM with , weights , and arguments : to the powers gives , while the convex combination is . The equality case from weighted AM-GM forces .
Proposition (Hölder's inequality). For real or complex sequences and conjugate exponents , .
Proof sketch. Reduce to the case where and are both nonzero (otherwise the inequality is immediate). Rescale to . Apply Young's inequality term-by-term to and sum: , the desired conclusion under the normalisation.
Proposition (Minkowski's inequality). For and real or complex sequences, .
Proof sketch. The case is the term-by-term triangle inequality summed. For , write . Sum and apply Hölder's inequality to each of the two resulting sums with exponents and . Rearranging gives Minkowski.
Connections [Master]
The triangle inequality on , proved as the key theorem of
00.01.02(absolute value and the triangle inequality), is the one-dimensional case of the Cauchy-Schwarz-derived triangle inequality on a real inner-product space established here. Both rest on the same underlying observation that the squared norm of a sum bounds the sum's size by the norms of its summands. The generalisation runs through metric-space theory (the triangle inequality is one of the three axioms of a metric inmetric-spaces.metric-space), the -norm theory (Minkowski's inequality supplies the triangle inequality for , hence the metric structure on infunctional-analysis.lp-spaces), and the inner-product-space angle theory (the angle lies in by Cauchy-Schwarz, giving the geometry offunctional-analysis.hilbert-space).The discriminant trichotomy from
00.03.02(quadratic equations and the quadratic formula) is the load-bearing tool in the proof of Cauchy-Schwarz here: the non-negativity of the quadratic is converted into the discriminant inequality , which rearranges to . The same pattern — express a non-negativity statement as a discriminant inequality, then rearrange — recurs in the proof of the operator-norm inequality infunctional-analysis.bounded-linear-operators(where the discriminant of extracts the operator-norm bound on in terms of ), in the proof of the Hermitian-form positivity criterion inlinalg.bilinear-quadratic-form, and in the proof of the Bessel and Parseval inequalities for orthonormal bases in Hilbert spaces.The AM-GM inequality at two points, proved here as a corollary of Cauchy-Schwarz in , is the simplest mean-comparison inequality and the prototype of the entire family ordering the harmonic, geometric, arithmetic, and quadratic means of non-negative reals. The general AM-GM at points and its weighted form follow from Jensen's inequality applied to the concave logarithm, as proved in the Full proof set. The mean-comparison hierarchy appears later in convex-analysis units (
convex-analysis.jensen), in information-theory units (the entropy inequality via Jensen on ), and in concentration-of-measure units inprobability.concentration.The sign-analysis method for polynomial inequalities is the prototype technique for semi-algebraic set computations, which generalise via the Tarski-Seidenberg theorem (1948) to the entire first-order theory of . The same method underlies real-algebraic-geometry algorithms (cylindrical algebraic decomposition, Collins 1975), the computer-algebra implementation of decision procedures for real polynomial systems (
logic.decidability-of-real-closed-fields), and the o-minimal structures framework (van den Dries) that organises tame geometry. At the elementary level the method handles a quadratic with two real roots; at the research level the same idea handles arbitrary finite Boolean combinations of polynomial inequalities in many variables.The proof of Cauchy-Schwarz via non-negative quadratics is the discrete one-variable shadow of a recurring move in functional analysis: extracting an operator estimate from the non-negativity of a quadratic form. In
functional-analysis.hilbert-spacethis becomes the Bessel inequality and the existence of orthogonal projections; infunctional-analysis.bounded-linear-operatorsit becomes the operator norm; inpde.energy-estimatesit becomes the Friedrichs and Poincaré inequalities controlling Sobolev norms by their derivatives. The shared anatomy is: a non-negative quadratic form with has , and the rearrangement is the desired inequality.
Historical & philosophical context [Master]
The systematic use of inequalities as objects of study, separate from equations, is largely a development of the eighteenth and nineteenth centuries. Augustin-Louis Cauchy, in his 1821 Cours d'analyse de l'École Royale Polytechnique, proved what is now called the Cauchy inequality on finite sums of real numbers: [Cauchy — Cours d'analyse]. Cauchy's setting was finite sequences with a view to bilinear estimates in analysis; the inequality appears in Note II (the "Notes" appendix) and is proved by writing the difference of the two sides as a sum of squares , a derivation independent of the discriminant argument used in the modern proof. The 1821 Cours d'analyse is also the textbook in which Cauchy introduced the - definition of limit and continuity, anchoring the analysis programme on careful inequality estimates rather than infinitesimals.
Hermann Amandus Schwarz, in his 1885 paper Über ein die Flächen kleinsten Flächeninhalts betreffendes Problem der Variationsrechnung (On a problem of minimal surfaces in the calculus of variations), generalised Cauchy's inequality from finite sums to the integral inner product on a function space, in the context of his work on minimal surfaces and the Dirichlet problem [Schwarz 1885]. Schwarz's inequality reads and is the inner-product-space form that Hilbert later abstracted to arbitrary inner-product spaces in his 1906 lectures on integral equations. The name Cauchy-Schwarz inequality fuses the two contributions; the older name Cauchy-Bunyakovsky-Schwarz additionally credits the Russian mathematician Viktor Bunyakovsky, who proved the integral form independently of Schwarz in 1859.
Otto Hölder, in his 1889 Über einen Mittelwertssatz (On a mean-value theorem), introduced the inequality bearing his name and the notion of conjugate exponents with [Hölder 1889]. The motivating problem was the comparison of -th-power means with arithmetic means; the inequality is the load-bearing tool. Hermann Minkowski, in his 1896 Geometrie der Zahlen (Geometry of Numbers), proved the triangle inequality in that bears his name, in the context of the geometric study of lattices and convex bodies in [Minkowski 1896]. Johan Jensen, in his 1906 Sur les fonctions convexes et les inégalités entre les valeurs moyennes (On convex functions and inequalities between mean values), gave the first formal definition of a convex function and proved the inequality , generalising AM-GM and unifying the earlier inequalities under a single convex-function principle [Jensen 1906]. Alfred Tarski, in his 1948 RAND-published A Decision Method for Elementary Algebra and Geometry, established quantifier elimination for the first-order theory of real-closed fields, identifying the semi-algebraic sets as the definable subsets of in this theory and giving an algorithmic decision procedure for any statement built from polynomial inequalities [Tarski 1948]. The cumulative effect, by the mid-twentieth century, was the recognition of inequalities as a first-class subject of mathematics: Hardy-Littlewood-Pólya's 1934 monograph Inequalities was the first systematic treatise on the subject and remains a standard reference.