00.04.01 · precalc / inequalities

Inequalities (linear and quadratic)

shipped3 tiersLean: partial

Anchor (Master): Cauchy 1821 Cours d'analyse; Schwarz 1885; Hölder 1889; Minkowski 1896; Jensen 1906; Hardy-Littlewood-Pólya Inequalities 1934; Tarski 1948 A Decision Method for Elementary Algebra and Geometry; Lang Algebra Ch. XV

Intuition [Beginner]

An inequality is what you get when you replace the equals sign in an equation with one of $<$ , $\leq$ , $>$ , or $\geq$ . The equation $2 x + 3 = 7$ asks for the single number $x = 2$ . The inequality $2 x + 3 > 7$ asks for every number $x$ that makes $2 x + 3$ bigger than $7$ . The answer is not a point but a region of the number line: all numbers larger than $2$ . Inequalities describe ranges and bounds — the natural language of "at least", "at most", "near", and "within".

You solve an inequality with the same moves as an equation: add the same thing to both sides, subtract the same thing, multiply both sides by the same number. There is one twist. If you multiply or divide both sides by a negative number, the inequality flips direction. The reason is the orientation of the number line: negation reflects the line through zero, swapping which side a number sits on. So $- 2 x > 4$ becomes $x < - 2$ after dividing by $- 2$ , not $x > - 2$ .

The picture is a number line marked with shaded regions. A linear inequality such as $x > 2$ shades a half-line — everything to the right of $2$ . A quadratic inequality such as $x^{2} - x - 6 < 0$ shades a bounded interval $(- 2, 3)$ . The same problem can also have the outside of an interval as its answer, or the whole line, or nothing at all. The shape of the shaded region depends on the leading sign and on how many places the corresponding parabola crosses the axis.

Visual [Beginner]

Picture the number line with the two roots $x = - 2$ and $x = 3$ of the equation $x^{2} - x - 6 = 0$ marked as open circles. The two roots split the line into three regions: the part to the left of $- 2$ , the part between $- 2$ and $3$ , and the part to the right of $3$ . Above the middle region a label reads $(-)$ ; above the outer two regions a label reads $(+)$ . The middle region is shaded.

The picture explains the method. The product $(x - 3) (x + 2)$ is negative exactly when the two factors have opposite signs. To the left of $- 2$ both factors are negative, so the product is positive. Between the roots one factor is positive and the other negative, so the product is negative — that is the answer to the inequality. To the right of $3$ both factors are positive, so the product is positive again. Two open circles at the roots record that the strict inequality $< 0$ excludes the boundary, where the product equals zero.

Worked example [Beginner]

Solve $2 x + 3 > 7$ . Subtract $3$ from both sides: $2 x > 4$ . Divide both sides by $2$ , a positive number, so the inequality does not flip: $x > 2$ . The solution is every real number larger than $2$ , the half-line $(2, \infty)$ . Check: at $x = 3$ the left side is $9$ and the right side is $7$ , so $9 > 7$ holds; at $x = 0$ the left side is $3$ and $3 > 7$ fails, as expected since $0$ lies outside the solution region.

Solve $- 2 x + 5 \geq 1$ . Subtract $5$ from both sides: $- 2 x \geq - 4$ . Divide both sides by $- 2$ , a negative number, so the inequality flips: $x \leq 2$ . The solution is the half-line $(- \infty, 2]$ , closed at $2$ because the inequality is non-strict. Check: at $x = 2$ the left side is $1$ and the right side is $1$ , so $1 \geq 1$ holds; at $x = 3$ the left side is $- 1$ and $- 1 \geq 1$ fails.

Solve $x^{2} - x - 6 < 0$ by factoring and sign analysis. Factor: $x^{2} - x - 6 = (x - 3) (x + 2)$ . The roots are $x = 3$ and $x = - 2$ . A product of two real numbers is negative when one factor is positive and the other is negative. The factor $x - 3$ is negative when $x < 3$ and positive when $x > 3$ . The factor $x + 2$ is negative when $x < - 2$ and positive when $x > - 2$ . The two factors have opposite signs exactly between the two roots, $- 2 < x < 3$ . The solution is the open interval $(- 2, 3)$ .

What this tells us: linear inequalities give half-lines, quadratic inequalities give intervals or their complements, and the sign of a product is read off from the signs of its factors. The number-line picture records every case in a single diagram.

Check your understanding [Beginner]

Formal definition [Intermediate+]

Let $K$ be a linear-ordered field — for the purposes of this unit $K = R$ — with the order relation $<$ and the induced relations $\leq$ , $>$ , $\geq$ . An inequality in one variable $x \in K$ is a relation of the form $f (x) □ g (x)$ , where $□ \in {<, \leq, >, \geq}$ and $f, g$ are polynomial expressions in $x$ with coefficients in $K$ ^{[Lang — Basic Mathematics Ch. 4]}. The solution set is ${x \in K : f (x) □ g (x)} \subseteq K$ .

A linear inequality in one variable is $a x + b □ c$ with $a, b, c \in K$ and $a \neq = 0$ . By the order axioms of $K$ — for any $u, v, w \in K$ , $u < v \Rightarrow u + w < v + w$ , and $u < v$ together with $w > 0 \Rightarrow u w < v w$ — the solution set is the half-line ${x : x < (c - b) / a}$ when $a > 0$ and the half-line ${x : x > (c - b) / a}$ when $a < 0$ , with analogous closed half-lines for the non-strict variants. The sign of $a$ controls the direction of the half-line, and the same order axiom multiplied through by a positive (respectively negative) scalar preserves (respectively reverses) the inequality.

A quadratic inequality in one variable is $a x^{2} + b x + c □ 0$ with $a, b, c \in K$ and $a \neq = 0$ . Let $Δ = b^{2} - 4 a c$ be the discriminant of the underlying quadratic 00.03.02. The solution set depends on the sign of $a$ and on the sign of $Δ$ :

If $Δ > 0$ the quadratic factors over $K$ as $a (x - r_{1}) (x - r_{2})$ with two distinct real roots $r_{1} < r_{2}$ . For $a x^{2} + b x + c \leq 0$ with $a > 0$ , the solution is the closed interval $[r_{1}, r_{2}]$ ; with $a < 0$ it is $(- \infty, r_{1}] \cup [r_{2}, \infty)$ .
If $Δ = 0$ the quadratic factors as $a (x - r)^{2}$ with a single repeated root $r = - b / (2 a)$ . The strict inequality $a x^{2} + b x + c < 0$ has solution $\emptyset$ when $a > 0$ and $K ∖ {r}$ when $a < 0$ .
If $Δ < 0$ the quadratic does not factor over $K$ and has constant sign on all of $K$ : positive when $a > 0$ , negative when $a < 0$ . The solution sets are correspondingly the entire line or the empty set.

The sign-analysis method extends this to a polynomial inequality $p (x) □ 0$ for arbitrary $p \in K [x]$ with distinct real roots $r_{1} < r_{2} < \dots < r_{k}$ . The roots partition $K$ into $k + 1$ open intervals; on each interval $p$ has constant sign, determined by the parity of the count of factors $(x - r_{i})$ that are negative on that interval. The solution set is the union of those intervals on which the sign matches $□ 0$ , with the roots included or excluded according to whether $□$ is non-strict or strict.

Counterexamples to common slips

Multiplying both sides of an inequality by an expression of unknown sign is invalid. The step $1/ x < 2$ does not become $1 < 2 x$ for all $x$ : when $x < 0$ the multiplication by $x$ reverses the inequality, and when $x = 0$ the original expression is undefined. Solve by sign cases on $x$ , or rewrite as a single rational inequality.
A non-strict quadratic inequality with $Δ < 0$ never has the empty set as its solution: the quadratic has constant nonzero sign on all of $K$ , so $a x^{2} + b x + c \geq 0$ with $a > 0$ and $Δ < 0$ is satisfied by every $x \in K$ , not by none.
"Take the square root of both sides" is not a single move on an inequality. From $x^{2} < a^{2}$ with $a \geq 0$ one obtains $∣ x ∣ < a$ , which unpacks as $- a < x < a$ — two one-sided conditions, not just $x < a$ . The same care applies to extracting roots in any inequality involving an even power.

Key theorem with proof [Intermediate+]

Theorem (Cauchy-Schwarz inequality). Let $V$ be a real inner-product space with inner product $⟨ \cdot, \cdot ⟩$ and induced norm $∥ u ∥ = ⟨ u, u ⟩$ . For every $u, v \in V$ ,

∣ ⟨ u, v ⟩ ∣ \leq ∥ u ∥ ∥ v ∥,

with equality iff $u$ and $v$ are linearly dependent ^{[Rudin — Principles of Mathematical Analysis Ch. 1]}.

Proof. If $v = 0$ then $⟨ u, v ⟩ = 0$ and $∥ v ∥ = 0$ , so both sides of the inequality are zero and equality holds; also $u$ and $v$ are linearly dependent because the zero vector is a scalar multiple of every vector. Assume from here that $v \neq = 0$ , so $∥ v ∥ > 0$ .

For any real $t \in R$ , expand the squared norm of $u - t v$ using bilinearity and symmetry of the inner product:

∥ u - t v ∥^{2} = ⟨ u - t v, u - t v ⟩ = ∥ u ∥^{2} - 2 t ⟨ u, v ⟩ + t^{2} ∥ v ∥^{2} .

The left side is a squared norm, hence non-negative for every $t$ . The right side, viewed as a polynomial in $t$ , is therefore a non-negative quadratic with positive leading coefficient $∥ v ∥^{2} > 0$ . By the discriminant trichotomy for quadratics with positive leading coefficient 00.03.02, a non-negative real-coefficient quadratic has discriminant $\leq 0$ . Computing the discriminant of $∥ v ∥^{2} t^{2} - 2 ⟨ u, v ⟩ t + ∥ u ∥^{2}$ :

Δ = (- 2 ⟨ u, v ⟩)^{2} - 4∥ v ∥^{2} ∥ u ∥^{2} = 4 ⟨ u, v ⟩^{2} - 4∥ u ∥^{2} ∥ v ∥^{2} .

The condition $Δ \leq 0$ rearranges to $⟨ u, v ⟩^{2} \leq ∥ u ∥^{2} ∥ v ∥^{2}$ . Taking square roots, which preserves the inequality since both sides are non-negative, yields $∣ ⟨ u, v ⟩ ∣ \leq ∥ u ∥∥ v ∥$ .

The equality case $Δ = 0$ holds iff the quadratic in $t$ has a (repeated) real root, that is, iff there exists $t_{0} \in R$ with $∥ u - t_{0} v ∥^{2} = 0$ . Since the norm is positive-definite, this is equivalent to $u = t_{0} v$ , the linear dependence of $u$ on $v$ . Conversely, if $u$ and $v$ are linearly dependent and $v \neq = 0$ , then $u = t_{0} v$ for some $t_{0} \in R$ , and substitution gives $⟨ u, v ⟩ = t_{0} ∥ v ∥^{2}$ and $∥ u ∥ = ∣ t_{0} ∣∥ v ∥$ , so $∣ ⟨ u, v ⟩ ∣ = ∣ t_{0} ∣∥ v ∥^{2} = ∥ u ∥∥ v ∥$ . $□$

Corollary (triangle inequality on inner-product spaces). For every $u, v \in V$ , $∥ u + v ∥ \leq ∥ u ∥ + ∥ v ∥$ .

Proof of corollary. Expand $∥ u + v ∥^{2} = ∥ u ∥^{2} + 2 ⟨ u, v ⟩ + ∥ v ∥^{2} \leq ∥ u ∥^{2} + 2∣ ⟨ u, v ⟩ ∣ + ∥ v ∥^{2}$ and apply Cauchy-Schwarz to bound the cross term: $∥ u + v ∥^{2} \leq ∥ u ∥^{2} + 2∥ u ∥∥ v ∥ + ∥ v ∥^{2} = (∥ u ∥ + ∥ v ∥)^{2}$ . Take square roots.

Corollary (AM-GM at two points). For non-negative reals $a, b \geq 0$ , $ab \leq (a + b) /2$ , with equality iff $a = b$ .

Proof of corollary. Apply Cauchy-Schwarz to $u = (a, b) \in R^{2}$ and $v = (b, a) \in R^{2}$ : $⟨ u, v ⟩ = 2 ab$ and $∥ u ∥∥ v ∥ = a + b \cdot a + b = a + b$ . Hence $2 ab \leq a + b$ , equivalently $ab \leq (a + b) /2$ . Equality in Cauchy-Schwarz forces linear dependence of $u$ and $v$ , which after squaring components forces $a = b$ . (A direct derivation also works: $(a - b)^{2} \geq 0$ expands to $a + b \geq 2 ab$ , with equality iff $a = b$ .)

Bridge. Cauchy-Schwarz is the load-bearing instance of a wider family of inequalities, and the discriminant-of-a-non-negative-quadratic argument used in its proof recurs across the analysis strand. A first generalisation is the AM-GM inequality in $n$ variables: for non-negative reals $x_{1}, \dots, x_{n}$ , $(x_{1} + \dots + x_{n}) / n \geq (x_{1} \dots x_{n})^{1/ n}$ , with equality iff all $x_{i}$ coincide. This sharpens AM-GM at two points to an inequality on an arbitrarily long list and is a consequence of the concavity of the logarithm together with Jensen's inequality (1906). A second is Hölder's inequality (Hölder 1889): for conjugate exponents $p, q$ with $1/ p + 1/ q = 1$ and $1 < p, q < \infty$ , $\sum_{i} ∣ x_{i} y_{i} ∣ \leq (\sum_{i} ∣ x_{i} ∣^{p})^{1/ p} (\sum_{i} ∣ y_{i} ∣^{q})^{1/ q}$ . The case $p = q = 2$ recovers Cauchy-Schwarz; the case $p = 1, q = \infty$ is the elementary $\sum ∣ x_{i} y_{i} ∣ \leq max_{i} ∣ y_{i} ∣ \cdot \sum_{i} ∣ x_{i} ∣$ . A third is Minkowski's inequality (Minkowski 1896): $(\sum_{i} ∣ x_{i} + y_{i} ∣^{p})^{1/ p} \leq (\sum_{i} ∣ x_{i} ∣^{p})^{1/ p} + (\sum_{i} ∣ y_{i} ∣^{p})^{1/ p}$ , the triangle inequality on the $ℓ^{p}$ norm. A fourth is Jensen's inequality: for a convex function $ϕ$ and a probability measure, $ϕ (E [X]) \leq E [ϕ (X)]$ , the inequality from which AM-GM, Hölder, and Cauchy-Schwarz can all be derived as special cases.

Putting these together, the foundational role of Cauchy-Schwarz is twofold. It supplies the angle in an inner-product space: defining $cos θ = ⟨ u, v ⟩ / (∥ u ∥∥ v ∥)$ produces a number in $[- 1, 1]$ exactly because $∣ ⟨ u, v ⟩ ∣ \leq ∥ u ∥∥ v ∥$ , and this is the reason an inner-product space carries a geometry of angles in the first place. It also supplies the triangle inequality on the inner-product norm, hence the metric on the space, hence the topology on which all of analysis on inner-product spaces is built — the metric-space machinery whose triangle inequality $∣ x + y ∣ \leq ∣ x ∣ + ∣ y ∣$ on $R$ 00.01.02 is its one-dimensional case. The discriminant argument from the quadratic-formula unit thereby propagates from a fact about polynomials in one real variable to the defining geometric inequality of every Hilbert and Banach space in the analysis strand. The narrower lesson — that the rules for manipulating inequalities are constrained by the sign of the multiplier — generalises, in real algebraic geometry, to the Tarski-Seidenberg theorem (1948): the first-order theory of the real numbers is decidable, and the sets defined by systems of polynomial inequalities (the semi-algebraic sets) form a class closed under polynomial maps and Boolean operations.

Exercises [Intermediate+]

Exercise 4 (medium, symbolic).

Solve the rational inequality $\frac{x + 1}{x - 2} \geq 0$ and describe the solution set.

Hint

Determine the sign of the numerator and denominator separately, then combine. Exclude the value that makes the denominator zero.

Answer

$(- \infty, - 1] \cup (2, \infty)$ . The expression equals zero when $x = - 1$ and is undefined when $x = 2$ . The numerator $x + 1$ is negative for $x < - 1$ and positive for $x > - 1$ ; the denominator $x - 2$ is negative for $x < 2$ and positive for $x > 2$ . The quotient is non-negative when numerator and denominator have the same sign (or the numerator is zero), giving $x \leq - 1$ or $x > 2$ . The value $x = - 1$ is included (numerator zero, quotient zero), while $x = 2$ is excluded (denominator zero, quotient undefined).

Rubric: full credit for sign analysis on numerator and denominator, the union of intervals as the solution, and the correct inclusion of $- 1$ and exclusion of $2$ .

Exercise 6 (hard, symbolic).

Use Cauchy-Schwarz in $R^{n}$ to show that for any positive reals $a_{1}, \dots, a_{n}$ , $(\sum_{i = 1}^{n} a_{i}) (\sum_{i = 1}^{n} 1/ a_{i}) \geq n^{2}$ , with equality iff all the $a_{i}$ are equal.

Hint

Apply Cauchy-Schwarz to the vectors $u = (a_{1}, \dots, a_{n})$ and $v = (1/ a_{1}, \dots, 1/ a_{n})$ .

Answer

Take $u = (a_{1}, \dots, a_{n})$ and $v = (1/ a_{1}, \dots, 1/ a_{n})$ in $R^{n}$ with the standard inner product. Then $⟨ u, v ⟩ = \sum_{i} a_{i} \cdot 1/ a_{i} = \sum_{i} 1 = n$ , $∥ u ∥^{2} = \sum_{i} a_{i}$ , and $∥ v ∥^{2} = \sum_{i} 1/ a_{i}$ . Cauchy-Schwarz reads $⟨ u, v ⟩^{2} \leq ∥ u ∥^{2} ∥ v ∥^{2}$ , that is, $n^{2} \leq (\sum_{i} a_{i}) (\sum_{i} 1/ a_{i})$ , the desired inequality.

Equality in Cauchy-Schwarz holds iff $u$ and $v$ are linearly dependent: there exists $λ$ with $1/ a_{i} = λ a_{i}$ for all $i$ , that is, $a_{i} = 1/ λ$ for all $i$ . Hence all $a_{i}$ are equal.

Rubric: full credit for the choice of $u$ and $v$ , the computation of $⟨ u, v ⟩^{2}$ and $∥ u ∥^{2} ∥ v ∥^{2}$ , the inequality $n^{2} \leq (\sum a_{i}) (\sum 1/ a_{i})$ , and the equality case via linear dependence.

Exercise 7 (hard, short-answer).

Show that for real numbers $a, b, c$ with $a + b + c = 1$ and $a, b, c \geq 0$ , the inequality $a^{2} + b^{2} + c^{2} \geq 1/3$ holds, with equality iff $a = b = c = 1/3$ .

Hint

Apply Cauchy-Schwarz to $u = (a, b, c)$ and $v = (1, 1, 1)$ in $R^{3}$ .

Answer

With $u = (a, b, c)$ and $v = (1, 1, 1)$ , $⟨ u, v ⟩ = a + b + c = 1$ , $∥ u ∥^{2} = a^{2} + b^{2} + c^{2}$ , and $∥ v ∥^{2} = 3$ . Cauchy-Schwarz gives $⟨ u, v ⟩^{2} \leq ∥ u ∥^{2} ∥ v ∥^{2}$ , that is, $1 \leq 3 (a^{2} + b^{2} + c^{2})$ , equivalently $a^{2} + b^{2} + c^{2} \geq 1/3$ .

Equality holds iff $u$ and $v$ are linearly dependent, i.e. $a = b = c$ , and together with $a + b + c = 1$ this forces $a = b = c = 1/3$ . The non-negativity hypothesis on $a, b, c$ is not used in the inequality itself; it ensures the equality case lies in the prescribed domain.

Rubric: full credit for the choice of $u, v$ , the application of Cauchy-Schwarz, and the equality case via linear dependence combined with the constraint $a + b + c = 1$ .

Lean formalization [Intermediate+]

[object Promise]

The named statements compile against Mathlib's Algebra.Order.Ring and Analysis.InnerProductSpace.Basic. Mathlib supplies the sign-flip rule via mul_lt_mul_of_neg_left, the sign-product characterisation via mul_neg_iff, and the Cauchy-Schwarz inequality on a real inner-product space via inner_mul_le_norm_mul_norm. The human reviewer named in the frontmatter signs off on the coverage claim.

Advanced results [Master]

The Cauchy-Schwarz inequality at the Intermediate tier is the simplest non-degenerate instance of a hierarchy of inequalities that organises classical analysis and its applications. The generalisations go in coordinated directions: from finite sums to integrals, from squares to arbitrary powers, from inner-product spaces to $L^{p}$ spaces, and from real-variable inequalities to probabilistic and geometric ones.

Hölder's inequality. For exponents $1 < p, q < \infty$ with $1/ p + 1/ q = 1$ — conjugate exponents — and any real or complex sequences $(x_{i}), (y_{i})$ (or measurable functions on a measure space),

i = 1 \sum n ∣ x_{i} y_{i} ∣ \leq (i = 1 \sum n ∣ x_{i} ∣^{p})^{1/ p} (i = 1 \sum n ∣ y_{i} ∣^{q})^{1/ q},

with equality iff $(∣ x_{i} ∣^{p})$ and $(∣ y_{i} ∣^{q})$ are proportional sequences ^{[Hardy-Littlewood-Pólya — Inequalities Ch. 6]}. The case $p = q = 2$ is Cauchy-Schwarz. The standard proof uses Young's inequality $ab \leq a^{p} / p + b^{q} / q$ for $a, b \geq 0$ and conjugate exponents $p, q$ , itself a consequence of the concavity of the logarithm: $lo g (a^{p} / p + b^{q} / q) \geq (1/ p) lo g a^{p} + (1/ q) lo g b^{q} = lo g a + lo g b = lo g (ab)$ , where the inequality is the concavity inequality for $lo g$ with weights $1/ p, 1/ q$ summing to $1$ . Hölder's inequality first appears in Hölder's 1889 Über einen Mittelwertssatz in the context of mean-value comparisons and is the load-bearing tool in the duality theory of $L^{p}$ spaces: the topological dual of $L^{p}$ is $L^{q}$ via the pairing $(f, g) \mapsto \int f g$ exactly because Hölder's inequality bounds this pairing by $∥ f ∥_{p} ∥ g ∥_{q}$ .

Minkowski's inequality. For $1 \leq p < \infty$ and any real or complex sequences $(x_{i}), (y_{i})$ ,

(i = 1 \sum n ∣ x_{i} + y_{i} ∣^{p})^{1/ p} \leq (i = 1 \sum n ∣ x_{i} ∣^{p})^{1/ p} + (i = 1 \sum n ∣ y_{i} ∣^{p})^{1/ p},

with equality iff $(x_{i})$ and $(y_{i})$ are non-negative proportional sequences (or one is zero) ^{[Hardy-Littlewood-Pólya — Inequalities Ch. 6]}. The case $p = 1$ is the ordinary triangle inequality applied term-by-term; the case $p = 2$ is the triangle inequality on the Euclidean norm; the case $p = \infty$ is the analogous statement $sup ∣ x_{i} + y_{i} ∣ \leq sup ∣ x_{i} ∣ + sup ∣ y_{i} ∣$ . Minkowski first published this in Geometrie der Zahlen (1896) in the context of the geometry of convex bodies in $R^{n}$ . The inequality supplies the triangle inequality for the $ℓ^{p}$ and $L^{p}$ norms — without Minkowski's inequality, $∥ \cdot ∥_{p}$ would not be a norm for $1 \leq p < \infty$ , and the whole edifice of $L^{p}$ spaces would collapse to a quasi-normed setting.

Jensen's inequality. For a convex function $ϕ : I \to R$ on an interval $I \subseteq R$ , a probability space $(Ω, F, P)$ , and a random variable $X : Ω \to I$ with $E [X] \in I$ ,

ϕ (E [X]) \leq E [ϕ (X)] .

For a strictly convex $ϕ$ equality holds iff $X$ is almost-surely constant ^{[Hardy-Littlewood-Pólya — Inequalities Ch. 3]}. The discrete case $ϕ ((p_{1} x_{1} + \dots + p_{n} x_{n})) \leq p_{1} ϕ (x_{1}) + \dots + p_{n} ϕ (x_{n})$ for non-negative weights $p_{i}$ summing to $1$ is the convexity inequality. Specialising to $ϕ (x) = - lo g x$ on $(0, \infty)$ recovers the AM-GM inequality with weights: $lo g ((p_{1} x_{1} + \dots + p_{n} x_{n})) \geq p_{1} lo g x_{1} + \dots + p_{n} lo g x_{n}$ , equivalently $\sum p_{i} x_{i} \geq \prod x_{i}^{p_{i}}$ . Specialising further to equal weights $p_{i} = 1/ n$ recovers the standard AM-GM. Jensen's 1906 paper was the formal definition of convex function and its inequality together with applications to mean-value comparisons; the probabilistic restatement was made explicit later by Khintchine and Kolmogorov.

The isoperimetric inequality. Among all bounded regions in $R^{2}$ of fixed area $A$ , the disk has the minimum perimeter, equivalently $L^{2} \geq 4 π A$ for the perimeter $L$ of any sufficiently regular plane region, with equality iff the region is a disk. The inequality dates to Greek antiquity (folklore claim, with rigorous proofs by Steiner 1838 and Schwarz 1884) and has profound generalisations: in $R^{n}$ , the ball minimises surface area among regions of fixed volume; on a Riemannian manifold with a positive lower bound on the Ricci curvature, the Lévy-Gromov isoperimetric inequality gives a sharp comparison with the sphere of the same dimension. The inequality is the prototype of geometric inequalities — those linking a region's content to its boundary content — and supplies the sharp constants in many functional inequalities on manifolds (Sobolev, Poincaré, log-Sobolev).

Concentration of measure. A modern descendant of the classical inequalities is the family of concentration inequalities. Markov's inequality $P (∣ X ∣ \geq a) \leq E [∣ X ∣] / a$ is the elementary case; Chebyshev's inequality $P (∣ X - E [X] ∣ \geq a) \leq Var (X) / a^{2}$ is the second-moment version, an immediate consequence of Markov applied to $(X - E [X])^{2}$ . Sharper bounds for sums of independent random variables (Hoeffding 1963, Bernstein 1924, Bennett 1962) give exponential rather than polynomial tail decay, and the geometric viewpoint (Talagrand 1995) generalises this to functions of independent random variables satisfying a Lipschitz condition. The unifying perspective is that an inequality on the expectation of a non-negative function $f (X)$ translates into a tail bound on $X$ .

Tarski-Seidenberg and polynomial inequalities. A semi-algebraic subset of $R^{n}$ is one defined by a finite Boolean combination of polynomial inequalities $p_{i} (x_{1}, \dots, x_{n}) □ 0$ with $□ \in {<, \leq, =}$ . The class of semi-algebraic sets is closed under finite union, finite intersection, complement, and projection onto a coordinate subspace ^{[Tarski — A Decision Method for Elementary Algebra and Geometry]}. The substantive closure under projection — equivalent to quantifier elimination in the first-order theory of $R$ as a real-closed field — was proved by Tarski in unpublished work in 1948 and independently by Seidenberg 1954, with an effective algorithm due to Tarski. As a consequence, the elementary theory of $R$ is decidable: there is an algorithm that, given any first-order sentence in the language of ordered fields built from polynomial inequalities, decides whether it is true of $R$ . The complexity is doubly-exponential in the worst case (Davenport-Heintz 1988), and an improved algorithm due to Collins 1975 (cylindrical algebraic decomposition) achieves the same decidability with practically computable bounds on moderately-sized inputs. Polynomial inequalities, the unit's central object at the Intermediate tier, thereby sit at the foundation of real algebraic geometry and computational real algebra.

Synthesis. The bridge between elementary inequalities on $R$ and the broader theory of inequalities is the recognition that an inequality is a statement about an ordered algebraic structure, and that the manipulations preserving the inequality are constrained by the interaction of order and arithmetic. Once one has the order axioms, the discriminant-of-a-non-negative-quadratic argument runs in any linear-ordered field; once one has an inner product, the same argument gives Cauchy-Schwarz; once one has convexity, the same argument gives Jensen. The named inequalities — Cauchy-Schwarz, Hölder, Minkowski, Jensen, AM-GM, Young — form a family whose members are interderivable in pairs and which collectively encode the quantitative content of analysis: where equations fix points and equalities determine values, inequalities supply the slack within which limits, convergence, approximation, and error estimates take place. Without inequalities there is no $ε$ - $δ$ , no Banach contraction, no Hilbert-space angle, no $L^{p}$ norm; with them, analysis acquires its characteristic apparatus of bounds and estimates.

This unit identifies the basic linear and quadratic inequalities on $R$ as the prototype ordered-field manipulations, the sign-analysis method as the prototype technique for solving polynomial inequalities, Cauchy-Schwarz as the prototype inner-product-space inequality and the prototype consequence of the discriminant trichotomy for non-negative quadratics, AM-GM as the prototype mean-comparison inequality and the simplest consequence of the concavity of $lo g$ , and the triangle inequality as the prototype norm inequality. Each of these prototype roles motivates a downstream generalisation that runs through the analysis, probability, geometry, and logic strands of the curriculum.

Full proof set [Master]

Proposition (sign-flip rule for inequalities). Let $K$ be a linear-ordered field and let $a, b, c \in K$ with $a < b$ and $c < 0$ . Then $c \cdot a > c \cdot b$ .

Proof. From $c < 0$ and the order axioms, $- c > 0$ . From $a < b$ and order compatibility with multiplication by positive elements, $(- c) \cdot a < (- c) \cdot b$ , that is, $- (c a) < - (c b)$ . Adding $c a + c b$ to both sides yields $c b < c a$ , equivalently $c a > c b$ .

Proposition (quadratic inequality by sign analysis). Let $a, b, c \in R$ with $a \neq = 0$ and let $Δ = b^{2} - 4 a c$ . The set $S = {x \in R : a x^{2} + b x + c < 0}$ is: (i) the open interval $(r_{1}, r_{2})$ when $a > 0$ and $Δ > 0$ , where $r_{1} < r_{2}$ are the two real roots; (ii) the complement $(- \infty, r_{1}) \cup (r_{2}, \infty)$ when $a < 0$ and $Δ > 0$ ; (iii) empty when $a > 0$ and $Δ \leq 0$ ; (iv) $R ∖ {r}$ when $a < 0$ and $Δ = 0$ , where $r$ is the repeated root; (v) $R$ when $a < 0$ and $Δ < 0$ .

Proof. Factor $a x^{2} + b x + c = a (x - r_{1}) (x - r_{2})$ over an extension when $Δ \geq 0$ (with $r_{1} = r_{2} = r$ in the repeated-root case), where $r_{1} = (- b - Δ) / (2 a)$ and $r_{2} = (- b + Δ) / (2 a)$ are real when $Δ \geq 0$ by the quadratic formula 00.03.02. When $Δ < 0$ the factorisation is over $C$ but the polynomial has constant sign on $R$ , equal to the sign of $a$ , since $a x^{2} + b x + c = a ((x + b / (2 a))^{2} + (- Δ) / (4 a^{2}))$ and the bracketed factor is positive everywhere by the non-negativity of the square and the positivity of $- Δ/ (4 a^{2})$ .

Case $Δ > 0$ , $a > 0$ : with $r_{1} < r_{2}$ , the product $(x - r_{1}) (x - r_{2})$ is positive when $x < r_{1}$ (both factors negative), negative when $r_{1} < x < r_{2}$ (opposite signs), and positive when $x > r_{2}$ (both positive). Multiplication by $a > 0$ preserves signs, so the inequality $a x^{2} + b x + c < 0$ has solution $(r_{1}, r_{2})$ .

Case $Δ > 0$ , $a < 0$ : multiplication by $a < 0$ reverses signs, so the inequality holds outside $[r_{1}, r_{2}]$ , giving $(- \infty, r_{1}) \cup (r_{2}, \infty)$ .

Cases $Δ \leq 0$ : the polynomial has constant sign equal to $sgn (a)$ on $R ∖ {r_{1}}$ when $Δ = 0$ and on all of $R$ when $Δ < 0$ . The solution sets follow.

Proposition (Cauchy-Schwarz in $R^{n}$ ). For real vectors $x = (x_{1}, \dots, x_{n})$ and $y = (y_{1}, \dots, y_{n})$ , $(\sum_{i} x_{i} y_{i})^{2} \leq (\sum_{i} x_{i}^{2}) (\sum_{i} y_{i}^{2})$ , with equality iff $x$ and $y$ are linearly dependent.

Proof. Apply the inner-product-space proof from the Key theorem section to the standard inner product $⟨ x, y ⟩ = \sum_{i} x_{i} y_{i}$ on $R^{n}$ .

Proposition (AM-GM at $n$ points, weighted form). For non-negative reals $x_{1}, \dots, x_{n}$ and positive weights $p_{1}, \dots, p_{n}$ with $\sum_{i} p_{i} = 1$ , $\prod_{i} x_{i}^{p_{i}} \leq \sum_{i} p_{i} x_{i}$ , with equality iff all $x_{i}$ are equal.

Proof. If any $x_{i} = 0$ the left side is zero and the right side is non-negative, with equality iff the right side is also zero, i.e. all $x_{j}$ are zero (and the equality case condition is satisfied). Otherwise, apply Jensen's inequality to the strictly concave function $lo g : (0, \infty) \to R$ : $lo g (\sum_{i} p_{i} x_{i}) \geq \sum_{i} p_{i} lo g x_{i} = lo g (\prod_{i} x_{i}^{p_{i}})$ . Exponentiating gives $\sum_{i} p_{i} x_{i} \geq \prod_{i} x_{i}^{p_{i}}$ . Equality in Jensen requires the $x_{i}$ to be almost-surely constant — here, all equal.

Proposition (Young's inequality). For $a, b \geq 0$ and conjugate exponents $p, q$ with $1/ p + 1/ q = 1$ and $1 < p, q < \infty$ , $ab \leq a^{p} / p + b^{q} / q$ , with equality iff $a^{p} = b^{q}$ .

Proof. If $a = 0$ or $b = 0$ the left side is zero and the right side is non-negative, so the inequality holds (with equality iff the corresponding $b^{q}$ or $a^{p}$ is also zero). Otherwise apply the weighted AM-GM with $n = 2$ , weights $p_{1} = 1/ p, p_{2} = 1/ q$ , and arguments $x_{1} = a^{p}, x_{2} = b^{q}$ : $a^{p} \cdot b^{q}$ to the powers $1/ p, 1/ q$ gives $a \cdot b$ , while the convex combination is $a^{p} / p + b^{q} / q$ . The equality case from weighted AM-GM forces $a^{p} = b^{q}$ .

Proposition (Hölder's inequality). For real or complex sequences $(x_{i}), (y_{i})$ and conjugate exponents $p, q$ , $\sum_{i} ∣ x_{i} y_{i} ∣ \leq (\sum_{i} ∣ x_{i} ∣^{p})^{1/ p} (\sum_{i} ∣ y_{i} ∣^{q})^{1/ q}$ .

Proof sketch. Reduce to the case where $∥ x ∥_{p} = (\sum ∣ x_{i} ∣^{p})^{1/ p}$ and $∥ y ∥_{q}$ are both nonzero (otherwise the inequality is immediate). Rescale to $∥ x ∥_{p} = ∥ y ∥_{q} = 1$ . Apply Young's inequality term-by-term to $a = ∣ x_{i} ∣, b = ∣ y_{i} ∣$ and sum: $\sum_{i} ∣ x_{i} y_{i} ∣ \leq \sum_{i} (∣ x_{i} ∣^{p} / p + ∣ y_{i} ∣^{q} / q) = 1/ p + 1/ q = 1$ , the desired conclusion under the normalisation.

Proposition (Minkowski's inequality). For $1 \leq p < \infty$ and real or complex sequences, $(\sum_{i} ∣ x_{i} + y_{i} ∣^{p})^{1/ p} \leq (\sum_{i} ∣ x_{i} ∣^{p})^{1/ p} + (\sum_{i} ∣ y_{i} ∣^{p})^{1/ p}$ .

Proof sketch. The case $p = 1$ is the term-by-term triangle inequality summed. For $p > 1$ , write $∣ x_{i} + y_{i} ∣^{p} = ∣ x_{i} + y_{i} ∣ \cdot ∣ x_{i} + y_{i} ∣^{p - 1} \leq (∣ x_{i} ∣ + ∣ y_{i} ∣) ∣ x_{i} + y_{i} ∣^{p - 1}$ . Sum and apply Hölder's inequality to each of the two resulting sums with exponents $p$ and $q = p / (p - 1)$ . Rearranging gives Minkowski.

Connections [Master]

The triangle inequality $∣ x + y ∣ \leq ∣ x ∣ + ∣ y ∣$ on $R$ , proved as the key theorem of 00.01.02 (absolute value and the triangle inequality), is the one-dimensional case of the Cauchy-Schwarz-derived triangle inequality on a real inner-product space established here. Both rest on the same underlying observation that the squared norm of a sum bounds the sum's size by the norms of its summands. The generalisation runs through metric-space theory (the triangle inequality is one of the three axioms of a metric in metric-spaces.metric-space), the $L^{p}$ -norm theory (Minkowski's inequality supplies the triangle inequality for $∥ \cdot ∥_{p}$ , hence the metric structure on $L^{p}$ in functional-analysis.lp-spaces), and the inner-product-space angle theory (the angle $cos θ = ⟨ u, v ⟩ / (∥ u ∥∥ v ∥)$ lies in $[- 1, 1]$ by Cauchy-Schwarz, giving the geometry of functional-analysis.hilbert-space).
The discriminant trichotomy from 00.03.02 (quadratic equations and the quadratic formula) is the load-bearing tool in the proof of Cauchy-Schwarz here: the non-negativity of the quadratic $t \mapsto ∥ u - t v ∥^{2}$ is converted into the discriminant inequality $Δ \leq 0$ , which rearranges to $∣ ⟨ u, v ⟩ ∣ \leq ∥ u ∥∥ v ∥$ . The same pattern — express a non-negativity statement as a discriminant inequality, then rearrange — recurs in the proof of the operator-norm inequality in functional-analysis.bounded-linear-operators (where the discriminant of $∥ T u + s T v ∥^{2} \leq ∥ T ∥^{2} ∥ u + s v ∥^{2}$ extracts the operator-norm bound on $∥ T u ∥$ in terms of $∥ u ∥$ ), in the proof of the Hermitian-form positivity criterion in linalg.bilinear-quadratic-form, and in the proof of the Bessel and Parseval inequalities for orthonormal bases in Hilbert spaces.
The AM-GM inequality at two points, proved here as a corollary of Cauchy-Schwarz in $R^{2}$ , is the simplest mean-comparison inequality and the prototype of the entire family $H_{n} \leq G_{n} \leq A_{n} \leq Q_{n}$ ordering the harmonic, geometric, arithmetic, and quadratic means of $n$ non-negative reals. The general AM-GM at $n$ points and its weighted form follow from Jensen's inequality applied to the concave logarithm, as proved in the Full proof set. The mean-comparison hierarchy appears later in convex-analysis units (convex-analysis.jensen), in information-theory units (the entropy inequality $H (p) \leq lo g n$ via Jensen on $- lo g$ ), and in concentration-of-measure units in probability.concentration.
The sign-analysis method for polynomial inequalities is the prototype technique for semi-algebraic set computations, which generalise via the Tarski-Seidenberg theorem (1948) to the entire first-order theory of $R$ . The same method underlies real-algebraic-geometry algorithms (cylindrical algebraic decomposition, Collins 1975), the computer-algebra implementation of decision procedures for real polynomial systems (logic.decidability-of-real-closed-fields), and the o-minimal structures framework (van den Dries) that organises tame geometry. At the elementary level the method handles a quadratic with two real roots; at the research level the same idea handles arbitrary finite Boolean combinations of polynomial inequalities in many variables.
The proof of Cauchy-Schwarz via non-negative quadratics is the discrete one-variable shadow of a recurring move in functional analysis: extracting an operator estimate from the non-negativity of a quadratic form. In functional-analysis.hilbert-space this becomes the Bessel inequality and the existence of orthogonal projections; in functional-analysis.bounded-linear-operators it becomes the operator norm; in pde.energy-estimates it becomes the Friedrichs and Poincaré inequalities controlling Sobolev norms by their derivatives. The shared anatomy is: a non-negative quadratic form $Q (t) = A t^{2} + B t + C$ with $A > 0$ has $Δ = B^{2} - 4 A C \leq 0$ , and the rearrangement $B^{2} \leq 4 A C$ is the desired inequality.

Historical & philosophical context [Master]

The systematic use of inequalities as objects of study, separate from equations, is largely a development of the eighteenth and nineteenth centuries. Augustin-Louis Cauchy, in his 1821 Cours d'analyse de l'École Royale Polytechnique, proved what is now called the Cauchy inequality on finite sums of real numbers: $(\sum_{i} a_{i} b_{i})^{2} \leq (\sum_{i} a_{i}^{2}) (\sum_{i} b_{i}^{2})$ ^{[Cauchy — Cours d'analyse]}. Cauchy's setting was finite sequences with a view to bilinear estimates in analysis; the inequality appears in Note II (the "Notes" appendix) and is proved by writing the difference of the two sides as a sum of squares $\sum_{i < j} (a_{i} b_{j} - a_{j} b_{i})^{2} \geq 0$ , a derivation independent of the discriminant argument used in the modern proof. The 1821 Cours d'analyse is also the textbook in which Cauchy introduced the $ε$ - $δ$ definition of limit and continuity, anchoring the analysis programme on careful inequality estimates rather than infinitesimals.

Hermann Amandus Schwarz, in his 1885 paper Über ein die Flächen kleinsten Flächeninhalts betreffendes Problem der Variationsrechnung (On a problem of minimal surfaces in the calculus of variations), generalised Cauchy's inequality from finite sums to the integral inner product on a function space, in the context of his work on minimal surfaces and the Dirichlet problem ^{[Schwarz 1885]}. Schwarz's inequality reads $(\int f g d x)^{2} \leq \int f^{2} d x \cdot \int g^{2} d x$ and is the inner-product-space form that Hilbert later abstracted to arbitrary inner-product spaces in his 1906 lectures on integral equations. The name Cauchy-Schwarz inequality fuses the two contributions; the older name Cauchy-Bunyakovsky-Schwarz additionally credits the Russian mathematician Viktor Bunyakovsky, who proved the integral form independently of Schwarz in 1859.

Otto Hölder, in his 1889 Über einen Mittelwertssatz (On a mean-value theorem), introduced the inequality bearing his name and the notion of conjugate exponents $p, q$ with $1/ p + 1/ q = 1$ ^{[Hölder 1889]}. The motivating problem was the comparison of $p$ -th-power means with arithmetic means; the inequality is the load-bearing tool. Hermann Minkowski, in his 1896 Geometrie der Zahlen (Geometry of Numbers), proved the triangle inequality in $ℓ^{p}$ that bears his name, in the context of the geometric study of lattices and convex bodies in $R^{n}$ ^{[Minkowski 1896]}. Johan Jensen, in his 1906 Sur les fonctions convexes et les inégalités entre les valeurs moyennes (On convex functions and inequalities between mean values), gave the first formal definition of a convex function and proved the inequality $ϕ (E [X]) \leq E [ϕ (X)]$ , generalising AM-GM and unifying the earlier inequalities under a single convex-function principle ^{[Jensen 1906]}. Alfred Tarski, in his 1948 RAND-published A Decision Method for Elementary Algebra and Geometry, established quantifier elimination for the first-order theory of real-closed fields, identifying the semi-algebraic sets as the definable subsets of $R^{n}$ in this theory and giving an algorithmic decision procedure for any statement built from polynomial inequalities ^{[Tarski 1948]}. The cumulative effect, by the mid-twentieth century, was the recognition of inequalities as a first-class subject of mathematics: Hardy-Littlewood-Pólya's 1934 monograph Inequalities was the first systematic treatise on the subject and remains a standard reference.

Bibliography [Master]

[object Promise]

Prerequisites

00.03.02

Tier anchors

beginner: Lang Basic Mathematics Ch. 4; Strogatz-style number-line-region picture
intermediate: Lang Basic Mathematics Ch. 4; Apostol Calculus Vol. 1 §I.4; Rudin Principles of Mathematical Analysis Ch. 1
master: Cauchy 1821 Cours d'analyse; Schwarz 1885; Hölder 1889; Minkowski 1896; Jensen 1906; Hardy-Littlewood-Pólya Inequalities 1934; Tarski 1948 A Decision Method for Elementary Algebra and Geometry; Lang Algebra Ch. XV

References

TODO_REF
Lang — Basic Mathematics · Ch. 4, inequalities and sign analysis
TODO_REF
Apostol — Calculus Vol. 1 · §I.4, inequalities and absolute value
TODO_REF
Rudin — Principles of Mathematical Analysis · Ch. 1, ordered fields and Schwarz inequality
TODO_REF
Hardy-Littlewood-Pólya — Inequalities · Ch. 1–2, classical inequalities; Ch. 6, Hölder and Minkowski
TODO_REF
Cauchy — Cours d'analyse de l'École Royale Polytechnique · 1821, Note II, the inequality on finite sums
TODO_REF
Schwarz — Über ein die Flächen kleinsten Flächeninhalts betreffendes Problem der Variationsrechnung · 1885, Acta Soc. Sci. Fenn. 15, inner-product-space generalisation
TODO_REF
Hölder — Über einen Mittelwertssatz · 1889, Nachr. Akad. Wiss. Göttingen, the eponymous inequality
TODO_REF
Minkowski — Geometrie der Zahlen · 1896, Teubner, the triangle inequality in $\ell^p$
TODO_REF
Jensen — Sur les fonctions convexes et les inégalités entre les valeurs moyennes · 1906, Acta Math. 30, the convex-function inequality
TODO_REF
Tarski — A Decision Method for Elementary Algebra and Geometry · 1948, RAND Corporation, the decision procedure for real-closed fields

Lean module

Codex.Precalc.Inequalities.Basic

Mathlib gap

Mathlib supplies the linear-ordered-field structure on Real together with the sign-flip rule mul_lt_mul_of_neg_left, the sign-product characterisations mul_neg_iff and mul_pos_iff feeding the sign-analysis method, sq_nonneg for the non-negativity of a square underpinning the discriminant argument, the AM-GM inequality at two points via Real.add_sq_le_sq_mul_sq and the general Real.inner_mul_le_norm_mul_norm form for AM-GM on finitely many points, and inner_mul_le_norm_mul_norm in Mathlib.Analysis.InnerProductSpace.Basic for the Cauchy-Schwarz inequality on a real inner-product space. The six signature statements in the companion module reduce to Mathlib lemmas in each case; what is not packaged as a single named lemma is the textbook one-step derivation of Cauchy-Schwarz as a non-positive-discriminant consequence of the non-negativity of the quadratic t ↦ ‖u - tv‖^2, which the companion module states alongside the inequality and the named reviewer signs off on.

Reviewer

TBD

Estimated time

beginner: 14m
intermediate: 32m
master: 55m