Gradients & directional derivatives
The idea
Partial derivatives give slopes along the axes, but you can walk across a surface in any direction. The gradient ∇f = (f_x, f_y) packages the partials into a single vector that answers every directional question at once: the directional derivative of f in the direction of a unit vector u is the dot product ∇f · u. You already know dot products measure alignment, so the conclusion falls out — the rate of change is largest when you walk exactly along ∇f.
Internalize three facts as one picture. The gradient points in the direction of steepest ascent; its length |∇f| is that steepest rate; and it is perpendicular to the level curve through the point, because walking along a contour changes nothing, forcing ∇f · u = 0 there. This is why streams cross elevation contours at right angles and why gradient descent in machine learning steps against ∇f.
The recurring slip is using a direction vector without normalizing it. The directional derivative compares rates per unit of distance, so the formula demands a unit vector; feeding it a length-5 vector silently multiplies the answer by 5.
Worked example
Let f(x, y) = x²y + y³. At the point (1, 2), find the directional derivative of f in the direction of v = (3, 4), and the maximum possible rate of increase at that point.
- Compute the gradient: f_x = 2xy and f_y = x² + 3y², so ∇f = (2xy, x² + 3y²).
- Evaluate at (1, 2): ∇f(1, 2) = (2 × 1 × 2, 1 + 3 × 4) = (4, 13).
- Normalize the direction: |v| = √(9 + 16) = 5, so the unit vector is u = (3/5, 4/5) — skipping this step would inflate the answer fivefold.
- Take the dot product: D_u f = (4)(3/5) + (13)(4/5) = 12/5 + 52/5 = 64/5 = 12.8 output units per unit step.
- For the maximum rate, no direction beats the gradient itself: the top rate is |∇f| = √(4² + 13²) = √185 ≈ 13.6, achieved walking in the direction of (4, 13). Note 12.8 falls below 13.6, as it must — v points close to, but not exactly along, the gradient.
Answer. The directional derivative toward (3, 4) is 64/5 = 12.8, and the maximum rate of increase is √185 ≈ 13.6, in the direction of ∇f = (4, 13).
Check your understanding
- Why must the gradient be perpendicular to level curves — what would a nonzero rate along a contour contradict?
- How does the dot product formula encode the idea that walking partly along the gradient earns part of the maximum rate?
- What goes wrong numerically when you forget to normalize the direction vector, and how could you catch the mistake?
- How would you use the gradient to find the direction of fastest decrease, and what rate goes with it?
Build the foundations first
Gradients & directional derivatives builds on these concepts. If any feel shaky, start there.