Random variables & distributions
The idea
A random variable attaches a number to every outcome of a random process — the number of defects in a batch, the sum of two dice, the wait time for a bus — so that uncertainty becomes something you can do arithmetic on. Its distribution lists each possible value with its probability, and from the distribution flow the two headline summaries: the expected value E[X], a probability-weighted average, and the variance Var(X) = E[X²] − (E[X])², which measures spread around that average.
Read E[X] as a long-run average over many repetitions, not a prediction for one trial: a variable taking values 0, 1, 2 can have mean 1.1 even though 1.1 never occurs. The computational shortcut worth memorizing is the variance identity — compute E[X²] by weighting squared values, then subtract the square of the mean.
The order of operations there is the classic trap: E[X²] is not (E[X])², and the gap between them is exactly the variance. Squaring first emphasizes large values, so E[X²] always meets or exceeds (E[X])², with equality only for a constant — a useful built-in sanity check, since a negative variance always signals an arithmetic slip.
Worked example
A help desk replaces X laptop batteries per day, where X takes the value 0 with probability 0.2, the value 1 with probability 0.5, and the value 2 with probability 0.3. Find the mean, variance, and standard deviation of X.
- Check the distribution is legitimate: 0.2 + 0.5 + 0.3 = 1, so the probabilities account for every possibility exactly once.
- Compute the mean as a weighted average: E[X] = 0(0.2) + 1(0.5) + 2(0.3) = 0 + 0.5 + 0.6 = 1.1 batteries per day.
- Compute the second moment by weighting the squared values: E[X²] = 0²(0.2) + 1²(0.5) + 2²(0.3) = 0 + 0.5 + 1.2 = 1.7.
- Apply the variance identity: Var(X) = E[X²] − (E[X])² = 1.7 − 1.21 = 0.49 — positive, as it must be, and notice how E[X²] = 1.7 differs from (E[X])² = 1.21.
- Take the square root for the standard deviation: σ = √0.49 = 0.7 batteries, a spread measure in the same units as X. Interpretation: over many days the average settles near 1.1 with typical day-to-day deviations of about 0.7 — even though no single day can ever produce 1.1 batteries.
Answer. E[X] = 1.1 batteries per day, Var(X) = 0.49, and the standard deviation is 0.7 batteries.
Check your understanding
- Why can the expected value be a number the random variable never actually takes, and what does it really describe?
- How does the identity Var(X) = E[X²] − (E[X])² follow from the definition of variance as expected squared deviation?
- What would happen to the mean and the variance if every value of X were doubled, and why do they scale by different factors?
- When would you prefer reporting a standard deviation over a variance, and what do its units have to do with it?
Build the foundations first
Random variables & distributions builds on these concepts. If any feel shaky, start there.