Normal distributions are used to model variables that are evenly spread on either side of their mean values.
The further a value is from the mean: the less likely that value is to occur.
The curve of a normal probability density function appears to be bell-shaped. These curves are often referred to as bell curves.
Bell curves are symmetrical on either side of the mean value \(x=\mu\) so that \(50\%\) of the area enclosed by the curve and the horizontal axis lies on either side of the vertical line of symmetry \(x=\mu\).
An example could be the distribution of the heights in cm of the entire male population of a country. There would be a mean height (say 175cm) and all heights in the population would be evenly distributed on either side of this mean, with a standard deviation of a few cm.
Another example could be the time it takes for each gender to run \(100\)m. There would be a mean value and all other values would be evenly spread on either side of that mean.
To say that a continuous random variable, \(X\), follows a normal distribution with mean \(\mu \) and variance \(\sigma^2\) we write: \[X \sim N\begin{pmatrix} \mu , \ \sigma^2 \end{pmatrix}\] Remember: \(\text{Variance}=\sigma^2\), where \(\sigma \) is the standard deviation.
Given a continuous random variable that follows a normal distribution, \(X \sim N\begin{pmatrix} \mu , \ \sigma^2 \end{pmatrix}\), its probability density function, \(f(x)\), is defined as: \[f(x)=\frac{1}{\sigma \sqrt{2 \pi }}e^{- \frac{(x-\mu )^2}{2\sigma^2}}, \quad x\in \mathbb{R}\] Note: we'll rarely need to manipulate this function. In general any work we do with normal distributions is done with a calculator or a computer (that has the normal distribution function built-in to it).
Normal distributions are such that mean, median and mode are all equal to eachother. So given a normal distribution we can state: \[\text{mean} = \mu\] \[\text{median} = \mu\] \[\text{mode} = \mu\]
A few properties of this function should always be kept in mind:
Just as with other continuous probability distributions, this function isn't used to calculate probabilities directly.
We calculate probabilities by calculating areas enclosed by the curve and the horizontal axis.
To do this we use the cumulative distribution function.
The cumulative density function (cdf), used to find probability \(P\begin{pmatrix}X \leq x \end{pmatrix}\), of a normal distribution is defined as: \[F(x) = \int_{-\infty}^x \frac{1}{\sigma \sqrt{2 \pi }}e^{- \frac{(t-\mu )^2}{2\sigma^2}} dt \] The value of \(F(x)\) is equal to the probability \(P\begin{pmatrix}X\leq x \end{pmatrix}\), that's: \[F(x)=P\begin{pmatrix}X\leq x\end{pmatrix}\] So, we often simply write: \[P\begin{pmatrix}X \leq x \end{pmatrix} = \int_{-\infty}^x \frac{1}{\sigma \sqrt{2 \pi }}e^{- \frac{(t-\mu )^2}{2\sigma^2}} dt \]
To calculate the probabiliy \(P\begin{pmatrix} X\leq b \end{pmatrix}\), we calculate \(F(b)\), that's the integral:
\[P\begin{pmatrix}X \leq b \end{pmatrix} = \int_{-\infty}^b \frac{1}{\sigma \sqrt{2 \pi }}e^{- \frac{(t-\mu )^2}{2\sigma^2}} dt \]
This represents the area enclosed beneath the bell-shaped curve and the horizontal axis, from \(- \infty\) up to \(b\).
This is illustrated here:
The shaded area is equal to the probability \(P\begin{pmatrix} X\leq b \end{pmatrix}\).
Calculating such integrals is very tricky.
Luckily for us our calculators have this integral built-in.
and all we have to enter is:
A random variable \(X\) follows a normal distribution with mean \(\mu = 6\) and standard deviation \(\sigma = 2\).
Find each of the following probabilities:
Just as for other continuous probability distributions: \[P\begin{pmatrix}X \leq k \end{pmatrix} = P\begin{pmatrix}X < k \end{pmatrix}\] \[P\begin{pmatrix}X \geq k \end{pmatrix} = P\begin{pmatrix}X > k \end{pmatrix}\] Consequently: \[\begin{aligned}P\begin{pmatrix}a \leq X \leq b \end{pmatrix} & = P\begin{pmatrix} a < X < b \end{pmatrix} \\ & = P\begin{pmatrix} a \leq X < b \end{pmatrix} \\ & = P\begin{pmatrix} a < X \leq b \end{pmatrix} \end{aligned}\]
Answer each of the following: