We have learned the following measures of central tendency:
- arithmetic mean – the sum of all values divided by the total number of items in a dataset,
- median – the middle point of a dataset,
- mode – the most common value in a dataset.
Keep in mind that all these measures are human-made. Most measures you know from school (like the formula for the area of a triangle or the perimeter of a circle) describe how the world works: the formulas can be verified. Statistical measures are different: they are just heuristics, or experimental tools, that people found useful for summarizing data.
Let's recap what we learned. The arithmetic mean is the most commonly used measure of the three. It is important because it can be reliably estimated from the sample with inferential statistics methods.Unlike the arithmetic mean, the median is a good measure of central tendency for datasets with extreme values. Finally, the mode is the least frequently used method. It is mostly used for non-numerical data, where you can't compute the arithmetic mean or median.
Looking at the mean and median of a dataset, you can identify patterns without actually seeing the histogram:
- MEDIAN = MEAN – the histogram is symmetric (both sides have the same number of equally distributed elements).
- MEDIAN < MEAN – the histogram is skewed right, so the median is situated on the left side of the mean.
- MEDIAN > MEAN – the histogram is skewed left, so the median is on the right side of the mean.