In Naked Statistics, Charles Whelan does a great, and often very funny, job of not only explaining statistics in very simple terms, but also explaining why you should understand statistics. Statistics can be used to simplify complex situations to a small set of indexes or metrics, many of which are meaningful only for relative comparisons. Statistical calculations can then be used to better understand those situations, make inferences, and make informed decisions. A statistics-unaware gambler and his or her money are soon parted.
Working knowledge of statistics is incredibly valuable in everyday life, but most people understand little more than how to compute averages. Once you learn the difference between the mean and median and which to use when, you’ll realize how often you may have been mislead by looking only at averages, i.e., means. Unless you know the data is normally distributed with relatively few major outliers, the median is more informative than the mean. In that case, the mean and the median will be approximately the same and it is computationally easier to compute the mean as more data is added. Then again, the median can sometimes hide the impact of important outliers (e.g., in the results of drug trials).
Whelan also demonstrates ways to mislead yourself or others, whether intentional or not, with statistics. For example, comparing dollar amounts over long periods of time without correcting for inflation (i.e, the use of nominal figures vs. real figures), using the mean when you should use the median, confusing percentage points with percentage change, using unwarranted high precision to imply high accuracy, cherry picking time windows, assuming that correlated events are actually independent (and vice versa), not understanding regression to the mean, not using a representative sample, and everyone’s favorite, assuming that correlation implies causation.
Whelan spends a lot of time on the central limit theorem, which states that “a large, properly drawn sample will resemble the population from which it is drawn”. This theorem explains why properly conducted exit polls can usually correctly predict election outcomes, even when the samples intuitively seem very small.
Also, if you compute certain statistics, e.g., mean and standard deviation, on a sample and on the population from which it was allegedly derived, the central limit theorem can tell you the likelihood that the sample actually is from that population. Similarly, you can compute the probability that two samples are from the same population. And valuably, even if the population data is not normally distributed, the means of the samples will be normally distributed about the population mean. While the standard deviation measures dispersion in the population data, the standard error measures dispersion in the sample means. The standard error is the standard deviation of the sample means.
A nice metaphor he uses for population sampling is tasting a spoonful from a pot of soup, stirring the pot, and tasting again. The two samples should taste similar. The pot of soup is a large population and the spoon is a sample containing a large number of organic molecules. Of course, if you put the same spoon back into the pot, you are a horrible person.