Plots for two variables

Rafael Irizarry

In general, you should use scatterplots to visualize the relationship between two variables. In every single instance in which we have examined the relationship between two variables, including total murders versus population size, life expectancy versus fertility rates, and infant mortality versus income, we have used scatterplots. This is the plot we generally recommend. However, there are some exceptions and we describe two alternative plots here: the slope chart and the Bland-Altman plot.

Slope charts

One exception where another type of plot may be more informative is when you are comparing variables of the same type, but at different time points and for a relatively small number of comparisons. For example, comparing life expectancy between 2010 and 2015. In this case, we might recommend a slope chart. Below is an example comparing 2010 to 2015 for large western countries:

A slope chart comparing life expectancies between different countries.vSpain is listed as having the longest life expectancy, which has increased since 2010. The United States appears to have the lowest, staying at about 79 from 2010 to 2015.
Slope charts use angles to communicate values.

An advantage of the slope chart is that it permits us to quickly get an idea of changes based on the slope of the lines. Although we are using angle as the visual cue, we also have position to determine the exact values. Comparing the improvements is a bit harder with a scatterplot:

The same life expectancy data from the previous graph, now represented as a scatter plot. The scatter plot shows the average life expectancy of several countries from 2010 to 2015.
The life expectancy graph, now shown as a scatter plot.

In the scatterplot, we have followed the principle use common axes since we are comparing these before and after. However, if we have many points, slope charts stop being useful as it becomes hard to see all the lines.

Bland-Altman plot

Since we are primarily interested in the difference, it makes sense to dedicate one of our axes to it. The Bland-Altman plot, also known as the Tukey mean-difference plot and the MA-plot, shows the difference versus the average:

A Bland-Altman plot labeled to show the difference between the life expectancies from 2010 to 2015, counting from 0.3 to 0.9.
The life expectancy data, now showing differences and averages using a Bland-Altman plot.

Here, by simply looking at the y-axis, we quickly see which countries have shown the most improvement. We also get an idea of the overall value from the x-axis.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Business Analytics Copyright © by Di Shang is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book