Encoding a third variable

Rafael Irizarry

Below is a scatterplot showing the relationship between infant survival and average income. It also encodes three variables: OPEC membership, region, and population.

A scatter plot titled "Infant survival proportion". It uses colors and shapes to encode information.
A color and shape coded scatter plot on the correlation between infant survival and different variables.

It encodes categorical variables with color and shape. These shapes can be controlled with shape argument. Below are the shapes available for use in R package. For the last five, the color goes inside.

Three rows of shapes usable in R package. All can be used to encode for different variables.
Shapes and symbols available to be used for encoding data in R package.

For continuous variables, we can use color, intensity, or size. We now show an example of how we do this with a case study.

When selecting colors to quantify a numeric variable, we choose between two options: sequential and diverging. Sequential colors are suited for data that goes from high to low. High values are clearly distinguished from low values. Here are some examples.

Color palettes that can be used to represent changing data, typically using less to more intensity or saturation.
Typical color palettes used for encoding information about data.

Diverging colors are used to represent values that diverge from a center. We put equal emphasis on both ends of the data range: higher than the center and lower than the center. An example of when we would use a divergent pattern would be if we were to show height in standard deviations away from the average. Here are some examples of divergent patterns:

A range of color palettes with values ranging from color to color, but become light and unsaturated in the center.
Color palettes that would work well representing data that falls heavily on one side or another.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Business Analytics Copyright © by Rafael Irizarry is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book