Visual Interpretation
Christoph Molnar
Various visualizations make the linear regression model easy and quick to grasp for humans.
Weight Plot
The information of the weight table (weight and variance estimates) can be visualized in a weight plot. The following plot shows the results from the previous linear regression model.
Effect Plot
The weights of the linear regression model can be more meaningfully analyzed when they are multiplied by the actual feature values. The weights depend on the scale of the features and will be different if you have a feature that measures e.g. a person’s height and you switch from meter to centimeter. The weight will change, but the actual effects in your data will not. It is also important to know the distribution of your feature in the data, because if you have a very low variance, it means that almost all instances have similar contribution from this feature. The effect plot can help you understand how much the combination of weight and feature contributes to the predictions in your data. Start by calculating the effects, which is the weight per feature times the feature value of an instance:
The effects can be visualized with boxplots. The box in a boxplot contains the effect range for half of the data (25% to 75% effect quantiles). The vertical line in the box is the median effect, i.e. 50% of the instances have a lower and the other half a higher effect on the prediction. The dots are outliers, defined as points that are more than 1.5 * IQR (interquartile range, that is, the difference between the first and third quartiles) above the third quartile, or less than 1.5 * IQR below the first quartile. The two horizontal lines, called the lower and upper whiskers, connect the points below the first quartile and above the third quartile that are not outliers. If there are no outliers the whiskers will extend to the minimum and maximum values.
The categorical feature effects can be summarized in a single boxplot, compared to the weight plot, where each category has its own row.