3.2.2 Model-Agnostic
Last updated
Was this helpful?
Last updated
Was this helpful?
The great advantage of model-agnostic interpretation methods over model-specific ones is their flexibility. Machine learning developers are free to use any machine learning model they like when the interpretation methods can be applied to any model. Anything that builds on an interpretation of a machine learning model, such as a graphic or user interface, also becomes independent of the underlying machine learning model. Typically, not just one, but many types of machine learning models are evaluated to solve a task, and when comparing models in terms of interpretability, it is easier to work with model-agnostic explanations, because the same method can be used for any type of model. Below are the methods we will be addressing in this book:
Variable Importance
Partial Dependence Plot (PDP)
Individual Conditional Expectation (ICE)
Variable Importance represents the statistical significance of each variable in the data with respect to its effect on the generated model. This procedure breaks the relationship between the feature and the target, thus the drop in the model score is indicative of how much the model depends on the feature.
We called varimp_plot() API of H2O to output a bar type plot, this plot shows the top ten most important features in turn for the model. The degree of each feature’s importance is the length of the bar after it has been scaled between 0 and 1.
The calculation of variable importance for the tree-based model is determined by calculating the relative influence of each variable: whether that variable was selected to split on during the tree building process, and how much the squared error (overall trees) improved (decreased) as a result.
And for the non-tree-based model, we could fit the model using the standardized independent variables and compare the standardized coefficients as the variable importance.
The partial dependence plot (short PDP or PD plot) shows the marginal effect one or two features have on the predicted outcome of a machine learning model. A partial dependence plot can show the relationship between the target and a feature. The effect of a variable is measured in change in the mean response.
The partial dependence of the response Æ’ at a point Xs is defined as:
The difference with the partial dependence plot (PDP) is that the Individual Conditional Expectation (ICE) plots to display one line per instance that shows how the instance's prediction changes when a feature changes. An ICE plot visualizes the dependence of the prediction on a feature for each instance separately, resulting in one line per instance, compared to one line overall in partial dependence plots. ACE plots can help us discover the heteroscedasticity from the dataset which is hard to find out only through partial dependence plots.