4.2.2 Model Agnostic

Model-agnostic explanation methods are not dependent on the model. For the interpretability of model-agnostic methods, we have three dimensions that need to achieve.

Type of Explanation

Properties of explanations

Explanation Methods

Feature summary

Detailed

Expressive power

Model internals

Correctness

Portability

Datapoint

Consistency

Algorithmic complexity

A Surrogate intrinsically interpretable model

Stability

Certainty

Importance

Novelty

Representativeness

4.2.3.1 Type of Explanation

Firstly, the type of explanation. This is based on the result that each interpretable method produces. In other words, it is the type of explanation that each model-agnostic method provides.

Feature summary: Some explanation methods provide summary statistics for each feature. Most of the feature summary statistics can be visualized as well.
Model internals: This is the explanation output of all intrinsically interpretable models.
Datapoint: There are methods that return data points (already existent or not) to make a model interpretable. To be useful, explanation methods that output data points require that the data points themselves are meaningful and can be interpreted.
Surrogate intrinsically interpretable model: How well the surrogate model performs if there is a surrogate model that exists? More importantly, how well the surrogate model interprets the black-box model, and what is the accuracy?

4.2.3.2 Properties of Explanations

Secondly, the list of properties of Individual Explanations. In order to measure the interpretability correctly and how useful it is to specific use cases; we need to evaluate the properties of each model-agnostic method's explanations.

Detailed: How detail does an explanation predict unseen data?
Correctness: Is the explanation 100% correct based on the dataset and prediction?
Consistency: How much does an explanation differ between models that have been trained on the same task and that produce similar predictions?
Stability: How similar are the explanations for similar instances?
Certainty: Does the explanation reflect the certainty of the machine learning model? Many ML models only provide prediction values, not including statements about the model’s confidence in the correctness of the prediction.
Importance: It is associated with how well the explanation reflects the importance of features or of parts of the explanation.
Novelty: It describes if the explanation reflects whether an instance, of which the prediction is to be explained, comes from a region in the feature space that is far away from the distribution of the training data.
Representativeness: It describes how many instances are covered by the explanation. Explanations can cover the entire model or represent only an individual prediction.

4.2.3.3 Explanation Methods

Last but not least, the explanation methods. The properties from this list can be used to assess and make a comparison between different explanation methods.

Expressive power: It is the language or structure of the explanations the method is able to generate.
Portability: It describes the range of ML models to which the explanation method can be applied.
Algorithmic complexity: It is related to the computational complexity or time complexity of the explanation method.

Previous4.2.1 Interpretable Models(Model specific)Next4.2.3 Human Explanations

Last updated 4 years ago

Was this helpful?