publish eli5 blog post
continuous-integration/drone/push Build is failing Details

This commit is contained in:
James Ravenscroft 2022-03-14 09:16:03 +00:00
parent bb96371e4f
commit e6eb376200
11 changed files with 27 additions and 6 deletions

View File

@ -5,10 +5,9 @@ description: An introduction to LIME ML model explainability in the context of N
resources: resources:
- name: feature - name: feature
src: images/scrabble.jpg src: images/scrabble.jpg
date: 2022-01-17T13:01:11+00:00 date: 2022-03-14T08:01:11+00:00
url: /2022/01/17/painless-explainability-for-text-models-with-eli5 url: /2022/03/14/painless-explainability-for-text-models-with-eli5
toc: true toc: true
draft: true
tags: tags:
- machine-learning - machine-learning
- work - work
@ -108,7 +107,7 @@ If you look at the results in Jupyter you'll get red and green highlights over t
{{<figure src="images/explanation_svm.png" caption="An example explanation from LIME">}} {{<figure src="images/explanation_svm.png" caption="An example explanation from LIME">}}
The `<BIAS>` contribution is the model's underlying bias towards or against a particular class - again ***within this neighbourhood**. The most intuitive way to think about this parameter is that it describes the model's perception that other examples, similar to this one, belong to the given class. The bias is usually a much smaller contributing factor than the actual features as we see in the example above. The `<BIAS>` contribution is the model's underlying bias towards or against a particular class - again ***within this neighbourhood**. The most intuitive way to think about this parameter is that it describes the model's perception that other examples, similar to this one, belong to the given class. It is based on the ***a priori*** probability that a randomly perturbed sample in the current neighbourhood belongs to each one of those classes since in a well trained model with good features, different parts of the neighbourhood are more likely to correspond to different classes. The bias is usually a much smaller contributing factor than the actual features as we see in the example above.
We can also inspect the weights/feature importances that the model has generated ***for the current local neighbourhood*** and see, for each class, what words or phrases the model thinks are predictive of a particular class. We can also inspect the weights/feature importances that the model has generated ***for the current local neighbourhood*** and see, for each class, what words or phrases the model thinks are predictive of a particular class.
@ -414,6 +413,18 @@ def remote_model_adapter(texts: List[str]):
return softmax(np.array(all_scores), axis=1) return softmax(np.array(all_scores), axis=1)
``` ```
Next we simply run the text explainer using our new model adapter - I artificially limit the samples to a small number so that I don't wipe out that 30k tokens limit - but remember that doing this will also limit the local fidelity of the model - ideally we want to keep the number quite high.
```python
from eli5.lime import TextExplainer
te = TextExplainer(n_samples=100, random_state=42)
te.fit("""The restaurant was amazing, the quality of their
food was exceptional. The waiters were so polite.""", remote_model_adapter)
te.explain_prediction(target_names=list(model.config.id2label.values()))
```
## Checking whether the explanation is trustworthy ## Checking whether the explanation is trustworthy
How do we know if our explanations are good? Like any other ML model, the models produced by LIME should be evaluated using a held-out/unseen test set of perturbed examples that have not been seen before. If the local model can do well at predicting the black box weights for other, local examples that it's not seen yet, then we can assume that the model is a good fit (at least within the specific 'locality' under analysis). How do we know if our explanations are good? Like any other ML model, the models produced by LIME should be evaluated using a held-out/unseen test set of perturbed examples that have not been seen before. If the local model can do well at predicting the black box weights for other, local examples that it's not seen yet, then we can assume that the model is a good fit (at least within the specific 'locality' under analysis).
@ -426,10 +437,20 @@ ELI5 provides this functionality all for free (generates a test set of perturbed
{'mean_KL_divergence': 0.01961629150756376, 'score': 0.9853027527973619} {'mean_KL_divergence': 0.01961629150756376, 'score': 0.9853027527973619}
``` ```
The `score` metric is our local model accuracy which is 98.5% - that's quite reassuring. The mean KL Divergence is low at 0.0196 - this can be interpreted as a mean difference/divergence in the predictions of about 2% across the whole dataset which seems acceptable. The `score` metric is our local model accuracy which is 98.5% - that's quite reassuring. The mean KL Divergence is low at 0.0196 - this can be interpreted as a mean difference/divergence in the predictions of about 2% across the whole dataset which seems acceptable (although in your use case that might not be acceptable depending on scale, regulatory requirements etc. That's a call you or your business sponsor have to make).
If these KL divergence is high or the score is low then you have a bad local model and it's worth checking to see why that might be the case and probably best not to trust the results. The [ELI5 Documentation](https://eli5.readthedocs.io/en/latest/tutorials/black-box-text-classifiers.html#should-we-trust-the-explanation) has some excellent information on specific cases where your NLP model might fail and how you might go about diagnosing these issues. If these KL divergence is high or the score is low then you have a bad local model and it's worth checking to see why that might be the case and probably best not to trust the results. The [ELI5 Documentation](https://eli5.readthedocs.io/en/latest/tutorials/black-box-text-classifiers.html#should-we-trust-the-explanation) has some excellent information on specific cases where your NLP model might fail and how you might go about diagnosing these issues.
## Using Explanations in Practice
Where is it appropriate to use these explanations? Certainly if you're mid data-science workflow and you want to do some error analysis to see why you're getting some surprising false-positives or false-negatives then these explanations might be a great place to start.
One elephant in the room is the question of how and when to show these explanations to end users and how to set appropriate expectations. Certainly there is a lot of nuance to interpreting these explanations that should prevent one from hastily drawing conclusions about the far-reaching conclusions (as we discussed above). Therefore if you are planning to allow users to see explanations, some level of training and an informative and clear user experience are going to be key. Given this nuance, I don't think I'd recommend using explanations in an automated capacity (e.g. workflows that change depending on the outcome of a LIME model) unless it is just to flag something for manual review.
Finally, We should briefly touch on the compute requirements. Every time we ask for an explanation we are fitting (dependant on sample size parameter) a few thousand small models which takes 20-30 seconds on my 2020 laptop. Therefore getting an explanation for one data point is not prohibitive but having one for every decision ever made by a model quickly becomes prohibitive in production environments with 10k+ rows of data (more likely millions of rows). A compromise could be to allow end users request an explanation for specific decisions that they're interested in on a case-by-case basis. A 30 second wait can be managed through user experience.
## Conclusion ## Conclusion
In this post I have given you an insight into how LIME works under the covers and how it uses simple local models to offer explanations of more powerful black-box models. I've discussed some of the limitations of this approach and given some practical code examples for how you could apply LIME to commonly used frameworks in Python as well as a remote model API. In this post I have given you an insight into how LIME works under the covers and how it uses simple local models to offer explanations of more powerful black-box models. I've discussed some of the limitations of this approach and given some practical code examples for how you could apply LIME to commonly used frameworks in Python as well as a remote model API.