update eli5 post

2022-01-17 12:02:53 +00:00 · 2022-01-17 12:02:53 +00:00 · 046e037d6b
parent 797ab81b4a
commit 046e037d6b
2 changed files with 71 additions and 49 deletions
--- a/brainsteam/content/posts/2022/01/13-01-painless-explainability-for-text-models-with-eli5/images/weights.png
+++ b/brainsteam/content/posts/2022/01/13-01-painless-explainability-for-text-models-with-eli5/images/weights.png
--- a/brainsteam/content/posts/2022/01/13-01-painless-explainability-for-text-models-with-eli5/index.md
+++ b/brainsteam/content/posts/2022/01/13-01-painless-explainability-for-text-models-with-eli5/index.md
@ -1,46 +1,21 @@
 ---
 title: Painless Explainability for NLP/Text Models with LIME and ELI5
-type: post
+type: draft
 description: An introduction to LIME ML model explainability in the context of NLP usage and how to use ELI5 library - a painless way to use LIME local explainability for almost any model.
 resources:
 - name: feature
   src: images/scrabble.jpg
 date: 2022-01-13T07:47:11+00:00
 url: /2022/01/13/painless-explainability-for-text-models-with-eli5
+toc: true
 tags:
  - machine-learning
  - work
  - explainability
 ---

-# Contents

- [Contents](#contents)
- [Introduction](#introduction)
- [Understanding LIME](#understanding-lime)
-  - [Local](#local)
-  - [Interpretable](#interpretable)
-  - [Model-Agnostic](#model-agnostic)
-  - [Explanation](#explanation)
- [Usage Examples](#usage-examples)
-  - [Requirements and Setup](#requirements-and-setup)
-  - [ELI5 and Sci-kit Learn](#eli5-and-sci-kit-learn)
-    - [Why SVM and LSA?](#why-svm-and-lsa)
-    - [Training the Model](#training-the-model)
-    - [Getting Some Predictions](#getting-some-predictions)
-    - [Getting an Explanation](#getting-an-explanation)
-  - [ELI5 and Transformers/Huggingface](#eli5-and-transformershuggingface)
-    - [Why Transformers?](#why-transformers)
-    - [Loading The Model](#loading-the-model)
-    - [Defining the Interface with ELI5](#defining-the-interface-with-eli5)
-    - [Getting an Explanation](#getting-an-explanation-1)
-  - [ELI5 and a Remotely Hosted Model / API](#eli5-and-a-remotely-hosted-model--api)
-    - [Setting up](#setting-up)
-    - [Building a Remote Model Adapter](#building-a-remote-model-adapter)
-
-
-
-# Introduction
+## Introduction


 Explainability of machine learning models is a hot topic right now - particularly in deep learning where models are that bit harder to reason about and understand. These models are often called 'black boxes' because you put something in, you get something out and you don't really know how that outcome was achieved. The ability to explain machine learning model's decisions in terms of the features passed in is both useful from a debugging standpoint (identifying features with weird weights) and with legislation like [GDPR's Right to an Explanation](https://www.privacy-regulation.eu/en/r71.htm) it is becoming important in a commercial setting to be able to explain why models behave a certain way.
@ -52,7 +27,7 @@ In this post I will give a simplified overview of how LIME works (I may take som



-# Understanding LIME
+## Understanding LIME

 Lime stands for **L**ocal, **I**nterpretable **M**odel-agnostic **E**xplanations and is a technique proposed by [Ribeiro et al.](https://arxiv.org/abs/1602.04938) in 2016. The basic premise is that for a given  input example (in an image classifier we're talking 1 image, in a text classifier we're talking 1 unit of text e.g. a paragraph or a sentence, in a numerical model trained on tabular data we're talking 1 row from that table), LIME can approximate how much of an effect each of the features extracted from the input have on the final output (i.e. How important are a cluster of pixels in an image?, How important are specific words/phrases in a sentence?, How important is each column in that row of numbers?). 

@ -63,7 +38,7 @@ For a given example both contributing and negating features are highlighted (rea



-## Local
+### Local

 The local aspect of LIME is described in [the paper](https://arxiv.org/abs/1602.04938):

@ -103,7 +78,7 @@ The point I'm trying to make is that it's very difficult to come up with good ge
 Therefore when we're using LIME, we should avoid saying things like "The model seems to consider the words 'million' and 'usd' spammy" and we should say things like "in cases similar to the widow email, it looks like the words 'million' and 'usd' contributed to the decision that this email was spam in the absense of any other redeeming words".


-## Interpretable
+### Interpretable

 Some machine learning models like [linear models](https://scikit-learn.org/stable/modules/linear_model.html) and [Decision Trees](https://scikit-learn.org/stable/modules/tree.html) are inherently interpretable through being able to measure parameter coefficients (how big the weight of the feature is when calculating the decision boundary line) in the case of the former and how early on a feature appears in a decision tree (since decision trees use [information gain](https://en.wikipedia.org/wiki/Information_gain_in_decision_trees) to put features that tell us most about the final classification/decision near the top of the tree so that they impact more data points) in the case of the latter.

@ -119,20 +94,32 @@ For text models, LIME uses [Bag-of-Words](https://en.wikipedia.org/wiki/Bag-of-w
 We can then use the interpretable information (parameter coefficients/feature position in decision tree) for the local model to approximately interpret the effect that the different words have on the bigger model since each word in the local BoW vocabulary will have an associated coefficient.


-## Model-Agnostic
+### Model-Agnostic

 LIME's model agnosticism is one of its most useful attributes. As long as you know how to encode the input data and your model has the ability to provide probabality distributions over its outputs, you can provide local explanations for any type of model. This is because the explanation comes from the local model and the BoW features therein rather than the black box model.

 In the section below I've provided some examples of how to use ELI5 with some different types of models.

-## Explanation
+### Explanation

-As we saw at the beginning of the post, the explanations that are produced by LIME for NLP models are usually 
+Explanations that are produced by LIME for NLP models are expressed in terms of which words/phrases were considered as the biggest contributing factors towards a class decision by the model. 
+
+If you look at the results in Jupyter you'll get blue and green highlights over the text input showing the degree to which each word contributed (green) or reduced (red) the likelihood that the input example is from the class under the microscope. In the example below you can see that kidney stones and medication are keywords that the model has learned can be used to classify examples in this neighbourhood (remember these explanations don't apply globally) as medical and that the presence of these words detracts from the likelihood that the email is about religion or graphic design.   
+
+{{<figure src="images/explanation_svm.png" caption="An example explanation from LIME">}}
+
+The `<BIAS>` contribution is the model's underlying bias towards or against a particular class - again ***within this neighbourhood**. The most intuitive way to think about this parameter is that it describes the model's perception that other examples, similar to this one, belong to the given class. The bias is usually a much smaller contributing factor than the actual features as we see in the example above.
+
+We can also inspect the weights/feature importances that the model has generated ***for the current local neighbourhood*** and see, for each class, what words or phrases the model thinks are predictive of a particular class.
+
+{{<figure src="images/weights.png" caption="Feature importances example">}}
+
+This table can also be useful as it can highlight surprising/incorrect results like that "to be" or "do anything" might signal a post about atheism. It's always worth having a look and if you see anything weird then also [check whether the model is trustworthy](#checking-whether-the-explanation-is-trustworthy) or whether your black-box model might be doing something strange.


-# Usage Examples
+## Usage Examples

-## Requirements and Setup
+### Requirements and Setup

 In order to get any of the examples below running you will need a relatively recent version of Python 3 and the [eli5](https://eli5.readthedocs.io/en/latest/autodocs/lime.html#eli5.lime.lime.TextExplainer) library installed too. You will probably want to run the example code in a [Jupyter Notebook](https://jupyter.org/) so that you can see the pretty graphical explanations.

@ -140,7 +127,7 @@ If you're not sure about which version of Python to install, you might want to h

 All of these examples will work fine on machines without GPUs although the [transformer model](#eli5-and-transformershuggingface) is a little slow running on CPU (it takes about 60 seconds to run on my 2020 Dell XPS w/ i7, 16GB RAM).

-## ELI5 and Sci-kit Learn
+### ELI5 and Sci-kit Learn

 [Scikit-Learn](https://scikit-learn.org/stable/) is one of the most widely used machine learning libraries used by data scientists everywhere. In this first example we're going to train a model in sci-kit learn and then use ELI5 to get an explanation for it. Make sure you have your python environment set up and [scikit-learn](https://scikit-learn.org/stable/) installed.

@ -148,7 +135,7 @@ If you recognise the following example that's because it is also the example tha

 We are going to train a [Support Vector Machine (SVM)](https://en.wikipedia.org/wiki/Support-vector_machine) model to predict which newsgroup an email came from thanks to the [20 newsgroup](https://scikit-learn.org/stable/datasets/real_world.html#newsgroups-dataset) dataset. SVMs with a linear kernel do have feature coefficients which could be used to provide global feature importance. However, to make it harder we will be using an [RBF](https://en.wikipedia.org/wiki/Radial_basis_function_kernel) kernel and we will use [Latent Semantic Analysis](https://en.wikipedia.org/wiki/Latent_semantic_analysis) because that's the setup used in the example and it's a combination that cannot be explained simply without LIME. 

-### Why SVM and LSA?
+#### Why SVM and LSA?

 So why do they used RBF and LIME? Is it a contrived example just to show off LIME? 

@ -157,7 +144,7 @@ Well LSA is often used as a way to get more performance from an underlying [BoW]
 RBF is a SVM kernel that can separate data that is not linearly seperable and there's a great explanation of this [here](https://www.kdnuggets.com/2016/06/select-support-vector-machine-kernels.html). RBF is often cited as a [reasonable first choice](https://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf) of kernel for SVMs. However, NLP practitioners will generally [recommend a linear kernel for text classification](https://www.svm-tutorial.com/2014/10/svm-linear-kernel-good-text-classification/) as in practice, and in my experience, text is usually linearly separable. However it will always depend on dataset so do some visualisation during exploratory analysis to see if an RBF kernel is appropriate. 


-### Training the Model
+#### Training the Model

 First we are going to use scikit-learn's built in [fetch_20newsgroups](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_20newsgroups.html#sklearn.datasets.fetch_20newsgroups) helper function to download some example emails from 4 newsgroups. There could reasonably be some serious overlap between the atheism and christian boards so this might be where LSA and our RBF kernel come in handy.

@ -219,7 +206,7 @@ pipe.score(twenty_test.data, twenty_test.target)

 ```

-### Getting Some Predictions
+#### Getting Some Predictions

 Now that the model is trained it is possible to run it on unseen data and get a prediction. In the tutorial
 the ELI5 authors provide a pretty printing function that shows the probability distribution of the labels for
@ -237,7 +224,7 @@ print_prediction(doc)

 This is basically just predicting the classes for the given document, which is the first doc in the test set, and then combining the probabilities in the prediction (`y_pred`) with the class names (`twenty_train.target_names`).

-### Getting an Explanation
+#### Getting an Explanation

 Getting an explanation of out this model is relatively simple at this point. We simply import the [TextExplainer](https://eli5.readthedocs.io/en/latest/autodocs/lime.html#eli5.lime.lime.TextExplainer) class from ELI5 and `fit()` it to the document (the first one in the test set as per the above snippet). The TextExplainer will use the SVC pipeline `pipe` to make predictions for a bunch of perturbed examples and train its own model. The `show_predictions` function will then give a visualisation of the explanation. The `target_names=` parameter is used to pass the class names from our dataset to the text explainer so that they can be displayed nicely.

@ -254,20 +241,28 @@ Et voila! Hopefully you will get some output that looks like the below:

 {{<figure src="images/explanation_svm.png" caption="The output of the explain functon should look something like this">}}

+Finally we can look at the model weights too
 
-## ELI5 and Transformers/Huggingface
+ ```python
+te.explain_weights(target_names=twenty_train.target_names)
+```
+
+{{<figure src="images/weights.png" caption="Model feature weights">}}
+
+
+### ELI5 and Transformers/Huggingface

 [Transformers](https://huggingface.co/docs/transformers/index) is an open source library provided by HuggingFace which provides an easy to use wrapper around PyTorch and Tensorflow specifically to make it easy to use transformer-based NLP models like BERT, RoBERTa etc. In order to use ELI5 with Transformers from huggingface, we need to have Python3, [transformers](https://huggingface.co/docs/transformers/index) and a recent version of [pytorch](https://pytorch.org/) installed.

 This example will work on a machine without a GPU provided you aren't planning on training your transformer model from scratch. I am using [this sentiment model](https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment) which evaluates the sentiment/rating of reviews from 1 to 5 in English, Dutch, German, French or Spanish.

-### Why Transformers?
+#### Why Transformers?

 Transformer-based models are, at the time of writing, **the in thing** for NLP models - they are a type of deep neural network that has contextual understanding of full sentences. If you're not familiar with them  [this article](https://towardsdatascience.com/transformers-89034557de14) offers a fairly good introduction.

 There are good reasons for not using transformers - first and foremost is that they are very computationally expensive to train and somewhat computationally expensive during inference (as you will see if you run both the above SVM experiment and the below transformer experiment). If you find that a less powerful (both in terms of understanding and in terms of power consumption) model works for your use case then using that instead is probably a good move - it'll save you headaches later if you need to scale up your inference operation.

-### Loading The Model
+#### Loading The Model

 The following snippet of code simply loads the model into memory amd sets up the tokenizer ready for use with new text examples

@ -287,7 +282,7 @@ tokenizer = AutoTokenizer.from_pretrained(MODEL)
 model = AutoModelForSequenceClassification.from_pretrained(MODEL)
 ```

-### Defining the Interface with ELI5
+#### Defining the Interface with ELI5

 This snippet of code defines the all important `model_adapter` function which we use to interface between PyTorch and ELI5.

@ -328,7 +323,7 @@ def model_adapter(texts: List[str]):

 ```

-### Getting an Explanation
+#### Getting an Explanation

 The last piece in the puzzle is to actually run the model and get our explanation. Firstly we initialize our explainer object. 
 `n_samples` gives the number of perturbed examples that LIME should generate in order to train the local model (more samples 
@ -357,19 +352,25 @@ Et voila! Hopefully you will get some output that looks like the below:

 {{<figure src="images/explanation_example.png" caption="The output of the explain functon should look something like this">}}

+You might also want to check the model weights with:

-## ELI5 and a Remotely Hosted Model / API
+```python
+te.explain_weights(target_names=list(model.config.id2label.values()))
+```
+
+
+### ELI5 and a Remotely Hosted Model / API

 This one is quite fun and exciting. Since LIME is model agnostic, we can get an explanation for a remotely hosted model assuming we have access to 
 the full probability distribution over its labels (and assuming you have enough API credits to train your local model).

 In this example I'm using Huggingface's [inference api](https://api-inference.huggingface.co/docs/python/html/quicktour.html) where they host transformer models on your behalf - you can pay to have your models run on GPUs for higher throughput. I made this guide with the free tier allowance which gives you 30k tokens per month - if you are using LIME with default settings you could easily eat through this whilst generating a single explanation so this is yet again a contrived example that gives you a taster of what is possible.

-### Setting up 
+#### Setting up 

 For this part of the tutorial you will need the Python [requests](https://docs.python-requests.org/en/latest/) library and we are also going to make use of [scipy](https://docs.scipy.org). You will also need a huggingface account and you will need to set up your API key as described in the [documentation](https://api-inference.huggingface.co/docs/python/html/quicktour.html).

-### Building a Remote Model Adapter
+#### Building a Remote Model Adapter

 Firstly we need to build a model adapter function that allows ELI5 to interface with huggingface's models. 

@ -413,3 +414,24 @@ def remote_model_adapter(texts: List[str]):
    return softmax(np.array(all_scores), axis=1)
 ```

+## Checking whether the explanation is trustworthy
+
+How do we know if our explanations are good? Like any other ML model, the models produced by LIME should be evaluated using a held-out/unseen test set of perturbed examples that have not been seen before. If the local model can do well at predicting the black box weights for other, local examples that it's not seen yet, then we can assume that the model is a good fit (at least within the specific 'locality' under analysis).
+
+When we evaluate the local model against the black box model we want to know that, at the very least, the local model is making the same class predictions as the parent black-box model (do both the child model and parent model predict the same most likely class). However, it is also useful to know precisely how similar those outputs are (given that both models predict the same 'most likely' class, what is the percentage difference in probability between the two predictions). A good local model should produce a very similar probability distribution to the parent black-box model for the same inputs. Therefore we use [KL-Divergence](https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence) as our performance metric in order to evaluate how well the model is performing. In a nutshell KL-Divergence tells you how similar 2 probability distributions are - and we want this number to be as small as possible (i.e. the probability distributions are pretty much the same).
+
+ELI5 provides this functionality all for free (generates a test set of perturbed examples and evalutes the final model automatically) so all we need is to look at the metrics and interpret them. For any of the above examples you should be able to run `te.metrics_` in Jupyter to get an output similar to the one below:
+
+```
+{'mean_KL_divergence': 0.01961629150756376, 'score': 0.9853027527973619}
+```
+
+The `score` metric is our local model accuracy which is 98.5% - that's quite reassuring. The mean KL Divergence is low at 0.0196 - this can be interpreted as a mean difference/divergence in the predictions of about 2% across the whole dataset which seems acceptable.
+
+If these KL divergence is high or the score is low then you have a bad local model and it's worth checking to see why that might be the case and probably best not to trust the results. The [ELI5 Documentation](https://eli5.readthedocs.io/en/latest/tutorials/black-box-text-classifiers.html#should-we-trust-the-explanation) has some excellent information on specific cases where your NLP model might fail and how you might go about diagnosing these issues.
+
+## Conclusion
+
+In this post I have given you an insight into how LIME works under the covers and how it uses simple local models to offer explanations of more powerful black-box models. I've discussed some of the limitations of this approach and given some practical code examples for how you could apply LIME to commonly used frameworks in Python as well as a remote model API.
+
+If you enjoyed this article please take a moment to tweet, toot or send me a webmention.