This is especially important in tasks that are highly context dependent (like text classification). Here's a contrived example of a spam detection use case. Take the words "7 million usd" as in:
In the first example, the words "7 million usd" contribute to the suspicion that this is a scam in the presence of "wealthy widow" and "help me". In the second example the words "7 million usd" aren't as important, they're words that you'd probably expect in a legitimate email about an investment opportunity from your colleague in Mergers.
The point I'm trying to make is that it's very difficult to come up with good general rules about which words are important without any context (and indeed if you can do that then you probably don't need machine learning, you can just build a rule-based system that checks for the presence or absense of words on a list). The overall decision function of "spam or not spam" is much more complicated than "these words are good and these words are bad" but for a certain set of "spammy" examples we can certainly say which words are more spammy and which words are less spammy. This is analogous to the concepts at play in LIME too.
Therefore when we're using LIME, we should avoid saying things like "The model seems to consider the words 'million' and 'usd' spammy" and we should say things like "in cases similar to the widow email, it looks like the words 'million' and 'usd' contributed to the decision that this email was spam in the absense of any other redeeming words".
LIME exploits these explainable models in order to explain the local context around a given input example. We perturb (slightly change) the input example and use the black-box model under analysis to make predictions. As words are added or removed from the input, the output from the black box model changes slightly (in the [contrived again] example below, removing the word 'love' from the movie review reduces the probability that the review is positive.)
}}
These perturbed inputs and the outputs from the 'black box' model that we're analysing outputs are then used as a training set to train the local, interpretable model.
For text models, LIME uses Bag-of-Words (BoW) representations of the perturbed input as the features for the local model.
We can then use the interpretable information (parameter coefficients/feature position in decision tree) for the local model to approximately interpret the effect that the different words have on the bigger model since each word in the local BoW vocabulary will have an associated coefficient.
Model-Agnostic
LIME's model agnosticism is one of its most useful attributes. As long as you know how to encode the input data and your model has the ability to provide probabality distributions over its outputs, you can provide local explanations for any type of model. This is because the explanation comes from the local model and the BoW features therein rather than the black box model.
In the section below I've provided some examples of how to use ELI5 with some different types of models.
Explanation
As we saw at the beginning of the post, the explanations that are produced by LIME for NLP models are usually
Usage Examples
ELI5 and Sci-kit Learn
ELI5 and Transformers/Huggingface
Transformers is an open source library provided by HuggingFace which provides an easy to use wrapper around PyTorch and Tensorflow specifically to make it easy to use transformer-based NLP models like BERT, RoBERTa etc. In order to use ELI5 with Transformers from huggingface, we need to have Python3, transformers and a recent version of pytorch installed. You will probably want to run this code in a Jupyter Notebook so that you can see the pretty graphical explanations. Of course you'll also need eli5 library installed too.
This example will work on a machine without a GPU provided you aren't planning on training your transformer model from scratch. I am using this sentiment model which evaluates the sentiment/rating of reviews from 1 to 5 in English, Dutch, German, French or Spanish.
Loading The Model
The following snippet of code simply loads the model into memory amd sets up the tokenizer ready for use with new text examples
from transformers import AutoModelForSequenceClassification
from transformers import AutoTokenizer
import numpy as np
import pandas as pd
from typing import List
# this is the name of the model we want to evaluate on
# huggingface.com/models or alternatively you could train your own
MODEL = "nlptown/bert-base-multilingual-uncased-sentiment"
tokenizer = AutoTokenizer . from_pretrained ( MODEL )
model = AutoModelForSequenceClassification . from_pretrained ( MODEL )
Defining the Interface with ELI5
This snippet of code defines the all important model_adapter
function which we use to interface between PyTorch and ELI5.
ELI5 expects to be able to pass in a list of perturbed texts and get back a set of probability distributions (a matrix in the shape [NUM_EXAMPLES, NUM_CLASSES]).
In our function we have to encode the text into a BERT compatible input format using the tokenizer .
Then we pass the encoded input to the model and receive some predictions.
Finally we use softmax()
which will convert the raw logits generated by the model into nice smooth probability functions that LIME is expecting to see.
You may be wondering about the for loop and the batches? ELI5 tries to get results for 5000 samples at a time (by default) and that might be fine in a smaller, less powerful model but with a transformer we can't fit all of those examples into memory. Therefore we split the samples into batches of 64 at a time so that we don't end up running out of RAM.
def model_adapter ( texts : List [ str ]):
all_scores = []
for i in range ( 0 , len ( texts ), 64 ):
batch = texts [ i : i + 64 ]
# use bert encoder to tokenize text
encoded_input = tokenizer ( batch ,
return_tensors = 'pt' ,
padding = True ,
truncation = True ,
max_length = model . config . max_position_embeddings - 2 )
# run the model
output = model ( ** encoded_input )
# by default this model gives raw logits rather
# than a nice smooth softmax so we apply it ourselves here
scores = output [ 0 ] . softmax ( 1 ) . detach () . numpy ()
all_scores . extend ( scores )
return np . array ( all_scores )
Getting an explanation
The last piece in the puzzle is to actually run the model and get our explanation. Firstly we initialize our explainer object
Here we pass in the text that we'd like to get an explanation for. n_samples
gives the number of perturbed examples that LIME should generate in order
to train the local model (more samples should give a more faithful local explanation at the cost of more compute/taking longer).
Random state is simply a number that is used to seed Python's pseudo-random number generator which LIME uses to randomly decide what
samples to pick. Setting random state explicitly is a good habit to get into in order to preserve the reproducibility of your models.
from eli5.lime import TextExplainer
te = TextExplainer ( n_samples = 5000 , random_state = 42 )
te . fit ( """The restaurant was amazing, the quality of their
food was exceptional. The waiters were so polite.""" , model_adapter )
te . explain_prediction ( target_names = list ( model . config . id2label . values ()))
ELI5 and a Remotely Hosted Model / API