working on rationale post
continuous-integration/drone/push Build is passing Details

This commit is contained in:
James Ravenscroft 2021-01-02 18:22:23 +00:00
parent 99b8dddf77
commit c45c30dc5c
4 changed files with 34 additions and 2 deletions

View File

@ -0,0 +1 @@
<mxfile host="app.diagrams.net" modified="2021-01-02T18:15:11.052Z" agent="5.0 (X11)" etag="sLTzhic4Rizf62juCqSs" version="14.1.7" type="device"><diagram id="F5Ue9vwdzeswKv2T4yi_" name="Page-1">zVdZc9owEP41nmkfYIyFOR45Qo9MaTpkpvRRsYWtibBcWYDpr+9Klm8TkrTTFBiQPmkP7X6rNRZa7NMPAsfhF+4TZjm2n1poaTnOYOg4lvrY/jlDps4wAwJBfbOpBDb0FzGgbdAD9UlS2yg5Z5LGddDjUUQ8WcOwEPxU37bjrG41xgFpARsPszb6nfoyzNCJMy7xj4QGYW55MJpmK3ucbzYnSULs81MFQjcWWgjOZTbapwvCVPDyuGRyqwurhWOCRPI5Ar31dH+0v63RaTW5jZJ1GvPbntFyxOxgDmw5Iwb65uFARZDRINL46OdBOTr3wBoR5RxGgfrd5nIPIodyBFxSyupgnAOfEpXSkAii7EVnGdIogOEDkdoQrOEIftbU4wyrzQuVMseercEt+zMP1eq9wEdgBbbQCmYr7KkdX3c7df5EK1ZfnqQ80kHzHonf7/efdDouMZ0/ec5JIUmqQyT3DIABDBNQqfxGS7ec3fMYgB7kBc1PIZVkEyvH0PIEtQIYPxKxY5oTIfV9ApGeC36IfKJSpqR2lLEFZ1xou2jnqrcyIAV/JJWVkX4pCR7JCp69ijOAQfD9IoEGBS2hngnfEynOsMUIjAyRTSU7QzM/lXVR7AkrNTExGDalGBSaS7bCwBD2BeR1OsjbyBSJ/Jm6BWDmAXsS6tXzlgkQv3UJXA1K5dRux6FzTBCGJT3W1XdFwli44xQMFzFHTiPozWAm/CA8YqSq1d9UNL6iSGIRENlSpBNTHPv1uXJbuVq9275v5aurTCrpwkmcXfI7mqoi0XTfGPFBPs+aCDSav0L7VuQ6eD/5l7Qf/TntSUrltjL+oe6bvmtmy9RcP3pyNpPnlgo4ojl5jQxvVlLTej5d+5UlNZy8bUmNL/Zunx6b/UwVRo0jeQNXC71El8wMNjjDOG1393OlLWaaWt2yw+ZlN7yiQ5WWkG0vFioXTeN3PKE630/70NnDay5c7eL54455yEFzdXFQeBicmYU9tGkl3tnOG627eTP9T628Rd2h03dbt9oAdVTh2H3xtQbT8jE3Y3/5ZwHd/AY=</diagram></mxfile>

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 11 KiB

View File

@ -2,7 +2,7 @@
title: Explain Yourself! Self-Rationalizing NLP Models title: Explain Yourself! Self-Rationalizing NLP Models
author: James author: James
type: post type: post
draft: true draft: false
resources: resources:
- name: feature - name: feature
src: images/feature.jpg src: images/feature.jpg
@ -22,6 +22,37 @@ tags:
--- ---
## Introduction
The ability to understand and rationalise about automated decisions is becoming particularly important as more and more businesses adopt AI into their core processes. Particularly in light of legislation like GDPR requiring subjects of automated decisions to be given the right to an explanation as to why that decision was made. There have been a number of breakthroughs in explainable models in the last few years as academic teams in the machine learning space focus their attention on the why and the how. The ability to understand and rationalise about automated decisions is becoming particularly important as more and more businesses adopt AI into their core processes. Particularly in light of legislation like GDPR requiring subjects of automated decisions to be given the right to an explanation as to why that decision was made. There have been a number of breakthroughs in explainable models in the last few years as academic teams in the machine learning space focus their attention on the why and the how.
Significant breakthroughs in model explainability were seen in the likes of [LIME](https://towardsdatascience.com/understanding-model-predictions-with-lime-a582fdff3a3b) and [SHAP](https://towardsdatascience.com/explain-your-model-with-the-shap-values-bc36aac4de3d) where local surrogate models, which are explainable but only for the small number of data samples under observation, are used to approximate the importance/contribution of features to a particular decision. ## Recent Progress in Model Explainability
Significant breakthroughs in model explainability were seen in the likes of [LIME](https://towardsdatascience.com/understanding-model-predictions-with-lime-a582fdff3a3b) and [SHAP](https://towardsdatascience.com/explain-your-model-with-the-shap-values-bc36aac4de3d) where local surrogate models, which are explainable but only for the small number of data samples under observation, are used to approximate the importance/contribution of features to a particular decision. These approaches are powerful when input features are meaningful in their own right (e.g. bag-of-words representations where a feature may be the presence or absense of a specific word) but are less helpful when input features are too abstract or are the output of some other black box (e.g. multi-dimensional word vectors or RGB values from pixels).
[Transformer](https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html)-based models like [BERT](https://github.com/google-research/bert) which use the concept of neural attention to learn contextual relationships between words can also be interrogated by [visualising attention patterns inside the model](https://towardsdatascience.com/deconstructing-bert-part-2-visualizing-the-inner-workings-of-attention-60a16d86b5c1)). However, these visualisations are still quite complex (especially for transformer-based models which typically have multiple parallel attention mechanisms to examine) and do not provide concise or intuitive rationalisation for model behaviour.
## Rationalization of Neural Predictions
In 2016, [Lei, Barzilay and Jaakola](https://people.csail.mit.edu/taolei/papers/emnlp16_rationale.pdf) wrote about a new architecture for rationale extraction from NLP models. The aim was to generate a new model that could extract a "short and coherent" justification for why the model made a particular prediction.
{{<figure src="images/figure1.png" caption="An example of a review with ranking in two categories. The rationale for Look prediction is shown in bold from [Lei, Barzilay and Jaakola 2016](https://people.csail.mit.edu/taolei/papers/emnlp16_rationale.pdf)">}}
The idea is actually quite simple. Firstly, let's assume we're starting with a classification problem where we want to take document **X** and train a classifier function **F(x)** to predict label **y** based on the text in the document (e.g. X is a movie review and y is positive or negative sentiment).
{{<figure src="images/simple_classifier.png" caption="A simple text classifier that tries to learn to predict y based on text in X">}}
What Lei, Barzilay and Jaakola propose is that we add a new step to this process. We're going to introduce **G(X)** - a generator- which aims to generate a rationale **R** for the document. Then we're going to train our classifier **F(X)** to predict **y** not from the document representation **X** but from the rationale **R**. Our new process looks something like this:
<!-- which was followed up and improved upon in 2019 by [Yu, Chang, Zhang and Jaakola](http://people.csail.mit.edu/tommi/papers/YCZJ_EMNLP2019.pdf). The idea is surprisingly simple: take an existing neural classification model, tack on a new -->
<script type="text/javascript"
src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
</script>