wzuidema commited on
Commit
28caaa6
1 Parent(s): d271155

Update description.md

Browse files
Files changed (1) hide show
  1. description.md +1 -8
description.md CHANGED
@@ -3,12 +3,5 @@
3
  In this demo, we use the RoBERTa language model (optimized for masked language modelling and finetuned for sentiment analysis).
4
  The model predicts for a given sentences whether it expresses a positive, negative or neutral sentiment.
5
  But how does it arrive at its classification? This is, surprisingly perhaps, very difficult to determine.
6
- A range of so-called "attribution methods" have been developed that attempt to determine the importance of the words in the input for the final prediction;
7
- they provide a very limited form of "explanation" -- and often disagree -- but sometimes provide good initial hypotheses nevertheless that can be further explored with other methods.
8
 
9
- Abnar & Zuidema (2020) proposed a method for Transformers called "Attention Rollout", which was further refined by Chefer et al. (2021) into Gradient-weighted Rollout.
10
- Here we compare it to another popular method called Integrated Gradients.
11
-
12
- * Gradient-weighted attention rollout, as defined by [Hila Chefer](https://github.com/hila-chefer)
13
- [(Transformer-MM_explainability)](https://github.com/hila-chefer/Transformer-MM-Explainability/), with rollout recursion upto selected layer
14
- * Layer IG, as implemented in [Captum](https://captum.ai/)(LayerIntegratedGradients), based on gradient w.r.t. selected layer.
 
3
  In this demo, we use the RoBERTa language model (optimized for masked language modelling and finetuned for sentiment analysis).
4
  The model predicts for a given sentences whether it expresses a positive, negative or neutral sentiment.
5
  But how does it arrive at its classification? This is, surprisingly perhaps, very difficult to determine.
 
 
6
 
7
+ Abnar & Zuidema (2020) proposed a method for Transformers called "Attention Rollout", which was further refined by Chefer et al. (2021) into **Gradient-weighted Rollout**. Here we compare it to another popular method called **Integrated Gradients**.