Spaces:

Dusan
/

clickbaitonator

Runtime error

App Files Files Community

Dusan Svilarkovic commited on Jul 15, 2022

Commit

fc5ecba

•

1 Parent(s): 7872a22

Adding Fudge

Browse files

Files changed (46) hide show

naacl-2021-fudge-controlled-generation/LICENSE +21 -0
naacl-2021-fudge-controlled-generation/README.md +155 -0
naacl-2021-fudge-controlled-generation/clickbait_classifier.py +128 -0
naacl-2021-fudge-controlled-generation/constants.py +32 -0
naacl-2021-fudge-controlled-generation/data.py +415 -0
naacl-2021-fudge-controlled-generation/eval_formality_metrics.py +73 -0
naacl-2021-fudge-controlled-generation/eval_poetry_metrics.py +135 -0
naacl-2021-fudge-controlled-generation/eval_topic_metrics.py +134 -0
naacl-2021-fudge-controlled-generation/evaluate_clickbait.py +200 -0
naacl-2021-fudge-controlled-generation/evaluate_formality.py +104 -0
naacl-2021-fudge-controlled-generation/evaluate_poetry.py +115 -0
naacl-2021-fudge-controlled-generation/evaluate_topic.py +143 -0
naacl-2021-fudge-controlled-generation/formality_data/README.md +2 -0
naacl-2021-fudge-controlled-generation/formality_data/fisher_test_oracle.es +0 -0
naacl-2021-fudge-controlled-generation/formality_data/test.noid.cleaned_0 +0 -0
naacl-2021-fudge-controlled-generation/formality_data/test.noid.cleaned_1 +0 -0
naacl-2021-fudge-controlled-generation/main.py +192 -0
naacl-2021-fudge-controlled-generation/model.py +182 -0
naacl-2021-fudge-controlled-generation/poetry_data/README.md +1 -0
naacl-2021-fudge-controlled-generation/poetry_data/couplet_ends.txt +154 -0
naacl-2021-fudge-controlled-generation/poetry_data/couplet_prefixes.txt +154 -0
naacl-2021-fudge-controlled-generation/poetry_util.py +83 -0
naacl-2021-fudge-controlled-generation/predict_clickbait.py +199 -0
naacl-2021-fudge-controlled-generation/predict_formality.py +404 -0
naacl-2021-fudge-controlled-generation/predict_poetry.py +219 -0
naacl-2021-fudge-controlled-generation/predict_topic.py +126 -0
naacl-2021-fudge-controlled-generation/requirements.txt +7 -0
naacl-2021-fudge-controlled-generation/topic_data/README.md +3 -0
naacl-2021-fudge-controlled-generation/topic_data/test_wordlists/computers.txt +163 -0
naacl-2021-fudge-controlled-generation/topic_data/test_wordlists/legal.txt +108 -0
naacl-2021-fudge-controlled-generation/topic_data/test_wordlists/military.txt +136 -0
naacl-2021-fudge-controlled-generation/topic_data/test_wordlists/politics.txt +40 -0
naacl-2021-fudge-controlled-generation/topic_data/test_wordlists/religion.txt +207 -0
naacl-2021-fudge-controlled-generation/topic_data/test_wordlists/science.txt +47 -0
naacl-2021-fudge-controlled-generation/topic_data/test_wordlists/space.txt +16 -0
naacl-2021-fudge-controlled-generation/topic_data/topic_prefixes.txt +20 -0
naacl-2021-fudge-controlled-generation/topic_data/val_wordlists/fantasy.txt +26 -0
naacl-2021-fudge-controlled-generation/topic_data/wordlists/computers.txt +176 -0
naacl-2021-fudge-controlled-generation/topic_data/wordlists/legal.txt +131 -0
naacl-2021-fudge-controlled-generation/topic_data/wordlists/military.txt +149 -0
naacl-2021-fudge-controlled-generation/topic_data/wordlists/politics.txt +47 -0
naacl-2021-fudge-controlled-generation/topic_data/wordlists/religion.txt +232 -0
naacl-2021-fudge-controlled-generation/topic_data/wordlists/science.txt +48 -0
naacl-2021-fudge-controlled-generation/topic_data/wordlists/space.txt +18 -0
naacl-2021-fudge-controlled-generation/transcript.txt +415 -0
naacl-2021-fudge-controlled-generation/util.py +110 -0

naacl-2021-fudge-controlled-generation/LICENSE ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2022 Kevin Yang
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

naacl-2021-fudge-controlled-generation/README.md ADDED Viewed

	@@ -0,0 +1,155 @@

+# FUDGE: Controlled Text Generation With Future Discriminators
+This repo contains code corresponding to the paper FUDGE: Controlled Text Generation With Future Discriminators (https://arxiv.org/abs/2104.05218) by Kevin Yang and Dan Klein, published at NAACL 2021.
+You can also find a video presentation at http://somup.com/crhlVPFKN7 and the corresponding slides in `slides.pptx`.
+## Setup/Installation
+We tested on Python 3.8.5 but earlier versions of Python 3 are almost certainly fine. To get the required packages (other versions likely to work too):
+```
+pip install -r requirements.txt
+```
+Additionally, to get our pre-trained predictor checkpoints and training data, run:
+```
+wget https://naacl2021-fudge-files.s3.amazonaws.com/large_files.zip
+```
+and extract the zip to the top-level `lm-prediction/` folder. (There should be three folders, `ckpt/`, `train_data/`, and `topic_human_evals/`. The zip is 7GB.) Note: the zip seems to not work for some people actually, if this is the case you can get the files directly from https://drive.google.com/drive/folders/1GZfOGqpQxDmIfD2RvuhUQla9eX2OHUXU?usp=sharing (13GB).
+`ckpt/` contains predictor checkpoints for each task if you are just interested in running inference. (Note that for the paper results, we used predictors trained with an older version of the code, but the new checkpoints get similar results, so you are OK to use the new predictors provided here if e.g. you just want to use FUDGE as a baseline. You can just run the evaluation commands provided below; it should take maybe 5-60 minutes depending on the task and your compute, assuming you have a GPU.)
+`train_data/` contains our GPT2-generated training data for the poetry and topic tasks' predictors. See https://github.com/raosudha89/GYAFC-corpus for instructions on gaining access to the GYAFC data used for the machine translation formality task; replace our dummy folders with the corresponding folders/files if you want to train our formality predictor.
+## Clickbait
+To generate outputs, run:
+```
+python -u evaluate_clickbait.py --ckpt ckpt/topic/future_word_predictor/model.pth.tar --dataset_info ckpt/topic/future_word_predictor/dataset_info --in_file topic_data/topic_prefixes.txt --condition_lambda 4.0 --verbose --precondition_topk 200 --length_cutoff 80 --device cpu
+python -u evaluate_clickbait.py --ckpt ckpt/formality/predictor_gyafc_entertainment_music/model.pth.tar --dataset_info ckpt/formality/predictor_gyafc_entertainment_music/dataset_info --in_file formality_data/fisher_test_oracle.es
+python -u evaluate_clickbait.py --ckpt ckpt/topic/future_word_predictor/model.pth.tar --dataset_info ckpt/topic/future_word_predictor/dataset_info --in_file topic_data/topic_prefixes.txt --condition_lambda 4.0 --verbose --precondition_topk 200 --sample_size 3 --max_sample_batch 1 --length_cutoff 80 --log_file clickbait_preds.log
+```
+Then evaluate metrics using:
+```
+python eval_topic_metrics.py --log_file topic_preds.log --tw_dir topic_data/test_wordlists
+```
+## Poetry Couplet Completion
+### Evaluation
+To generate outputs, run:
+```
+python -u evaluate_poetry.py --iambic_ckpt ckpt/poetry/iambic_predictor/model.pth.tar --rhyme_ckpt ckpt/poetry/rhyme_predictor/model.pth.tar --newline_ckpt ckpt/poetry/newline_predictor/model.pth.tar --dataset_info ckpt/poetry/rhyme_predictor/dataset_info --rhyme_info ckpt/poetry/rhyme_predictor/rhyme_info --prefix_file poetry_data/couplet_prefixes.txt --precondition_topk 200 > poetry_preds.log
+```
+Then evaluate metrics using:
+```
+python eval_poetry_metrics.py --pred_file poetry_preds.log --prefix_file poetry_data/couplet_prefixes.txt
+```
+### Training your own predictors
+Example commands for all three predictors used in the poetry task below. (You actually probably don't need so many epochs for iambic and rhyme; in any case the commands will save intermediate ckpts so you can just stop them early if needed by inspecting the log.)
+Iambic predictor:
+```
+python -u main.py --task iambic --data_dir train_data/gpt2_generations --save_dir ckpt/poetry/iambic_retrain_predictor --num_workers 20 --batch_size 128 --epoch_max_len 100000 --validation_freq 10  --lr 2e-4 --epochs 1500 > iambic_retrain_predictor.log
+```
+Rhyme predictor:
+```
+python -u main.py --task rhyme --data_dir train_data/gpt2_generations --save_dir ckpt/poetry/rhyme_retrain_predictor --num_workers 20 --batch_size 128 --epoch_max_len 100000 --validation_freq 10  --lr 2e-4 --epochs 1500 > rhyme_retrain_predictor.log
+```
+End of sentence predictor (referred to as "newline" in the code; 50 epochs is more than enough for this one):
+```
+python -u main.py --task newline --data_dir train_data/gpt2_generations --save_dir ckpt/poetry/newline_retrain_predictor --num_workers 20 --batch_size 128 --epoch_max_len 100000 --validation_freq 10  --lr 2e-4 --epochs 50 > newline_retrain_predictor.log
+```
+The same evaluation commands as before will work; just modify the paths in the command to point to `model_best.pth.tar`, `dataset_info`, and `rhyme_info` from your newly trained ckpt folders.
+## Topic Control
+### Evaluation
+To generate outputs, run:
+```
+python -u evaluate_topic.py --ckpt ckpt/topic/future_word_predictor/model.pth.tar --dataset_info ckpt/topic/future_word_predictor/dataset_info --prefix_file topic_data/topic_prefixes.txt --wordlist_dir topic_data/wordlists --condition_lambda 4.0 --verbose --precondition_topk 200 --topk 10 --sample_size 3 --max_sample_batch 1 --length_cutoff 80 --log_file topic_preds.log
+```
+Then evaluate metrics using:
+```
+python eval_topic_metrics.py --log_file topic_preds.log --tw_dir topic_data/test_wordlists
+```
+You can also find our original generations and baselines in `topic_human_evals/`.
+### Training your own predictors
+Example command below.
+```
+python -u main.py --task topic --data_dir train_data/gpt2_generations --save_dir ckpt/topic/future_word_retrain_predictor --num_workers 20 --batch_size 128 --epoch_max_len 100000 --validation_freq 10  --lr 2e-4 --epochs 500 --glove_file train_data/glove.840B.300d.txt > future_word_retrain_predictor.log
+```
+The same evaluation commands as before will work; just modify the paths in the command to point to `model_best.pth.tar`, `dataset_info`, and `rhyme_info` from your newly trained ckpt folders.
+## Machine Translation Formality
+### Evaluation
+To generate outputs, run:
+```
+python -u evaluate_formality.py --ckpt ckpt/formality/predictor_gyafc_entertainment_music/model.pth.tar --dataset_info ckpt/formality/predictor_gyafc_entertainment_music/dataset_info --in_file formality_data/fisher_test_oracle.es --model_path ckpt/formality/marian_finetune_fisher > formality_preds.log
+```
+The above command generates predictions using the Marian model finetuned on the Fisher dataset; remove the `--model_path` argument to get predictions with the un-finetuned Marian model from HuggingFace (referred to as 0-shot in the paper)
+Then evaluate metrics using:
+```
+python eval_formality_metrics.py --pred formality_preds.log --ref formality_data/test.noid.cleaned_0 formality_data/test.noid.cleaned_1 --ckpt ckpt/formality/test_evaluator_gyafc_family_relationships/model.pth.tar --dataset_info ckpt/formality/test_evaluator_gyafc_family_relationships/dataset_info
+```
+### Training your own predictors
+Example command below. (Reminder: you need to go get the GYAFC dataset following the instructions in https://github.com/raosudha89/GYAFC-corpus.)
+```
+python -u main.py --task formality --data_dir train_data/GYAFC_Corpus/Entertainment_Music --save_dir ckpt/formality/formality_retrain_predictor --num_workers 20 --batch_size 32 --epoch_max_len 1000000 --validation_freq 1 --lr 2e-5 --epochs 20 > formality_retrain_predictor.log
+```
+(The test-time formality evaluator is trained in the same way, just using the Family/Relationships half of the GYAFC dataset.)
+The same evaluation commands as before will work; just modify the paths in the command to point to `model_best.pth.tar`, `dataset_info`, and `rhyme_info` from your newly trained ckpt folders.
+## Running FUDGE on your own data
+The code has been refactored so that the iambic (poetry), rhyme (poetry), newline (poetry), future word (topic), and formality (machine translation) are controlled by the `--task` flag to `main.py`. You should add your task as another option here, then modify the data processing in `data.py` and the model in `model.py` as needed for your task. (In `data.py` you probably won't need all the entries of the tuple that is expected of the loader; you can just put dummy entries in the ones you don't need.) You might also need to modify the loss computation in the `train` and `validate` functions in `main.py`. You'll probably want to write new evaluation scripts, though the existing poetry/topic/formality ones are hopefully helpful as references.
+Alternatively, the general FUDGE framework is pretty simple, so you could always try reimplementing things yourself. A few additional details based on questions I've received:
+(1) The formality task setup is likely closest to what you want if you're just trying to run the simplest form of FUDGE (take a language model, and use a classifier to optimize toward a single attribute) although you may need to swap out the Marian translation model/tokenizer we use.
+(2) When you construct your training data, if you have an example in your data e.g. "This movie is great!" for positive sentiment, you want to learn on all the pairs (This, +), (This movie, +), (This movie is, +), etc., as that's one of the main points of our approach.
+(3) For computational efficiency, we first filter the base model's next token probabilities down to the top 200 (Sec. 3.1 in the paper), before adding the classifier logits. This way you only need to evaluate your classifier on 200 continuations. Then afterward, you filter down again to whatever top-k/greedy/nucleus sampling you're using for evaluation (we use top-k with k=10 for poetry and topic, greedy for formality).
+(4) You can use a pretrained LM backbone instead of a simple LSTM backbone for the predictor as well. This should work better when your dataset is smaller.

naacl-2021-fudge-controlled-generation/clickbait_classifier.py ADDED Viewed

	@@ -0,0 +1,128 @@

+import torch
+from transformers import BertModel, BertConfig, PretrainedConfig, PreTrainedModel, AutoModel, AutoConfig
+from typing import List, Optional, Tuple, Union
+from transformers.modeling_outputs import TokenClassifierOutput,SequenceClassifierOutput
+from torch.nn import BCEWithLogitsLoss, CrossEntropyLoss, MSELoss, BCELoss
+import torch.nn as nn
+# from modeling_mpnet import MPNetModel, MPnetConfig
+class ClickbaitConfig(PretrainedConfig):
+    def __init__(
+        self,
+        model_type: str = "bert",
+        pretrained_model: str = "bert-base-uncased",
+        num_labels: int = 1,
+        dropout: float = 0.1,
+        inner_dim1: int = 256,
+        inner_dim2: int = 32,
+        max_length: int = 512,
+        load_pretrained: bool = True,
+        freeze_bert: bool = True,
+        **kwargs
+    ):
+        super(ClickbaitConfig, self).__init__(num_labels=num_labels, **kwargs)
+        self.model_type = model_type
+        self.pretrained_model = pretrained_model
+        self.dropout = dropout
+        self.inner_dim1 = inner_dim1
+        self.inner_dim2 = inner_dim2
+        self.max_length = max_length
+        self.load_pretrained = load_pretrained
+        self.freeze_bert = freeze_bert
+class BertClickbaitClassifier(PreTrainedModel):
+    """
+      Taken and extended from BertforSequenceClassification : https://github.com/huggingface/transformers/blob/v4.19.2/src/transformers/models/bert/modeling_bert.py#L1508
+    """
+    config_class = ClickbaitConfig
+    def __init__(self, config: ClickbaitConfig):
+        super(BertClickbaitClassifier, self).__init__(config)
+        self.num_labels = config.num_labels
+        self.config = config
+        # self.bert_config = BertConfig.from_pretrained(config.pretrained_model)
+        self.bert_config = AutoConfig.from_pretrained(config.pretrained_model)
+        # self.bert = BertModel(self.bert_config)
+        self.bert = AutoModel.from_pretrained(config.pretrained_model, config=self.bert_config)
+        # self.bert = SentenceTransformer(config.pretrained_model, config=self.bert_config)
+        # self.bert = MPNetModel(config.pretrained_model, config=self.bert_config)
+        if config.load_pretrained:
+            print("Load pretrained weights from {}".format(config.pretrained_model))
+            self.bert = self.bert.from_pretrained(config.pretrained_model)
+        if config.freeze_bert:
+            print("Freeze weights in the BERT model. Just the classifier will be trained")
+            for param in self.bert.parameters():
+                param.requires_grad = False
+        self.linear_1 = nn.Linear(self.bert.config.hidden_size, config.inner_dim1)
+        self.dropout_1 = nn.Dropout(config.dropout)
+        self.relu_1 = nn.ReLU()
+        self.dropout_2 = nn.Dropout(config.dropout)
+        self.linear_2 = nn.Linear(config.inner_dim1, config.inner_dim2)
+        self.relu_2 = nn.ReLU()
+        self.dropout_3 = nn.Dropout(config.dropout)
+        self.classifier = nn.Linear(config.inner_dim2, config.num_labels)
+        self.sigmoid = nn.Sigmoid()
+    def forward(
+        self,
+        input_ids: Optional[torch.Tensor] = None,
+        attention_mask: Optional[torch.Tensor] = None,
+        token_type_ids: Optional[torch.Tensor] = None,
+        position_ids: Optional[torch.Tensor] = None,
+        head_mask: Optional[torch.Tensor] = None,
+        inputs_embeds: Optional[torch.Tensor] = None,
+        labels: Optional[torch.Tensor] = None,
+        output_attentions: Optional[bool] = None,
+        output_hidden_states: Optional[bool] = None,
+        return_dict: Optional[bool] = None,
+    ) -> Union[Tuple[torch.Tensor], SequenceClassifierOutput]:
+        r"""
+        labels (`torch.LongTensor` of shape `(batch_size,)`, *optional*):
+            Labels for computing the sequence classification/regression loss. Indices should be in `[0, ...,
+            config.num_labels - 1]`. If `config.num_labels == 1` a regression loss is computed (Mean-Square loss), If
+            `config.num_labels > 1` a classification loss is computed (Cross-Entropy).
+        """
+        return_dict = return_dict if return_dict is not None else self.config.use_return_dict
+        outputs = self.bert(
+            input_ids,
+            attention_mask=attention_mask,
+            token_type_ids=token_type_ids,
+            position_ids=position_ids,
+            head_mask=head_mask,
+            inputs_embeds=inputs_embeds,
+            output_attentions=output_attentions,
+            output_hidden_states=output_hidden_states,
+            return_dict=return_dict,
+        )
+        output = outputs[0][:,0,:]
+        x = self.dropout_1(output)
+        x = self.linear_1(x)
+        x = self.relu_1(x)
+        x = self.dropout_2(x)
+        x = self.linear_2(x)
+        x = self.relu_2(x)
+        x = self.dropout_3(x)
+        logits = self.classifier(x)
+        logits = self.sigmoid(logits)
+        loss = None
+        if labels is not None:
+            loss_fct = BCELoss(weight=WEIGHT)
+            labels = 1.0*labels
+            loss = loss_fct(logits.view(-1), labels.view(-1))
+        if not return_dict:
+            output = (logits,) + outputs[2:]
+            return ((loss,) + output) if loss is not None else output
+        return SequenceClassifierOutput(
+            loss=loss,
+            logits=logits
+        )

naacl-2021-fudge-controlled-generation/constants.py ADDED Viewed

	@@ -0,0 +1,32 @@

+PAD_TOKEN = '[PAD]'
+EOT_TOKEN = '<|endoftext|>'
+SEP = 50256 # just use the weird eot token
+TOPIC_MODEL_STRING = 'gpt2-medium'
+FORMALITY_MODEL_STRING = 'Helsinki-NLP/opus-mt-es-en'
+DIR_END_SPLIT_POSITIONS = 32
+TOPIC_VAL_SIZE = 100000
+FORMALITY_VAL_SIZE = 2000
+VOCAB_SIZE = 50000
+FORMALITY_MAX_LEN = 200
+GLOVE_PRINT_PROGRESS_FREQ = 1000000
+GLOVE_DIM = 300
+HIDDEN_DIM = 300
+RNN_DIM = 150
+MIN_SENTENCE_LENGTH = 3
+POETRY_LINE_SYLLABLES = 10
+MAX_SYLLABLES_PER_WORD = 10 # no way anything is more
+MAX_COUNT_SYLLABLE_DIST = 10
+MAX_COUNT_SYLLABLE_INPUT_LENGTH = 25 # for just a couplet, shouldn't need more
+COUNT_SYLLABLE_DIM = 100
+UNKNOWN_RHYME_GROUP = 'UNKNOWN_RHYME_GROUP'
+PHRASE_ENDS = '.?!'
+POETRY_BANNED_TOKENS = [198, 50256, 628, 220] # newlines and eos and such

naacl-2021-fudge-controlled-generation/data.py ADDED Viewed

	@@ -0,0 +1,415 @@

+import random
+import math
+import os
+import pickle
+from collections import defaultdict, namedtuple
+import string
+os.environ['TOKENIZERS_PARALLELISM'] = 'false' # turn off since we're using multiple threads for loading anyway
+from transformers import AutoTokenizer, AutoModelWithLMHead, pipeline, set_seed, GPT2Tokenizer, GPT2Model
+import numpy as np
+from tqdm import tqdm
+import torch
+from util import suppress_stdout
+from poetry_util import is_iambic, count_syllables, get_rhymes, get_rhyme_group
+from constants import *
+DatasetInfo = namedtuple('DatasetInfo',
+                ['index2word', 'word2index', 'total_words', 'vocab', 'glove_embeddings'])
+RhymeInfo = namedtuple('RhymeInfo',
+                ['word2rhyme_group', 'rhyme_group_counts', 'rhyme_groups', 'index2rhyme_group', 'rhyme_group2index', 'total_rhyme_groups'])
+def collate(batch):
+    pad_id = batch[0][4]
+    inputs = [b[0] for b in batch]
+    lengths = torch.LongTensor([b[1] for b in batch])
+    max_length = lengths.max()
+    for i in range(len(inputs)):
+        if len(inputs[i]) < max_length:
+            inputs[i] = torch.cat([inputs[i], torch.zeros(max_length - len(inputs[i])).long()], dim=0) # actually 0 is fine as pad since it's masked out
+    inputs = torch.stack(inputs, dim=0)
+    future_words = torch.LongTensor([b[2] for b in batch]).unsqueeze(0).expand(len(batch), -1).clone() # batch x N=batch
+    labels = torch.zeros_like(future_words).long()
+    labels = labels.scatter(1, torch.arange(len(batch)).unsqueeze(1), torch.ones(len(batch)).long().unsqueeze(1)).clone()
+    log_probs = torch.Tensor([b[3] for b in batch])
+    classification_labels = [b[5] for b in batch] # batch
+    if type(classification_labels[0]) == list:
+        for i in range(len(classification_labels)):
+            assert len(classification_labels[i]) == lengths[i]
+            if len(classification_labels[i]) < max_length:
+                classification_labels[i] = torch.cat([torch.LongTensor(classification_labels[i]), -1 + torch.zeros(max_length - len(classification_labels[i])).long()], dim=0)
+            else:
+                classification_labels[i] = torch.LongTensor(classification_labels[i])
+        classification_labels = torch.stack(classification_labels, dim=0) # batch x seq
+    else:
+        assert type(classification_labels[0]) == int
+        classification_labels = torch.LongTensor(classification_labels) # they're just int labels
+    syllables_to_go = torch.LongTensor([b[6] for b in batch])
+    future_word_num_syllables = torch.LongTensor([b[7] for b in batch])
+    rhyme_group_index = torch.LongTensor([b[8] for b in batch])
+    return (inputs, lengths, future_words, log_probs, labels, classification_labels, syllables_to_go, future_word_num_syllables, rhyme_group_index)
+def load_rhyme_info(index2word, vocab):
+    word2rhyme_group = defaultdict(lambda: UNKNOWN_RHYME_GROUP)
+    rhyme_group_counts = defaultdict(lambda: 0)
+    rhyme_groups = set()
+    for word in index2word:
+        try:
+            rhyme_group = get_rhyme_group(word)
+            word2rhyme_group[word] = rhyme_group
+            rhyme_group_counts[rhyme_group] += (vocab[word] if word in vocab else 1) # for rare words not in vocab, just use 1
+            rhyme_groups.add(rhyme_group)
+        except:
+            rhyme_group_counts[UNKNOWN_RHYME_GROUP] += (vocab[word] if word in vocab else 1)
+    index2rhyme_group = [UNKNOWN_RHYME_GROUP] + sorted(list(rhyme_groups))
+    rhyme_group2index = {s: i for i, s in enumerate(index2rhyme_group)}
+    total_rhyme_groups = sum(rhyme_group_counts.values())
+    return RhymeInfo(word2rhyme_group=dict(word2rhyme_group),
+                     rhyme_group_counts=dict(rhyme_group_counts),
+                     rhyme_groups=rhyme_groups,
+                     index2rhyme_group=index2rhyme_group,
+                     rhyme_group2index=rhyme_group2index,
+                     total_rhyme_groups=total_rhyme_groups)
+class Dataset:
+    def __init__(self, args):
+        print('loading data')
+        random.seed(args.seed)
+        self.batch_size = args.batch_size
+        self.data_dir = args.data_dir
+        self.topic = args.task == 'topic'
+        self.formality = args.task == 'formality'
+        self.iambic = args.task == 'iambic'
+        self.rhyme = args.task == 'rhyme'
+        self.newline = args.task == 'newline'
+        self.tokenizer = AutoTokenizer.from_pretrained(FORMALITY_MODEL_STRING if self.formality else TOPIC_MODEL_STRING)
+        self.tokenizer.add_special_tokens({'pad_token': PAD_TOKEN})
+        self.gpt_pad_id = self.tokenizer.encode(PAD_TOKEN)[0] # actually just the vocab size
+        sentences = []
+        self.vocab = defaultdict(lambda: 0)
+        if self.formality:
+            self.vocab['placeholder'] = 1 # anything so we don't crash
+            train, val, test = [], [], []
+            for category, label in [('formal', 1), ('informal', 0)]:
+                with open(os.path.join(args.data_dir, 'train', category), 'r') as rf:
+                    for i, line in enumerate(rf):
+                        if len(line) > FORMALITY_MAX_LEN:
+                            line = ' '.join(line.strip()[:FORMALITY_MAX_LEN].split()[:-1]) # cutoff words until below max len; chosen so only ~20 examples affected in dataset
+                        if i < FORMALITY_VAL_SIZE // 2:
+                            val.append((line.strip(), label))
+                        else:
+                            train.append((line.strip(), label))
+                with open(os.path.join(args.data_dir, 'test', category), 'r') as rf:
+                    for line in rf:
+                        if len(line) > FORMALITY_MAX_LEN:
+                            line = ' '.join(line.strip()[:FORMALITY_MAX_LEN].split()[:-1]) # cutoff words until below max len
+                        test.append((line.strip(), label))
+            self.splits = {}
+            self.splits['train'], self.splits['val'], self.splits['test'] = train, val, test
+        else: # topic / poetry
+            for root, _, filenames in os.walk(args.data_dir):
+                for fname in filenames:
+                    with open(os.path.join(root, fname), 'r') as rf:
+                        for line in rf:
+                            sentences.append(line.strip())
+                            for word in line.strip().split(' '):
+                                self.vocab[word] += 1
+            random.shuffle(sentences)
+            self.splits = {}
+            if args.debug:
+                self.splits['val'] = sentences
+                self.splits['test'] = sentences
+                self.splits['train'] = sentences
+            else:
+                self.splits['val'] = sentences[:TOPIC_VAL_SIZE]
+                self.splits['test'] = sentences[TOPIC_VAL_SIZE:2*TOPIC_VAL_SIZE]
+                self.splits['train'] = sentences[2*TOPIC_VAL_SIZE:]
+        if args.dataset_info is not None:
+            print('loading dataset info from file')
+            with open(args.dataset_info, 'rb') as rf:
+                dataset_info = pickle.load(rf)
+            self.vocab, self.total_words, self.index2word, self.word2index, self.glove_embeddings = \
+                dataset_info.vocab, dataset_info.total_words, dataset_info.index2word, dataset_info.word2index, dataset_info.glove_embeddings
+            self.dataset_info = dataset_info
+        else:
+            print('generating dataset info from scratch')
+            words_values = list(self.vocab.items())
+            words_values = sorted(words_values, key=lambda x: x[1], reverse=True)
+            if args.glove_file is None:
+                print('no glove embeddings given')
+                for word, _ in words_values[VOCAB_SIZE:]: # only use somewhat common tokens
+                    del self.vocab[word]
+                glove_embeddings = None
+            else:
+                print('loading glove embeddings')
+                glove_embeddings = {}
+                with open(args.glove_file, 'r') as rf:
+                    for i, line in enumerate(rf):
+                        if i % GLOVE_PRINT_PROGRESS_FREQ == 0:
+                            print(i)
+                        line = line.strip().split()
+                        if len(line) != GLOVE_DIM + 1:
+                            continue # skip multi-word embeddings which are rare anyway
+                        glove_embeddings[line[0]] = [float(x) for x in line[1:]]
+                for word, _ in words_values:
+                    if word not in glove_embeddings:
+                        del self.vocab[word]
+            self.total_words = sum(self.vocab.values())
+            self.index2word = [PAD_TOKEN] + sorted(list(self.vocab.keys()))
+            self.word2index = {s: i for i, s in enumerate(self.index2word)}
+            self.vocab = dict(self.vocab) # so we can pickle later
+            if glove_embeddings is None:
+                self.glove_embeddings = None
+            else:
+                self.glove_embeddings = torch.stack([torch.zeros(GLOVE_DIM)] + [torch.Tensor(glove_embeddings[word]) for word in self.index2word[1:]], dim=0)
+            self.dataset_info = DatasetInfo(index2word=self.index2word,
+                                            word2index=self.word2index,
+                                            total_words=self.total_words,
+                                            vocab=self.vocab,
+                                            glove_embeddings=self.glove_embeddings)
+        if self.rhyme:
+            if args.rhyme_info is not None:
+                print('loading rhyme info from file')
+                with open(args.rhyme_info, 'rb') as rf:
+                    self.rhyme_info = pickle.load(rf)
+            else:
+                self.rhyme_info = load_rhyme_info(self.index2word, self.vocab)
+            self.word2rhyme_group, self.rhyme_group_counts, self.rhyme_groups, self.index2rhyme_group, self.rhyme_group2index, self.total_rhyme_groups = \
+                    defaultdict(lambda: UNKNOWN_RHYME_GROUP, self.rhyme_info.word2rhyme_group), self.rhyme_info.rhyme_group_counts, self.rhyme_info.rhyme_groups, self.rhyme_info.index2rhyme_group, self.rhyme_info.rhyme_group2index, self.rhyme_info.total_rhyme_groups
+        print('done loading data')
+        print('split sizes:')
+        for key in ['train', 'val', 'test']:
+            print(key, len(self.splits[key]))
+        if not self.formality:
+            print('total words', self.total_words)
+            print('vocab size', len(self.index2word))
+    def shuffle(self, split, seed=None):
+        assert split in ['train', 'val', 'test']
+        if seed is not None:
+            random.seed(seed)
+        random.shuffle(self.splits[split])
+    def loader(self, split, num_workers=20, indices=None):
+        assert split in ['train', 'val', 'test']
+        data = self.splits[split] if indices is None else [self.splits[split][i] for i in indices]
+        return torch.utils.data.DataLoader(SplitLoader(data, self), batch_size=self.batch_size, pin_memory=True, collate_fn=collate, num_workers=num_workers)
+class SplitLoader(torch.utils.data.IterableDataset):
+    def __init__(self, data, parent):
+        super(SplitLoader).__init__()
+        self.data = data
+        self.pos = 0
+        self.parent = parent
+    def __len__(self):
+        return len(self.data)
+    def __iter__(self):
+        return self
+    def __next__(self):
+        increment = 1
+        worker_info = torch.utils.data.get_worker_info()
+        if worker_info is not None: # # in a worker process
+            increment = worker_info.num_workers
+            worker_id = worker_info.id
+            if self.pos == 0:
+                self.pos = worker_id
+        valid = False
+        while not valid:
+            if self.pos >= len(self):
+                raise StopIteration
+            if self.parent.topic:
+                failed = False
+                future_word_num_syllables, rhyme_group_index, syllables_to_go = -1, -1, -1
+                raw_sentence, classification_label = self.data[self.pos], -1
+                original_sentence = raw_sentence.split()
+                sentence = self.parent.tokenizer.encode(raw_sentence, return_tensors='pt')[0]
+                length = len(sentence)
+                min_sentence_length = MIN_SENTENCE_LENGTH
+                if len(sentence) > min_sentence_length: # set to 3. well, everything in data is > 3 for the bag of words task
+                    pos_to_split = random.randint(1, length - 1) # for lm, learn all positions at once
+                    inp = sentence[:pos_to_split]
+                    length = len(inp)
+                    num_words_in_input = len(self.parent.tokenizer.decode(inp).split())
+                    if not failed and num_words_in_input < len(original_sentence):
+                        future_word_position_max = len(original_sentence) - 1
+                        future_word_position = random.randint(num_words_in_input-1, future_word_position_max) # allow the last possibly partial word though
+                        future_word = original_sentence[future_word_position]
+                        unstripped_future_word = future_word
+                        future_word = future_word.strip().strip(string.punctuation) # NOTE: we didn't strip punctuation for the topic bag of words paper experiments for our method. it doesn't make much difference, though.
+                        if not failed and future_word in self.parent.word2index.keys():
+                            word_log_prob = math.log(self.parent.vocab[future_word] / self.parent.total_words) # roughly baseline prob of word under noise model
+                            future_word = self.parent.word2index[future_word]
+                            pad_id = self.parent.gpt_pad_id
+                            example = (inp, length, future_word, word_log_prob, pad_id, classification_label, syllables_to_go, future_word_num_syllables, rhyme_group_index)
+                            valid = not failed
+            elif self.parent.formality:
+                future_word_num_syllables, rhyme_group_index, syllables_to_go = -1, -1, -1
+                raw_sentence, classification_label = self.data[self.pos]
+                original_sentence = raw_sentence.split()
+                sentence = self.parent.tokenizer.encode(raw_sentence, return_tensors='pt')[0]
+                length = len(sentence)
+                min_sentence_length = MIN_SENTENCE_LENGTH
+                if len(sentence) > min_sentence_length: # set to 3. well, everything in data is > 3 for the bag of words task
+                    pos_to_split = length # no need to split; we're going to train on all possible prefixes simultaneously for efficiency
+                    inp = sentence[:pos_to_split]
+                    length = len(inp)
+                    num_words_in_input = len(self.parent.tokenizer.decode(inp).split())
+                    # only look up to 10 words ahead if we're doing count syllables, since we'll filter out anything more than 10 syllables ahead anyway
+                    future_word_position_max = len(original_sentence) - 1
+                    future_word_position = 0
+                    future_word = 'placeholder'
+                    unstripped_future_word = future_word
+                    future_word = future_word.strip().strip(string.punctuation) # NOTE: we didn't strip punctuation for the topic bag of words paper experiments for our method. it doesn't make much difference, though.
+                    word_log_prob, future_word = 0, 0
+                    pad_id = self.parent.gpt_pad_id
+                    example = (inp, length, future_word, word_log_prob, pad_id, classification_label, syllables_to_go, future_word_num_syllables, rhyme_group_index)
+                    valid = True
+            elif self.parent.iambic:
+                failed = False
+                future_word_num_syllables, rhyme_group_index, syllables_to_go = -1, -1, -1
+                raw_sentence, classification_label = self.data[self.pos], -1
+                original_sentence = raw_sentence.split()
+                sentence = self.parent.tokenizer.encode(raw_sentence, return_tensors='pt')[0]
+                length = len(sentence)
+                min_sentence_length = MIN_SENTENCE_LENGTH
+                if len(sentence) > min_sentence_length: # set to 3. well, everything in data is > 3 for the bag of words task
+                    pos_to_split = random.randint(0, length - 1)
+                    # try to get a subseq of exactly 10 syllables
+                    inp = sentence[pos_to_split:]
+                    num_syllables = 0
+                    checked = False
+                    for i in range(1, len(inp)):
+                        decoded = self.parent.tokenizer.decode(inp[:i])
+                        num_syllables = count_syllables(decoded)
+                        if num_syllables > POETRY_LINE_SYLLABLES:
+                            inp = inp[:i-1] # might get a few data points where the split is in the middle of a word, but it should be ok for learning.
+                            last_line_length = i-1
+                            decoded = self.parent.tokenizer.decode(inp)
+                            num_syllables = count_syllables(decoded)
+                            checked = True
+                            break
+                    if not checked or num_syllables != POETRY_LINE_SYLLABLES:
+                        failed = True
+                    length = len(inp)
+                    num_words_in_input = len(self.parent.tokenizer.decode(inp).split())
+                    classification_label = [is_iambic(self.parent.tokenizer.decode(inp)) for _ in range(length)] # predict for whole seq including future
+                    # only look up to 10 words ahead if we're doing count syllables, since we'll filter out anything more than 10 syllables ahead anyway
+                    future_word_position_max = len(original_sentence) - 1
+                    future_word_position = 0
+                    future_word = 'placeholder'
+                    unstripped_future_word = future_word
+                    future_word = future_word.strip().strip(string.punctuation) # NOTE: we didn't strip punctuation for the topic bag of words paper experiments for our method. it doesn't make much difference, though.
+                    if not failed:
+                        word_log_prob, future_word = 0, 0
+                        pad_id = self.parent.gpt_pad_id
+                        example = (inp, length, future_word, word_log_prob, pad_id, classification_label, syllables_to_go, future_word_num_syllables, rhyme_group_index)
+                        valid = not failed
+            elif self.parent.rhyme:
+                failed = False
+                future_word_num_syllables, rhyme_group_index = -1, -1
+                raw_sentence, classification_label = self.data[self.pos], -1
+                original_sentence = raw_sentence.split()
+                sentence = self.parent.tokenizer.encode(raw_sentence, return_tensors='pt')[0]
+                length = len(sentence)
+                min_sentence_length = MIN_SENTENCE_LENGTH
+                if len(sentence) > min_sentence_length: # set to 3. well, everything in data is > 3 for the bag of words task
+                    pos_to_split = random.randint(1, length - 1) # for lm, learn all positions at once
+                    inp = sentence[:pos_to_split]
+                    length = len(inp)
+                    num_words_in_input = len(self.parent.tokenizer.decode(inp).split())
+                    if not failed and num_words_in_input < len(original_sentence):
+                        # only look up to 10 words ahead if we're doing count syllables, since we'll filter out anything more than 10 syllables ahead anyway
+                        future_word_position_max = min(len(original_sentence) - 1, num_words_in_input + MAX_COUNT_SYLLABLE_DIST)
+                        future_word_position = random.randint(num_words_in_input-1, future_word_position_max) # allow the last possibly partial word though
+                        future_word = original_sentence[future_word_position]
+                        unstripped_future_word = future_word
+                        future_word = future_word.strip().strip(string.punctuation) # NOTE: we didn't strip punctuation for the topic bag of words paper experiments for our method. it doesn't make much difference, though.
+                        words_in_between = original_sentence[num_words_in_input-1:future_word_position+1]
+                        syllables_to_go = count_syllables(' '.join(words_in_between))
+                        if syllables_to_go > MAX_COUNT_SYLLABLE_DIST:
+                            failed = True
+                        future_word_num_syllables = count_syllables(future_word)
+                        rhyme_group = self.parent.word2rhyme_group[future_word]
+                        rhyme_group_index = self.parent.rhyme_group2index[rhyme_group]
+                        # truncate context a bit since we're just doing couplets. random length from 1 to max desired length for this purpose.
+                        desired_length = random.randint(1, MAX_COUNT_SYLLABLE_INPUT_LENGTH)
+                        inp = inp[-desired_length:]
+                        length = len(inp)
+                        if not failed and future_word in self.parent.word2index.keys():
+                            word_log_prob = math.log(self.parent.rhyme_group_counts[rhyme_group] / self.parent.total_rhyme_groups)
+                            future_word = rhyme_group_index # future conditioning is just the rhyme group in this case
+                            pad_id = self.parent.gpt_pad_id
+                            example = (inp, length, future_word, word_log_prob, pad_id, classification_label, syllables_to_go, future_word_num_syllables, rhyme_group_index)
+                            valid = not failed
+            elif self.parent.newline:
+                failed = False
+                future_word_num_syllables, rhyme_group_index = -1, -1
+                raw_sentence, classification_label = self.data[self.pos], -1
+                original_sentence = raw_sentence.split()
+                sentence = self.parent.tokenizer.encode(raw_sentence, return_tensors='pt')[0]
+                length = len(sentence)
+                min_sentence_length = MIN_SENTENCE_LENGTH
+                if len(sentence) > min_sentence_length: # set to 3. well, everything in data is > 3 for the bag of words task
+                    pos_to_split = random.randint(1, length - 1) # for lm, learn all positions at once
+                    inp = sentence[:pos_to_split]
+                    while pos_to_split < len(sentence):
+                        if len(self.parent.tokenizer.decode(inp).split()) == len(self.parent.tokenizer.decode(sentence[:pos_to_split + 1]).split()):
+                            pos_to_split += 1
+                            inp = sentence[:pos_to_split]
+                        else:
+                            break
+                    length = len(inp)
+                    num_words_in_input = len(self.parent.tokenizer.decode(inp).split())
+                    if not failed and num_words_in_input < len(original_sentence):
+                        # only look up to 10 words ahead if we're doing count syllables, since we'll filter out anything more than 10 syllables ahead anyway
+                        future_word_position_max = len(original_sentence) - 1
+                        future_word_position = random.randint(num_words_in_input-1, future_word_position_max) # allow the last possibly partial word though
+                        future_word = original_sentence[future_word_position]
+                        unstripped_future_word = future_word
+                        future_word = future_word.strip().strip(string.punctuation) # NOTE: we didn't strip punctuation for the topic bag of words paper experiments for our method. it doesn't make much difference, though.
+                        # future_word = original_sentence[-1] # useful for debugging
+                        words_in_between = original_sentence[num_words_in_input-1:future_word_position+1]
+                        syllables_to_go = count_syllables(' '.join(words_in_between))
+                        if syllables_to_go > MAX_COUNT_SYLLABLE_DIST:
+                            failed = True
+                        # truncate context a bit since we're just doing couplets. random length from 1 to max desired length for this purpose.
+                        desired_length = random.randint(1, MAX_COUNT_SYLLABLE_INPUT_LENGTH)
+                        # desired_length = 10 # useful for debugging
+                        inp = inp[-desired_length:]
+                        length = len(inp)
+                        true_label = 1 if unstripped_future_word.strip()[-1] in PHRASE_ENDS else 0 # common ways to end a phrase
+                        classification_label = [-1 for _ in range(length)]
+                        classification_label[-1] = true_label # only learn at the last position
+                        if not failed and future_word in self.parent.word2index.keys():
+                            word_log_prob = math.log(self.parent.vocab[future_word] / self.parent.total_words) # roughly baseline prob of word under noise model
+                            future_word = self.parent.word2index[future_word]
+                            pad_id = self.parent.gpt_pad_id
+                            example = (inp, length, future_word, word_log_prob, pad_id, classification_label, syllables_to_go, future_word_num_syllables, rhyme_group_index)
+                            valid = not failed
+            else:
+                raise NotImplementedError
+            self.pos += increment
+        return example

naacl-2021-fudge-controlled-generation/eval_formality_metrics.py ADDED Viewed

	@@ -0,0 +1,73 @@

+from argparse import ArgumentParser
+import pickle
+import os
+import math
+import sacrebleu
+import numpy as np
+import torch
+from transformers import AutoTokenizer, AutoModelWithLMHead, pipeline, set_seed, GPT2Tokenizer, GPT2Model, MarianTokenizer, MarianMTModel
+from constants import *
+from model import Model
+from util import save_checkpoint, ProgressMeter, AverageMeter, num_params
+def avg_formality(preds, model, tokenizer, device='cuda'):
+    probs = []
+    for sent in preds:
+        encoded_input = tokenizer.encode(sent, return_tensors='pt').to(device)
+        lengths = torch.LongTensor([encoded_input.shape[1]]).to(device)
+        scores = model(encoded_input, lengths=lengths) # batch x seq
+        score = scores.flatten()[-1].item()
+        probs.append(math.exp(score) / (1 + math.exp(score))) # sigmoided score = prob
+    return np.mean(probs)
+if __name__=='__main__':
+    parser = ArgumentParser()
+    parser.add_argument('--pred', type=str)
+    parser.add_argument('--ref', type=str, nargs='*', help='bleu refs')
+    parser.add_argument('--ckpt', type=str, help='formality classifier')
+    parser.add_argument('--dataset_info', type=str)
+    parser.add_argument('--device', type=str, default='cuda', choices=['cpu', 'cuda'])
+    parser.add_argument('--model_string', type=str, default='Helsinki-NLP/opus-mt-es-en')
+    args = parser.parse_args()
+    # refs = [['The dog bit the man.', 'It was not unexpected.', 'The man bit him first.'],
+    #         ['The dog had bit the man.', 'No one was surprised.', 'The man had bitten the dog.']]
+    # sys = ['The dog bit the man.', "It wasn't surprising.", 'The man had just bitten him.']
+    print('num ref files', len(args.ref))
+    pred = []
+    with open(args.pred, 'r') as rf:
+        for line in rf:
+            pred.append(line.strip())
+    refs = []
+    for ref_file in args.ref:
+        ref = []
+        with open(ref_file, 'r') as rf:
+            for line in rf:
+                ref.append(line.strip())
+        assert len(ref) == len(pred)
+        refs.append(ref)
+    bleu = sacrebleu.corpus_bleu(pred, refs)
+    print('BLEU score:', bleu.score)
+    with open(args.dataset_info, 'rb') as rf:
+        dataset_info = pickle.load(rf)
+    tokenizer = MarianTokenizer.from_pretrained(args.model_string)
+    tokenizer.add_special_tokens({'pad_token': PAD_TOKEN})
+    pad_id = tokenizer.encode(PAD_TOKEN)[0]
+    checkpoint = torch.load(args.ckpt, map_location=args.device)
+    model_args = checkpoint['args']
+    conditioning_model = Model(model_args, pad_id, len(dataset_info.index2word)) # no need to get the glove embeddings when reloading since they're saved in model ckpt anyway
+    conditioning_model.load_state_dict(checkpoint['state_dict'])
+    conditioning_model = conditioning_model.to(args.device)
+    conditioning_model.eval()
+    print("=> loaded checkpoint '{}' (epoch {})"
+            .format(args.ckpt, checkpoint['epoch']))
+    print('num params', num_params(conditioning_model))
+    print('avg formality prob according to model', avg_formality(pred, conditioning_model, tokenizer, device=args.device))

naacl-2021-fudge-controlled-generation/eval_poetry_metrics.py ADDED Viewed

	@@ -0,0 +1,135 @@

+from argparse import ArgumentParser
+import math
+import string
+from tqdm import tqdm
+import numpy as np
+import torch
+import torch.nn.functional as F
+from transformers import AutoTokenizer, AutoModelWithLMHead, AutoModelForSequenceClassification
+from poetry_util import is_iambic, perfect_rhyme_end, count_syllables
+from constants import *
+def conditional_perplexity(prefix, pred, tokenizer, model, device='cuda', sep_losses=False):
+    # calculate perplexity on pred only, conditioned on prefix
+    sentence = prefix + pred
+    sos_token = tokenizer.decode([0])
+    prefix_tensor_input = tokenizer.encode(sos_token + prefix.replace(EOT_TOKEN, ' ').strip(), return_tensors='pt').to(device)
+    full_tensor_input = tokenizer.encode(sos_token + sentence.replace(EOT_TOKEN, ' ').strip(), return_tensors='pt').to(device)
+    if sep_losses:
+        prefix_loss = model(prefix_tensor_input, labels=prefix_tensor_input)[0].sum()
+        full_loss = model(full_tensor_input, labels=full_tensor_input)[0].sum()
+    else:
+        prefix_loss = model(prefix_tensor_input, labels=prefix_tensor_input)[0] * (prefix_tensor_input.shape[1]-1) # neg log prob of prefix
+        full_loss = model(full_tensor_input, labels=full_tensor_input)[0] * (full_tensor_input.shape[1]-1) # neg log prob of full seq
+    pred_loss = full_loss - prefix_loss # neg log prob of preds given prefix
+    avg_pred_loss = pred_loss / (full_tensor_input.shape[1] - prefix_tensor_input.shape[1])
+    return math.exp(avg_pred_loss.item())
+def grammaticality(sentences, tokenizer, model, device='cuda'):
+    with torch.no_grad():
+        total_good = 0
+        for sent in tqdm(sentences, total=len(sentences)):
+            good_prob = F.softmax(model(tokenizer.encode(sent, return_tensors='pt').to(device))[0].flatten(), dim=0)[1]
+            total_good += good_prob
+        return total_good / len(sentences) # avg probability of grammaticality according to model
+def distinctness(sentences):
+    d1 = set()
+    d2 = set()
+    d3 = set()
+    total_words = 0
+    for sentence in sentences:
+        o = sentence.split(' ')
+        total_words += len(o)
+        d1.update(o)
+        for i in range(len(o) - 1):
+            d2.add(o[i] + '_' + o[i+1])
+        for i in range(len(o) - 2):
+            d3.add(o[i] + '_' + o[i+1] + '_' + o[i+2])
+    return len(d1) / total_words, len(d2) / total_words, len(d3) / total_words
+if __name__=='__main__':
+    parser = ArgumentParser()
+    parser.add_argument('--pred_file', type=str)
+    parser.add_argument('--prefix_file', type=str)
+    parser.add_argument('--device', type=str, default='cuda', choices=['cpu', 'cuda'])
+    args = parser.parse_args()
+    preds = []
+    with open(args.pred_file, 'r') as rf:
+        for line in rf:
+            preds.append(line[:-1]) # drop \n but not beginning spaces if any
+    prefixes = []
+    with open(args.prefix_file, 'r') as rf:
+        for line in rf:
+            prefixes.append(line.strip())
+    assert len(prefixes) == len(preds)
+    rhymes = 0
+    iambic = 0
+    ten_syllables = 0
+    end = 0
+    diff_rhymes = 0
+    all_success = 0
+    total = len(prefixes)
+    for prefix, pred in zip(prefixes, preds):
+        if is_iambic(pred):
+            iambic += 1
+        if perfect_rhyme_end(prefix, pred):
+            rhymes += 1
+            if prefix.split()[-1].strip(string.punctuation) != pred.split()[-1].strip(string.punctuation):
+                diff_rhymes += 1
+        if count_syllables(pred) == 10:
+            ten_syllables += 1
+        if pred.strip()[-1] in PHRASE_ENDS:
+            end += 1
+        if is_iambic(pred) and perfect_rhyme_end(prefix, pred) and count_syllables(pred) == 10 and pred.strip()[-1] in PHRASE_ENDS:
+            all_success += 1
+    print('iambic', iambic, 'out of', total, ', frac', iambic / total)
+    print('rhymes', rhymes, 'out of', total, ', frac', rhymes / total)
+    print('end sentence', end, 'out of', total, ', frac', end / total)
+    print('10 syllables', ten_syllables, 'out of', total, ', frac', ten_syllables / total)
+    print('all success', all_success, 'out of', total, ', frac', all_success / total)
+    print('rhymes with diff word', diff_rhymes, 'out of', total, ', frac', diff_rhymes / total)
+    print('distinctness', distinctness(preds))
+    grammar_tokenizer = AutoTokenizer.from_pretrained('textattack/roberta-base-CoLA')
+    grammar_model = AutoModelForSequenceClassification.from_pretrained('textattack/roberta-base-CoLA').to(args.device)
+    grammar_model.eval()
+    print('grammaticality', grammaticality(preds, grammar_tokenizer, grammar_model, device=args.device))
+    perplexities = []
+    eval_tokenizer = AutoTokenizer.from_pretrained('transfo-xl-wt103')
+    eval_model = AutoModelWithLMHead.from_pretrained('transfo-xl-wt103').to(args.device)
+    eval_model.eval()
+    for prefix, pred in zip(prefixes, preds):
+        perplexities.append(conditional_perplexity(prefix, pred, eval_tokenizer, eval_model, device=args.device, sep_losses=True))
+    print('transformer xl perplexity', np.mean(perplexities), '+/-', np.std(perplexities))
+    perplexities = []
+    eval_tokenizer = AutoTokenizer.from_pretrained('openai-gpt')
+    eval_model = AutoModelWithLMHead.from_pretrained('openai-gpt').to(args.device)
+    eval_model.eval()
+    for prefix, pred in zip(prefixes, preds):
+        perplexities.append(conditional_perplexity(prefix, pred, eval_tokenizer, eval_model, device=args.device))
+    print('gpt perplexity', np.mean(perplexities), '+/-', np.std(perplexities))
+    # NOTE: uncomment this section with the path to the Shakespeare-finetuned GPT to evaluate this metric. it's in ckpt/poetry/gpt_finetune_shakespeare.pth.tar.
+    # eval_tokenizer = AutoTokenizer.from_pretrained('openai-gpt')
+    # eval_model = AutoModelWithLMHead.from_pretrained('openai-gpt').to(args.device)
+    # checkpoint = torch.load('***PATH_TO_SHAKESPEARE_FINETUNED_GPT***', map_location=args.device)
+    # mod_dict = {}
+    # for key in checkpoint['state_dict']:
+    #     mod_dict[key.replace('classifier.', '')] = checkpoint['state_dict'][key]
+    # eval_model.load_state_dict(mod_dict)
+    # eval_model.eval()
+    # perplexities = []
+    # for prefix, pred in zip(prefixes, preds):
+    #     perplexities.append(conditional_perplexity(prefix, pred, eval_tokenizer, eval_model, device=args.device))
+    # print('shakespeare finetuned perplexity', np.mean(perplexities), '+/-', np.std(perplexities))

naacl-2021-fudge-controlled-generation/eval_topic_metrics.py ADDED Viewed

	@@ -0,0 +1,134 @@

+import os
+import random
+import time
+import pickle
+import math
+from argparse import ArgumentParser
+from collections import defaultdict
+import string
+import csv
+from tqdm import tqdm
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from transformers import AutoTokenizer, AutoModelWithLMHead, AutoModelForSequenceClassification
+from data import Dataset
+from model import Model
+from util import save_checkpoint, ProgressMeter, AverageMeter, num_params, pad_mask
+from predict import predict
+from constants import *
+def tw_topic_eval(sentences, category, tw_dir, cap=None):
+    # num matches of distinct words
+    words = []
+    with open(os.path.join(tw_dir, category + '.txt'), 'r') as rf:
+        for line in rf:
+            words.append(line.strip().lower())
+    num_match = 0
+    for sent in sentences:
+        sent_match = 0
+        sent = sent.strip().lower().split()
+        sent = [tok.strip(string.punctuation) for tok in sent]
+        for word in words:
+            if word in sent:
+                sent_match += 1
+        if cap is None:
+            num_match += sent_match
+        else:
+            num_match += min(cap, sent_match)
+    return num_match
+def perplexity(sentences, tokenizer, model, device='cuda'):
+    # calculate perplexity
+    with torch.no_grad():
+        ppl = []
+        sos_token = tokenizer.decode([0])
+        for sentence in tqdm(sentences, total=len(sentences)):
+            full_tensor_input = tokenizer.encode(sos_token + sentence.replace(EOT_TOKEN, ' ').strip(), return_tensors='pt').to(device)
+            full_loss = model(full_tensor_input, labels=full_tensor_input)[0].mean()
+            ppl.append(torch.exp(full_loss).flatten().cpu().item())
+    return np.mean(ppl), np.std(ppl)
+def grammaticality(sentences, tokenizer, model, device='cuda'):
+    with torch.no_grad():
+        total_good = 0
+        for sent in tqdm(sentences, total=len(sentences)):
+            good_prob = F.softmax(model(tokenizer.encode(sent, return_tensors='pt').to(device))[0].flatten(), dim=0)[1]
+            total_good += good_prob
+        return total_good / len(sentences) # avg probability of grammaticality according to model
+def distinctness(results):
+    d1, d2, d3 = defaultdict(lambda: set()), defaultdict(lambda: set()), defaultdict(lambda: set())
+    total_words = defaultdict(lambda: 0)
+    for cw, outputs in results.items():
+        for o in outputs:
+            o = o.replace(EOT_TOKEN, ' ').strip().split(' ')
+            o = [str(x) for x in o]
+            total_words[cw] += len(o)
+            d1[cw].update(o)
+            for i in range(len(o) - 1):
+                d2[cw].add(o[i] + ' ' + o[i+1])
+            for i in range(len(o) - 2):
+                d3[cw].add(o[i] + ' ' + o[i+1] + ' ' + o[i+2])
+    return_info = []
+    avg_d1, avg_d2, avg_d3 = 0, 0, 0
+    for cw in total_words.keys():
+        return_info.append((cw, 'DISTINCTNESS', len(d1[cw]) / total_words[cw], len(d2[cw]) / total_words[cw], len(d3[cw]) / total_words[cw]))
+        avg_d1 += len(d1[cw]) / total_words[cw]
+        avg_d2 += len(d2[cw]) / total_words[cw]
+        avg_d3 += len(d3[cw]) / total_words[cw]
+    avg_d1, avg_d2, avg_d3 = avg_d1 / len(total_words.keys()), avg_d2 / len(total_words.keys()), avg_d3 / len(total_words.keys())
+    return return_info, (avg_d1, avg_d2, avg_d3)
+if __name__=='__main__':
+    parser = ArgumentParser()
+    parser.add_argument('--log_file', type=str, required=True, help='where to load results from')
+    parser.add_argument('--tw_dir', type=str, default='test_wordlists', help='test wordlists')
+    parser.add_argument('--batch_size', type=int, default=8, help='max samples at a time')
+    parser.add_argument('--cap_per_example', type=int, default=None, help='max matches to count per sentence')
+    parser.add_argument('--device', type=str, default='cuda', choices=['cpu', 'cuda'])
+    args = parser.parse_args()
+    tw_topic_match_c_total = 0
+    category_totals_c = defaultdict(lambda:0)
+    results = defaultdict(lambda: [])
+    with open(args.log_file, 'r') as rf:
+        data = list(csv.DictReader(rf))
+        for line in data:
+            results[line['category']].append(line['generation'])
+    all_c_sents = []
+    for category, condition_results in results.items():
+        tw_topic_match_c = tw_topic_eval(condition_results, category, args.tw_dir, cap=args.cap_per_example)
+        tw_topic_match_c_total += tw_topic_match_c
+        category_totals_c[category] += tw_topic_match_c
+        all_c_sents += condition_results
+    print('Test wordlist matches (divide by num outputs to get the Success metric):', tw_topic_match_c_total)
+    print('per category:', category_totals_c)
+    dist_info_by_category, dist_overall = distinctness(results)
+    print('Overall avg distinctness:', dist_overall)
+    print('per category:', dist_info_by_category)
+    grammar_tokenizer = AutoTokenizer.from_pretrained('textattack/roberta-base-CoLA')
+    grammar_model = AutoModelForSequenceClassification.from_pretrained('textattack/roberta-base-CoLA').to(args.device)
+    grammar_model.eval()
+    print('grammaticality:', grammaticality(all_c_sents, grammar_tokenizer, grammar_model, device=args.device))
+    eval_tokenizer = AutoTokenizer.from_pretrained('openai-gpt')
+    eval_model = AutoModelWithLMHead.from_pretrained('openai-gpt').to(args.device)
+    eval_model.eval()
+    print('GPT perplexity:', perplexity(all_c_sents, eval_tokenizer, eval_model))
+    eval_tokenizer = AutoTokenizer.from_pretrained('transfo-xl-wt103')
+    eval_model = AutoModelWithLMHead.from_pretrained('transfo-xl-wt103').to(args.device)
+    eval_model.eval()
+    print('TFXL perplexity:', perplexity(all_c_sents, eval_tokenizer, eval_model))

naacl-2021-fudge-controlled-generation/evaluate_clickbait.py ADDED Viewed

	@@ -0,0 +1,200 @@

+import os
+import random
+import time
+import pickle
+import math
+from argparse import ArgumentParser
+from typing import Iterable, List, Optional, Tuple
+from tqdm import tqdm
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from transformers import AutoTokenizer, AutoModelWithLMHead
+from torch import Tensor
+from data import Dataset
+from model import Model
+from util import num_params
+from constants import *
+tokenizer = AutoTokenizer.from_pretrained('google/pegasus-xsum')
+classifier_tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/all-mpnet-base-v2')
+def main(args):
+    with open(args.dataset_info, 'rb') as rf:
+        dataset_info = pickle.load(rf)
+    article_content = """Australian actor Guy Pearce will return for the iconic soap Neighbours finale on August 1 to reprise his role as Mike Young.
+                    Guy, 54, played the troubled Mike from 1986 to 1989, and is now set to make a comeback on the show after 33 years, Metro.co.uk reports.
+                    The star's character arcs explored the implications of domestic abuse, student-teacher relationships and dealing with loss of loved ones.
+                    Speaking to Metro.co.uk, Guy said: 'It is very exciting and surreal at the same time being back on set again, however it feels like coming home.
+                    'It's where it all started for me professionally. I've been asked to come back on occasions over the years and wondered if it was the right thing
+                    to do, but once I knew the show was finishing, I knew I had to do it.'He added that there is 'nothing like being here all together again'
+                    , even though he's had a chance to catch-up with other cast members."""
+    tokenizer.add_special_tokens({'pad_token': PAD_TOKEN})
+    pad_id = tokenizer.encode(PAD_TOKEN)[0]
+    #For loading Clickbait summarizer
+    model = AutoModelWithLMHead.from_pretrained(args.model_string, return_dict=True).to(args.device)
+    model.eval()
+    checkpoint = torch.load(args.ckpt, map_location=args.device)
+    model_args = checkpoint['args']
+    conditioning_model = Model(model_args, pad_id, len(dataset_info.index2word)) # no need to get the glove embeddings when reloading since they're saved in model ckpt anyway
+    conditioning_model.load_state_dict(checkpoint['state_dict'])
+    conditioning_model = conditioning_model.to(args.device)
+    conditioning_model.eval()
+    print("=> loaded checkpoint '{}' (epoch {})"
+            .format(args.ckpt, checkpoint['epoch']))
+    print('num params', num_params(conditioning_model))
+    while True:
+        results = generate_clickbait(model,
+                        tokenizer,
+                        conditioning_model,
+                        [args.input_text],
+                        dataset_info,
+                        precondition_topk=args.precondition_topk,
+                        do_sample=args.do_sample,
+                        length_cutoff=args.length_cutoff,
+                        condition_lambda=args.condition_lambda,
+                        article_content=article_content,
+                        device=args.device)
+        # print(results)
+        import pdb; pdb.set_trace()
+def generate_clickbait(model,
+                        tokenizer,
+                        conditioning_model,
+                        input_text,
+                        dataset_info,
+                        precondition_topk,
+                        length_cutoff,
+                        condition_lambda=1.0,
+                        article_content=None,
+                        device='cuda'):
+    with torch.no_grad():
+        batch_size = len(input_text)
+        # encoded_input_article = [tokenizer.encode(article_content, return_tensors='pt',add_special_tokens=False).to(device)] # batch x seq
+        encoded_input_article = tokenizer(article_content, return_tensors='pt',add_special_tokens=False, max_length=512).to(device) # batch x seq
+        # encoded_input_article = torch.cat(encoded_input_article, dim=0)
+        # attention_mask = encoded_input_article.new_ones(encoded_input_article.shape).to(device)
+        # CHANGE=ko
+        encoded_input = tokenizer('<pad>', return_tensors='pt',add_special_tokens=False).to(device) # batch x seq
+        # encoded_input = tokenizer('<pad>'+ input_text[0], return_tensors='pt',add_special_tokens=False).to(device) # batch x seq
+        # encoded_input = torch.cat(encoded_input, dim=0)
+        encoded_input = encoded_input['input_ids']
+        lengths = torch.LongTensor([encoded_input.shape[1]]).to(device)
+        # lengths = 1
+        past = None
+        use_cache = True
+        # CHANGE
+        # model_kwargs = {'encoder_outputs': model.get_encoder()(encoded_input_article, attention_mask=attention_mask)}
+        # print(encoded_input_article)
+        # print(encoded_input_article['input_ids'].shape, encoded_input_article['attention_mask'].shape)
+        model_kwargs = {'encoder_outputs': model.get_encoder()(input_ids=encoded_input_article['input_ids'],
+                                                            attention_mask=encoded_input_article['attention_mask'],
+                                                            return_dict=True,
+                                                            output_attentions=False,
+                                                            output_hidden_states=False),
+                        }
+        while lengths.max() < length_cutoff:
+            model_inputs = model.prepare_inputs_for_generation(
+                input_ids = encoded_input_article['input_ids'],
+                decoder_input_ids=encoded_input,
+                # past=past,
+                attention_mask=encoded_input_article['attention_mask'],
+                use_cache=use_cache,
+                **model_kwargs
+            )
+            outputs = model(**model_inputs, return_dict=True)
+            logits = outputs.logits[:, -1, :]
+            if "past_key_values" in outputs:
+                model_kwargs["past"] = outputs.past_key_values
+            # logits = model(encoded_input)[0][:, -1, :] # batch x vocab
+            top_logits, top_indices = logits.topk(precondition_topk, dim=1) # batch x topk
+            new_input_candidates = torch.cat([encoded_input.unsqueeze(1).expand(-1, precondition_topk, -1), top_indices.unsqueeze(2)], dim=2) # batch x topk x seq+1
+            expanded_lengths = (lengths + 1).unsqueeze(1).expand(batch_size, precondition_topk) # batch x topk
+            if condition_lambda == 0:
+                condition_logits = torch.zeros_like(top_logits).float()
+                condition_logits = condition_logits.view(batch_size, precondition_topk, -1) # batch x topk x N
+            else:
+                decoded_outputs = tokenizer.batch_decode(new_input_candidates.view(-1, new_input_candidates.size(-1)), clean_up_tokenization_spaces=False)
+                resulting_tokenization = classifier_tokenizer(decoded_outputs, add_special_tokens=False, padding='longest')
+                encoded_with_classifier = resulting_tokenization['input_ids']
+                attention_mask = torch.tensor(resulting_tokenization['attention_mask']).to(model.device)
+                tplus1_candidates_classifier = torch.tensor(encoded_with_classifier).view(batch_size, precondition_topk, -1).to(model.device)
+                condition_logits = conditioning_model(tplus1_candidates_classifier.flatten(0, 1), # batch*topk x seq+1
+                                                    expanded_lengths.flatten(0, 1), # batch*topk
+                                                    None,
+                                                    None,
+                                                    None,
+                                                    attention_mask=attention_mask
+                )
+                condition_logits = condition_logits.view(batch_size, precondition_topk, -1) # batch x topk x N
+                condition_logits = condition_logits - torch.log(1 + torch.exp(condition_logits)) # get correct log probs
+            condition_logits = torch.mean(condition_logits, dim=2)
+            full_logits = top_logits + condition_logits * condition_lambda # batch x topk
+            post_logits, post_indices = full_logits.topk(precondition_topk, dim=1)
+            post_probs = F.softmax(post_logits, dim=1)
+            # index_into_top_indices = post_indices[torch.arange(batch_size).to(post_indices.device), torch.multinomial(post_probs, 1).flatten()] # batch
+            index_into_top_indices = post_indices[:, torch.multinomial(post_probs, 1).flatten()] # batch
+            # next_indices = top_indices[torch.arange(batch_size).to(top_indices.device), index_into_top_indices] # batch
+            next_indices = top_indices[:, index_into_top_indices] # batch
+            # encoded_input = torch.cat([encoded_input, next_indices.unsqueeze(1)], dim=1) # batch x seq+1
+            encoded_input = torch.cat([encoded_input, next_indices.squeeze(1)], dim=1)
+            lengths = lengths + 1 # batch
+#             print(tokenizer.decode(encoded_input[0], add_special_tokens=False))
+        return [tokenizer.decode(s) for s in encoded_input]
+if __name__=='__main__':
+    parser = ArgumentParser()
+    # DATA
+    parser.add_argument('--ckpt', type=str, required=True)
+    parser.add_argument('--dataset_info', type=str, required=True, help='saved dataset info')
+    parser.add_argument('--model_string', type=str, default='Helsinki-NLP/opus-mt-es-en')
+    parser.add_argument('--in_file', type=str, default=None, required=True, help='text to run pred on')
+    parser.add_argument('--precondition_topk', type=int, default=200, help='consider top k outputs from text generation at each step before conditioning and re-pruning')
+    parser.add_argument('--do_sample', action='store_true', default=False, help='sample instead of greedy')
+    parser.add_argument('--condition_lambda', type=float, default=1.0, help='lambda weight on conditioning model')
+    parser.add_argument('--length_cutoff', type=int, default=512, help='max length')
+    parser.add_argument('--seed', type=int, default=1, help='random seed')
+    parser.add_argument('--device', type=str, default='cuda', choices=['cpu', 'cuda'])
+    parser.add_argument('--debug', action='store_true', default=False)
+    args = parser.parse_args()
+    random.seed(args.seed)
+    np.random.seed(args.seed)
+    torch.manual_seed(args.seed)
+    main(args)

naacl-2021-fudge-controlled-generation/evaluate_formality.py ADDED Viewed

	@@ -0,0 +1,104 @@

+import os
+import random
+import time
+import pickle
+import math
+from argparse import ArgumentParser
+from collections import namedtuple
+from tqdm import tqdm
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from transformers import AutoTokenizer, AutoModelWithLMHead, pipeline, set_seed, GPT2Tokenizer, GPT2Model, MarianTokenizer, MarianMTModel
+from data import Dataset
+from model import Model
+from util import save_checkpoint, ProgressMeter, AverageMeter, num_params
+from constants import *
+from predict_formality import predict_formality
+def main(args):
+    with open(args.dataset_info, 'rb') as rf:
+        dataset_info = pickle.load(rf)
+    tokenizer = MarianTokenizer.from_pretrained(args.model_string)
+    tokenizer.add_special_tokens({'pad_token': PAD_TOKEN})
+    pad_id = tokenizer.encode(PAD_TOKEN)[0]
+    model = MarianMTModel.from_pretrained(args.model_string, return_dict=True).to(args.device)
+    if args.model_path is not None:
+        if os.path.isdir(args.model_path):
+            for _, _, files in os.walk(args.model_path):
+                for fname in files:
+                    if fname.endswith('.ckpt'):
+                        args.model_path = os.path.join(args.model_path, fname)
+                        break
+        ckpt = torch.load(args.model_path, map_location=torch.device(args.device))
+        try:
+            model.load_state_dict(ckpt['state_dict'], strict=False)
+        except:
+            state_dict = {}
+            for key in ckpt['state_dict'].keys():
+                assert key.startswith('model.')
+                state_dict[key[6:]] = ckpt['state_dict'][key]
+            model.load_state_dict(state_dict)
+    model.eval()
+    checkpoint = torch.load(args.ckpt, map_location=args.device)
+    model_args = checkpoint['args']
+    conditioning_model = Model(model_args, pad_id, len(dataset_info.index2word)) # no need to get the glove embeddings when reloading since they're saved in model ckpt anyway
+    conditioning_model.load_state_dict(checkpoint['state_dict'])
+    conditioning_model = conditioning_model.to(args.device)
+    conditioning_model.eval()
+    if args.verbose:
+        print("=> loaded checkpoint '{}' (epoch {})"
+                .format(args.ckpt, checkpoint['epoch']))
+        print('num params', num_params(conditioning_model))
+    inputs = []
+    with open(args.in_file, 'r') as rf:
+        for line in rf:
+            inputs.append(line.strip())
+    for inp in tqdm(inputs, total=len(inputs)):
+        results = predict_formality(model,
+                        tokenizer,
+                        conditioning_model,
+                        [inp],
+                        dataset_info,
+                        precondition_topk=args.precondition_topk,
+                        do_sample=args.do_sample,
+                        length_cutoff=args.length_cutoff,
+                        condition_lambda=args.condition_lambda,
+                        device=args.device)
+        print(results[0])
+if __name__=='__main__':
+    parser = ArgumentParser()
+    # DATA
+    parser.add_argument('--ckpt', type=str, required=True)
+    parser.add_argument('--dataset_info', type=str, required=True, help='saved dataset info')
+    parser.add_argument('--model_string', type=str, default='Helsinki-NLP/opus-mt-es-en')
+    parser.add_argument('--model_path', type=str, default=None)
+    parser.add_argument('--in_file', type=str, default=None, required=True, help='file containing text to run pred on')
+    parser.add_argument('--precondition_topk', type=int, default=200, help='consider top k outputs from gpt at each step before conditioning and re-pruning')
+    parser.add_argument('--do_sample', action='store_true', default=False, help='sample or greedy; only greedy implemented')
+    parser.add_argument('--condition_lambda', type=float, default=1.0, help='lambda weight on conditioning model')
+    parser.add_argument('--length_cutoff', type=int, default=512, help='max length')
+    parser.add_argument('--seed', type=int, default=1, help='random seed')
+    parser.add_argument('--device', type=str, default='cuda', choices=['cpu', 'cuda'])
+    parser.add_argument('--debug', action='store_true', default=False)
+    parser.add_argument('--verbose', action='store_true', default=False)
+    args = parser.parse_args()
+    random.seed(args.seed)
+    np.random.seed(args.seed)
+    torch.manual_seed(args.seed)
+    main(args)

naacl-2021-fudge-controlled-generation/evaluate_poetry.py ADDED Viewed

	@@ -0,0 +1,115 @@

+import os
+import random
+import time
+import pickle
+import math
+from argparse import ArgumentParser
+import string
+from collections import defaultdict
+from tqdm import tqdm
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from transformers import AutoTokenizer, AutoModelWithLMHead, pipeline, set_seed, GPT2Tokenizer, GPT2Model
+from data import Dataset, load_rhyme_info
+from model import Model
+from util import save_checkpoint, ProgressMeter, AverageMeter, num_params
+from constants import *
+from poetry_util import get_rhymes, count_syllables
+from predict_poetry import predict_couplet
+def main(args):
+    with open(args.dataset_info, 'rb') as rf:
+        dataset_info = pickle.load(rf)
+    gpt_tokenizer = AutoTokenizer.from_pretrained(args.model_string)
+    gpt_tokenizer.add_special_tokens({'pad_token': PAD_TOKEN})
+    gpt_pad_id = gpt_tokenizer.encode(PAD_TOKEN)[0]
+    gpt_model = AutoModelWithLMHead.from_pretrained(args.model_string).to(args.device)
+    gpt_model.eval()
+    checkpoint = torch.load(args.iambic_ckpt, map_location=args.device)
+    model_args = checkpoint['args']
+    iambic_model = Model(model_args, gpt_pad_id, len(dataset_info.index2word)) # no need to get the glove embeddings when reloading since they're saved in model ckpt anyway
+    iambic_model.load_state_dict(checkpoint['state_dict'])
+    iambic_model = iambic_model.to(args.device)
+    iambic_model.eval()
+    if args.verbose:
+        print("=> loaded checkpoint '{}' (epoch {})"
+                .format(args.iambic_ckpt, checkpoint['epoch']))
+        print('iambic model num params', num_params(iambic_model))
+    with open(args.rhyme_info, 'rb') as rf:
+        rhyme_info = pickle.load(rf)
+    checkpoint = torch.load(args.rhyme_ckpt, map_location=args.device)
+    model_args = checkpoint['args']
+    rhyme_model = Model(model_args, gpt_pad_id, len(dataset_info.index2word), rhyme_group_size=len(rhyme_info.index2rhyme_group), verbose=args.verbose) # no need to get the glove embeddings when reloading since they're saved in model ckpt anyway
+    rhyme_model.load_state_dict(checkpoint['state_dict'])
+    rhyme_model = rhyme_model.to(args.device)
+    rhyme_model.eval()
+    if args.verbose:
+        print("=> loaded checkpoint '{}' (epoch {})"
+                .format(args.rhyme_ckpt, checkpoint['epoch']))
+        print('rhyme model num params', num_params(rhyme_model))
+    checkpoint = torch.load(args.newline_ckpt, map_location=args.device)
+    model_args = checkpoint['args']
+    newline_model = Model(model_args, gpt_pad_id, len(dataset_info.index2word)) # no need to get the glove embeddings when reloading since they're saved in model ckpt anyway
+    newline_model.load_state_dict(checkpoint['state_dict'])
+    newline_model = newline_model.to(args.device)
+    newline_model.eval()
+    if args.verbose:
+        print("=> loaded checkpoint '{}' (epoch {})"
+                .format(args.newline_ckpt, checkpoint['epoch']))
+        print('iambic model num params', num_params(newline_model))
+    with open(args.prefix_file, 'r') as rf:
+        lines = rf.readlines()
+    for line in tqdm(lines, total=len(lines)):
+        couplet = predict_couplet(gpt_model,
+                gpt_tokenizer,
+                iambic_model,
+                rhyme_model,
+                newline_model,
+                [line],
+                dataset_info,
+                rhyme_info,
+                args.precondition_topk,
+                args.topk,
+                condition_lambda=args.condition_lambda,
+                device=args.device)
+        assert len(couplet) == 2
+        print(couplet[1].strip().replace('\n', ''))
+if __name__=='__main__':
+    parser = ArgumentParser()
+    # DATA
+    parser.add_argument('--iambic_ckpt', type=str, required=True)
+    parser.add_argument('--rhyme_ckpt', type=str, required=True)
+    parser.add_argument('--newline_ckpt', type=str, required=True)
+    parser.add_argument('--dataset_info', type=str, required=True, help='saved dataset info')
+    parser.add_argument('--rhyme_info', type=str, required=True, help='saved rhyme info')
+    parser.add_argument('--model_string', type=str, default='gpt2-medium')
+    parser.add_argument('--prefix_file', type=str, default=None, required=True, help='file of prefix lines for couplets')
+    parser.add_argument('--precondition_topk', type=int, default=200, help='consider top k outputs from gpt at each step before conditioning and re-pruning')
+    parser.add_argument('--topk', type=int, default=10, help='consider top k outputs from gpt at each step')
+    parser.add_argument('--condition_lambda', type=float, default=1.0, help='lambda weight on conditioning model')
+    parser.add_argument('--seed', type=int, default=1, help='random seed')
+    parser.add_argument('--device', type=str, default='cuda', choices=['cpu', 'cuda'])
+    parser.add_argument('--debug', action='store_true', default=False)
+    parser.add_argument('--verbose', action='store_true', default=False)
+    args = parser.parse_args()
+    random.seed(args.seed)
+    np.random.seed(args.seed)
+    torch.manual_seed(args.seed)
+    main(args)

naacl-2021-fudge-controlled-generation/evaluate_topic.py ADDED Viewed

	@@ -0,0 +1,143 @@

+import os
+import random
+import time
+import pickle
+import math
+from argparse import ArgumentParser
+from collections import defaultdict
+import string
+import csv
+from tqdm import tqdm
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from transformers import AutoTokenizer, AutoModelWithLMHead, pipeline, set_seed, GPT2Tokenizer, GPT2Model
+from data import Dataset
+from model import Model
+from util import save_checkpoint, ProgressMeter, AverageMeter, num_params, pad_mask
+from predict_topic import predict
+from constants import *
+def main(args):
+    with open(args.dataset_info, 'rb') as rf:
+        dataset_info = pickle.load(rf)
+    gpt_tokenizer = AutoTokenizer.from_pretrained(args.model_string)
+    gpt_tokenizer.add_special_tokens({'pad_token': PAD_TOKEN})
+    gpt_pad_id = gpt_tokenizer.encode(PAD_TOKEN)[0]
+    gpt_model = AutoModelWithLMHead.from_pretrained(args.model_string).to(args.device)
+    gpt_model.eval()
+    checkpoint = torch.load(args.ckpt, map_location=args.device)
+    model_args = checkpoint['args']
+    conditioning_model = Model(model_args, gpt_pad_id, len(dataset_info.index2word)) # no need to get the glove embeddings when reloading since they're saved in model ckpt anyway
+    conditioning_model.load_state_dict(checkpoint['state_dict'])
+    conditioning_model = conditioning_model.to(args.device)
+    conditioning_model.eval()
+    if args.verbose:
+        print("=> loaded checkpoint '{}' (epoch {})"
+                .format(args.ckpt, checkpoint['epoch']))
+        print('num params', num_params(conditioning_model))
+    input_texts, conditions, categories = [], [], []
+    if args.condition_file is not None:
+        with open(args.condition_file, 'r') as rf:
+            for line in rf:
+                input_texts.append(line.strip().split('\t')[0])
+                conditions.append(line.strip().split('\t')[1])
+                categories.append(None)
+                for cw in conditions[-1].split():
+                    assert cw in dataset_info.word2index
+    else:
+        prefixes = []
+        with open(args.prefix_file, 'r') as rf:
+            for line in rf:
+                prefixes.append(line.strip())
+        condition_wordlists = []
+        for root, _, files in os.walk(args.wordlist_dir):
+            for fname in files:
+                words = []
+                with open(os.path.join(root, fname), 'r') as rf:
+                    for line in rf:
+                        word = line.strip()
+                        if word in dataset_info.word2index:
+                            words.append(word)
+                        else:
+                            if args.verbose:
+                                print('word not found:', word)
+                condition_wordlists.append((' '.join(words), fname.split('.')[0]))
+        for p in prefixes:
+            for c, category in condition_wordlists:
+                input_texts.append(p)
+                conditions.append(c)
+                categories.append(category)
+    all_cr = []
+    pair_num = 0
+    for input_text, condition_words, category in tqdm(zip(input_texts, conditions, categories), total=len(conditions)):
+        predict_function = predict
+        condition_results = []
+        for i in range(0, args.sample_size, args.max_sample_batch):
+            num_samples = min(args.max_sample_batch, args.sample_size - i)
+            condition_results += predict_function(gpt_model,
+                            gpt_tokenizer,
+                            conditioning_model,
+                            [input_text for _ in range(num_samples)],
+                            condition_words,
+                            dataset_info,
+                            args.precondition_topk,
+                            args.topk,
+                            args.length_cutoff,
+                            condition_lambda=args.condition_lambda,
+                            device=args.device)
+        all_cr.append((input_text, category, condition_results))
+        pair_num += 1
+        if args.max_pairs > 0 and pair_num >= args.max_pairs:
+            break
+    with open(args.log_file, 'w') as wf:
+        writer = csv.DictWriter(wf, fieldnames=['category', 'input_text', 'generation'])
+        writer.writeheader()
+        for cr_group in all_cr:
+            for cr in cr_group[2]:
+                writer.writerow({'category': cr_group[1], 'input_text': cr_group[0], 'generation': cr})
+if __name__=='__main__':
+    parser = ArgumentParser()
+    # DATA
+    parser.add_argument('--ckpt', type=str, required=True)
+    parser.add_argument('--log_file', type=str, required=True, help='file to write outputs to (csv format)')
+    parser.add_argument('--dataset_info', type=str, required=True, help='saved dataset info')
+    parser.add_argument('--model_string', type=str, default='gpt2-medium')
+    parser.add_argument('--condition_file', type=str, default=None, help='file of inputs and conditions')
+    parser.add_argument('--prefix_file', type=str, default=None, help='prefix set')
+    parser.add_argument('--wordlist_dir', type=str, default=None, help='dir of bow wordlists for categories')
+    parser.add_argument('--sample_size', type=int, default=3, help='samples per input text-condition pair')
+    parser.add_argument('--max_sample_batch', type=int, default=3, help='max samples at a time')
+    parser.add_argument('--max_pairs', type=int, default=-1, help='max input-condition pairs, for debugging quickly')
+    parser.add_argument('--precondition_topk', type=int, default=200, help='consider top k outputs from gpt at each step before conditioning and re-pruning')
+    parser.add_argument('--topk', type=int, default=10, help='consider top k outputs from gpt at each step')
+    parser.add_argument('--condition_lambda', type=float, default=1.0, help='lambda weight on conditioning model')
+    parser.add_argument('--length_cutoff', type=int, default=80, help='max length')
+    parser.add_argument('--seed', type=int, default=1, help='random seed')
+    parser.add_argument('--device', type=str, default='cuda', choices=['cpu', 'cuda'])
+    parser.add_argument('--debug', action='store_true', default=False)
+    parser.add_argument('--verbose', action='store_true', default=False)
+    args = parser.parse_args()
+    assert (args.condition_file is not None) != (args.prefix_file is not None and args.wordlist_dir is not None) # one of two interfaces for specifying
+    random.seed(args.seed)
+    np.random.seed(args.seed)
+    torch.manual_seed(args.seed)
+    main(args)

naacl-2021-fudge-controlled-generation/formality_data/README.md ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ `fisher_test_oracle.es` is the source-side Spanish test set.
2	+ `test_noid.cleaned_0` and `test_noid.cleaned_1` are Salesky 2019's fluent English test-time references.

naacl-2021-fudge-controlled-generation/formality_data/fisher_test_oracle.es ADDED Viewed

The diff for this file is too large to render. See raw diff

naacl-2021-fudge-controlled-generation/formality_data/test.noid.cleaned_0 ADDED Viewed

The diff for this file is too large to render. See raw diff

naacl-2021-fudge-controlled-generation/formality_data/test.noid.cleaned_1 ADDED Viewed

The diff for this file is too large to render. See raw diff

naacl-2021-fudge-controlled-generation/main.py ADDED Viewed

	@@ -0,0 +1,192 @@

+import os
+import random
+import time
+import pickle
+import math
+from argparse import ArgumentParser
+from tqdm import tqdm
+import numpy as np
+import torch
+import torch.nn as nn
+from data import Dataset
+from model import Model
+from util import save_checkpoint, ProgressMeter, AverageMeter, num_params, pad_mask
+from constants import *
+def train(model, dataset, optimizer, criterion, epoch, args, data_start_index):
+    model.train()
+    if data_start_index == 0:
+        dataset.shuffle('train', seed=epoch + args.seed)
+    if args.epoch_max_len is not None:
+        data_end_index = min(data_start_index + args.epoch_max_len, len(dataset.splits['train']))
+        loader = dataset.loader('train', num_workers=args.num_workers, indices=list(range(data_start_index, data_end_index)))
+        data_start_index = data_end_index if data_end_index < len(dataset.splits['train']) else 0
+    else:
+        loader = dataset.loader('train', num_workers=args.num_workers)
+    loss_meter = AverageMeter('loss', ':6.4f')
+    total_length = len(loader)
+    progress = ProgressMeter(total_length, [loss_meter], prefix='Training: ')
+    for batch_num, batch in enumerate(tqdm(loader, total=len(loader))):
+        batch = [tensor.to(args.device) for tensor in batch]
+        inputs, lengths, future_words, log_probs, labels, classification_targets, syllables_to_go, future_word_num_syllables, rhyme_group_index = batch
+        if args.task not in ['formality', 'iambic']:
+            if not args.debug and len(inputs) != args.batch_size: # it'll screw up the bias...?
+                continue
+        scores = model(inputs, lengths, future_words, log_probs, syllables_to_go, future_word_num_syllables, rhyme_group_index, run_classifier=True)
+        if args.task == 'formality': # we're learning for all positions at once. scores are batch x seq
+            expanded_labels = classification_targets.unsqueeze(1).expand(-1, scores.shape[1]) # batch x seq
+            length_mask = pad_mask(lengths).permute(1, 0) # batch x seq
+            loss = criterion(scores.flatten()[length_mask.flatten()==1], expanded_labels.flatten().float()[length_mask.flatten()==1])
+        elif args.task in ['iambic', 'newline']:
+            use_indices = classification_targets.flatten() != -1
+            loss = criterion(scores.flatten()[use_indices], classification_targets.flatten().float()[use_indices])
+        else: # topic, rhyme
+            loss = criterion(scores.flatten(), labels.flatten().float())
+        optimizer.zero_grad()
+        loss.backward()
+        optimizer.step()
+        loss_meter.update(loss.detach(), len(labels))
+        if batch_num % args.train_print_freq == 0:
+            progress.display(batch_num)
+    progress.display(total_length)
+    return data_start_index
+def validate(model, dataset, criterion, epoch, args):
+    model.eval()
+    random.seed(0)
+    loader = dataset.loader('val', num_workers=args.num_workers)
+    loss_meter = AverageMeter('loss', ':6.4f')
+    total_length = len(loader)
+    progress = ProgressMeter(total_length, [loss_meter], prefix='Validation: ')
+    with torch.no_grad():
+        for batch_num, batch in enumerate(tqdm(loader, total=len(loader))):
+            batch = [tensor.to(args.device) for tensor in batch]
+            inputs, lengths, future_words, log_probs, labels, classification_targets, syllables_to_go, future_word_num_syllables, rhyme_group_index = batch
+            if args.task not in ['formality', 'iambic']: # topic predictor
+                if not args.debug and len(inputs) != args.batch_size:
+                    continue
+            scores = model(inputs, lengths, future_words, log_probs, syllables_to_go, future_word_num_syllables, rhyme_group_index, run_classifier=True)
+            if args.task == 'formality': # we're learning for all positions at once. scores are batch x seq
+                expanded_labels = classification_targets.unsqueeze(1).expand(-1, scores.shape[1]) # batch x seq
+                length_mask = pad_mask(lengths).permute(1, 0) # batch x seq
+                loss = criterion(scores.flatten()[length_mask.flatten()==1], expanded_labels.flatten().float()[length_mask.flatten()==1])
+            elif args.task in ['iambic', 'newline']:
+                use_indices = classification_targets.flatten() != -1
+                loss = criterion(scores.flatten()[use_indices], classification_targets.flatten().float()[use_indices])
+            else: # topic, rhyme
+                loss = criterion(scores.flatten(), labels.flatten().float())
+            loss_meter.update(loss.detach(), len(labels))
+            if batch_num % args.train_print_freq == 0:
+                progress.display(batch_num)
+    progress.display(total_length)
+    return loss_meter.avg
+def main(args):
+    dataset = Dataset(args)
+    os.makedirs(args.save_dir, exist_ok=True)
+    with open(os.path.join(args.save_dir, 'dataset_info'), 'wb') as wf:
+        pickle.dump(dataset.dataset_info, wf)
+    if args.task == 'rhyme':
+        with open(os.path.join(args.save_dir, 'rhyme_info'), 'wb') as wf:
+            pickle.dump(dataset.rhyme_info, wf)
+    if args.ckpt:
+        checkpoint = torch.load(args.ckpt, map_location=args.device)
+        start_epoch = checkpoint['epoch'] + 1
+        best_val_metric = checkpoint['best_metric']
+        model_args = checkpoint['args']
+        model = Model(model_args, dataset.gpt_pad_id, len(dataset.index2word), rhyme_group_size=len(dataset.index2rhyme_group) if args.task == 'rhyme' else None) # no need to get the glove embeddings when reloading since they're saved in model ckpt anyway
+        model.load_state_dict(checkpoint['state_dict'])
+        model = model.to(args.device)
+        optimizer = torch.optim.Adam(model.parameters(), lr=model_args.lr)
+        optimizer.load_state_dict(checkpoint['optimizer'])
+        data_start_index = checkpoint['data_start_index']
+        print("=> loaded checkpoint '{}' (epoch {})"
+                .format(args.ckpt, checkpoint['epoch']))
+        # NOTE: just import pdb after loading the model here if you want to play with it, it's easy
+        # model.eval()
+        # import pdb; pdb.set_trace()
+    else:
+        model = Model(args, dataset.gpt_pad_id, len(dataset.index2word), rhyme_group_size=len(dataset.index2rhyme_group) if args.task == 'rhyme' else None, glove_embeddings=dataset.glove_embeddings)
+        model = model.to(args.device)
+        optimizer = torch.optim.Adam(model.parameters(), lr=args.lr)
+        best_val_metric = 1e8 # lower is better for BCE
+        data_start_index = 0
+    print('num params', num_params(model))
+    criterion = nn.BCEWithLogitsLoss().to(args.device)
+    if args.evaluate:
+        epoch = 0
+        validate(model, dataset, criterion, epoch, args)
+        return
+    for epoch in range(args.epochs):
+        print("TRAINING: Epoch {} at {}".format(epoch, time.ctime()))
+        data_start_index = train(model, dataset, optimizer, criterion, epoch, args, data_start_index)
+        if epoch % args.validation_freq == 0:
+            print("VALIDATION: Epoch {} at {}".format(epoch, time.ctime()))
+            metric = validate(model, dataset, criterion, epoch, args)
+            if not args.debug:
+                if metric < best_val_metric:
+                    print('new best val metric', metric)
+                    best_val_metric = metric
+                    save_checkpoint({
+                        'epoch': epoch,
+                        'state_dict': model.state_dict(),
+                        'best_metric': best_val_metric,
+                        'optimizer': optimizer.state_dict(),
+                        'data_start_index': data_start_index,
+                        'args': args
+                    }, os.path.join(args.save_dir, 'model_best.pth.tar'))
+                save_checkpoint({
+                    'epoch': epoch,
+                    'state_dict': model.state_dict(),
+                    'best_metric': metric,
+                    'optimizer': optimizer.state_dict(),
+                    'data_start_index': data_start_index,
+                    'args': args
+                }, os.path.join(args.save_dir, 'model_epoch' + str(epoch) + '.pth.tar'))
+if __name__=='__main__':
+    parser = ArgumentParser()
+    # DATA
+    parser.add_argument('--task', type=str, required=True, choices=['iambic', 'rhyme', 'newline', 'topic', 'formality', 'clickbait'])
+    parser.add_argument('--data_dir', type=str, required=True)
+    parser.add_argument('--glove_file', type=str, help='glove embedding init, for topic task')
+    # SAVE/LOAD
+    parser.add_argument('--save_dir', type=str, required=True, help='where to save ckpts')
+    parser.add_argument('--ckpt', type=str, default=None, help='load ckpt from file if given')
+    parser.add_argument('--dataset_info', type=str, help='saved dataset info')
+    parser.add_argument('--rhyme_info', type=str, help='saved dataset rhyme info, for a ckpt with task==rhyme')
+    # TRAINING
+    parser.add_argument('--batch_size', type=int, default=128)
+    parser.add_argument('--epochs', type=int, default=100)
+    parser.add_argument('--epoch_max_len', type=int, default=None, help='max batches per epoch if set, for more frequent validation')
+    parser.add_argument('--validation_freq', type=int, default=1, help='validate every X epochs')
+    parser.add_argument('--lr', type=float, default=1e-3, help='Adam learning rate')
+    parser.add_argument('--seed', type=int, default=1, help='random seed')
+    parser.add_argument('--device', type=str, default='cuda', choices=['cpu', 'cuda'])
+    parser.add_argument('--num_workers', type=int, default=20, help='num workers for data loader')
+    parser.add_argument('--evaluate', action='store_true', default=False)
+    parser.add_argument('--debug', action='store_true', default=False)
+    # PRINTING
+    parser.add_argument('--train_print_freq', type=int, default=100, help='how often to print metrics (every X batches)')
+    args = parser.parse_args()
+    random.seed(args.seed)
+    np.random.seed(args.seed)
+    torch.manual_seed(args.seed)
+    if args.evaluate:
+        assert args.ckpt is not None
+    main(args)

naacl-2021-fudge-controlled-generation/model.py ADDED Viewed

	@@ -0,0 +1,182 @@

+import math
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from torch.nn.utils.rnn import pad_sequence, pad_packed_sequence, pack_padded_sequence
+from transformers import AutoTokenizer, AutoModelWithLMHead, pipeline, set_seed, GPT2Tokenizer, GPT2Model, GPT2LMHeadModel, GPT2Config, GPT2ForSequenceClassification, GPT2LMHeadModel, MarianTokenizer
+from constants import *
+from util import pad_mask
+from clickbait_classifier import BertClickbaitClassifier, ClickbaitConfig
+class Model(nn.Module):
+    def __init__(self, args, gpt_pad_id, vocab_size, rhyme_group_size=None, glove_embeddings=None, verbose=True):
+        super(Model, self).__init__()
+#         self.topic = args.task == 'topic'
+        self.formality = args.task == 'formality'
+        self.iambic = args.task == 'iambic'
+        self.rhyme = args.task == 'rhyme'
+        self.newline = args.task == 'newline'
+        self.clickbait = args.task == 'clickbait'
+#         if self.topic:
+#             self.gpt_embed = nn.Embedding(gpt_pad_id + 1, HIDDEN_DIM, padding_idx=gpt_pad_id) # these are subwords, not words
+#             if glove_embeddings is None:
+#                 if verbose:
+#                     print('initializing word embeddings from scratch')
+#                 self.word_embed = nn.Embedding(vocab_size, GLOVE_DIM, padding_idx=0)
+#             else:
+#                 if verbose:
+#                     print('initializing word embeddings from glove')
+#                 self.word_embed = nn.Embedding.from_pretrained(glove_embeddings, padding_idx=0)
+#             self.rnn = nn.LSTM(HIDDEN_DIM, RNN_DIM, num_layers=3, bidirectional=True)
+#             self.attention_linear = nn.Linear(HIDDEN_DIM, HIDDEN_DIM)
+#             large_hidden_dim = HIDDEN_DIM
+#             self.embed_key_linear = nn.Linear(large_hidden_dim, HIDDEN_DIM)
+#             self.attention_value_linear = nn.Linear(HIDDEN_DIM, HIDDEN_DIM)
+#             self.out_embed_linear = nn.Linear(HIDDEN_DIM, HIDDEN_DIM)
+#             self.out_linear = nn.Linear(HIDDEN_DIM, HIDDEN_DIM)
+#             self.out_linear2 = nn.Linear(HIDDEN_DIM + large_hidden_dim, HIDDEN_DIM)
+#             self.out_linear3 = nn.Linear(HIDDEN_DIM, 1)
+#             self.nonlinear = nn.ReLU()
+#         elif self.formality:
+        if self.formality:
+            self.marian_embed = nn.Embedding(gpt_pad_id + 1, HIDDEN_DIM, padding_idx=0) # 0 in marian is ''
+            self.rnn = nn.LSTM(HIDDEN_DIM, HIDDEN_DIM, num_layers=3, bidirectional=False, dropout=0.5) # want it to be causal so we can learn all positions
+            self.out_linear = nn.Linear(HIDDEN_DIM, 1)
+        elif self.iambic:
+            self.gpt_embed = nn.Embedding(gpt_pad_id + 1, HIDDEN_DIM, padding_idx=gpt_pad_id)
+            self.rnn = nn.LSTM(HIDDEN_DIM, HIDDEN_DIM, num_layers=3, bidirectional=False, dropout=0) # want it to be causal so we can learn all positions
+            self.out_linear = nn.Linear(HIDDEN_DIM, 1)
+        elif self.rhyme:
+            self.gpt_embed = nn.Embedding(gpt_pad_id + 1, HIDDEN_DIM, padding_idx=gpt_pad_id) # these are subwords, not words
+            self.word_embed = nn.Embedding(rhyme_group_size+1, GLOVE_DIM, padding_idx=0) # this embedding for future words will actually embed the rhyme group idx
+            self.rnn = nn.LSTM(HIDDEN_DIM, RNN_DIM, num_layers=3, bidirectional=True)
+            self.attention_linear = nn.Linear(HIDDEN_DIM, HIDDEN_DIM)
+            large_hidden_dim = HIDDEN_DIM + COUNT_SYLLABLE_DIM
+            self.embed_key_linear = nn.Linear(large_hidden_dim, HIDDEN_DIM)
+            self.attention_value_linear = nn.Linear(HIDDEN_DIM, HIDDEN_DIM)
+            self.out_embed_linear = nn.Linear(HIDDEN_DIM, HIDDEN_DIM)
+            self.out_linear = nn.Linear(HIDDEN_DIM, HIDDEN_DIM)
+            self.out_linear2 = nn.Linear(HIDDEN_DIM + large_hidden_dim, HIDDEN_DIM)
+            self.out_linear3 = nn.Linear(HIDDEN_DIM, 1)
+            self.count_syllable_embed = nn.Embedding(MAX_COUNT_SYLLABLE_DIST+1, COUNT_SYLLABLE_DIM)
+            self.nonlinear = nn.ReLU()
+        elif self.newline:
+            self.gpt_embed = nn.Embedding(gpt_pad_id + 1, HIDDEN_DIM, padding_idx=gpt_pad_id) # these are subwords, not words
+            self.rnn = nn.LSTM(HIDDEN_DIM, HIDDEN_DIM, num_layers=3, bidirectional=False)
+            self.count_syllable_embed = nn.Embedding(MAX_COUNT_SYLLABLE_DIST+1, COUNT_SYLLABLE_DIM)
+            self.out_linear = nn.Linear(HIDDEN_DIM + COUNT_SYLLABLE_DIM, HIDDEN_DIM)
+            self.out_linear2 = nn.Linear(HIDDEN_DIM, HIDDEN_DIM)
+            self.out_linear3 = nn.Linear(HIDDEN_DIM, 1)
+            self.nonlinear = nn.ReLU()
+        elif self.clickbait:
+            # mpnet_config = ClickbaitConfig(
+            #     model_type="mpnet",
+            #     pretrained_model="sentence-transformers/all-mpnet-base-v2",
+            #     num_labels=1,
+            #     dropout=0.2,
+            #     inner_dim1=256,
+            #     inner_dim2=32,
+            #     max_length=25,
+            #     load_pretrained=True,
+            #     freeze_bert=False,
+            # )
+            #TODO add a checkpoint to Classifier
+            # print('add a checkpoint to Classifier')
+            checkpoint = args.checkpoint #'ckpt/clickbait_classifier/checkpoint-1464'
+            # self.classifier = BertClickbaitClassifier(config=mpnet_config).to(torch.device(args.device))
+            self.classifier = BertClickbaitClassifier.from_pretrained(checkpoint).to(torch.device(args.device))
+        else:
+            raise NotImplementedError # TODO honestly this can/should be refactored into different models
+    def forward(self, inputs, lengths=None, future_words=None, log_probs=None, syllables_to_go=None, future_word_num_syllables=None, rhyme_group_index=None, run_classifier=False, attention_mask=None):
+        """
+        inputs: token ids, batch x seq, right-padded with 0s
+        lengths: lengths of inputs; batch
+        future_words: batch x N words to check if not predict next token, else batch
+        log_probs: N
+        syllables_to_go: batch
+        """
+#         if self.topic:
+#             inputs = self.gpt_embed(inputs) # batch x seq x 300
+#             inputs = pack_padded_sequence(inputs.permute(1, 0, 2), lengths.cpu(), enforce_sorted=False)
+#             rnn_output, _ = self.rnn(inputs)
+#             rnn_output, _ = pad_packed_sequence(rnn_output)
+#             rnn_output = rnn_output.permute(1, 0, 2) # batch x seq x 300
+#             hidden = rnn_output
+#             attention_mask = pad_mask(lengths).permute(1, 0) # batch x seq
+#             embed = self.word_embed(future_words) # batch x N x 300
+#             embed_query = self.embed_key_linear(embed)
+#             attention_tensor = self.attention_linear(hidden).unsqueeze(2) * embed_query.unsqueeze(1) # batch x seq x N x 300
+#             attention_weights = F.softmax(attention_tensor.sum(dim=3), dim=1) # batch x seq x N
+#             attention_weights = attention_weights * attention_mask.unsqueeze(2)
+#             hidden = self.attention_value_linear(hidden)
+#             weighted_hidden = (hidden.unsqueeze(2) * attention_weights.unsqueeze(3)).sum(dim=1) # batch x seq x N x 768 -> batch x N x 768
+#             unnormalized_scores = (self.out_linear(weighted_hidden) * self.out_embed_linear(embed)) # batch x N x 300
+#             unnormalized_scores = torch.cat([unnormalized_scores, embed], dim=2)
+#             unnormalized_scores = self.nonlinear(self.out_linear2(self.nonlinear(unnormalized_scores)))
+#             unnormalized_scores = self.out_linear3(unnormalized_scores)
+#             scores = unnormalized_scores.squeeze(2) - log_probs.unsqueeze(0)
+#             return scores # batch x N of normalized scores or batch x
+#         elif self.formality:
+        if self.formality:
+            inputs = self.marian_embed(inputs)
+            inputs = pack_padded_sequence(inputs.permute(1, 0, 2), lengths.cpu(), enforce_sorted=False)
+            rnn_output, _ = self.rnn(inputs)
+            rnn_output, _ = pad_packed_sequence(rnn_output)
+            rnn_output = rnn_output.permute(1, 0, 2) # batch x seq x 300
+            return self.out_linear(rnn_output).squeeze(2)
+        elif self.iambic:
+            inputs = self.gpt_embed(inputs)
+            inputs = pack_padded_sequence(inputs.permute(1, 0, 2), lengths.cpu(), enforce_sorted=False)
+            rnn_output, _ = self.rnn(inputs)
+            rnn_output, _ = pad_packed_sequence(rnn_output)
+            rnn_output = rnn_output.permute(1, 0, 2) # batch x seq x 300
+            return self.out_linear(rnn_output).squeeze(2)
+        elif self.rhyme:
+            inputs = self.gpt_embed(inputs) # batch x seq x 300
+            inputs = pack_padded_sequence(inputs.permute(1, 0, 2), lengths.cpu(), enforce_sorted=False)
+            rnn_output, _ = self.rnn(inputs)
+            rnn_output, _ = pad_packed_sequence(rnn_output)
+            rnn_output = rnn_output.permute(1, 0, 2) # batch x seq x 300
+            hidden = rnn_output
+            attention_mask = pad_mask(lengths).permute(1, 0) # batch x seq
+            embed = self.word_embed(future_words) # batch x N x 300
+            embedded_syllables_to_go = self.count_syllable_embed(syllables_to_go).unsqueeze(1).expand(-1, embed.shape[1], -1) # batch x N x 100
+            auxiliary_embed = embedded_syllables_to_go
+            embed_query = self.embed_key_linear(torch.cat([embed, auxiliary_embed], dim=2))
+            attention_tensor = self.attention_linear(hidden).unsqueeze(2) * embed_query.unsqueeze(1) # batch x seq x N x 300
+            attention_weights = F.softmax(attention_tensor.sum(dim=3), dim=1) # batch x seq x N
+            attention_weights = attention_weights * attention_mask.unsqueeze(2)
+            hidden = self.attention_value_linear(hidden)
+            weighted_hidden = (hidden.unsqueeze(2) * attention_weights.unsqueeze(3)).sum(dim=1) # batch x seq x N x 768 -> batch x N x 768
+            unnormalized_scores = (self.out_linear(weighted_hidden) * self.out_embed_linear(embed)) # batch x N x 300
+            unnormalized_scores = torch.cat([unnormalized_scores, embed, auxiliary_embed], dim=2)
+            unnormalized_scores = self.nonlinear(self.out_linear2(self.nonlinear(unnormalized_scores)))
+            unnormalized_scores = self.out_linear3(unnormalized_scores)
+            scores = unnormalized_scores.squeeze(2) - log_probs.unsqueeze(0)
+            return scores # batch x N of normalized scores or batch x
+        elif self.newline:
+            inputs = self.gpt_embed(inputs) # batch x seq x 300
+            inputs = pack_padded_sequence(inputs.permute(1, 0, 2), lengths.cpu(), enforce_sorted=False)
+            rnn_output, _ = self.rnn(inputs)
+            rnn_output, _ = pad_packed_sequence(rnn_output)
+            rnn_output = rnn_output.permute(1, 0, 2) # batch x seq x 300
+            hidden = torch.cat([rnn_output, self.count_syllable_embed(syllables_to_go).unsqueeze(1).expand(-1, rnn_output.shape[1], -1)], dim=2)
+            return self.out_linear3(self.nonlinear(self.out_linear2(self.nonlinear(self.out_linear(hidden))))).squeeze(2)
+        elif self.clickbait:
+            input_ids = torch.tensor(inputs)
+            classifer_output = self.classifier(input_ids = input_ids, attention_mask = attention_mask).logits
+            classifer_output = classifer_output[None,:,:] # batch x seq x 300
+            # return self.out_linear(rnn_output).squeeze(2)
+            return classifer_output.squeeze(2)
+        else:
+            raise NotImplementedError

naacl-2021-fudge-controlled-generation/poetry_data/README.md ADDED Viewed

	@@ -0,0 +1 @@


1	+ `couplet_prefixes.txt` contains the 13th line of each of Shakespeare's sonnets. `couplet_ends.txt` contains the 14th. (Each 14-line sonnet ends with a couplet in the last two lines). The prefixes are our test set prefixes for the couplet completion task; the ends are Shakespeare's outputs.

naacl-2021-fudge-controlled-generation/poetry_data/couplet_ends.txt ADDED Viewed

	@@ -0,0 +1,154 @@

+ To eat the world's due, by the grave and thee.
+ And see thy blood warm when thou feel'st it cold.
+ Die single, and thine image dies with thee.
+ Which, used, lives th' executor to be.
+ Leese but their show; their substance still lives sweet.
+ To be death's conquest and make worms thine heir.
+ Unlook'd on diest, unless thou get a son.
+ Sings this to thee: 'thou single wilt prove none.'
+ That on himself such murderous shame commits.
+ That beauty still may live in thine or thee.
+ Thou shouldst print more, not let that copy die.
+ Save breed, to brave him when he takes thee hence.
+ You had a father: let your son say so.
+ Thy end is truth's and beauty's doom and date.
+ As he takes from you, I engraft you new.
+ And you must live, drawn by your own sweet skill.
+ You should live twice; in it and in my rhyme.
+ So long lives this and this gives life to thee.
+ My love shall in my verse ever live young.
+ Mine be thy love and thy love's use their treasure.
+ I will not praise that purpose not to sell.
+ Thou gavest me thine, not to give back again.
+ To hear with eyes belongs to love's fine wit.
+ They draw but what they see, know not the heart.
+ Where I may not remove nor be removed.
+ Till then not show my head where thou mayst prove me.
+ For thee and for myself no quiet find.
+ And night doth nightly make grief's strength seem stronger.
+ That then I scorn to change my state with kings.
+ All losses are restored and sorrows end.
+ And thou, all they, hast all the all of me.
+ Theirs for their style I'll read, his for his love.'
+ Suns of the world may stain when heaven's sun staineth.
+ And they are rich and ransom all ill deeds.
+ To that sweet thief which sourly robs from me.
+ As, thou being mine, mine is thy good report.
+ This wish I have; then ten times happy me!
+ The pain be mine, but thine shall be the praise.
+ By praising him here who doth hence remain!
+ Kill me with spites; yet we must not be foes.
+ Thine, by thy beauty being false to me.
+ Sweet flattery! then she loves but me alone.
+ And nights bright days when dreams do show thee me.
+ But heavy tears, badges of either's woe.
+ I send them back again and straight grow sad.
+ And my heart's right thy inward love of heart.
+ Awakes my heart to heart's and eye's delight.
+ For truth proves thievish for a prize so dear.
+ Since why to love I can allege no cause.
+ My grief lies onward and my joy behind.
+ Towards thee I'll run, and give him leave to go.
+ Being had, to triumph, being lack'd, to hope.
+ But you like none, none you, for constant heart.
+ When that shall fade, my verse distills your truth.
+ You live in this, and dwell in lover's eyes.
+ Makes summer's welcome thrice more wish'd, more rare.
+ Though you do any thing, he thinks no ill.
+ Not blame your pleasure, be it ill or well.
+ To subjects worse have given admiring praise.
+ Praising thy worth, despite his cruel hand.
+ From me far off, with others all too near.
+ Painting my age with beauty of thy days.
+ And they shall live, and he in them still green.
+ But weep to have that which it fears to lose.
+ That in black ink my love may still shine bright.
+ Save that, to die, I leave my love alone.
+ In days long since, before these last so bad.
+ To show false Art what beauty was of yore.
+ The solve is this, that thou dost common grow.
+ Then thou alone kingdoms of hearts shouldst owe.
+ And mock you with me after I am gone.
+ And so should you, to love things nothing worth.
+ To love that well which thou must leave ere long.
+ And that is this, and this with thee remains.
+ Or gluttoning on all, or all away.
+ So is my love still telling what is told.
+ Shall profit thee and much enrich thy book.
+ As high as learning my rude ignorance.
+ Since what he owes thee thou thyself dost pay.
+ The worst was this; my love was my decay.
+ Where breath most breathes, even in the mouths of men.
+ Where cheeks need blood; in thee it is abused.
+ Than both your poets can in praise devise.
+ Being fond on praise, which makes your praises worse.
+ Me for my dumb thoughts, speaking in effect.
+ Then lack'd I matter; that enfeebled mine.
+ In sleep a king, but waking no such matter.
+ That for thy right myself will bear all wrong.
+ For I must ne'er love him whom thou dost hate.
+ Compared with loss of thee will not seem so.
+ All this away and me most wretched make.
+ Thou mayst be false, and yet I know it not.
+ if thy sweet virtue answer not thy show!
+ Lilies that fester smell far worse than weeds.
+ The hardest knife ill-used doth lose his edge.
+ As, thou being mine, mine is thy good report.
+ That leaves look pale, dreading the winter's near.
+ As with your shadow I with these did play:
+ But sweet or colour it had stol'n from thee.
+ So thou prevent'st his scythe and crooked knife.
+ To make him seem long hence as he shows now.
+ Because I would not dull you with my song.
+ Your own glass shows you when you look in it.
+ Ere you were born was beauty's summer dead.
+ Which three till now never kept seat in one.
+ Had eyes to wonder, but lack tongues to praise.
+ When tyrants' crests and tombs of brass are spent.
+ Where time and outward form would show it dead.
+ Save thou, my rose; in it thou art my all.
+ Even to thy pure and most most loving breast.
+ Even that your pity is enough to cure me.
+ That all the world besides methinks are dead.
+ My most true mind thus makes mine eye untrue.
+ That mine eye loves it and doth first begin.
+ To give full growth to that which still doth grow?
+ I never writ, nor no man ever loved.
+ The constancy and virtue of your love.
+ Drugs poison him that so fell sick of you.
+ And gain by ill thrice more than I have spent.
+ Mine ransoms yours, and yours must ransom me.
+ All men are bad, and in their badness reign.
+ Were to import forgetfulness in me.
+ I will be true, despite thy scythe and thee.
+ Which die for goodness, who have lived for crime.
+ When most impeach'd stands least in thy control.
+ And her quietus is to render thee.
+ That every tongue says beauty should look so.
+ Give them thy fingers, me thy lips to kiss.
+ To shun the heaven that leads men to this hell.
+ As any she belied with false compare.
+ And thence this slander, as I think, proceeds.
+ And all they foul that thy complexion lack.
+ Perforce am thine, and all that is in me.
+ He pays the whole, and yet am I not free.
+ Think all but one, and me in that one 'Will.'
+ And then thou lovest me, for my name is 'Will.'
+ And to this false plague are they now transferr'd.
+ And in our faults by lies we flatter'd be.
+ Kill me outright with looks and rid my pain.
+ Bear thine eyes straight, though thy proud heart go wide.
+ That she that makes me sin awards me pain.
+ By self-example mayst thou be denied!
+ If thou turn back, and my loud crying still.
+ Till my bad angel fire my good one out.
+ And saved my life, saying 'not you.'
+ And Death once dead, there's no more dying then.
+ Who art as black as hell, as dark as night.
+ Lest eyes well-seeing thy foul faults should find.
+ Those that can see thou lovest, and I am blind.
+ More worthy I to be beloved of thee.
+ Her 'love' for whose dear love I rise and fall.
+ To swear against the truth so foul a lie!
+ Where Cupid got new fire--my mistress' eyes.
+ Love's fire heats water, water cools not love.

naacl-2021-fudge-controlled-generation/poetry_data/couplet_prefixes.txt ADDED Viewed

	@@ -0,0 +1,154 @@

+Pity the world, or else this glutton be,
+This were to be new made when thou art old,
+But if thou live, remember'd not to be,
+Thy unused beauty must be tomb'd with thee,
+But flowers distill'd though they with winter meet,
+Be not self-will'd, for thou art much too fair
+So thou, thyself out-going in thy noon,
+Whose speechless song, being many, seeming one,
+No love toward others in that bosom sits
+Make thee another self, for love of me,
+She carved thee for her seal, and meant thereby
+And nothing 'gainst Time's scythe can make defence
+O, none but unthrifts! Dear my love, you know
+Or else of thee this I prognosticate:
+And all in war with Time for love of you,
+To give away yourself keeps yourself still,
+But were some child of yours alive that time,
+So long as men can breathe or eyes can see,
+Yet, do thy worst, old Time: despite thy wrong,
+But since she prick'd thee out for women's pleasure,
+Let them say more than like of hearsay well;
+Presume not on thy heart when mine is slain;
+O, learn to read what silent love hath writ:
+Yet eyes this cunning want to grace their art;
+Then happy I, that love and am beloved
+Then may I dare to boast how I do love thee;
+Lo! thus, by day my limbs, by night my mind,
+But day doth daily draw my sorrows longer
+For thy sweet love remember'd such wealth brings
+But if the while I think on thee, dear friend,
+Their images I loved I view in thee,
+But since he died and poets better prove,
+Yet him for this my love no whit disdaineth;
+Ah! but those tears are pearl which thy love sheds,
+That I an accessary needs must be
+But do not so; I love thee in such sort
+Look, what is best, that best I wish in thee:
+If my slight Muse do please these curious days,
+And that thou teachest how to make one twain,
+Lascivious grace, in whom all ill well shows,
+Hers by thy beauty tempting her to thee,
+But here's the joy; my friend and I are one;
+All days are nights to see till I see thee,
+Receiving nought by elements so slow
+This told, I joy; but then no longer glad,
+As thus; mine eye's due is thy outward part,
+Or, if they sleep, thy picture in my sight
+And even thence thou wilt be stol'n, I fear,
+To leave poor me thou hast the strength of laws,
+For that same groan doth put this in my mind;
+Since from thee going he went wilful-slow,
+Blessed are you, whose worthiness gives scope,
+In all external grace you have some part,
+And so of you, beauteous and lovely youth,
+So, till the judgment that yourself arise,
+Else call it winter, which being full of care
+So true a fool is love that in your will,
+I am to wait, though waiting so be hell;
+O, sure I am, the wits of former days
+And yet to times in hope my verse shall stand,
+For thee watch I whilst thou dost wake elsewhere,
+'Tis thee, myself, that for myself I praise,
+His beauty shall in these black lines be seen,
+This thought is as a death, which cannot choose
+O, none, unless this miracle have might,
+Tired with all these, from these would I be gone,
+O, him she stores, to show what wealth she had
+And him as for a map doth Nature store,
+But why thy odour matcheth not thy show,
+If some suspect of ill mask'd not thy show,
+Lest the wise world should look into your moan
+For I am shamed by that which I bring forth,
+This thou perceivest, which makes thy love more strong,
+The worth of that is that which it contains,
+Thus do I pine and surfeit day by day,
+For as the sun is daily new and old,
+These offices, so oft as thou wilt look,
+But thou art all my art and dost advance
+Then thank him not for that which he doth say,
+Then if he thrive and I be cast away,
+You still shall live--such virtue hath my pen--
+And their gross painting might be better used
+There lives more life in one of your fair eyes
+You to your beauteous blessings add a curse,
+Then others for the breath of words respect,
+But when your countenance fill'd up his line,
+Thus have I had thee, as a dream doth flatter,
+Such is my love, to thee I so belong,
+For thee against myself I'll vow debate,
+And other strains of woe, which now seem woe,
+Wretched in this alone, that thou mayst take
+But what's so blessed-fair that fears no blot?
+How like Eve's apple doth thy beauty grow,
+For sweetest things turn sourest by their deeds;
+Take heed, dear heart, of this large privilege;
+But do not so; I love thee in such sort
+Or, if they sing, 'tis with so dull a cheer
+Yet seem'd it winter still, and, you away,
+More flowers I noted, yet I none could see
+Give my love fame faster than Time wastes life;
+Then do thy office, Muse; I teach thee how
+Therefore like her I sometime hold my tongue,
+And more, much more, than in my verse can sit
+For fear of which, hear this, thou age unbred;
+'Fair, kind, and true,' have often lived alone,
+For we, which now behold these present days,
+And thou in this shalt find thy monument,
+Finding the first conceit of love there bred
+For nothing this wide universe I call,
+Then give me welcome, next my heaven the best,
+Pity me then, dear friend, and I assure ye
+You are so strongly in my purpose bred
+Incapable of more, replete with you,
+If it be poison'd, 'tis the lesser sin
+Love is a babe; then might I not say so,
+If this be error and upon me proved,
+Since my appeal says I did strive to prove
+But thence I learn, and find the lesson true,
+So I return rebuked to my content
+But that your trespass now becomes a fee;
+Unless this general evil they maintain,
+To keep an adjunct to remember thee
+This I do vow and this shall ever be;
+To this I witness call the fools of time,
+Hence, thou suborn'd informer! a true soul
+Her audit, though delay'd, answer'd must be,
+Yet so they mourn, becoming of their woe,
+Since saucy jacks so happy are in this,
+All this the world well knows; yet none knows well
+And yet, by heaven, I think my love as rare
+In nothing art thou black save in thy deeds,
+Then will I swear beauty herself is black
+And yet thou wilt; for I, being pent in thee,
+Him have I lost; thou hast both him and me:
+Let no unkind, no fair beseechers kill;
+Make but my name thy love, and love that still,
+In things right true my heart and eyes have erred,
+Therefore I lie with her and she with me,
+Yet do not so; but since I am near slain,
+That I may not be so, nor thou belied,
+Only my plague thus far I count my gain,
+If thou dost seek to have what thou dost hide,
+So will I pray that thou mayst have thy 'Will,'
+Yet this shall I ne'er know, but live in doubt,
+'I hate' from hate away she threw,
+So shalt thou feed on Death, that feeds on men,
+For I have sworn thee fair and thought thee bright,
+O cunning Love! with tears thou keep'st me blind,
+But, love, hate on, for now I know thy mind;
+If thy unworthiness raised love in me,
+No want of conscience hold it that I call
+For I have sworn thee fair; more perjured I,
+But found no cure: the bath for my help lies
+Came there for cure, and this by that I prove,

naacl-2021-fudge-controlled-generation/poetry_util.py ADDED Viewed

	@@ -0,0 +1,83 @@

+import string
+import pronouncing
+from Phyme import Phyme
+phyme = Phyme()
+from constants import *
+def is_iambic(phrase):
+    """
+    check that we satisfy iambic meter.
+    return 1 if so, otherwise 0.
+    definitely an imperfect check...
+    if we end up needing to check a word that's not in the CMU dictionary, just return 0.
+    """
+    meter = ''
+    for word in phrase.split():
+        word = word.strip().strip(string.punctuation).lower()
+        try:
+            phones_list = pronouncing.phones_for_word(word)
+            stresses = pronouncing.stresses(phones_list[0])
+            if len(stresses) == 1:
+                if stresses == '1':
+                    stresses = '2' # allow ambiguity for 1-syllable words with stress 1
+            meter += stresses # just default to the first pronunciation if > 1 given
+        except:
+            return 0 # word not found
+    meter = [int(x) for x in meter]
+    even_stresses_full = [meter[i] for i in range(0, len(meter), 2)]
+    odd_stresses_full = [meter[i] for i in range(1, len(meter), 2)]
+    even_stresses = set(even_stresses_full)
+    odd_stresses = set(odd_stresses_full)
+    if 0 in odd_stresses:
+        return 0
+    if 1 in even_stresses:
+        return 0
+    return 1
+def count_syllables(words):
+    syllables = 0
+    for word in words.split():
+        word = word.strip().strip(string.punctuation)
+        try:
+            phones_list = pronouncing.phones_for_word(word)
+            stresses = pronouncing.stresses(phones_list[0])
+            syllables += min(MAX_SYLLABLES_PER_WORD, len(stresses))
+        except:
+            # if we don't know, just do a quick approximation here; it shouldn't come up too often
+            syllables += min(MAX_SYLLABLES_PER_WORD, round(len(word) / 3))
+    return syllables
+def get_rhymes(word):
+    # throws exception if word not in the rhyme dict (rare)
+    rhymes = []
+    rhyme_dict = phyme.get_perfect_rhymes(word)
+    for length_dict in rhyme_dict.values():
+        for word in length_dict:
+            if '(' in word: # sometimes you have stuff like preferred(1) where they indicate a particular pronunciation
+                rhymes.append(word.split('(')[0])
+            else:
+                rhymes.append(word)
+    return sorted(list(set(rhymes)))
+def get_rhyme_group(word):
+    sorted_rhyme_list = get_rhymes(word)
+    return ' '.join(sorted_rhyme_list)
+def perfect_rhyme_end(s1, s2):
+    ending_word1 = s1.split()[-1].strip(string.punctuation)
+    ending_word2 = s2.split()[-1].strip(string.punctuation)
+    try:
+        return get_rhyme_group(ending_word1) == get_rhyme_group(ending_word2)
+    except:
+        return False # unknown words
+if __name__=='__main__':
+    result = is_iambic('Shall I compare thee to a summer day')
+    result2 = count_syllables('Shall I compare thee to a summer day')
+    import pdb; pdb.set_trace()

naacl-2021-fudge-controlled-generation/predict_clickbait.py ADDED Viewed

	@@ -0,0 +1,199 @@

+import os
+import random
+import time
+import pickle
+import math
+from argparse import ArgumentParser
+from typing import Iterable, List, Optional, Tuple
+from tqdm import tqdm
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from transformers import AutoTokenizer, AutoModelWithLMHead
+from torch import Tensor
+from data import Dataset
+from model import Model
+from util import num_params
+from constants import *
+tokenizer = AutoTokenizer.from_pretrained('google/pegasus-xsum')
+classifier_tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/all-mpnet-base-v2')
+def main(args):
+    with open(args.dataset_info, 'rb') as rf:
+        dataset_info = pickle.load(rf)
+    article_content = """Australian actor Guy Pearce will return for the iconic soap Neighbours finale on August 1 to reprise his role as Mike Young.
+                    Guy, 54, played the troubled Mike from 1986 to 1989, and is now set to make a comeback on the show after 33 years, Metro.co.uk reports.
+                    The star's character arcs explored the implications of domestic abuse, student-teacher relationships and dealing with loss of loved ones.
+                    Speaking to Metro.co.uk, Guy said: 'It is very exciting and surreal at the same time being back on set again, however it feels like coming home.
+                    'It's where it all started for me professionally. I've been asked to come back on occasions over the years and wondered if it was the right thing
+                    to do, but once I knew the show was finishing, I knew I had to do it.'He added that there is 'nothing like being here all together again'
+                    , even though he's had a chance to catch-up with other cast members."""
+    tokenizer.add_special_tokens({'pad_token': PAD_TOKEN})
+    pad_id = tokenizer.encode(PAD_TOKEN)[0]
+    #For loading Clickbait summarizer
+    model = AutoModelWithLMHead.from_pretrained(args.model_string, return_dict=True).to(args.device)
+    model.eval()
+    checkpoint = torch.load(args.ckpt, map_location=args.device)
+    model_args = checkpoint['args']
+    conditioning_model = Model(model_args, pad_id, len(dataset_info.index2word)) # no need to get the glove embeddings when reloading since they're saved in model ckpt anyway
+    conditioning_model.load_state_dict(checkpoint['state_dict'])
+    conditioning_model = conditioning_model.to(args.device)
+    conditioning_model.eval()
+    print("=> loaded checkpoint '{}' (epoch {})"
+            .format(args.ckpt, checkpoint['epoch']))
+    print('num params', num_params(conditioning_model))
+    while True:
+        results = generate_clickbait(model,
+                        tokenizer,
+                        conditioning_model,
+                        [args.input_text],
+                        dataset_info,
+                        precondition_topk=args.precondition_topk,
+                        do_sample=args.do_sample,
+                        length_cutoff=args.length_cutoff,
+                        condition_lambda=args.condition_lambda,
+                        article_content=article_content,
+                        device=args.device)
+        # print(results)
+        import pdb; pdb.set_trace()
+def generate_clickbait(model,
+                        tokenizer,
+                        conditioning_model,
+                        input_text,
+                        dataset_info,
+                        precondition_topk,
+                        length_cutoff,
+                        condition_lambda=1.0,
+                        article_content=None,
+                        device='cuda'):
+    with torch.no_grad():
+        batch_size = len(input_text)
+        # encoded_input_article = [tokenizer.encode(article_content, return_tensors='pt',add_special_tokens=False).to(device)] # batch x seq
+        max_input_length = 512
+        encoded_input_article = tokenizer(article_content, return_tensors='pt',add_special_tokens=False, max_length = max_input_length).to(device) # batch x seq
+        # encoded_input_article = torch.cat(encoded_input_article, dim=0)
+        # attention_mask = encoded_input_article.new_ones(encoded_input_article.shape).to(device)
+        # CHANGE=ko
+        encoded_input = tokenizer('<pad>', return_tensors='pt',add_special_tokens=False).to(device) # batch x seq
+        # encoded_input = tokenizer('<pad>'+ input_text[0], return_tensors='pt',add_special_tokens=False).to(device) # batch x seq
+        # encoded_input = torch.cat(encoded_input, dim=0)
+        encoded_input = encoded_input['input_ids']
+        lengths = torch.LongTensor([encoded_input.shape[1]]).to(device)
+        # lengths = 1
+        past = None
+        use_cache = True
+        # CHANGE
+        # model_kwargs = {'encoder_outputs': model.get_encoder()(encoded_input_article, attention_mask=attention_mask)}
+        model_kwargs = {'encoder_outputs': model.get_encoder()(input_ids=encoded_input_article['input_ids'],
+                                                            attention_mask=encoded_input_article['attention_mask'],
+                                                            return_dict=True,
+                                                            output_attentions=False,
+                                                            output_hidden_states=False),
+                        }
+        while lengths.max() < length_cutoff:
+            model_inputs = model.prepare_inputs_for_generation(
+                input_ids = encoded_input_article['input_ids'],
+                decoder_input_ids=encoded_input,
+                # past=past,
+                attention_mask=encoded_input_article['attention_mask'],
+                use_cache=use_cache,
+                **model_kwargs
+            )
+            outputs = model(**model_inputs, return_dict=True)
+            logits = outputs.logits[:, -1, :]
+            if "past_key_values" in outputs:
+                model_kwargs["past"] = outputs.past_key_values
+            # logits = model(encoded_input)[0][:, -1, :] # batch x vocab
+            top_logits, top_indices = logits.topk(precondition_topk, dim=1) # batch x topk
+            new_input_candidates = torch.cat([encoded_input.unsqueeze(1).expand(-1, precondition_topk, -1), top_indices.unsqueeze(2)], dim=2) # batch x topk x seq+1
+            expanded_lengths = (lengths + 1).unsqueeze(1).expand(batch_size, precondition_topk) # batch x topk
+            if condition_lambda == 0:
+                condition_logits = torch.zeros_like(top_logits).float()
+                condition_logits = condition_logits.view(batch_size, precondition_topk, -1) # batch x topk x N
+            else:
+                decoded_outputs = tokenizer.batch_decode(new_input_candidates.view(-1, new_input_candidates.size(-1)), clean_up_tokenization_spaces=False)
+                resulting_tokenization = classifier_tokenizer(decoded_outputs, add_special_tokens=False, padding='longest')
+                encoded_with_classifier = resulting_tokenization['input_ids']
+                attention_mask = torch.tensor(resulting_tokenization['attention_mask']).to(model.device)
+                tplus1_candidates_classifier = torch.tensor(encoded_with_classifier).view(batch_size, precondition_topk, -1).to(model.device)
+                condition_logits = conditioning_model(tplus1_candidates_classifier.flatten(0, 1), # batch*topk x seq+1
+                                                    expanded_lengths.flatten(0, 1), # batch*topk
+                                                    None,
+                                                    None,
+                                                    None,
+                                                    attention_mask=attention_mask
+                )
+                condition_logits = condition_logits.view(batch_size, precondition_topk, -1) # batch x topk x N
+                condition_logits = condition_logits - torch.log(1 + torch.exp(condition_logits)) # get correct log probs
+            condition_logits = torch.mean(condition_logits, dim=2)
+            full_logits = top_logits + condition_logits * condition_lambda # batch x topk
+            post_logits, post_indices = full_logits.topk(precondition_topk, dim=1)
+            post_probs = F.softmax(post_logits, dim=1)
+            # index_into_top_indices = post_indices[torch.arange(batch_size).to(post_indices.device), torch.multinomial(post_probs, 1).flatten()] # batch
+            index_into_top_indices = post_indices[:, torch.multinomial(post_probs, 1).flatten()] # batch
+            # next_indices = top_indices[torch.arange(batch_size).to(top_indices.device), index_into_top_indices] # batch
+            next_indices = top_indices[:, index_into_top_indices] # batch
+            # encoded_input = torch.cat([encoded_input, next_indices.unsqueeze(1)], dim=1) # batch x seq+1
+            encoded_input = torch.cat([encoded_input, next_indices.squeeze(1)], dim=1)
+            lengths = lengths + 1 # batch
+#             print(tokenizer.decode(encoded_input[0], add_special_tokens=False))
+        return [tokenizer.decode(s) for s in encoded_input]
+if __name__=='__main__':
+    parser = ArgumentParser()
+    # DATA
+    parser.add_argument('--ckpt', type=str, required=True)
+    parser.add_argument('--dataset_info', type=str, required=True, help='saved dataset info')
+    parser.add_argument('--model_string', type=str, default='Helsinki-NLP/opus-mt-es-en')
+    parser.add_argument('--in_file', type=str, default=None, required=True, help='text to run pred on')
+    parser.add_argument('--precondition_topk', type=int, default=200, help='consider top k outputs from text generation at each step before conditioning and re-pruning')
+    parser.add_argument('--do_sample', action='store_true', default=False, help='sample instead of greedy')
+    parser.add_argument('--condition_lambda', type=float, default=1.0, help='lambda weight on conditioning model')
+    parser.add_argument('--length_cutoff', type=int, default=512, help='max length')
+    parser.add_argument('--seed', type=int, default=1, help='random seed')
+    parser.add_argument('--device', type=str, default='cuda', choices=['cpu', 'cuda'])
+    parser.add_argument('--debug', action='store_true', default=False)
+    args = parser.parse_args()
+    random.seed(args.seed)
+    np.random.seed(args.seed)
+    torch.manual_seed(args.seed)
+    main(args)

naacl-2021-fudge-controlled-generation/predict_formality.py ADDED Viewed

	@@ -0,0 +1,404 @@

+import os
+import random
+import time
+import pickle
+import math
+from argparse import ArgumentParser
+from typing import Iterable, List, Optional, Tuple
+from tqdm import tqdm
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from transformers import AutoTokenizer, AutoModelWithLMHead, pipeline, set_seed, GPT2Tokenizer, GPT2Model, MarianTokenizer, MarianMTModel
+from torch import Tensor
+from data import Dataset
+from model import Model
+from util import save_checkpoint, ProgressMeter, AverageMeter, num_params
+from constants import *
+def main(args):
+    with open(args.dataset_info, 'rb') as rf:
+        dataset_info = pickle.load(rf)
+    tokenizer = MarianTokenizer.from_pretrained(args.model_string)
+    tokenizer.add_special_tokens({'pad_token': PAD_TOKEN})
+    pad_id = tokenizer.encode(PAD_TOKEN)[0]
+    model = MarianMTModel.from_pretrained(args.model_string, return_dict=True).to(args.device)
+    model.eval()
+    checkpoint = torch.load(args.ckpt, map_location=args.device)
+    model_args = checkpoint['args']
+    conditioning_model = Model(model_args, pad_id, len(dataset_info.index2word)) # no need to get the glove embeddings when reloading since they're saved in model ckpt anyway
+    conditioning_model.load_state_dict(checkpoint['state_dict'])
+    conditioning_model = conditioning_model.to(args.device)
+    conditioning_model.eval()
+    print("=> loaded checkpoint '{}' (epoch {})"
+            .format(args.ckpt, checkpoint['epoch']))
+    print('num params', num_params(conditioning_model))
+    while True:
+        results = predict_formality(model,
+                        tokenizer,
+                        conditioning_model,
+                        [args.input_text],
+                        dataset_info,
+                        precondition_topk=args.precondition_topk,
+                        do_sample=args.do_sample,
+                        length_cutoff=args.length_cutoff,
+                        condition_lambda=args.condition_lambda,
+                        device=args.device)
+        print(results)
+        import pdb; pdb.set_trace()
+def predict_formality(model, tokenizer, conditioning_model, input_text, dataset_info, precondition_topk=200, do_sample=False, length_cutoff=512, condition_lambda=1.0, device='cuda'):
+    with torch.no_grad():
+        batch_size = len(input_text)
+        # assumes initially all same length.
+        # encode every x_i i \in [seq] word to respectable embedding
+        encoded_input = [tokenizer.encode(it, return_tensors='pt').to(device) for it in input_text] # batch x seq
+        encoded_input = torch.cat(encoded_input, dim=0)
+        input_ids = torch.LongTensor([[58100]]).to(device)
+        cur_len = 1
+        max_length = length_cutoff
+        min_length = 0
+        temperature = 1.0
+        top_k = 50
+        top_p = 1.0
+        repetition_penalty = 1.0
+        no_repeat_ngram_size = 0
+        bad_words_ids = [[58100]]
+        pad_token_id = 58100
+        eos_token_id = 0
+        effective_batch_size = batch_size
+        attention_mask = encoded_input.new_ones(encoded_input.shape)
+        use_cache = True
+        model_specific_kwargs = {'encoder_outputs': model.get_encoder()(encoded_input, attention_mask=attention_mask)}
+        output = _generate_no_beam_search(model,
+                                        conditioning_model,
+                                        condition_lambda,
+                                        precondition_topk,
+                                        input_ids,
+                                        cur_len,
+                                        max_length,
+                                        min_length,
+                                        do_sample,
+                                        temperature,
+                                        top_k,
+                                        top_p,
+                                        repetition_penalty,
+                                        no_repeat_ngram_size,
+                                        bad_words_ids,
+                                        pad_token_id,
+                                        eos_token_id,
+                                        batch_size,
+                                        attention_mask,
+                                        use_cache,
+                                        model_specific_kwargs)
+        return [tokenizer.decode(s[1:]) for s in output] # 1: to delete the pad token
+# hack of code from transformers/generation_utils.py
+# to get our conditioning
+def postprocess_next_token_scores(
+    model,
+    scores,
+    input_ids,
+    no_repeat_ngram_size,
+    bad_words_ids,
+    cur_len,
+    min_length,
+    max_length,
+    eos_token_id,
+    repetition_penalty,
+    batch_size,
+    num_beams,
+):
+    # repetition penalty (from CTRL paper https://arxiv.org/abs/1909.05858)
+    if repetition_penalty != 1.0:
+        model.enforce_repetition_penalty_(
+            scores,
+            batch_size,
+            num_beams,
+            input_ids,
+            repetition_penalty,
+        )
+    # set eos token prob to zero if min_length is not reached
+    if eos_token_id is not None and cur_len < min_length:
+        scores[:, eos_token_id] = -float("inf")
+    if no_repeat_ngram_size > 0:
+        # calculate a list of banned tokens to prevent repetitively generating the same ngrams
+        num_batch_hypotheses = batch_size * num_beams
+        # from fairseq: https://github.com/pytorch/fairseq/blob/a07cb6f40480928c9e0548b737aadd36ee66ac76/fairseq/sequence_generator.py#L345
+        banned_batch_tokens = calc_banned_ngram_tokens(
+            input_ids, num_batch_hypotheses, no_repeat_ngram_size, cur_len
+        )
+        for i, banned_tokens in enumerate(banned_batch_tokens):
+            scores[i, banned_tokens] = -float("inf")
+    if bad_words_ids is not None:
+        # Exclude EOS token (already processed)
+        bad_words_ids = list(filter(lambda bad_token_seq: bad_token_seq != [eos_token_id], bad_words_ids))
+        # calculate a list of banned tokens according to bad words
+        banned_tokens = calc_banned_bad_words_ids(input_ids.tolist(), bad_words_ids)
+        # Modify the scores in place by setting the banned tokens logits to `-inf`
+        set_scores_to_inf_for_banned_tokens(scores, banned_tokens)
+    return scores
+def calc_banned_ngram_tokens(prev_input_ids: Tensor, num_hypos: int, no_repeat_ngram_size: int, cur_len: int) -> None:
+    """Copied from fairseq for no_repeat_ngram in beam_search"""
+    if cur_len + 1 < no_repeat_ngram_size:
+        # return no banned tokens if we haven't generated no_repeat_ngram_size tokens yet
+        return [[] for _ in range(num_hypos)]
+    generated_ngrams = [{} for _ in range(num_hypos)]
+    for idx in range(num_hypos):
+        gen_tokens = prev_input_ids[idx].tolist()
+        generated_ngram = generated_ngrams[idx]
+        for ngram in zip(*[gen_tokens[i:] for i in range(no_repeat_ngram_size)]):
+            prev_ngram_tuple = tuple(ngram[:-1])
+            generated_ngram[prev_ngram_tuple] = generated_ngram.get(prev_ngram_tuple, []) + [ngram[-1]]
+    def _get_generated_ngrams(hypo_idx):
+        # Before decoding the next token, prevent decoding of ngrams that have already appeared
+        start_idx = cur_len + 1 - no_repeat_ngram_size
+        ngram_idx = tuple(prev_input_ids[hypo_idx, start_idx:cur_len].tolist())
+        return generated_ngrams[hypo_idx].get(ngram_idx, [])
+    banned_tokens = [_get_generated_ngrams(hypo_idx) for hypo_idx in range(num_hypos)]
+    return banned_tokens
+def calc_banned_bad_words_ids(prev_input_ids: Iterable[int], bad_words_ids: Iterable[int]) -> Iterable[int]:
+    banned_tokens = []
+    def _tokens_match(prev_tokens, tokens):
+        if len(tokens) == 0:
+            # if bad word tokens is just one token always ban it
+            return True
+        if len(tokens) > len(prev_tokens):
+            # if bad word tokens are longer than prev tokens they can't be equal
+            return False
+        if prev_tokens[-len(tokens) :] == tokens:
+            # if tokens match
+            return True
+        else:
+            return False
+    for prev_input_ids_slice in prev_input_ids:
+        banned_tokens_slice = []
+        for banned_token_seq in bad_words_ids:
+            assert len(banned_token_seq) > 0, "Banned words token sequences {} cannot have an empty list".format(
+                bad_words_ids
+            )
+            if _tokens_match(prev_input_ids_slice, banned_token_seq[:-1]) is False:
+                # if tokens do not match continue
+                continue
+            banned_tokens_slice.append(banned_token_seq[-1])
+        banned_tokens.append(banned_tokens_slice)
+    return banned_tokens
+def set_scores_to_inf_for_banned_tokens(scores: torch.Tensor, banned_tokens: List[List[int]]) -> None:
+    """Modifies the scores in place by setting the banned token positions to `-inf`. Banned token is expected to be
+    a list of list of banned tokens to ban in the format [[batch index, vocabulary position],...]
+        Args:
+            scores: logits distribution of shape (batch size, vocabulary size)
+            banned_tokens: list of list of tokens to ban of length (batch_size)
+    """
+    banned_mask_list = []
+    for idx, batch_banned_tokens in enumerate(banned_tokens):
+        for token in batch_banned_tokens:
+            banned_mask_list.append([idx, token])
+    if not banned_mask_list:
+        return
+    banned_mask = torch.LongTensor(banned_mask_list)
+    indices = torch.ones(len(banned_mask))
+    # A sparse tensor is generated from a list of coordinates: [[0, 1], [0, 2], [2, 0]]. A conversion to dense tensor generates:
+    # [ 0  1  1 ]
+    # [ 0  0  0 ]
+    # [ 1  0  0 ]
+    banned_mask = torch.sparse.LongTensor(banned_mask.t(), indices, scores.size()).to(scores.device).to_dense().bool()
+    scores.masked_fill_(banned_mask, -float("inf"))
+def _generate_no_beam_search(
+        model,
+        conditioning_model,
+        condition_lambda,
+        precondition_topk,
+        input_ids,
+        cur_len,
+        max_length,
+        min_length,
+        do_sample,
+        temperature,
+        top_k,
+        top_p,
+        repetition_penalty,
+        no_repeat_ngram_size,
+        bad_words_ids,
+        pad_token_id,
+        eos_token_id,
+        batch_size,
+        attention_mask,
+        use_cache,
+        model_kwargs,
+    ):
+        """Generate sequences for each example without beam search (num_beams == 1).
+        All returned sequence are generated independantly.
+        """
+        # length of generated sentences / unfinished sentences
+        unfinished_sents = input_ids.new(batch_size).fill_(1)
+        sent_lengths = input_ids.new(batch_size).fill_(max_length)
+        past = None
+        while cur_len < max_length:
+            model_inputs = model.prepare_inputs_for_generation(
+                input_ids, past=past, attention_mask=attention_mask, use_cache=use_cache, **model_kwargs
+            )
+            outputs = model(**model_inputs, return_dict=True)
+            next_token_logits = outputs.logits[:, -1, :]
+            # scores = model.postprocess_next_token_scores(
+            #     scores=next_token_logits,
+            #     input_ids=input_ids,
+            #     no_repeat_ngram_size=no_repeat_ngram_size,
+            #     bad_words_ids=bad_words_ids,
+            #     cur_len=cur_len,
+            #     min_length=min_length,
+            #     max_length=max_length,
+            #     eos_token_id=eos_token_id,
+            #     repetition_penalty=repetition_penalty,
+            #     batch_size=batch_size,
+            #     num_beams=1,
+            # )
+            scores = postprocess_next_token_scores(
+                model=model,
+                scores=next_token_logits,
+                input_ids=input_ids,
+                no_repeat_ngram_size=no_repeat_ngram_size,
+                bad_words_ids=bad_words_ids,
+                cur_len=cur_len,
+                min_length=min_length,
+                max_length=max_length,
+                eos_token_id=eos_token_id,
+                repetition_penalty=repetition_penalty,
+                batch_size=batch_size,
+                num_beams=1,
+            )
+            # if model has past, then set the past variable to speed up decoding
+            if "past_key_values" in outputs:
+                past = outputs.past_key_values
+            elif "mems" in outputs:
+                past = outputs.mems
+            top_logits, top_indices = scores.topk(precondition_topk, dim=1) # batch x topk
+            tplus1_candidates = torch.cat([input_ids.unsqueeze(1).expand(-1, precondition_topk, -1), top_indices.unsqueeze(2)], dim=2)[:, :, 1:] # batch x topk x seq+1, with pad dropped
+            expanded_lengths = torch.LongTensor([[cur_len for _ in range(precondition_topk)] for _ in range(batch_size)]).to(scores.device)
+            if condition_lambda == 0:
+                condition_logits = torch.zeros_like(top_logits).float()
+            else:
+                condition_logits = conditioning_model(tplus1_candidates.flatten(0, 1), # batch*topk x seq+1
+                                                    expanded_lengths.flatten(0, 1), # batch*topk
+                                                    None,
+                                                    None,
+                                                    None)
+                condition_logits = condition_logits.view(batch_size, precondition_topk, -1)[:, :, -1] # batch x topk of last formality pred
+                condition_logits = condition_logits - torch.log(1 + torch.exp(condition_logits)) # get correct log probs
+                # condition_logits = - torch.log(1 + torch.exp(condition_logits)) # for informal
+            full_logits = top_logits + condition_lambda * condition_logits
+            if do_sample:
+                raise NotImplementedError
+            else:
+                # Greedy decoding
+                next_token = top_indices[torch.arange(batch_size).to(top_indices.device), torch.argmax(full_logits, dim=-1)]
+            # if do_sample:
+            #     # Temperature (higher temperature => more likely to sample low probability tokens)
+            #     if temperature != 1.0:
+            #         scores = scores / temperature
+            #     # Top-p/top-k filtering
+            #     next_token_logscores = top_k_top_p_filtering(scores, top_k=top_k, top_p=top_p)
+            #     # Sample
+            #     probs = F.softmax(next_token_logscores, dim=-1)
+            #     next_token = torch.multinomial(probs, num_samples=1).squeeze(1)
+            # else:
+            #     # Greedy decoding
+            #     next_token = torch.argmax(next_token_logits, dim=-1)
+            # update generations and finished sentences
+            if eos_token_id is not None:
+                # pad finished sentences if eos_token_id exist
+                tokens_to_add = next_token * unfinished_sents + (pad_token_id) * (1 - unfinished_sents)
+            else:
+                tokens_to_add = next_token
+            # add token and increase length by one
+            input_ids = torch.cat([input_ids, tokens_to_add.unsqueeze(-1)], dim=-1)
+            cur_len = cur_len + 1
+            if eos_token_id is not None:
+                eos_in_sents = tokens_to_add == eos_token_id
+                # if sentence is unfinished and the token to add is eos, sent_lengths is filled with current length
+                is_sents_unfinished_and_token_to_add_is_eos = unfinished_sents.mul(eos_in_sents.long()).bool()
+                sent_lengths.masked_fill_(is_sents_unfinished_and_token_to_add_is_eos, cur_len)
+                # unfinished_sents is set to zero if eos in sentence
+                unfinished_sents.mul_((~eos_in_sents).long())
+            # stop when there is a </s> in each sentence, or if we exceed the maximul length
+            if unfinished_sents.max() == 0:
+                break
+            # extend attention_mask for new generated input if only decoder
+            if model.config.is_encoder_decoder is False:
+                attention_mask = torch.cat(
+                    [attention_mask, attention_mask.new_ones((attention_mask.shape[0], 1))], dim=-1
+                )
+        return input_ids
+if __name__=='__main__':
+    parser = ArgumentParser()
+    # DATA
+    parser.add_argument('--ckpt', type=str, required=True)
+    parser.add_argument('--dataset_info', type=str, required=True, help='saved dataset info')
+    parser.add_argument('--model_string', type=str, default='Helsinki-NLP/opus-mt-es-en')
+    parser.add_argument('--input_text', type=str, default=None, required=True, help='text to run pred on')
+    parser.add_argument('--precondition_topk', type=int, default=200, help='consider top k outputs from gpt at each step before conditioning and re-pruning')
+    parser.add_argument('--do_sample', action='store_true', default=False, help='sample instead of greedy')
+    parser.add_argument('--condition_lambda', type=float, default=1.0, help='lambda weight on conditioning model')
+    parser.add_argument('--length_cutoff', type=int, default=512, help='max length')
+    parser.add_argument('--seed', type=int, default=1, help='random seed')
+    parser.add_argument('--device', type=str, default='cuda', choices=['cpu', 'cuda'])
+    parser.add_argument('--debug', action='store_true', default=False)
+    args = parser.parse_args()
+    random.seed(args.seed)
+    np.random.seed(args.seed)
+    torch.manual_seed(args.seed)
+    main(args)

naacl-2021-fudge-controlled-generation/predict_poetry.py ADDED Viewed

	@@ -0,0 +1,219 @@

+import os
+import random
+import time
+import pickle
+import math
+from argparse import ArgumentParser
+import string
+from collections import defaultdict
+from tqdm import tqdm
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from transformers import AutoTokenizer, AutoModelWithLMHead, pipeline, set_seed, GPT2Tokenizer, GPT2Model
+from data import Dataset, load_rhyme_info
+from model import Model
+from util import save_checkpoint, ProgressMeter, AverageMeter, num_params
+from constants import *
+from poetry_util import get_rhymes, count_syllables
+def main(args):
+    with open(args.dataset_info, 'rb') as rf:
+        dataset_info = pickle.load(rf)
+    gpt_tokenizer = AutoTokenizer.from_pretrained(args.model_string)
+    gpt_tokenizer.add_special_tokens({'pad_token': PAD_TOKEN})
+    gpt_pad_id = gpt_tokenizer.encode(PAD_TOKEN)[0]
+    gpt_model = AutoModelWithLMHead.from_pretrained(args.model_string).to(args.device)
+    gpt_model.eval()
+    checkpoint = torch.load(args.iambic_ckpt, map_location=args.device)
+    model_args = checkpoint['args']
+    iambic_model = Model(model_args, gpt_pad_id, len(dataset_info.index2word)) # no need to get the glove embeddings when reloading since they're saved in model ckpt anyway
+    iambic_model.load_state_dict(checkpoint['state_dict'])
+    iambic_model = iambic_model.to(args.device)
+    iambic_model.eval()
+    print("=> loaded checkpoint '{}' (epoch {})"
+            .format(args.iambic_ckpt, checkpoint['epoch']))
+    print('iambic model num params', num_params(iambic_model))
+    with open(args.rhyme_info, 'rb') as rf:
+        rhyme_info = pickle.load(rf)
+    checkpoint = torch.load(args.rhyme_ckpt, map_location=args.device)
+    model_args = checkpoint['args']
+    rhyme_model = Model(model_args, gpt_pad_id, len(dataset_info.index2word), rhyme_group_size=len(rhyme_info.index2rhyme_group)) # no need to get the glove embeddings when reloading since they're saved in model ckpt anyway
+    rhyme_model.load_state_dict(checkpoint['state_dict'])
+    rhyme_model = rhyme_model.to(args.device)
+    rhyme_model.eval()
+    print("=> loaded checkpoint '{}' (epoch {})"
+            .format(args.rhyme_ckpt, checkpoint['epoch']))
+    print('rhyme model num params', num_params(rhyme_model))
+    checkpoint = torch.load(args.newline_ckpt, map_location=args.device)
+    model_args = checkpoint['args']
+    newline_model = Model(model_args, gpt_pad_id, len(dataset_info.index2word)) # no need to get the glove embeddings when reloading since they're saved in model ckpt anyway
+    newline_model.load_state_dict(checkpoint['state_dict'])
+    newline_model = newline_model.to(args.device)
+    newline_model.eval()
+    print("=> loaded checkpoint '{}' (epoch {})"
+            .format(args.newline_ckpt, checkpoint['epoch']))
+    print('iambic model num params', num_params(newline_model))
+    while True:
+        results = predict_couplet(gpt_model,
+                    gpt_tokenizer,
+                    iambic_model,
+                    rhyme_model,
+                    newline_model,
+                    [args.input_text],
+                    dataset_info,
+                    rhyme_info,
+                    args.precondition_topk,
+                    args.topk,
+                    condition_lambda=args.condition_lambda,
+                    device=args.device)
+        for line in results:
+            print(line)
+        import pdb; pdb.set_trace()
+def predict_couplet(gpt_model, gpt_tokenizer, iambic_model, rhyme_model, newline_model, input_text, dataset_info, rhyme_info, precondition_topk, postcondition_topk, condition_lambda=1.0, device='cuda'):
+    assert len(input_text) == 1 # only do one at a time for now
+    current_text = input_text[0]
+    current_line_text = ''
+    all_lines = [current_text]
+    ending_word = current_text.split()[-1].strip(string.punctuation)
+    word2rhyme_group = defaultdict(lambda: UNKNOWN_RHYME_GROUP, rhyme_info.word2rhyme_group)
+    rhyme_group = word2rhyme_group[ending_word]
+    line = predict_iambic_pentameter_line(gpt_model,
+                        gpt_tokenizer,
+                        iambic_model,
+                        rhyme_model,
+                        newline_model,
+                        current_text,
+                        current_line_text,
+                        rhyme_group,
+                        dataset_info,
+                        rhyme_info,
+                        precondition_topk,
+                        postcondition_topk,
+                        condition_lambda=condition_lambda,
+                        device=device)
+    all_lines.append(line)
+    return all_lines
+def predict_iambic_pentameter_line(gpt_model, gpt_tokenizer, iambic_model, rhyme_model, newline_model, current_text, current_line_text, rhyme_group, dataset_info, rhyme_info, precondition_topk, postcondition_topk, banned_tokens=POETRY_BANNED_TOKENS, condition_lambda=1.0, device='cuda', length_cutoff=30):
+    # TODO(poetry) delete banned tokens?
+    with torch.no_grad():
+        batch_size = 1
+        rhyme_group_index = rhyme_info.rhyme_group2index[rhyme_group]
+        future_words = torch.LongTensor([rhyme_group_index]).to(device) # 1
+        log_probs = torch.Tensor([math.log(rhyme_info.rhyme_group_counts[rhyme_group] / rhyme_info.total_rhyme_groups)]).to(device) # 1
+        # assumes initially all same length.
+        previous_encoded_text = [gpt_tokenizer.encode(it, return_tensors='pt').to(device) for it in [current_text]]
+        previous_enc_len = previous_encoded_text[0].shape[1]
+        encoded_input = [gpt_tokenizer.encode(it, return_tensors='pt').to(device) for it in [current_text + current_line_text]] # batch x seq
+        encoded_input = torch.cat(encoded_input, dim=0)
+        lengths = torch.LongTensor([encoded_input.shape[1]]).to(device)
+        line_syllable_count = count_syllables(current_line_text)
+        assert line_syllable_count < POETRY_LINE_SYLLABLES # assume we started with less than one full line
+        syllables_to_go = POETRY_LINE_SYLLABLES - line_syllable_count
+        for _ in range(length_cutoff): # really shouldn't have a line this long anyway
+            gpt_logits = gpt_model(encoded_input)[0][:, -1, :] # batch x vocab
+            gpt_logits[:, banned_tokens] = -1e8
+            top_logits, top_indices = gpt_logits.topk(precondition_topk, dim=1)
+            new_input_candidates = torch.cat([encoded_input.unsqueeze(1).expand(-1, precondition_topk, -1), top_indices.unsqueeze(2)], dim=2) # batch x topk x seq+1
+            expanded_lengths = (lengths + 1).unsqueeze(1).expand(batch_size, precondition_topk) # batch x topk
+            expanded_future_words = future_words.unsqueeze(0).unsqueeze(1).expand(batch_size, precondition_topk, -1) # batch x topk x N
+            candidate_syllables_to_go = []
+            for candidate in new_input_candidates[0]:
+                candidate_until_last_word_text = ' '.join(gpt_tokenizer.decode(candidate[previous_enc_len:]).split()[:-1])
+                candidate_syllables_to_go.append(10 - count_syllables(candidate_until_last_word_text))
+                # usually these are all the same, but run them all for correctness. could do more efficiently but it's not too slow anyway.
+            expanded_syllables_to_go = torch.LongTensor(candidate_syllables_to_go).to(device).view(1, precondition_topk)
+            if condition_lambda == 0:
+                iambic_logits = torch.zeros_like(expanded_lengths).float()
+            else:
+                # truncate prefix because we trained on single lines
+                iambic_logits = iambic_model(new_input_candidates[:, :, previous_enc_len:].flatten(0, 1), expanded_lengths.flatten(0, 1) - previous_enc_len, None, None, None)[:, -1] # batch*topk x seq+1 -> batch*topk
+                iambic_logits = iambic_logits.view(batch_size, precondition_topk)
+                iambic_logits = iambic_logits - torch.log(1 + torch.exp(iambic_logits))
+            if condition_lambda == 0:
+                rhyme_logits = torch.zeros_like(expanded_lengths).float()
+            else:
+                rhyme_logits = rhyme_model(new_input_candidates.flatten(0, 1), # batch*topk x seq+1
+                                                    expanded_lengths.flatten(0, 1), # batch*topk
+                                                    expanded_future_words.flatten(0, 1), # batch*topk x N
+                                                    log_probs, # N
+                                                    expanded_syllables_to_go.flatten(0, 1)) # batch*topk
+                rhyme_logits = rhyme_logits.view(batch_size, precondition_topk, -1) # batch x topk x N
+                rhyme_logits = rhyme_logits - torch.log(1 + torch.exp(rhyme_logits)) # batch x topk x N
+                rhyme_logits = rhyme_logits.squeeze(2) # batch x topk
+            if condition_lambda == 0:
+                newline_logits = torch.zeros_like(expanded_lengths).float()
+            else:
+                newline_logits = newline_model(new_input_candidates.flatten(0, 1), # batch*topk x seq+1
+                                                    expanded_lengths.flatten(0, 1), # batch*topk
+                                                    expanded_future_words.flatten(0, 1), # batch*topk x N
+                                                    log_probs, # N
+                                                    expanded_syllables_to_go.flatten(0, 1)) # batch*topk
+                newline_logits = newline_logits[:, -1].view(batch_size, precondition_topk, -1) # batch x topk x N
+                newline_logits = newline_logits - torch.log(1 + torch.exp(newline_logits)) # batch x topk x N
+                newline_logits = newline_logits.squeeze(2) # batch x topk
+            full_logits = top_logits + condition_lambda * iambic_logits + condition_lambda * rhyme_logits + condition_lambda * newline_logits
+            post_logits, post_indices = full_logits.topk(postcondition_topk, dim=1)
+            post_probs = F.softmax(post_logits, dim=1)
+            index_into_top_indices = post_indices[torch.arange(batch_size).to(post_indices.device), torch.multinomial(post_probs, 1).flatten()] # batch
+            next_indices = top_indices[torch.arange(batch_size).to(top_indices.device), index_into_top_indices] # batch
+            encoded_input = torch.cat([encoded_input, next_indices.unsqueeze(1)], dim=1) # batch x seq+1
+            lengths = lengths + 1
+            syllables_to_go = POETRY_LINE_SYLLABLES - count_syllables(gpt_tokenizer.decode(encoded_input[0][previous_enc_len:])) # if we get very unlucky with a partial word that the syllable counter doesn't recognize we might end early, but it's unlikely
+            if syllables_to_go <= 0 and [gpt_tokenizer.decode(s) for s in encoded_input][0][-1] in PHRASE_ENDS:
+                break
+            if syllables_to_go < 0:
+                # encoded_input = encoded_input[:, :-1]
+                break
+        return [gpt_tokenizer.decode(s) for s in encoded_input][0][len(current_text):]
+if __name__=='__main__':
+    parser = ArgumentParser()
+    # DATA
+    parser.add_argument('--iambic_ckpt', type=str, required=True)
+    parser.add_argument('--rhyme_ckpt', type=str, required=True)
+    parser.add_argument('--newline_ckpt', type=str, required=True)
+    parser.add_argument('--dataset_info', type=str, required=True, help='saved dataset info')
+    parser.add_argument('--rhyme_info', type=str, required=True, help='saved rhyme info')
+    parser.add_argument('--model_string', type=str, default='gpt2-medium')
+    parser.add_argument('--input_text', type=str, default=None, required=True, help='initial text')
+    parser.add_argument('--precondition_topk', type=int, default=200, help='consider top k outputs from gpt at each step before conditioning and re-pruning')
+    parser.add_argument('--topk', type=int, default=10, help='consider top k outputs from gpt at each step')
+    parser.add_argument('--condition_lambda', type=float, default=1.0, help='lambda weight on conditioning model')
+    parser.add_argument('--seed', type=int, default=1, help='random seed')
+    parser.add_argument('--device', type=str, default='cuda', choices=['cpu', 'cuda'])
+    parser.add_argument('--debug', action='store_true', default=False)
+    args = parser.parse_args()
+    random.seed(args.seed)
+    np.random.seed(args.seed)
+    torch.manual_seed(args.seed)
+    main(args)

naacl-2021-fudge-controlled-generation/predict_topic.py ADDED Viewed

	@@ -0,0 +1,126 @@

+import os
+import random
+import time
+import pickle
+import math
+from argparse import ArgumentParser
+from tqdm import tqdm
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from transformers import AutoTokenizer, AutoModelWithLMHead, pipeline, set_seed, GPT2Tokenizer, GPT2Model
+from data import Dataset
+from model import Model
+from util import save_checkpoint, ProgressMeter, AverageMeter, num_params
+from constants import *
+def main(args):
+    with open(args.dataset_info, 'rb') as rf:
+        dataset_info = pickle.load(rf)
+    for cw in args.condition_words.split():
+        assert cw in dataset_info.word2index
+    gpt_tokenizer = AutoTokenizer.from_pretrained(args.model_string)
+    gpt_tokenizer.add_special_tokens({'pad_token': PAD_TOKEN})
+    gpt_pad_id = gpt_tokenizer.encode(PAD_TOKEN)[0]
+    gpt_model = AutoModelWithLMHead.from_pretrained(args.model_string).to(args.device)
+    gpt_model.eval()
+    checkpoint = torch.load(args.ckpt, map_location=args.device)
+    model_args = checkpoint['args']
+    conditioning_model = Model(model_args, gpt_pad_id, len(dataset_info.index2word)) # no need to get the glove embeddings when reloading since they're saved in model ckpt anyway
+    conditioning_model.load_state_dict(checkpoint['state_dict'])
+    conditioning_model = conditioning_model.to(args.device)
+    conditioning_model.eval()
+    print("=> loaded checkpoint '{}' (epoch {})"
+            .format(args.ckpt, checkpoint['epoch']))
+    print('num params', num_params(conditioning_model))
+    while True:
+        results = predict(gpt_model,
+                        gpt_tokenizer,
+                        conditioning_model,
+                        [args.input_text],
+                        args.condition_words,
+                        dataset_info,
+                        args.precondition_topk,
+                        args.topk,
+                        args.length_cutoff,
+                        condition_lambda=args.condition_lambda,
+                        device=args.device)
+        print(results)
+        import pdb; pdb.set_trace()
+def predict(gpt_model, gpt_tokenizer, conditioning_model, input_text, condition_words, dataset_info, precondition_topk, postcondition_topk, length_cutoff, condition_lambda=1.0, device='cuda'):
+    with torch.no_grad():
+        batch_size = len(input_text)
+        condition_words = condition_words.split()
+        future_words = torch.LongTensor([dataset_info.word2index[cw] for cw in condition_words]).to(device) # N
+        log_probs = torch.Tensor([math.log(dataset_info.vocab[cw] / dataset_info.total_words) for cw in condition_words]).to(device) # N
+        # assumes initially all same length.
+        encoded_input = [gpt_tokenizer.encode(it, return_tensors='pt').to(device) for it in input_text] # batch x seq
+        encoded_input = torch.cat(encoded_input, dim=0)
+        lengths = torch.LongTensor([encoded_input.shape[1]]).to(device)
+        gpt_encoded_future_words = [gpt_tokenizer.encode(' ' + cw, return_tensors='pt')[0].to(device) for cw in condition_words]
+        while lengths.max() < length_cutoff:
+            tokens_left = torch.LongTensor([length_cutoff - lengths.max() for _ in range(batch_size)]).to(device)
+            gpt_logits = gpt_model(encoded_input)[0][:, -1, :] # batch x vocab
+            top_logits, top_indices = gpt_logits.topk(precondition_topk, dim=1) # batch x topk
+            new_input_candidates = torch.cat([encoded_input.unsqueeze(1).expand(-1, precondition_topk, -1), top_indices.unsqueeze(2)], dim=2) # batch x topk x seq+1
+            expanded_lengths = (lengths + 1).unsqueeze(1).expand(batch_size, precondition_topk) # batch x topk
+            expanded_future_words = future_words.unsqueeze(0).unsqueeze(1).expand(batch_size, precondition_topk, -1) # batch x topk x N
+            expanded_tokens_left = tokens_left.unsqueeze(1).expand(-1, precondition_topk) # batch x topk
+            if condition_lambda == 0:
+                condition_logits = torch.zeros_like(expanded_future_words).float()
+            else:
+                condition_logits = conditioning_model(new_input_candidates.flatten(0, 1), # batch*topk x seq+1
+                                                    expanded_lengths.flatten(0, 1), # batch*topk
+                                                    expanded_future_words.flatten(0, 1), # batch*topk x N
+                                                    log_probs, # N
+                                                    expanded_tokens_left.flatten(0, 1)) # batch*topk
+                condition_logits = condition_logits.view(batch_size, precondition_topk, -1) # batch x topk x N
+                condition_logits = condition_logits - torch.log(1 + torch.exp(condition_logits)) # get correct log probs
+            condition_logits = torch.mean(condition_logits, dim=2)
+            full_logits = top_logits + condition_logits * condition_lambda # batch x topk
+            post_logits, post_indices = full_logits.topk(postcondition_topk, dim=1)
+            post_probs = F.softmax(post_logits, dim=1)
+            index_into_top_indices = post_indices[torch.arange(batch_size).to(post_indices.device), torch.multinomial(post_probs, 1).flatten()] # batch
+            next_indices = top_indices[torch.arange(batch_size).to(top_indices.device), index_into_top_indices] # batch
+            encoded_input = torch.cat([encoded_input, next_indices.unsqueeze(1)], dim=1) # batch x seq+1
+            lengths = lengths + 1 # batch
+        return [gpt_tokenizer.decode(s) for s in encoded_input]
+if __name__=='__main__':
+    parser = ArgumentParser()
+    # DATA
+    parser.add_argument('--ckpt', type=str, required=True)
+    parser.add_argument('--dataset_info', type=str, required=True, help='saved dataset info')
+    parser.add_argument('--model_string', type=str, default='gpt2-medium')
+    parser.add_argument('--input_text', type=str, default=None, required=True, help='initial text')
+    parser.add_argument('--condition_words', type=str, default=None, required=True, help='word(s) to optimize for')
+    parser.add_argument('--precondition_topk', type=int, default=200, help='consider top k outputs from gpt at each step before conditioning and re-pruning')
+    parser.add_argument('--topk', type=int, default=10, help='consider top k outputs from gpt at each step')
+    parser.add_argument('--condition_lambda', type=float, default=1.0, help='lambda weight on conditioning model')
+    parser.add_argument('--length_cutoff', type=int, default=80, help='max length')
+    parser.add_argument('--seed', type=int, default=1, help='random seed')
+    parser.add_argument('--device', type=str, default='cuda', choices=['cpu', 'cuda'])
+    parser.add_argument('--debug', action='store_true', default=False)
+    args = parser.parse_args()
+    random.seed(args.seed)
+    np.random.seed(args.seed)
+    torch.manual_seed(args.seed)
+    main(args)

naacl-2021-fudge-controlled-generation/requirements.txt ADDED Viewed

	@@ -0,0 +1,7 @@

+Phyme==0.0.9
+pronouncing==0.2.0
+pytorch-lightning==1.0.6
+torch==1.7.0
+tqdm==4.49.0
+sacrebleu==1.4.14
+sacremoses==0.0.43

naacl-2021-fudge-controlled-generation/topic_data/README.md ADDED Viewed

	@@ -0,0 +1,3 @@


1	+ `topic_prefixes.txt` contains the 20 prefixes used at test time for starting the generations.
2	+
3	+ `wordlists/` contains the wordlists for each of the 7 topics used during testing. The heldout bags used to evaluate the generalization of the topic words to other related words are in `test_wordlists/`. `val_wordlists/` contains just one extra wordlist used for tuning.

naacl-2021-fudge-controlled-generation/topic_data/test_wordlists/computers.txt ADDED Viewed

	@@ -0,0 +1,163 @@

+sailor
+memories
+article
+phishing
+crucial
+interactive
+capabilities
+ISP
+query
+signal
+computation
+detect
+compiling
+workstation
+barcode
+XP
+cake
+counterfeiting
+decimal
+back-up
+reasoning
+DSL
+C++
+DVD
+Frequently
+wifi
+deleting
+paper
+DNS
+CyanogenMod
+overflow
+Android
+latency
+creating
+redirect
+sites
+sidebar
+Jacket
+prev
+connections
+PDF
+torrent
+original
+gmail
+rename
+coder
+mainboard
+parasite
+casing
+lurks
+pixels
+touchpad
+update
+visuals
+encyclopedia
+mice
+Solaris
+caching
+copies
+usb
+chew
+fixes
+house
+operand
+input
+pull
+iterative
+educational
+autocomplete
+on-line
+confidentiality
+decrypt
+beach
+mails
+rectangular
+jQuery
+Excel
+point-in-time
+Ubuntu
+decryption
+dialup
+profit
+off-line
+developing
+choice
+notebook
+storing
+typeface
+little
+customer
+step
+text
+run-time
+interview
+layout
+computing
+chairs
+infected
+must
+tools
+search
+pane
+gamepad
+disc
+initialize
+display
+button
+Firefox
+automatically
+garbage
+512MB
+cyber
+logon
+elements
+restoring
+writer
+saving
+parsing
+execute
+configuring
+telephoto
+popup
+utilities
+packet
+pasting
+guest
+edit
+glass
+e-mail
+components
+binaries
+subdirectory
+restart
+XSLT
+inkjet
+allows
+functionality
+debian
+change
+click
+dialog
+GPU
+stored
+attribute
+deflate
+cheat
+direction
+camera
+hats
+topic
+journalists
+taxi
+console
+identifier
+VPN
+flames
+spyware
+secure
+shoe
+Macs
+php
+demo
+extract

naacl-2021-fudge-controlled-generation/topic_data/test_wordlists/legal.txt ADDED Viewed

	@@ -0,0 +1,108 @@

+waived
+homicide
+repress
+statutory
+sentencing
+respondent
+maintain
+legislative
+prosecution
+whether
+forgive
+mandamus
+democratic
+treasurer
+acquittal
+offender
+sued
+edict
+malpractice
+debatable
+criminal
+injunctive
+appellant
+convicted
+admit
+proxies
+aggrieved
+enforcement
+second-degree
+ethical
+knowing
+liability
+event
+property
+conviction
+deposited
+immune
+assertion
+assualt
+regulations
+exams
+pixels
+prosecuting
+insolvent
+felonies
+families
+mediator
+rulings
+heard
+wrongs
+wrongful
+folder
+federal
+widget
+restaurant
+incarcerated
+burglary
+pants
+land-use
+quash
+sitting
+rescind
+dispute
+leave
+requesting
+appearing
+testify
+discoveries
+championship
+police
+judgment
+purchase
+revelation
+solicitor
+disagree
+judicial
+reversing
+jurors
+decision
+negligent
+mutual
+track
+objecting
+major
+amendment
+alleging
+agreement
+investment
+custodial
+accusation
+passageways
+asserted
+authority
+deputies
+insolvency
+sworn
+defensive
+embezzlement
+disputes
+findings
+reservation
+litem
+inmates
+step-by-step
+innocence
+parties
+transcribed
+inept

naacl-2021-fudge-controlled-generation/topic_data/test_wordlists/military.txt ADDED Viewed

	@@ -0,0 +1,136 @@

+team
+threat
+sloop
+offensively
+guerilla
+invading
+samurai
+propel
+sunk
+concern
+persuade
+Maj.
+wear
+fatigues
+subsidiary
+glider
+advancing
+ICBM
+won
+cargo
+groan
+knowledge
+proposal
+terms
+deputy
+taken
+bricks
+operation
+Iraq
+zoning
+offices
+fought
+detonated
+adjutant
+skipper
+batteries
+medical
+strategic
+armistice
+rocket
+enemies
+tensions
+forming
+inundate
+engaging
+dormitories
+flying
+allies
+cursor
+casing
+zone
+scouts
+stationed
+pistol
+paragraph
+highest
+tribute
+strategy
+pump
+decoding
+argue
+public
+policeman
+lob
+sword
+bleeding
+civilians
+rifles
+airmen
+freedom
+explosion
+capturing
+skirmish
+conquered
+frigate
+armour
+leaving
+customer
+expert
+armies
+aviation
+armoury
+rifleman
+lace
+khaki
+barrage
+civilian
+secluded
+casualties
+injuries
+academies
+hires
+dead
+ATL
+late
+relinquish
+naval
+riflemen
+seige
+sonar
+aboard
+longtime
+bottom
+gatling
+militia
+clandestine
+execute
+assets
+significant
+personnel
+escorting
+manoeuvre
+Sgt.
+rear
+shoulders
+rescuing
+hand-to-hand
+howitzer
+committee
+rifle
+victory
+defensive
+forcing
+honour
+companies
+pirate
+evacuating
+sabotaging
+citadel
+cadre
+camera
+launchers
+flames
+encoding
+visor
+ship

naacl-2021-fudge-controlled-generation/topic_data/test_wordlists/politics.txt ADDED Viewed

	@@ -0,0 +1,40 @@

+credibility
+Nazism
+imported
+remember
+progressivism
+legislative
+communist
+gender
+democratic
+immediate
+capitalist
+purchase
+energy
+referenda
+ratify
+lengthy
+authorisation
+aristocrats
+jurisdiction
+judge
+socialist
+excise
+fascist
+secondary
+subsidies
+autocratic
+shortfall
+appropriated
+uphold
+income
+federated
+federal
+efforts
+diplomatic
+freedom
+properties
+ideologies
+exporting
+minority
+cultural

naacl-2021-fudge-controlled-generation/topic_data/test_wordlists/religion.txt ADDED Viewed

	@@ -0,0 +1,207 @@

+Elegant
+Catholicism
+Metatron
+Mind
+Empires
+SWF
+Secular
+Judas
+Prime
+Terrier
+Preview
+Existence
+Silent
+sanctuaries
+Answer
+Balancing
+Mutual
+Constantinople
+Scrolls
+Network
+Almighty
+Attorney
+Liberation
+Database
+Practicing
+St.
+Eucharist
+Glorious
+Catholic
+Compassion
+Volume
+Saviour
+Meditation
+Testament
+Morality
+Heart
+Aramaic
+Court
+Baskets
+Fervor
+Date
+Curriculum
+Liberal
+Creativity
+Everlasting
+PDF
+Rev.
+Thank
+Nanak
+Dangerous
+Shari'a
+Policy
+Talmud
+Best
+Supply
+Oneness
+Punishment
+Reincarnation
+TransCanada
+Forums
+VoIP
+Factors
+Assistance
+Charities
+Calculator
+Shadows
+Him
+Natural
+Lamp
+Thyme
+Templar
+Muhammad
+Venue
+Hell
+Bunyan
+Songs
+Epistle
+Suites
+Economic
+Intel
+Spanish
+Lives
+Married
+Hypothesis
+Cosmic
+Injunction
+Involvement
+Leviticus
+Self
+Truth
+Mystical
+Melody
+Pure
+Sermon
+Atlantic
+Excel
+Sonata
+SPCA
+Saturday
+Adventure
+Honour
+Resurrection
+Emanuel
+Connery
+Rites
+United
+Pope
+Mary
+Chen
+Lisa
+ODST
+Videos
+Modernity
+Sculpture
+Jewish
+Heavy
+Remote
+Praise
+Foods
+Merrell
+Safety
+Influencing
+Tie
+Outreach
+Kenichi
+Criminal
+Stevie
+Judgement
+SQL
+Basilica
+Piano
+Reiki
+Understanding
+Cognition
+Maker
+Diocese
+Marital
+Masjid
+Militant
+Methodist
+Political
+Appeals
+Deities
+Purchase
+Rallies
+Testing
+Contemporary
+Help
+Sweet
+Fallen
+Spangled
+Renewable
+Laughter
+Provider
+Charitable
+Ethical
+Families
+Cure
+Significance
+Communities
+Cost
+Demon
+Motivation
+Calvary
+Double
+Mysteries
+Determining
+Baptist
+Mandir
+Qi
+Loss
+Lust
+Echoes
+Lord
+Vote
+Glad
+Dharma
+Kombat
+Prostitute
+Wetlands
+Queries
+Always
+Focus
+EOS
+Worship
+Implications
+Wiccan
+Invitations
+Theology
+Hospital
+Freedom
+Mirror
+Uncharted
+Radiance
+Serving
+Buddhist
+Kiss
+Mother
+Death
+Episcopal
+Impact
+Shinto
+Crisis
+Secure
+Learning
+Dreams
+Association

naacl-2021-fudge-controlled-generation/topic_data/test_wordlists/science.txt ADDED Viewed

	@@ -0,0 +1,47 @@

+astronomical
+evolved
+tests
+reason
+idea
+component
+jug
+rain
+renewable
+scaling
+phone
+action
+studies
+humidity
+siphon
+warming
+compounds
+genomics
+electrons
+mathematics
+clinical
+physiology
+hypotheses
+stored
+statutes
+magnesium
+measuring
+fuels
+scientific
+bone
+molecular
+microscopy
+observing
+parameter
+transition
+system
+bacterium
+ligand
+increasing
+theories
+physicist
+flow
+pounds
+nothing
+observatory
+gravitational
+electron

naacl-2021-fudge-controlled-generation/topic_data/test_wordlists/space.txt ADDED Viewed

	@@ -0,0 +1,16 @@

+cosmos
+mothership
+flyby
+broadband
+aeronautics
+fireball
+Romulan
+room
+cosmonaut
+actress
+worlds
+heavens
+lunar
+interstellar
+galaxies
+lander

naacl-2021-fudge-controlled-generation/topic_data/topic_prefixes.txt ADDED Viewed

	@@ -0,0 +1,20 @@

+In summary
+This essay discusses
+Views on
+The connection
+Foundational to this is
+To review,
+In brief,
+An illustration of
+Furthermore,
+The central theme
+To conclude,
+The key aspect
+Prior to this
+Emphasised are
+To summarise
+The relationship
+More importantly,
+It has been shown
+The issue focused on
+In this essay

naacl-2021-fudge-controlled-generation/topic_data/val_wordlists/fantasy.txt ADDED Viewed

	@@ -0,0 +1,26 @@

+beast
+Cerberus
+demon
+dragon
+fairy
+Frankenstein
+ghost
+Godzilla
+giant
+horror
+hydra
+imp
+monster
+mummy
+ogre
+orc
+savage
+spirit
+sprite
+titan
+troll
+undead
+unicorn
+vampire
+witch
+zombie

naacl-2021-fudge-controlled-generation/topic_data/wordlists/computers.txt ADDED Viewed

	@@ -0,0 +1,176 @@

+algorithm
+analog
+app
+application
+array
+backup
+bandwidth
+binary
+bit
+bite
+blog
+blogger
+bookmark
+boot
+broadband
+browser
+buffer
+bug
+bus
+byte
+cache
+caps
+captcha
+CD
+client
+command
+compile
+compress
+computer
+configure
+cookie
+copy
+CPU
+dashboard
+data
+database
+debug
+delete
+desktop
+development
+digital
+disk
+document
+domain
+dot
+download
+drag
+dynamic
+email
+encrypt
+encryption
+enter
+FAQ
+file
+firewall
+firmware
+flaming
+flash
+folder
+font
+format
+frame
+graphics
+hack
+hacker
+hardware
+home
+host
+html
+icon
+inbox
+integer
+interface
+Internet
+IP
+iteration
+Java
+joystick
+kernel
+key
+keyboard
+keyword
+laptop
+link
+Linux
+logic
+login
+lurking
+Macintosh
+macro
+malware
+media
+memory
+mirror
+modem
+monitor
+motherboard
+mouse
+multimedia
+net
+network
+node
+offline
+online
+OS
+option
+output
+page
+password
+paste
+path
+piracy
+pirate
+platform
+podcast
+portal
+print
+printer
+privacy
+process
+program
+programmer
+protocol
+RAM
+reboot
+resolution
+restore
+ROM
+root
+router
+runtime
+save
+scan
+scanner
+screen
+screenshot
+script
+scroll
+security
+server
+shell
+shift
+snapshot
+software
+spam
+spreadsheet
+storage
+surf
+syntax
+table
+tag
+template
+thread
+toolbar
+trash
+undo
+Unix
+upload
+URL
+user
+UI
+username
+utility
+version
+virtual
+virus
+web
+website
+widget
+wiki
+window
+Windows
+wireless
+worm
+XML
+Zip

naacl-2021-fudge-controlled-generation/topic_data/wordlists/legal.txt ADDED Viewed

	@@ -0,0 +1,131 @@

+affidavit
+allegation
+appeal
+appearance
+argument
+arrest
+assault
+attorney
+bail
+bankrupt
+bankruptcy
+bar
+bench
+warrant
+bond
+booking
+capital
+crime
+case
+chambers
+claim
+complainant
+complaint
+confess
+confession
+constitution
+constitutional
+contract
+counsel
+court
+custody
+damages
+decree
+defendant
+defense
+deposition
+discovery
+equity
+estate
+ethics
+evidence
+examination
+family
+law
+felony
+file
+fraud
+grievance
+guardian
+guilty
+hearing
+immunity
+incarceration
+incompetent
+indictment
+injunction
+innocent
+instructions
+jail
+judge
+judiciary
+jurisdiction
+jury
+justice
+law
+lawsuit
+lawyer
+legal
+legislation
+liable
+litigation
+manslaughter
+mediation
+minor
+misdemeanor
+moot
+murder
+negligence
+oath
+objection
+opinion
+order
+ordinance
+pardon
+parole
+party
+perjury
+petition
+plaintiff
+plea
+precedent
+prison
+probation
+prosecute
+prosecutor
+proxy
+record
+redress
+resolution
+reverse
+revoke
+robbery
+rules
+sentence
+settlement
+sheriff
+sidebar
+standing
+state
+statute
+stay
+subpoena
+suit
+suppress
+sustain
+testimony
+theft
+title
+tort
+transcript
+trial
+trust
+trustee
+venue
+verdict
+waiver
+warrant
+will
+witness
+writ
+zoning

naacl-2021-fudge-controlled-generation/topic_data/wordlists/military.txt ADDED Viewed

	@@ -0,0 +1,149 @@

+academy
+advance
+aircraft
+ally
+ammo
+ammunition
+armor
+arms
+army
+arrow
+arsenal
+artillery
+attack
+attention
+ballistic
+barracks
+base
+battalion
+battery
+battle
+battlefield
+bomb
+bombard
+bombardment
+brig
+brigade
+bullet
+camouflage
+camp
+cannon
+captain
+capture
+carrier
+casualty
+catapult
+cavalry
+colonel
+combat
+command
+commander
+commission
+company
+conflict
+conquest
+convoy
+corps
+covert
+crew
+decode
+defeat
+defend
+defense
+destroyer
+division
+draft
+encode
+enemy
+engage
+enlist
+evacuate
+explosive
+fight
+fire
+fleet
+force
+formation
+fort
+front
+garrison
+general
+grenade
+grunt
+guerrilla
+gun
+headquarters
+helmet
+honor
+hospital
+infantry
+injury
+intelligence
+invade
+invasion
+jet
+kill
+leave
+lieutenant
+major
+maneuver
+marines
+MIA
+mid
+military
+mine
+missile
+mortar
+navy
+neutral
+offense
+officer
+ordinance
+parachute
+peace
+plane
+platoon
+private
+radar
+rank
+recruit
+regiment
+rescue
+reserves
+retreat
+ribbon
+sabotage
+sailor
+salute
+section
+sergeant
+service
+shell
+shoot
+shot
+siege
+sniper
+soldier
+spear
+specialist
+squad
+squadron
+staff
+submarine
+surrender
+tactical
+tactics
+tank
+torpedo
+troops
+truce
+uniform
+unit
+veteran
+volley
+war
+warfare
+warrior
+weapon
+win
+wound

naacl-2021-fudge-controlled-generation/topic_data/wordlists/politics.txt ADDED Viewed

	@@ -0,0 +1,47 @@

+affirm
+appropriation
+aristocracy
+authoritarian
+authority
+authorization
+brief
+capitalism
+communism
+constitution
+conservatism
+court
+deficit
+diplomacy
+direct
+democracy
+equality
+exports
+fascism
+federation
+government
+ideology
+imports
+initiative
+legislature
+legitimacy
+liberalism
+liberty
+majority
+order
+political
+culture
+politics
+power
+primary
+property
+ratification
+recall
+referendum
+republic
+socialism
+state
+subsidy
+tariff
+imports
+tax
+totalitarian

naacl-2021-fudge-controlled-generation/topic_data/wordlists/religion.txt ADDED Viewed

	@@ -0,0 +1,232 @@

+absolute
+affect
+aid
+angel
+anthem
+apostle
+archangel
+Archbishop
+balance
+ban
+belief
+benefit
+Bible
+bishop
+bless
+blessing
+bliss
+bond
+bow
+Buddhism
+canon
+Cantor
+cathedral
+celestial
+chapel
+charity
+choice
+Christianity
+church
+comfort
+community
+conflict
+connection
+conquest
+conservative
+control
+conversion
+convert
+core
+counsel
+courage
+Covenant
+creative
+Creator
+creed
+cross
+Crusade
+Darkness
+decision
+deity
+destiny
+Devil
+disciple
+discipline
+discussion
+divine
+divinity
+doctrine
+duty
+effect
+elder
+energy
+essence
+eternal
+ethics
+event
+evidence
+exile
+Exodus
+faith
+family
+fate
+Father
+favor
+fundamental
+gift
+glory
+God
+gospel
+grace
+growth
+guru
+habit
+hallow
+halo
+happiness
+harmony
+healing
+Heaven
+Hebrew
+holy
+honor
+hope
+host
+humane
+immortal
+influence
+insight
+instruction
+issue
+Jesuit
+Jesus
+joy
+Judaism
+judgment
+justice
+karma
+keen
+Keystone
+Kingdom
+Latin
+life
+light
+love
+loving
+marriage
+meaning
+mercy
+Messiah
+minister
+miracle
+mission
+mortal
+mosque
+movement
+music
+mystery
+nature
+nun
+official
+oracle
+order
+organ
+Orthodox
+outlook
+pacific
+pagan
+parish
+participation
+pastor
+patriarch
+peace
+perception
+personal
+perspective
+petition
+pilgrim
+politics
+power
+practice
+prayer
+prelude
+presence
+priest
+principle
+privacy
+prophet
+protection
+purpose
+query
+quest
+question
+quiet
+radiant
+radical
+rally
+rebirth
+redemption
+refuge
+relationship
+relative
+religion
+religious
+Revelation
+ritual
+role
+Sacrament
+sacred
+sacrifice
+sage
+saint
+salvation
+sanctuary
+savior
+scripture
+scriptures
+sect
+security
+sense
+serious
+serve
+service
+Sharia
+shepherd
+shrine
+silence
+sin
+society
+soul
+source
+spirit
+spiritual
+split
+statue
+Sunday
+support
+Supreme
+teaching
+temple
+tests
+text
+Torah
+tradition
+traditional
+trust
+unique
+unity
+unknown
+value
+vanity
+virtue
+vision
+voice
+voices
+watch
+weight
+whole
+wisdom
+wonder
+yang
+yin
+zeal

naacl-2021-fudge-controlled-generation/topic_data/wordlists/science.txt ADDED Viewed

	@@ -0,0 +1,48 @@

+astronomy
+atom
+biology
+cell
+chemical
+chemistry
+climate
+control
+data
+electricity
+element
+energy
+evolution
+experiment
+fact
+flask
+fossil
+funnel
+genetics
+gravity
+hypothesis
+lab
+laboratory
+laws
+mass
+matter
+measure
+microscope
+mineral
+molecule
+motion
+observe
+organism
+particle
+phase
+physics
+research
+scale
+science
+scientist
+telescope
+temperature
+theory
+tissue
+variable
+volume
+weather
+weigh

naacl-2021-fudge-controlled-generation/topic_data/wordlists/space.txt ADDED Viewed

	@@ -0,0 +1,18 @@

+planet
+galaxy
+space
+universe
+orbit
+spacecraft
+earth
+moon
+comet
+star
+astronaut
+aerospace
+asteroid
+spaceship
+starship
+galactic
+satellite
+meteor

naacl-2021-fudge-controlled-generation/transcript.txt ADDED Viewed

	@@ -0,0 +1,415 @@

+(Sorry, the slide numbers got a bit misaligned as I added slides. Not an exact transcript for the video but roughly correct.)
+1:
+Hi! I'm Kevin from UC Berkeley, and today I'll be presenting my paper FUDGE: Controlled Text Generation with Future Discriminators, by me and my advisor Dan Klein.
+2:
+So first a quick overview.
+3:
+I'll start by explaining the problem of controlled text generation with some examples,
+4:
+then describe our method, FUDGE, Future Discriminators for Generation,
+5:
+and in doing so I'll also show experimental results and example model outputs on three diverse controlled generation tasks.
+6:
+So what's controlled text generation?
+7:
+Well let's start with our autoregressive language model that we use for text generation, without the controlled part.
+8:
+The language model models a distribution over next tokens x i+1 given the prefix x1 to x i. for example, you might tell it to
+9:
+Generate text according to a prompt like
+9:
+THIS, the issue focused on.
+10:
+and then it'll chug along and generate text,
+11:
+and these days language models are pretty good.
+But in controlled generation, you have an additional attribute constraint
+12:
+like wanting the output to be about politics.
+13:
+Specifically, we have an attribute function a(X) which says whether or not the attribute a is true for your output X, in this case whether or not the output is on topic. There's no probabilities involved in a(X) since it operates on the completed generation output, not on partial sequences.
+14:
+More precisely, the task of controlled text generation is to sample from the distribution P(X given a = True), so the distribution of outputs X which satisfy a.
+15:
+By default the language model isn't equipped to handle this additional constraint a, so its output is not going to pass.
+So we need a method for *controlled* text generation.
+15:
+For example, our method FUDGE.
+16:
+Given the same prompt with the politics topic,
+16:
+Here's what FUDGE says.
+17:
+It worked pretty well in this example. It's talking about institutions and constitutions, which seems clearly on topic.
+18:
+And I'll point out here that controlled generation makes sense in addition to the usual conditioning on the input that you might see in translation or summarization.
+19:
+Say we're translating Spanish to English. There's input conditioning on the original Spanish, but we're also imposing the additional constraint that the output be formal, which is where controlled text generation comes in.
+20:
+So say we have this Spanish input
+20:
+and let me just move it to the corner so we can still see it
+20:
+If you ask your off-the-shelf translation model it'll get the meaning right,
+but it copies some ungrammatical parts of the original Spanish
+21:
+like these repeated words in bold.
+22:
+So at the end when we ask our formality classifier,
+23:
+it might not be super happy.
+24:
+But if you use a controlled text generation approach like FUDGE,
+25:
+You can get this translation which preserves the meaning, while also better matching the formal style constraint.
+26:
+27:
+You might wonder, why don't we just do rejection sampling?
+28:
+Just sample a bunch of times from the translator
+29:
+until you get one that passes.
+That might work for some simpler constraints like topics, but it's going to be totally intractable when you use constraints that are very rarely satisfied by your generator distribution.
+30:
+What are some more difficult attribute constraints?
+31:
+Consider this task, effectively, complete the poem.
+32:
+Let’s see what the language model says when we give it this input from Shakespeare. And even thence thou wilt be stol'n I fear
+32:
+and thou art a good friend of mine. The king's guard.
+33:
+This is terrible! It doesn't roll off the tongue, it doesn't rhyme, it doesn't even end the sentence properly at the end.
+34:
+Shakespeare hates it. You could generate any number of poems using your language model, and Shakespeare is gonna hate every last one.
+35:
+But if you ask FUDGE, you get this. And even thence thou wilt be stol'n I fear, for this shall be the end. That's pretty clear.
+36:
+So it's not Shakespeare, but it gets the meter, or rhythm right, it rhymes, and it ends the sentence in about the right place at the end. Not too bad.
+37:
+Ok. So how does controlled generation work anyway? Let me give an incredibly oversimplified summary of some ideas in this line of work to put FUDGE in context.
+38:
+First, you can finetune.
+39:
+We'll use the politics topic as an example.
+39:
+You can train on a bunch of text about politics. Depending on how good your data is, this can work great! or it could be rather bad. It also might be annoying to have to finetune again next time, when you want to write about science instead.
+40:
+Another idea is to use a classifier.
+41:
+We're already using a classifier to evaluate.
+42:
+We can use a classifier to help us generate too. There's many different ways to do this.
+43:
+For example, you might propagate gradients to modify the model's activations,
+44:
+or you could just directly modify the model's output probabilities. One advantage of the latter method is that you don't need to access the original language model's gradients at all, which is nice if you're using something like GPT3. You can also swap the generator out as better models become available, like GPT4. Our approach FUDGE falls in this category of just modifying the output logits.
+45:
+Ok, so what's FUDGE?
+46:
+FUDGE at its core learns a lightweight classifier for the attribute constraint, and then follows a Bayesian factorization to combine it with the original generator, like the pretrained language model.
+47:
+A key difference from prior work is that we plan for the future, not the immediate present.
+48:
+And finally, FUDGE can easily and flexibly compose multiple constraints.
+49:
+Let's start with the classifier and Bayesian factorization.
+50:
+Since FUDGE builds off the base language model, let's review:
+51:
+You feed whatever tokens you have so far
+52:
+into your model,
+53:
+which models the distribution over possible next tokens.
+54:
+And then you sample from this distribution to pick your continuation.
+55:
+Now, we completely ignored the formal style constraint.
+56:
+So it's gonna be unhappy.
+57:
+So what do you want to do instead?
+58:
+Well, what you really want is to use your classifier to judge continuations.
+59:
+and mark which ones are acceptable given your constraint. So the classifier looks at each possible next continuation Do you want, Do you prefer, Do you thus, and so on maybe up to some limit, and judges each one individually to decide which it's ok with.
+60:
+So putting it together, we throw out whatever the classifier didn't like,
+61:
+and then we select from whatever the classifier is ok with depending on the base generator's probabilities.
+And this gets you "Do you prefer" instead of "Do you want"
+62:
+which sounds a bit more formal.
+63:
+But there's a subtle problem in this diagram.
+The classifier is supposed to judge the finished sentence, not the prefixes,
+64:
+but here we've shoved it into our generation procedure where it's gonna operate on prefixes.
+What we actually need is
+65:
+kind of a future looking crystal ball version of the classifier, which judges whether the whole sentence will eventually be formal, given the current prefix.
+65:
+And in practice, we implement the judge as a learned binary classifier, which runs on each possible continuation, and for each one outputs the probability that in the end the desired attribute a will be True, or in this case whether the finished sentence would be formal, given just the current prefix plus next token.
+So in the red table, this 0.2 by "want" means it thinks that there's a 20% chance that the eventual sentence would be formal if we started with Do you want, whereas it assigns a much higher probability for Do you prefer and Do you thus because those are more formal.
+68:
+And then we sample proportionally from the probabilities in the purple table,
+which are now just the elementwise product of the blue and red tables' probabilities.
+This corresponds exactly to a Bayesian factorization for the probability distribution over sentences generated by the language model that possess the desired attribute, and you can check the math in the paper.
+But the Bayesian motivation is not new.
+70:
+What's really new in FUDGE is that we explicitly distinguish the final classifier from the crystal ball future-predicting version that we use during the generation procedure, and making this distinction is critical for performance.
+71:
+Let's see FUDGE in action.
+72:
+if you recall our Spanish to English formal translation example.
+73:
+Let's backtrack FUDGE to this step.
+74:
+Again we have the repeated Spanish que que in bold, which the base model translated verbatim as that, that.
+75:
+But by having our classifier judge the formality of possible continuations, FUDGE is able to modify its continuation so that it doesn't repeat the words here.
+76:
+And the end result preserves the meaning while being also a bit more formal.
+77:
+And finally this all holds up in our experiments. So we have a classifier trained on a heldout dataset of formality, and it indeed judges FUDGE's outputs to be significantly more formal than those of the best prior method.
+78:
+At the same time, FUDGE is able to preserve the content, based on measuring BLEU against cleaned reference translations.
+79:
+Ok great. So next I'll elaborate more about planning for the future vs present,
+80:
+and I'll try to show more clearly *why* we really need this crystal ball classifier.
+81:
+Let's go back to our politics topic constraint.
+82:
+For simplicity, let's pretend just for this talk that the politics topic just means whether or not you use the word "constitution."
+83:
+So the constraint that we check at the end of generation is literally just grep for constitution.
+84:
+The crystal ball classifier has a much harder task. For a given prefix, it needs to predict whether each possible word makes "constitution" more likely to appear later.
+85:
+So how do we learn this?
+86:
+Say you have this example in your training data containing "constitution"
+87:
+The crystal ball classifier takes this and makes a bunch of prefix examples, labeled with the attribute function a(X)=True because we saw those prefixes led to the word "constitution" later.
+88:
+And similarly if you have this example without the word "constitution"
+89:
+It'll label those prefixes as False.
+90:
+Ok
+91:
+So let's examine what FUDGE generates.
+92:
+After a couple of steps, we have It has been shown whether the two
+93:
+What if you hypothetically use the non crystal ball classifier to guide generation?
+94:
+The issue focused on whether the two constitution
+(pause) Maybe not. We don't really want to sacrifice fluency. But this classifier is too shortsighted. It's all or nothing, you either have to use constitution immediately or bust.
+95:
+Ok
+96:
+Good thing FUDGE is actually using the future looking classifier.
+97:
+So instead, FUDGE is going to generate something which is still reasonably likely under the original language model, but makes constitution more likely to be generated later on. This classifier doesn't care whether constitution is generated now or later, as long as it shows up eventually.
+98:
+So here it's going to write about institutions, so it's on the right topic
+99:
+which eventually leads it to write about the constitution.
+100:
+Great.
+101:
+And indeed in our experiments, FUDGE is great according to human evaluations too. It substantially beats the best prior method in pairwise evaluations of being on topic,
+102:
+while also beating it in fluency.
+103:
+Cool. So I've now demonstrated the importance of planning for the future through this topic control task.
+And finally, i'll highlight FUDGE's compositional potential, using a third task.
+104:
+Ok.
+105:
+So remember our schematic diagram where we have the judge of formality.
+106:
+This works great when we have just one attribute we care about.
+107:
+Now, what if you have another attribute? Maybe you want it to be formal but also about math
+108:
+Now our old crystal ball classifier of just formality isn't good enough anymore.
+Of course, you could construct a classifier which predicts both attributes simultaneously, but FUDGE lets you do something more scalable and also i think a bit more elegant.
+109:
+Just reuse the formality predictor, while adding a second crystal ball for the math topic.
+So now your generation is guided by one classifier for each constraint,
+110:
+and it picks something which it thinks sounds more mathy.
+111:
+So let's see this in practice.
+112:
+Remember our poetry examples? where FUDGE's example isn't quite Shakespeare but is at least pretty well-formed.
+This task actually uses three separate constraints:
+113:
+We want iambic meter, which means that every other syllable should be a stressed syllable when we're reading it,
+114:
+we want the two lines to rhyme, and since the first line is 10 syllables that means the second line should be 10 syllables too,
+115:
+and the second line that we generate should end the sentence afterward too.
+116:
+So let's backtrack to halfway through FUDGE's generation, before it's generated the last couple of words, pretty clear.
+117:
+FUDGE is using its crystal ball poetry classifier, which is a combination of three classifiers, one for each of the three constraints.
+118:
+It would be perfectly grammatical to just directly say "clear". This works for the iambic meter constraint. But this is only the 8th syllable, so you'd still have to rhyme and end a new sentence in just two more syllables.
+119:
+Then we're probably back to angry Shakespeare.
+120:
+So FUDGE first generates pretty
+121:
+before finishing with clear and a period,
+122:
+and this show how FUDGE is able to compose multiple attributes using multiple classifiers, while simultaneously planning for the future as I described previously.
+123:
+Finally, if we look at the experiments, FUDGE's performance holds up, with the success rate on simultaneously satisfying all three constraints being more than double that of the best prior method.
+124:
+So that wraps things up. The takeaways are that FUDGE is a simple, flexible method for controlled text generation.
+125:
+To reiterate our three main points from earlier, FUDGE learns a classifier in a Bayesian factorization to guide the generation,
+it plans for the future rather than the present,
+and it can easily and flexibly compose different constraints as needed while maintaining strong performance.
+126:
+And our code is all publicly available.
+127:
+Thanks for watching! And please check out our paper for the full details.

naacl-2021-fudge-controlled-generation/util.py ADDED Viewed

	@@ -0,0 +1,110 @@

+import os
+import time
+import sys
+from contextlib import contextmanager
+import torch
+from constants import *
+@contextmanager
+def suppress_stdout():
+    with open(os.devnull, "w") as devnull:
+        old_stdout = sys.stdout
+        sys.stdout = devnull
+        try:
+            yield
+        finally:
+            sys.stdout = old_stdout
+def save_checkpoint(state, save_path):
+    os.makedirs(os.path.dirname(save_path), exist_ok=True)
+    torch.save(state, save_path)
+def freeze(module):
+    for param in module.parameters():
+        param.requires_grad = False
+def num_params(model):
+    return sum(p.numel() for p in model.parameters() if p.requires_grad)
+def clamp(x, limit):
+    return max(-limit, min(x, limit))
+def pad_to_length(tensor, length, dim, value=0):
+    """
+    Pad tensor to given length in given dim using given value (value should be numeric)
+    """
+    assert tensor.size(dim) <= length
+    if tensor.size(dim) < length:
+        zeros_shape = list(tensor.shape)
+        zeros_shape[dim] = length - tensor.size(dim)
+        zeros_shape = tuple(zeros_shape)
+        return torch.cat([tensor, torch.zeros(zeros_shape).type(tensor.type()).to(tensor.device).fill_(value)], dim=dim)
+    else:
+        return tensor
+def pad_mask(lengths: torch.LongTensor) -> torch.ByteTensor:
+    """
+    Create a mask of seq x batch where seq = max(lengths), with 0 in padding locations and 1 otherwise.
+    """
+    # lengths: bs. Ex: [2, 3, 1]
+    max_seqlen = torch.max(lengths)
+    expanded_lengths = lengths.unsqueeze(0).repeat((max_seqlen, 1))  # [[2, 3, 1], [2, 3, 1], [2, 3, 1]]
+    indices = torch.arange(max_seqlen).unsqueeze(1).repeat((1, lengths.size(0))).to(lengths.device)  # [[0, 0, 0], [1, 1, 1], [2, 2, 2]]
+    return expanded_lengths > indices  # pad locations are 0. #[[1, 1, 1], [1, 1, 0], [0, 1, 0]]. seqlen x bs
+class ProgressMeter(object):
+    """
+    Display meter
+    """
+    def __init__(self, num_batches, meters, prefix=""):
+        self.batch_fmtstr = self._get_batch_fmtstr(num_batches)
+        self.meters = meters
+        self.prefix = prefix
+    def display(self, batch):
+        entries = [self.prefix + self.batch_fmtstr.format(batch)]
+        entries.append(time.ctime(time.time()))
+        entries += [str(meter) for meter in self.meters]
+        print('\t'.join(entries))
+    def _get_batch_fmtstr(self, num_batches):
+        num_digits = len(str(num_batches // 1))
+        fmt = '{:' + str(num_digits) + 'd}'
+        return '[' + fmt + '/' + fmt.format(num_batches) + ']'
+class AverageMeter(object):
+    """
+    Display meter
+    Computes and stores the average and current value
+    """
+    def __init__(self, name, fmt=':f'):
+        self.name = name
+        self.fmt = fmt
+        self.reset()
+    def reset(self):
+        self.val = 0
+        self.avg = 0
+        self.sum = 0
+        self.count = 0
+    def update(self, val, n=1):
+        self.val = val
+        self.sum += val * n
+        self.count += n
+        self.avg = self.sum / self.count
+    def __str__(self):
+        fmtstr = '{name} {val' + self.fmt + '} ({avg' + self.fmt + '})'
+        return fmtstr.format(**self.__dict__)