Edit model card

DElIteraTeR-RoBERTa-Intent-Span-Detector

This model was obtained by fine-tuning roberta-large on IteraTeR+ multi_sent dataset.

Paper: Improving Iterative Text Revision by Learning Where to Edit from Other Revision Tasks
Authors: Zae Myung Kim, Wanyu Du, Vipul Raheja, Dhruv Kumar, and Dongyeop Kang

Usage

import torch
from transformers import AutoTokenizer, AutoModelForTokenClassification

tokenizer = AutoTokenizer.from_pretrained("zaemyung/DElIteraTeR-RoBERTa-Intent-Span-Detector")

# update tokenizer with special tokens
INTENT_CLASSES = ['none', 'clarity', 'fluency', 'coherence', 'style', 'meaning-changed']  # `meaning-changed` is not used
INTENT_OPENED_TAGS = [f'<{intent_class}>' for intent_class in INTENT_CLASSES]
INTENT_CLOSED_TAGS = [f'</{intent_class}>' for intent_class in INTENT_CLASSES]
INTENT_TAGS = set(INTENT_OPENED_TAGS + INTENT_CLOSED_TAGS)
special_tokens_dict = {'additional_special_tokens': ['<bos>', '<eos>'] + list(INTENT_TAGS)}
tokenizer.add_special_tokens(special_tokens_dict)

model = AutoModelForTokenClassification.from_pretrained("zaemyung/DElIteraTeR-RoBERTa-Intent-Span-Detector")

id2label = {0: "none", 1: "clarity", 2: "fluency", 3: "coherence", 4: "style", 5: "meaning-changed"}

before_text = '<bos>I likes coffee?<eos>'
model_input = tokenizer(before_text, return_tensors='pt')
model_output = model(**model_input)
softmax_scores = torch.softmax(model_output.logits, dim=-1)
pred_ids = torch.argmax(softmax_scores, axis=-1)[0].tolist()
pred_intents = [id2label[_id] for _id in pred_ids]

tokens = tokenizer.convert_ids_to_tokens(model_input['input_ids'][0])

for token, pred_intent in zip(tokens, pred_intents):
    print(f"{token}: {pred_intent}")

"""
<s>: none
<bos>: none
I: fluency
Ġlikes: fluency
Ġcoffee: none
?: none
<eos>: none
</s>: none
"""
Downloads last month
16
Safetensors
Model size
354M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.