metadata
metrics:
- mse
- r_squared
- mae
datasets:
- google_wellformed_query
model-index:
- name: distilroberta-query-wellformedness
results:
- task:
type: text-classification
name: Text Classification
metrics:
- type: loss
value: 0.06214376166462898
- type: mse
value: 0.06214376166462898
name: Validation Mean Squared Error
- type: r2
value: 0.5705611109733582
name: Validation R-Squared
- type: mae
value: 0.1838676631450653
name: Validation Mean Absolute Error
language:
- en
DistilRoBERTa-query-wellformedness
This model utilizes the Distilroberta base architecture, which has been fine-tuned for a regression task on the Google's query wellformedness dataset encompassing 25,100 queries from the Paralex corpus. Each query received annotations from five raters, who provided a continuous rating indicating the degree to which the query is well-formed.
Model description
The model evaluates the query for completeness and grammatical correctness, providing a score between 0 and 1, where 1 indicates correctness.
Usage
# Sentences
sentences = [
"The cat and dog in the yard.", # Incorrect - It should be "The cat and dog are in the yard."
"she don't like apples.", # Incorrect - It should be "She doesn't like apples."
"Is rain sunny days sometimes?", # Incorrect - It should be "Do sunny days sometimes have rain?"
"She enjoys reading books and playing chess.", # Correct
"How many planets are there in our solar system?" # Correct
]
# Tokenizing the sentences
inputs = tokenizer(sentences, truncation=True, padding=True, return_tensors='pt')
# Getting the model's predictions
with torch.no_grad(): # Disabling gradient calculation as we are only doing inference
model.eval() # Setting the model to evaluation mode
predicted_ratings = model(
input_ids=inputs['input_ids'],
attention_mask=inputs['attention_mask']
)
# The predicted_ratings is a tensor, so we'll convert it to a list of standard Python numbers
predicted_ratings = predicted_ratings.squeeze().tolist()
# Printing the predicted ratings
for i, rating in enumerate(predicted_ratings):
print(f'Sentence: {sentences[i]}')
print(f'Predicted Rating: {rating}\n')
Output:
Sentence: The cat and dog in the yard.
Predicted Rating: 0.3482873737812042
Sentence: she don't like apples.
Predicted Rating: 0.07787154614925385
Sentence: Is rain sunny days sometimes?
Predicted Rating: 0.19854165613651276
Sentence: She enjoys reading books and playing chess.
Predicted Rating: 0.9327691793441772
Sentence: How many planets are there in our solar system?
Predicted Rating: 0.9746372103691101
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: AdamW with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 450
- num_epochs: 5
Training results
Metrics: Mean Squared Error, R-Squared, Mean Absolute Error
'test_loss': 0.06214376166462898,
'test_mse': 0.06214376166462898,
'test_r2': 0.5705611109733582,
'test_mae': 0.1838676631450653
Framework versions
- Transformers 4.34.1
- Pytorch lightning 2.1.0
- Tokenizers 0.14.1
If you want to support me, you can here.