tscholak/2e826ioa

Fine-tuned weights for PICARD - Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models based on T5-3B.

Training Data

The model has been fine-tuned on the 2,164 training dialogues in the CoSQL SQL-grounded dialogue state tracking dataset and the 7,000 training examples in the Spider text-to-SQL dataset. The model solves both, CoSQL's zero-shot text-to-SQL dialogue state tracking task and Spider's zero-shot text-to-SQL translation task. Zero-shot means that the model can generalize to unseen SQL databases.

Training Objective

This model was initialized with T5-3B and fine-tuned with the text-to-text generation objective.

A question is always grounded in both, a database schema and the preceiding questions in the dialogue. The model is trained to predict the SQL query that would be used to answer the user's current natural language question. The input to the model is composed of the user's current question, the database identifier, a list of tables and their columns, and a sequence of previous questions in reverse chronological order.

[current question] | [db_id] | [table] : [column] ( [content] , [content] ) , [column] ( ... ) , [...] | [table] : ... | ... || [previous question] | ... | [first question]

The sequence of previous questions is separated by || from the linearized schema. In the absence of previous questions (for example, for the first question in a dialogue or for Spider questions), this separator is omitted.

The model outputs the database identifier and the SQL query that will be executed on the database to answer the user's current question in the dialog.

[db_id] | [sql]

Performance

Out of the box, this model achieves 53.8 % question match accuracy and 21.8 % interaction match accuracy on the CoSQL development set. On the CoSQL test set, the model achieves 51.4 % question match accuracy and 21.7 % interaction match accuracy.

Using the PICARD constrained decoding method (see the official PICARD implementation), the model's performance can be improved to 56.9 % question match accuracy and 24.2 % interaction match accuracy on the CoSQL development set. On the CoSQL test set and with PICARD, the model achieves 54.6 % question match accuracy and 23.7 % interaction match accuracy.

Usage

Please see the official repository for scripts and docker images that support evaluation and serving of this model.

References

Citation

@inproceedings{Scholak2021:PICARD,
  author = {Torsten Scholak and Nathan Schucher and Dzmitry Bahdanau},
  title = "{PICARD}: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models",
  booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
  month = nov,
  year = "2021",
  publisher = "Association for Computational Linguistics",
  url = "https://aclanthology.org/2021.emnlp-main.779",
  pages = "9895--9901",
}

tscholak
/

2e826ioa