Spaces:
Runtime error
Runtime error
import streamlit as st | |
from transformers import AutoTokenizer, AutoModelForQuestionAnswering, pipeline | |
st.title('Question-Answering NLU') | |
st.sidebar.title('Navigation') | |
menu = st.sidebar.radio("", options=["Demo", "Parsing NLU data into SQuAD 2.0", "Training", | |
"Evaluation"], index=0) | |
if menu == "Demo": | |
st.markdown(''' | |
Question Answering NLU (QANLU) is an approach that maps the NLU task into question answering, | |
leveraging pre-trained question-answering models to perform well on few-shot settings. Instead of | |
training an intent classifier or a slot tagger, for example, we can ask the model intent- and | |
slot-related questions in natural language: | |
``` | |
Context : I'm looking for a cheap flight to Boston. | |
Question: Is the user looking to book a flight? | |
Answer : Yes | |
Question: Is the user asking about departure time? | |
Answer : No | |
Question: What price is the user looking for? | |
Answer : cheap | |
Question: Where is the user flying from? | |
Answer : (empty) | |
``` | |
Thus, by asking questions for each intent and slot in natural language, we can effectively construct an NLU hypothesis. For more details, | |
please read the paper: | |
[Language model is all you need: Natural language understanding as question answering](https://assets.amazon.science/33/ea/800419b24a09876601d8ab99bfb9/language-model-is-all-you-need-natural-language-understanding-as-question-answering.pdf). | |
In this Space, we will see how to transform an example | |
NLU dataset (e.g. utterances and intent / slot annotations) into [SQuAD 2.0 format](https://rajpurkar.github.io/SQuAD-explorer/explore/v2.0/dev/) | |
question-answering data that can be used by QANLU. | |
### Demo | |
Feel free to query the pre-trained QA-NLU model using the buttons below. | |
*Please note that this model has been trained on ATIS and may be need to be further fine-tuned to support intents and slots that are not covered in ATIS*. | |
''') | |
tokenizer = AutoTokenizer.from_pretrained("AmazonScience/qanlu") | |
model = AutoModelForQuestionAnswering.from_pretrained("AmazonScience/qanlu") | |
qa_pipeline = pipeline('question-answering', model=model, tokenizer=tokenizer) | |
context = st.text_input( | |
'Please enter the context (remember to include "Yes. No. " in the beginning):', | |
value="Yes. No. I want a cheap flight to Boston." | |
) | |
question = st.text_input( | |
'Please enter the intent question:', | |
value="Are they looking for a flight?" | |
) | |
qa_input = { | |
'context': context, | |
'question': question | |
} | |
if st.button('Ask QANLU'): | |
answer = qa_pipeline(qa_input) | |
st.write(answer) | |
elif menu == "Parsing NLU data into SQuAD 2.0": | |
st.header('QA-NLU Data Parsing') | |
st.markdown(''' | |
Here, we show a small example of how NLU data can be transformed into QANLU data. | |
The same method can be used to transform [MATIS++](https://github.com/amazon-research/multiatis) | |
NLU data (e.g. utterances and intent / slot annotations) into [SQuAD 2.0 format](https://rajpurkar.github.io/SQuAD-explorer/explore/v2.0/dev/) | |
question-answering data that can be used by QANLU. | |
Here is an example dataset with three intents and two examples per intent: | |
```` | |
restaurant, I am looking for some Vietnamese food | |
restaurant, What is there to eat around here? | |
music, Play my workout playlist | |
music, Can you find Bob Dylan songs? | |
flight, Show me flights from Oakland to Dallas | |
flight, I want two economy tickets from Miami to Chicago | |
```` | |
Now, we need to define some questions, per intent. We can use free-form questions or use templates. | |
```` | |
{ | |
'restaurant': [ | |
'Did they ask for a restaurant?', | |
'Did they mention a restaurant?' | |
], | |
'music': [ | |
'Did they ask for music?', | |
'Do they want to play music?' | |
], | |
'flight': [ | |
'Did they ask for a flight?', | |
'Do they want to book a flight?' | |
] | |
} | |
```` | |
The next step is to run the `atis.py` script from the [QA-NLU Amazon Research repository](https://github.com/amazon-research/question-answering-nlu). | |
That script will produce a json file that looks like this: | |
```` | |
{ | |
"version": 1.0, | |
"data": [ | |
{ | |
"title": "MultiATIS++", | |
"paragraphs": [ | |
{ | |
"context": "yes. no. i am looking for some vietnamese food", | |
"qas": [ | |
{ | |
"question": "did they ask for a restaurant?", | |
"id": "49f1180cb9ce4178a8a90f76c21f69b4", | |
"is_impossible": false, | |
"answers": [ | |
{ | |
"text": "yes", | |
"answer_start": 0 | |
} | |
], | |
"slot": "", | |
"intent": "restaurant" | |
}, | |
{ | |
"question": "did they ask for music?", | |
"id": "a7ffe039fb3e4843ae16d5a68194f45e", | |
"is_impossible": false, | |
"answers": [ | |
{ | |
"text": "no", | |
"answer_start": 5 | |
} | |
], | |
"slot": "", | |
"intent": "restaurant" | |
}, | |
... <More questions> | |
... <More paragraphs> | |
```` | |
There are many tunable parameters when generating the above file, such as how many negative examples to include per question. Follow the same process for training a slot-tagging model. | |
''') | |
elif menu == "Training": | |
st.header('QA-NLU Training') | |
st.markdown(''' | |
To train a QA-NLU model on the data we created, we use the `run_squad.py` script from [huggingface](https://github.com/huggingface/transformers/blob/master/examples/legacy/question-answering/run_squad.py) and a SQuAD-trained QA model as our base. As an example, we can use `deepset/roberta-base-squad2` model from [here](https://huggingface.co/deepset/roberta-base-squad2) (assuming 8 GPUs are present): | |
''') | |
st.code(''' | |
mkdir models | |
python -m torch.distributed.launch --nproc_per_node=8 run_squad.py \\ | |
--model_type roberta \\ | |
--model_name_or_path deepset/roberta-base-squad2 \\ | |
--do_train \\ | |
--do_eval \\ | |
--do_lower_case \\ | |
--train_file data/matis_en_train_squad.json \\ | |
--predict_file data/matis_en_test_squad.json \\ | |
--learning_rate 3e-5 \\ | |
--num_train_epochs 2 \\ | |
--max_seq_length 384 \\ | |
--doc_stride 64 \\ | |
--output_dir models/qanlu/ \\ | |
--per_gpu_train_batch_size 8 \\ | |
--overwrite_output_dir \\ | |
--version_2_with_negative \\ | |
--save_steps 100000 \\ | |
--gradient_accumulation_steps 8 \\ | |
--seed $RANDOM | |
''') | |
elif menu == "Evaluation": | |
st.header('QA-NLU Evaluation') | |
st.markdown(''' | |
To assess the performance of the trained model, we can use the `calculate_pr.py` script from the [QA-NLU Amazon Research repository](https://github.com/amazon-research/question-answering-nlu). | |
Feel free to query the pre-trained QA-NLU model in the Demo section. | |
''') | |