frankliu666
commited on
Commit
•
6adea4d
1
Parent(s):
248e75a
Upload model
Browse files- README.md +48 -0
- adapter_config.json +25 -0
- adapter_model.bin +3 -0
README.md
ADDED
@@ -0,0 +1,48 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
---
|
6 |
+
|
7 |
+
# TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data
|
8 |
+
|
9 |
+
Paper: https://arxiv.org/abs/2401.13223
|
10 |
+
|
11 |
+
Code: https://github.com/fengbinzhu/TAT-LLM
|
12 |
+
|
13 |
+
|
14 |
+
## Introduction
|
15 |
+
|
16 |
+
We present TAT-LLM, a specialized language model crafted through the innovative Step-wise Pipeline approach, focusing on the nuanced realm of tabular and textual question answering (QA). This model is the fruit of rigorously fine-tuning the LLaMA 2 architecture with a novel dataset, autonomously generated from expertly annotated resources. TAT-LLM stands at the intersection of tabular comprehension and textual analysis, engineered to excel by embodying three fundamental phases: Extraction, Reasoning, and Execution. Our empirical findings illuminate TAT-LLM's remarkable capability to eclipse traditional benchmarks, surmounting even the most advanced models and colossal language models such as GPT-4 across a suite of demanding QA tasks like FinQA, TAT-QA, and TAT-DQA. This endeavor not only sets a new standard for task-specific language models but also paves the way for future explorations in optimizing smaller models for highly specialized functions.
|
17 |
+
|
18 |
+
| Model | Size | FINQA | TATQA | TATDQA |
|
19 |
+
| --- | --- | --- | --- | --- |
|
20 |
+
| GPT-3.5-Turbo | - | 58.00 | 59.47 | 52.74 |
|
21 |
+
| GPT-4 | - | 63.91 | 71.92 | 64.46 |
|
22 |
+
| TAT-LLM-7B | 7B | 65.13 | 76.49 | 71.38 |
|
23 |
+
| TAT-LLM-13B | 13B | 71.93 | 77.51 | 72.22 |
|
24 |
+
| TAT-LLM-70B | 70B | **76.81** | **81.42** | **76.55** |
|
25 |
+
|
26 |
+
|
27 |
+
## Training
|
28 |
+
|
29 |
+
We train our TAT-LLM model in various sizes, including 7B, 13B, and 70B, by fine-tuning LLaMA 2 using Low-Rank Adaptation (LoRa) on a combination of the train sets from FinQA, TAT-QA and TAT-DQA datasets. To refine accuracy, we introduce an External Executor, enhancing the model by processing intermediate outputs to derive conclusive answers. Please refer to the [paper](https://arxiv.org/abs/2401.13223) for more details.
|
30 |
+
|
31 |
+
## Inference & Evaluation
|
32 |
+
|
33 |
+
Please refer to code [here](https://github.com/fengbinzhu/TAT-LLM)
|
34 |
+
|
35 |
+
## Citation
|
36 |
+
|
37 |
+
If you find this repository helpful, please consider citing our paper:
|
38 |
+
|
39 |
+
```
|
40 |
+
@misc{zhu2024tatllm,
|
41 |
+
title={TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data},
|
42 |
+
author={Fengbin Zhu and Ziyang Liu and Fuli Feng and Chao Wang and Moxin Li and Tat-Seng Chua},
|
43 |
+
year={2024},
|
44 |
+
eprint={2401.13223},
|
45 |
+
archivePrefix={arXiv},
|
46 |
+
primaryClass={cs.CL}
|
47 |
+
}
|
48 |
+
```
|
adapter_config.json
ADDED
@@ -0,0 +1,25 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"alpha_pattern": {},
|
3 |
+
"auto_mapping": null,
|
4 |
+
"base_model_name_or_path": "meta-llama/Llama-2-13b-hf",
|
5 |
+
"bias": "none",
|
6 |
+
"fan_in_fan_out": false,
|
7 |
+
"inference_mode": true,
|
8 |
+
"init_lora_weights": true,
|
9 |
+
"layers_pattern": null,
|
10 |
+
"layers_to_transform": null,
|
11 |
+
"lora_alpha": 16,
|
12 |
+
"lora_dropout": 0.05,
|
13 |
+
"modules_to_save": null,
|
14 |
+
"peft_type": "LORA",
|
15 |
+
"r": 16,
|
16 |
+
"rank_pattern": {},
|
17 |
+
"revision": null,
|
18 |
+
"target_modules": [
|
19 |
+
"q_proj",
|
20 |
+
"k_proj",
|
21 |
+
"v_proj",
|
22 |
+
"o_proj"
|
23 |
+
],
|
24 |
+
"task_type": "CAUSAL_LM"
|
25 |
+
}
|
adapter_model.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3c9a1bb3c6eda19f0e0a388721ad3f6aa638852e6aeccde50a66b3c65346ca9e
|
3 |
+
size 52540109
|