|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
--- |
|
## Empowering Character-level Text Infilling by Eliminating Sub-Tokens |
|
|
|
<p align="center"> |
|
<a href="https://arxiv.org/abs/2405.17103">π Paper</a> β’ |
|
<a href="https://github.com/SenseLLM/FIM-SE">π Repo</a> β’ |
|
<a href="https://huggingface.co/SenseLLM/FIM-SE-CL-13B">π€ Models</a> |
|
</p> |
|
|
|
## Introduction |
|
FIM-SE stands for Fill-In-the-Middle with both Starting and Ending character constraints. The proposed method addresses character-level infilling tasks by utilizing a line-level format to avoid predicting any sub-token in inference. |
|
|
|
![](method.png) |
|
|
|
<hr> |
|
|
|
## Models |
|
|
|
| Model | Checkpoint | Size | License| |
|
|:------|:-----------|:-----|:-------| |
|
| FIM-SE-CL-7B | π€ [HF Link](https://huggingface.co/SenseLLM/FIM-SE-CL-7B) | 7B | [Llama2](https://ai.meta.com/llama/license/) | |
|
| FIM-SE-CL-34B | π€ [HF Link](https://huggingface.co/SenseLLM/FIM-SE-CL-34B) | 13B | [Llama2](https://ai.meta.com/llama/license/) | |
|
| FIM-SE-SC-1B | π€ [HF Link](https://huggingface.co/SenseLLM/FIM-SE-SC-1B) | 1B | [StarCoder](https://github.com/bigcode-project/starcoder/blob/main/LICENSE) | |
|
| FIM-SE-SC-15B | π€ [HF Link](https://huggingface.co/SenseLLM/FIM-SE-SC-15B) | 15B | [StarCoder](https://github.com/bigcode-project/starcoder/blob/main/LICENSE) | |
|
|
|
## How to Use |
|
|
|
#### Prompt Format |
|
|
|
As shown in the figure, the prompt is organized as |
|
```text |
|
<PRE>R-Prefix<SUF>R-Suffix<START>L-Prefix<END>F-Suffix<MID> |
|
``` |
|
|
|
#### Inference Code |
|
Please refer to our [GitHub Repo](https://github.com/SenseLLM/FIM-SE) for more technical details. |
|
|
|
## Citation |
|
|
|
If you find this repo useful for your research, please kindly cite our paper: |
|
``` |
|
@misc{ren2024empowering, |
|
title={Empowering Character-level Text Infilling by Eliminating Sub-Tokens}, |
|
author={Houxing Ren and Mingjie Zhan and Zhongyuan Wu and Hongsheng Li}, |
|
year={2024}, |
|
eprint={2405.17103}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL} |
|
} |
|
``` |
|
|
|
## Acknowledgments |
|
|
|
We thank the following amazing projects that truly inspired us: |
|
|
|
- [FIM](https://arxiv.org/abs/2207.14255) |