--- license: apache-2.0 language: - en --- ## Empowering Character-level Text Infilling by Eliminating Sub-Tokens
## Introduction FIM-SE stands for Fill-In-the-Middle with both Starting and Ending character constraints. The proposed method addresses character-level infilling tasks by utilizing a line-level format to avoid predicting any sub-token in inference. ![](method.png)R-PrefixR-Suffix L-Prefix F-Suffix ``` #### Inference Code Please refer to our [GitHub Repo](https://github.com/SenseLLM/FIM-SE) for more technical details. ## Citation If you find this repo useful for your research, please kindly cite our paper: ``` @misc{ren2024empowering, title={Empowering Character-level Text Infilling by Eliminating Sub-Tokens}, author={Houxing Ren and Mingjie Zhan and Zhongyuan Wu and Hongsheng Li}, year={2024}, eprint={2405.17103}, archivePrefix={arXiv}, primaryClass={cs.CL} } ``` ## Acknowledgments We thank the following amazing projects that truly inspired us: - [FIM](https://arxiv.org/abs/2207.14255)