These models are used in context of the master thesis "ADDI: AI-driven data interoperability using zero-shot learners". The thesis was written in 2021 together with EPFL and University of Oulu.
A summary of the thesis can be found in the following youtube videos:
English: https://youtu.be/1YDzvx9Gmg4
Français: https://youtu.be/4BFLa_8Xo_w
Deutsch: https://youtu.be/FuDX8KcS9TM
Abstract:
In 2017, a new network architecture called the Transformer was introduced that leverages a self-attention mechanism to achieve state-of-the-art results in solving sequence-to-sequence tasks in the field of Natural Language Processing (NLP). In this thesis, three deep learning language models using the new transformer architecture (GPT2, RoBERTa and XLM-R) have been fine-tuned to extract information, such as name and birth date from ID-cards. The models are trained on Swiss ID-cards, and are shown to perform well in out-of-sample data such as German or even Finnish ID-cards. The models are able to adapt to unseen datasets by unsupervised fine-tuning. The unseen datasets do not only contain different label names and value representations, but can also contain novel labels with no equivalent during training. The best model extracts information with an average accuracy of 97% in out-of-sample datasets using unsupervised fine-tuning with 200 examples. The models have a zero shot accuracy of roughly 50%. In addition the model is resistant to spelling mistakes and fine-tuning to new databases does not result in catastrophic forgetting.