Spaces:

mission-impossible-lms
/

README

Running

App Files Files Community

README / README.md

juliekallini

Update README.md

61983a8 verified 4 days ago

preview code

raw

history blame contribute delete

3.19 kB

	---
	title: README
	emoji: 🔥
	colorFrom: blue
	colorTo: pink
	sdk: static
	pinned: false
	---

	<div align="center">

	# 💥 Mission: Impossible Language Models 💥

	<img src="https://cdn-uploads.huggingface.co/production/uploads/6268bc06adb1c6525b3d5157/GfEHK3X6bD_5u4etaJY8c.png" alt="drawing" width="400"/>

	</div>

	This page hosts the models trained and used in the paper "[Mission: Impossible Language Models](https://arxiv.org/abs/2401.06416)" (Kallini et al., 2024).
	If you use our code or models, please cite our ACL paper:

	```bibtex
	@inproceedings{kallini-etal-2024-mission,
	title = "Mission: Impossible Language Models",
	author = "Kallini, Julie and
	Papadimitriou, Isabel and
	Futrell, Richard and
	Mahowald, Kyle and
	Potts, Christopher",
	editor = "Ku, Lun-Wei and
	Martins, Andre and
	Srikumar, Vivek",
	booktitle = "Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
	month = aug,
	year = "2024",
	address = "Bangkok, Thailand",
	publisher = "Association for Computational Linguistics",
	url = "https://aclanthology.org/2024.acl-long.787",
	doi = "10.18653/v1/2024.acl-long.787",
	pages = "14691--14714",
	}

	```

	## Impossible Languages

	Our paper includes 15 impossible languages, grouped into three language classes:
	1. *\Shuffle languages** involve different shuffles of tokenized
	English sentences.
	2. *\Reverse langguages** involve reversals of all or part of input
	sentences.
	3. *\Hop languages** perturb verb inflection with counting rules.


	![languages.png](https://cdn-uploads.huggingface.co/production/uploads/6268bc06adb1c6525b3d5157/pBt38YYQL1gj8DqjyorWS.png)

	## Models

	For each language, we provide two models:
	1. A [standard GPT-2 Small model](https://huggingface.co/collections/mission-impossible-lms/gpt-2-models-67270160d99170620f5a27f6).
	2. A [GPT-2 Small model trained without positional encodings](https://huggingface.co/collections/mission-impossible-lms/gpt-2-models-no-positional-encodings-6727286b3d1650b1b374fdeb).

	Each model is trained from scratch exclusively on data from
	one impossible language. This makes a total of 30 models:
	15 standard GPT-2 models and 15 GPT-2 models without
	positional encodings. We separate these models out into two
	collections below for ease when navigating models.

	Models names match the following pattern:

	`mission-impossible-lms/{language_name}-{model_architecture}`

	where `language_name` is the name an impossible language from table above,
	converted from PascalCase to kebab-case (i.e. NoShuffle -> `no-shuffle`), and
	`model_architecture` is one of `gpt2` (for the standard GPT-2 architecture)
	or `gpt2-no-pos` (for the GPT-2 architecture without positional encodings).

	### Model Checkpoints

	On the main revision of each model, we provide the final
	model artefact we trained (checkpoint 3000). We also provide
	29 intermediate checkpoints over the course of training,
	from checkpoint 100 to 3000 in increments of 100 steps.
	These checkpoints can help you replicate the experiments
	we show in the paper and are provided in each model repo as
	separate revisions.