--- datasets: - Locutusque/TM-DATA-V2 - LLM360/TxT360 - mlfoundations/dclm-baseline-1.0 - Skylion007/openwebtext language: - en license: apache-2.0 --- still in training.