900M models trained on 25BT to compare different data processing choices (filtering, sentence dedup, minhash, etc)