Dataset used to train SantaCoder
#43
by
nihaljn
- opened
Which dataset between The Stack (v1.1) and The Stack Dedup (v1.1) was used to train SantaCoder?
The SantaCoder repo links to the former but can this be confirmed?