Pietro Lesci
pietrolesci
AI & ML interests
I am passionate about the science of language models: developing and applying causal methods—drawing from econometrics—to study the effect of training choices on models’ behaviour, including memorisation, shortcut learning, and tokenisation.
Organizations
pietrolesci's activity
Domain and provenance annotation
9
#1 opened about 1 year ago
by
haukur
Trapezoidal scheduler with cooldown phase
3
#4 opened 4 months ago
by
maveriq
Bias annotation
#2 opened 6 months ago
by
pietrolesci
Tokenizer `merges.txt` files
3
#5 opened 6 months ago
by
pietrolesci
Sequence "packing" logic
2
#2 opened 10 months ago
by
pietrolesci
Pad-only sequences from mmap'ed dataset after a certain index
#1 opened 11 months ago
by
pietrolesci
Add full sequences (beyond the first 64 tokens)
3
#1 opened 11 months ago
by
pietrolesci
Fix swapped start and exclusive_end fields
1
#3 opened about 2 years ago
by
pietrolesci
App down
#1 opened about 2 years ago
by
pietrolesci
`start` and `exclusive_end` seems swapped
1
#1 opened about 2 years ago
by
pietrolesci