metadata
arxiv: 2103.06333
license: mit
language:
- code
This is an unofficial reupload of uclanlp/plbart-large in the SafeTensors
format using transformers
4.40.1
. The goal of this reupload is to prevent older models that are still relevant baselines from becoming stale as a result of changes in HuggingFace. Additionally, I may include minor corrections, such as model max length configuration.
Please see the original repo for more information.
Original model card below:
PLBART is a Transformer model
- PLBART is a sequence-to-sequence model pre-trained on a large collection Java and Python functions and natural language descriptions collected from Github and StackOverflow, respectively.
- PLBART is pre-trained via denoising autoencoding (DAE) and uses three noising strategies: token masking, token deletion, and token infilling (shown below in the three examples).
Noisy Input | Original Sequence |
---|---|
Is 0 the [MASK] Fibonacci [MASK] ? <En> | <En> Is 0 the first Fibonacci number ? |
public static main ( String args [ ] ) { date = Date ( ) ; System . out . ( String . format ( " Current Date : % tc " , ) ) ; } <java> | <java> public static void main ( String args [ ] ) { Date date = new Date ( ) ; System . out . printf ( String . format ( " Current Date : % tc " , date ) ) ; } |
def addThreeNumbers ( x , y , z ) : NEW_LINE INDENT return [MASK] <python> | <python> def addThreeNumbers ( x , y , z ) : NEW_LINE INDENT return x + y + z |