louis030195 commited on
Commit
965a67c
1 Parent(s): 76b4323

first commit

Browse files
Files changed (4) hide show
  1. README.md +12 -101
  2. metadata.yaml +0 -24
  3. train.csv +0 -0
  4. validation.csv +0 -0
README.md CHANGED
@@ -1,107 +1,18 @@
1
  ---
2
- pipeline_tag: sentence-similarity
3
- license: apache-2.0
 
4
  tags:
5
- - sentence-transformers
6
- - feature-extraction
7
- - sentence-similarity
8
- - transformers
 
 
9
  ---
10
 
11
- # sentence-transformers/paraphrase-multilingual-mpnet-base-v2
12
 
13
- This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
14
 
15
-
16
-
17
- ## Usage (Sentence-Transformers)
18
-
19
- Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
20
-
21
- ```
22
- pip install -U sentence-transformers
23
- ```
24
-
25
- Then you can use the model like this:
26
-
27
- ```python
28
- from sentence_transformers import SentenceTransformer
29
- sentences = ["This is an example sentence", "Each sentence is converted"]
30
-
31
- model = SentenceTransformer('sentence-transformers/paraphrase-multilingual-mpnet-base-v2')
32
- embeddings = model.encode(sentences)
33
- print(embeddings)
34
- ```
35
-
36
-
37
-
38
- ## Usage (HuggingFace Transformers)
39
- Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
40
-
41
- ```python
42
- from transformers import AutoTokenizer, AutoModel
43
- import torch
44
-
45
-
46
- #Mean Pooling - Take attention mask into account for correct averaging
47
- def mean_pooling(model_output, attention_mask):
48
- token_embeddings = model_output[0] #First element of model_output contains all token embeddings
49
- input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
50
- return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
51
-
52
-
53
- # Sentences we want sentence embeddings for
54
- sentences = ['This is an example sentence', 'Each sentence is converted']
55
-
56
- # Load model from HuggingFace Hub
57
- tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/paraphrase-multilingual-mpnet-base-v2')
58
- model = AutoModel.from_pretrained('sentence-transformers/paraphrase-multilingual-mpnet-base-v2')
59
-
60
- # Tokenize sentences
61
- encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
62
-
63
- # Compute token embeddings
64
- with torch.no_grad():
65
- model_output = model(**encoded_input)
66
-
67
- # Perform pooling. In this case, max pooling.
68
- sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
69
-
70
- print("Sentence embeddings:")
71
- print(sentence_embeddings)
72
- ```
73
-
74
-
75
-
76
- ## Evaluation Results
77
-
78
-
79
-
80
- For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name=sentence-transformers/paraphrase-multilingual-mpnet-base-v2)
81
-
82
-
83
-
84
- ## Full Model Architecture
85
- ```
86
- SentenceTransformer(
87
- (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
88
- (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
89
- )
90
- ```
91
-
92
- ## Citing & Authors
93
-
94
- This model was trained by [sentence-transformers](https://www.sbert.net/).
95
-
96
- If you find this model helpful, feel free to cite our publication [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://arxiv.org/abs/1908.10084):
97
- ```bibtex
98
- @inproceedings{reimers-2019-sentence-bert,
99
- title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
100
- author = "Reimers, Nils and Gurevych, Iryna",
101
- booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
102
- month = "11",
103
- year = "2019",
104
- publisher = "Association for Computational Linguistics",
105
- url = "http://arxiv.org/abs/1908.10084",
106
- }
107
- ```
 
1
  ---
2
+ language:
3
+ - en # Example: en
4
+ license: MIT # Example: apache-2.0 or any license from https://hf.co/docs/hub/model-repos#list-of-license-identifiers
5
  tags:
6
+ - text-generation
7
+ datasets:
8
+ - waiting-messages # Example: common_voice. Use dataset id from https://hf.co/datasets
9
+ widget:
10
+ - text: 'List of funny waiting messages:'
11
+ example_title: 'Funny waiting messages'
12
  ---
13
 
14
+ # Langame/gpt2-waiting
15
 
16
+ This fine-tuned model can generate funny waiting messages.
17
 
18
+ [Langame](https://langa.me) uses these within its platform 😛.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
metadata.yaml DELETED
@@ -1,24 +0,0 @@
1
- language:
2
- - "List of ISO 639-1 code for your language"
3
- - lang1
4
- - lang2
5
- thumbnail: "url to a thumbnail used in social sharing"
6
- tags:
7
- - tag1
8
- - tag2
9
- license: "any valid license identifier"
10
- datasets:
11
- - dataset1
12
- - dataset2
13
- metrics:
14
- - metric1
15
- - metric2
16
- widget:
17
- - text: "Is this review positive or negative? Review: Best cast iron skillet you will every buy."
18
- example_title: "Sentiment analysis"
19
- - text: "Barack Obama nominated Hilary Clinton as his secretary of state on Monday. He chose her because she had ..."
20
- example_title: "Coreference resolution"
21
- - text: "On a shelf, there are five books: a gray book, a red book, a purple book, a blue book, and a black book ..."
22
- example_title: "Logic puzzles"
23
- - text: "The two men running to become New York City's next mayor will face off in their first debate Wednesday night ..."
24
- example_title: "Reading comprehension"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
train.csv ADDED
The diff for this file is too large to render. See raw diff
 
validation.csv ADDED
The diff for this file is too large to render. See raw diff