Update README.md
Browse files
README.md
CHANGED
@@ -48,7 +48,7 @@ license: apache-2.0
|
|
48 |
|
49 |
# SentenceTransformer based on HooshvareLab/bert-base-parsbert-uncased
|
50 |
|
51 |
-
This
|
52 |
|
53 |
## Model Details
|
54 |
|
@@ -99,68 +99,57 @@ similarities = model.similarity(embeddings, embeddings)
|
|
99 |
print(similarities.shape)
|
100 |
# [3, 3]
|
101 |
```
|
|
|
|
|
102 |
|
|
|
|
|
103 |
|
104 |
-
|
105 |
-
|
106 |
-
|
107 |
-
|
108 |
-
|
109 |
-
|
110 |
-
|
111 |
-
|
112 |
-
|
113 |
-
|
114 |
-
|
115 |
-
|
116 |
-
|
117 |
-
|
118 |
-
|
119 |
-
|
120 |
-
|
121 |
-
|
122 |
-
|
123 |
-
|
124 |
-
|
125 |
-
|
126 |
-
|
127 |
-
|
128 |
-
|
129 |
-
|
130 |
-
|
131 |
-
|
132 |
-
|
133 |
-
|
134 |
-
|
135 |
-
|
136 |
-
|
137 |
-
|
138 |
-
|
139 |
-
|
140 |
-
|
141 |
-
|
142 |
-
|
143 |
-
|
144 |
-
|
145 |
-
|
146 |
-
|
147 |
-
|
148 |
-
|
149 |
-
|
150 |
-
|
151 |
-
|
152 |
-
|
153 |
-
*Clearly define terms in order to be accessible across audiences.*
|
154 |
-
-->
|
155 |
-
|
156 |
-
<!--
|
157 |
-
## Model Card Authors
|
158 |
-
|
159 |
-
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
|
160 |
-
-->
|
161 |
-
|
162 |
-
<!--
|
163 |
-
## Model Card Contact
|
164 |
-
|
165 |
-
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
|
166 |
-
-->
|
|
|
48 |
|
49 |
# SentenceTransformer based on HooshvareLab/bert-base-parsbert-uncased
|
50 |
|
51 |
+
This [sentence-transformers](https://www.SBERT.net) model is finetuned from [HooshvareLab/bert-base-parsbert-uncased](https://huggingface.co/HooshvareLab/bert-base-parsbert-uncased) with a focus on enhancing Retrieval-Augmented Generation (RAG) systems. It maps sentences and paragraphs to a 768-dimensional dense vector space, making it highly effective for retrieving contextually relevant information to generate accurate and coherent responses in various applications such as QA systems, chatbots, and content generation.
|
52 |
|
53 |
## Model Details
|
54 |
|
|
|
99 |
print(similarities.shape)
|
100 |
# [3, 3]
|
101 |
```
|
102 |
+
### Usage in Retrieval-Augmented Generation (RAG) Systems
|
103 |
+
Retrieval-Augmented Generation (RAG) systems leverage a combination of retrieval and generation techniques to enhance the quality and accuracy of generated responses. This model can be effectively used to retrieve relevant information from a large corpus, which can then be used to generate more informed and contextually accurate responses. Here's how you can integrate this model into a RAG system:
|
104 |
|
105 |
+
Install Necessary Libraries:
|
106 |
+
Ensure you have the required libraries:
|
107 |
|
108 |
+
```bash
|
109 |
+
pip install -U sentence-transformers transformers
|
110 |
+
```
|
111 |
+
|
112 |
+
```python
|
113 |
+
from sentence_transformers import SentenceTransformer, util
|
114 |
+
import torch
|
115 |
+
|
116 |
+
# Load the model
|
117 |
+
model = SentenceTransformer("myrkur/sentence-transformer-parsbert-fa")
|
118 |
+
|
119 |
+
# Example corpus
|
120 |
+
corpus = [
|
121 |
+
'پرتغالی، در وطن اصلی خود، پرتغال، تقریباً توسط ۱۰ میلیون نفر جمعیت صحبت میشود...',
|
122 |
+
'اشکانیان حدود دو قرن بر ایران حکومت کردند...',
|
123 |
+
'عباس جدیدی، کشتیگیر سابق ایرانی است...',
|
124 |
+
# ... (more documents)
|
125 |
+
]
|
126 |
+
|
127 |
+
# Encode the corpus
|
128 |
+
corpus_embeddings = model.encode(corpus, convert_to_tensor=True)
|
129 |
+
```
|
130 |
+
|
131 |
+
Retrieve Relevant Information:
|
132 |
+
Given a user query, retrieve the most relevant documents from the corpus:
|
133 |
+
|
134 |
+
```python
|
135 |
+
# User query
|
136 |
+
query = "عباس جدیدی که بود؟"
|
137 |
+
query_embedding = model.encode(query, convert_to_tensor=True)
|
138 |
+
|
139 |
+
# Retrieve the top-k most similar documents
|
140 |
+
top_k = 5
|
141 |
+
hits = util.semantic_search(query_embedding, corpus_embeddings, top_k=top_k)
|
142 |
+
hits = hits[0]
|
143 |
+
|
144 |
+
# Print the retrieved documents
|
145 |
+
for hit in hits:
|
146 |
+
print(f"Score: {hit['score']:.4f}")
|
147 |
+
print(corpus[hit['corpus_id']])
|
148 |
+
```
|
149 |
+
## Conclusion
|
150 |
+
This sentence-transformer model is a powerful tool for various NLP applications, particularly in retrieval-augmented generation systems, enabling more accurate and contextually relevant information retrieval and generation.
|
151 |
+
|
152 |
+
## Contact
|
153 |
+
For questions or further information, please contact:
|
154 |
+
|
155 |
+
- Amir Masoud Ahmadi: [amirmasoud.ahkol@gmail.com](mailto:amirmasoud.ahkol@gmail.com)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|