SamagraDataGov commited on
Commit
cd51fcb
1 Parent(s): d5b9711

pytorch_model.bin upload/update

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,833 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-small-en-v1.5
3
+ datasets: []
4
+ language: []
5
+ library_name: sentence-transformers
6
+ metrics:
7
+ - cosine_accuracy@1
8
+ - cosine_accuracy@5
9
+ - cosine_accuracy@10
10
+ - cosine_precision@1
11
+ - cosine_precision@5
12
+ - cosine_precision@10
13
+ - cosine_recall@1
14
+ - cosine_recall@5
15
+ - cosine_recall@10
16
+ - cosine_ndcg@5
17
+ - cosine_ndcg@10
18
+ - cosine_ndcg@100
19
+ - cosine_mrr@5
20
+ - cosine_mrr@10
21
+ - cosine_mrr@100
22
+ - cosine_map@100
23
+ - dot_accuracy@1
24
+ - dot_accuracy@5
25
+ - dot_accuracy@10
26
+ - dot_precision@1
27
+ - dot_precision@5
28
+ - dot_precision@10
29
+ - dot_recall@1
30
+ - dot_recall@5
31
+ - dot_recall@10
32
+ - dot_ndcg@5
33
+ - dot_ndcg@10
34
+ - dot_ndcg@100
35
+ - dot_mrr@5
36
+ - dot_mrr@10
37
+ - dot_mrr@100
38
+ - dot_map@100
39
+ pipeline_tag: sentence-similarity
40
+ tags:
41
+ - sentence-transformers
42
+ - sentence-similarity
43
+ - feature-extraction
44
+ - generated_from_trainer
45
+ - dataset_size:900
46
+ - loss:GISTEmbedLoss
47
+ widget:
48
+ - source_sentence: How does the Equity Grant contribute to the creditworthiness of
49
+ FPOs?
50
+ sentences:
51
+ - ''' Date……………………………… ……………………………… Signature of Branch Manager with branch seal Name……………………………………
52
+ … Designation …………………………………… ……………………………… ……………………………… Signature of Authorized
53
+ Person in zonal office Name………………………………… Designation …………………………………… 5. Promoter''s
54
+ request letter List of Enclosures 1. Recommendation 9. List of shareholders addressed
55
+ to the Bank Manager on original letter head of FPO confirmed by promoter and
56
+ bank with amount of CGC sought on Bank''s Original letterhead with date and
57
+ dispatch number duly signed by the Branch Manager on each page. 2. Sanction letter
58
+ of 6. Implementation Schedule 10. Affidavit of promoters that confirmed by
59
+ the bank. they have not availed CGC from any other institution for sanctioned
60
+ Credit Facility. sanctioning authority addressed to recommending branch. 3.
61
+ Bank''s approved 7. Up-to-date statement of account of 11. Field inspection
62
+ report of Term loan and Cash Credit (if Sanctioned). Bank official as on recent
63
+ date. Appraisal/Process note bearing signature of sanctioning authority. 4.
64
+ Potential Impact on 8. a).Equity Certificate, C.A/CS * Pin Code at Column No.
65
+ 1. a), certificate/RCS certificate 2. b), 2. c), 4. a) and 9. a) is Mandatory b).
66
+ FORM-2, FORM-5 and FORM-23 filed with ROC for Company/RCS. small farmer producers 1.
67
+ Social Impact, 2. Environmental Impact 3.'''
68
+ - '''i. Shareholder List and Share Capital contribution by each Member verified
69
+ and certified by a Chartered Accountant (CA) prior to submission (Format attached,
70
+ Annexure I- Enclosure-I). ii. Resolution of FPO Board/Governing Council to seek
71
+ Equity Grant for Members (Format attached, Annexure I- Enclosure-II). iii. Consent
72
+ of Shareholders, stating name of shareholder, gender, number of shares held, face
73
+ value of shares, land holding, and signature, signifying consent for Implementing
74
+ Agency to directly transfer the Equity Grant sanctioned to the FPC on their behalf,
75
+ to FPC Bank account, against the consideration of additional shares of equivalent
76
+ value to be issued to them by FPC and on exit- transfer of the shares as per rules
77
+ (Format attached, Annexure I-Enclosure-III). iv. Audited Financials of FPO for
78
+ a minimum 1 year/for all years of existence of the FPO if formed less than three
79
+ years prior to application/ for the last 3 years for FPO in existence for 3 years
80
+ or more, verified and certified by a Chartered Accountant (CA) prior to submission.
81
+ v. Photocopy of FPO Bank Account Statement for last six months authenticated by
82
+ Branch Manager. vi. Business plan and budget for next 18 months. vii. Names, photographs,
83
+ and identity proof (one from among ration card, Aadhaar card, election identification
84
+ card, and passport of Representatives/ Directors authorized by the Board for executing
85
+ and signing all documents under the Scheme. viii. Each page of Application Form and
86
+ accompanying documents should be signed by a minimum of two Board Member Authorised
87
+ Representatives of FPO;'''
88
+ - '''11.1 Producer members'' own equity supplemented by a matching Equity Grant
89
+ from Government, which is required to strengthen financial base of FPOs and help
90
+ them to get credit from financial institutions for their projects and working
91
+ capital requirements for business development. Equity Grant shall be in the form
92
+ of matching grant upto Rs. 2,000 per farmer member of FPO subject to maximum limit
93
+ of Rs. 15.00 lakh fixed per FPO. This Equity Grant is not in the form of government
94
+ participation in equity, but only as a matching grant to the FPOs as farmer members''
95
+ equity. Therefore, Rs.1,500 crore with DAC&FW is proposed in the scheme to cover
96
+ all the 10,000 FPOs, if maximum permissible equity is contributed to all 10,000
97
+ FPOs. 11.2 **Objectives of Equity Grant:** The objectives of Equity Grant are to
98
+ (i) enhance viability and sustainability of FPOs; (ii) increase credit worthiness
99
+ of FPOs; and (iii) enhance shareholding of members to increase their ownership
100
+ and participation in their FPO. 11.3 **Eligibility Criteria for FPOs:** An FPO
101
+ fulfilling following criteria shall be eligible to apply for Equity Grant under
102
+ the Scheme- (i) It shall be a legal entity as per para 2 of this guidelines.
103
+ (ii) It has raised equity from its Members as laid down in its Articles of Association/
104
+ Bye laws, as the case may be. (iii) The number of its Individual Shareholders
105
+ is in accordance with the terms hereto read together with the Scheme. (iv) Minimum
106
+ 50% of its shareholders are small, marginal and landless tenant farmers as defined
107
+ by the Agriculture Census carried out periodically by the Ministry of Agriculture,
108
+ GoI. Women farmers'' participation as its shareholders is to be preferred. (v)
109
+ Maximum shareholding by any one member shall not be more than 10% of total equity
110
+ of the FPO.'''
111
+ - source_sentence: What is the purpose of the National Crop Insurance Portal?
112
+ sentences:
113
+ - '''i. Shareholder List and Share Capital contribution by each Member verified
114
+ and certified by a Chartered Accountant (CA) prior to submission (Format attached,
115
+ Annexure I- Enclosure-I). ii. Resolution of FPO Board/Governing Council to seek
116
+ Equity Grant for Members (Format attached, Annexure I- Enclosure-II). iii. Consent
117
+ of Shareholders, stating name of shareholder, gender, number of shares held, face
118
+ value of shares, land holding, and signature, signifying consent for Implementing
119
+ Agency to directly transfer the Equity Grant sanctioned to the FPC on their behalf,
120
+ to FPC Bank account, against the consideration of additional shares of equivalent
121
+ value to be issued to them by FPC and on exit- transfer of the shares as per rules
122
+ (Format attached, Annexure I-Enclosure-III). iv. Audited Financials of FPO for
123
+ a minimum 1 year/for all years of existence of the FPO if formed less than three
124
+ years prior to application/ for the last 3 years for FPO in existence for 3 years
125
+ or more, verified and certified by a Chartered Accountant (CA) prior to submission.
126
+ v. Photocopy of FPO Bank Account Statement for last six months authenticated by
127
+ Branch Manager. vi. Business plan and budget for next 18 months. vii. Names, photographs,
128
+ and identity proof (one from among ration card, Aadhaar card, election identification
129
+ card, and passport of Representatives/ Directors authorized by the Board for executing
130
+ and signing all documents under the Scheme. viii. Each page of Application Form and
131
+ accompanying documents should be signed by a minimum of two Board Member Authorised
132
+ Representatives of FPO;'''
133
+ - '''i. \''Credit Facility\'' means any fund based credit facility extended by
134
+ an Eligible Lending Institution (ELI) to an Eligible Borrower without any Collateral
135
+ Security or Third Party Guarantee ; ii. \''Credit Guarantee Fund\'' means the
136
+ Credit Guarantee Fund for FPOs created with NABARD and NCDC respectively under
137
+ the Scheme with matching grant from DAC&FW for the purpose of extending guarantee
138
+ to the eligible lending institution(s) against their collateral free lending to eligible
139
+ FPOs; iii. \''Eligible Lending Institution (ELI)\'' means a Scheduled Commercial
140
+ Bank for the time being included in the second Schedule to the Reserve Bank of
141
+ India Act, 1934, Regional Rural Banks, Co-operative Banks, Cooperative Credit Society,
142
+ NEDFI, or any other institution (s) as may be decided by the NABARD and/or NCDC,
143
+ as the case may be, in consultation with Government of India from time to time.
144
+ NABARD and NCDC can also finance, if they so desire with the approval of DAC&FW/N-PMFSC.
145
+ NBFCs and such other financing institutions with required net worth and track
146
+ record may also serve as Eligible Lending Institutions (ELIs), for lending to
147
+ FPOs with a moderate spread between their cost of capital and lending rate. However,
148
+ Standard Financial Sector Rating Agency should have rated NBFC **to be AAA**
149
+ to be considered as ELI; iv. \''Guarantee Cover\'' means maximum cover available
150
+ per eligible FPO borrower; v. \''Guarantee Fee\'' means the onetime fee at
151
+ a specified rate of the eligible credit facility sanctioned by the ELI, payable
152
+ by the ELI to NABARD or NCDC, as the case may be; and vi.'''
153
+ - ''' 2.7 Secured credential/login, preferably linked with Aadhaar Number and
154
+ mobile OTP based, for all Stakeholders viz, Central Government, State Governments,
155
+ Banks, empanelled Insurance Companies and their designated field functionaries
156
+ will be provided on the Portal to enable them to enter/upload/download the
157
+ requisite information. 2.8 Insurance Companies shall not distribute/collect/allow
158
+ any other proforma/utility/web Portal etc for collecting details of insured
159
+ farmers separately. However they may provide all requisite support to facilitate
160
+ Bank Branches/PACS for uploading the farmer''s details on the Portal well within
161
+ the prescribed cut-off dates. 2.9 Only farmers whose data is uploaded on
162
+ the National Crop Insurance Portal shall be eligible for Insurance coverage
163
+ and the premium subsidy from State and Central Govt. will be released accordingly. 2.10 All
164
+ data pertaining to crop-wise, area-wise historical yield data, weather data, sown
165
+ area, coverage and claims data, calamity years and actual yield shall be made
166
+ available on the National Crop Insurance Portal for the purpose of premium
167
+ rating, claim calculation etc. 2.11 Banks/Financial Institutions/other intermediaries
168
+ need to compulsorily transfer the individual farmer''s data electronically
169
+ to the National Crop Insurance Portal. Accordingly Banks/FIs may endeavour to undertake
170
+ CBS integration in a time bound manner for real time transfer of information/data. 2.12 It
171
+ is also proposed to develop an integrated platform/portal for both PMFBY and Interest
172
+ Subvention Scheme. The data/information of both the Schemes shall be auto synchronized
173
+ to enable real time sharing of information and better program monitoring. 2.13 Insurance
174
+ Companies shall compulsorily use technology/mobile applications for monitoring
175
+ of crop health/Crop Cutting Experiments (CCEs) in coordination with concerned
176
+ States. States shall also facilitate Insurance Companies with Satellite Imagery/Usage
177
+ of Drones by way of prior approval of agency from which such data can be sourced.
178
+ This is required for better monitoring and ground- truthing.'''
179
+ - source_sentence: What should the business plan of an FPO be based on?
180
+ sentences:
181
+ - '''First installment due on (date) : ii). Last Installment due on (date)
182
+ : 6. b). Cash Credit : Limit: Drawing Power: Outstanding: Comments
183
+ on Irregularity ( if any): Any adverse comments on the unit by inspecting
184
+ official in last inspection report: 7. A. Cost of Project (as accepted by
185
+ sanctioning authority)(In Rs. Lakh) B. Means of Finance (as accepted by sanctioning
186
+ authority)(In Rs. Lakh) Give component wise details a. Term loan of Bank: b.
187
+ Promoter Equity c. Unsecured loan : d. Others if any Total Total 8. A.
188
+ Forward Linkages: B. Backward Linkages with Small/Marginal farmers: 1 No.
189
+ of members: 2 Details of Primary and Collateral Securities taken by the
190
+ bank (if any) 3 a. Primary Securities b. Collateral Securities 4 5 6 (Please
191
+ enclose details separately) 9 NameoftheConsortium(ifany)associatedwithCreditFacilitywithcompleteaddress,contac
192
+ t details and email: 9 a) Address (*with pin-code) : 9 b) Contact Details
193
+ : 9 c) Email Address : Request of Branch head for Credit Guarantee:- In
194
+ view of the above information, we request Credit Guarantee Cover against Credit
195
+ Facility of Rs.....................(in Rupees ) to FPO(copy of sanction letter
196
+ along with appraisal/process note of competent authority is enclosed for your
197
+ perusal and record ). Further we confirm that : 1. The KYC norms in respect of
198
+ the Promoters have been complied by us. 2. Techno-feasibility and economic viability
199
+ aspect of the project has been taken care of by the sanctioning authority and
200
+ the branch. 3. On quarterly basis, bank will apprise the ........................(Name
201
+ of Implementing Agency)about progress of unit, recovery of bank''s dues and present
202
+ status of account to........................(Name of Implementing Agency) 4.
203
+ We undertake to abide by the Terms & Conditions of the Scheme.'''
204
+ - '''19.1 It has been seen, during first two years of implementation of PMFBY,
205
+ there are various types of yield disputes, which unnecessarily delays the claim
206
+ settlement. Following figure shows the procedures to be adopted in various cases. Figure.
207
+ Procedures to be followed in different yield dispute cases 19.2 Wherever
208
+ the yield estimates reported at IU level are abnormally low or high vis-à-vis
209
+ the general crop condition the Insurance Company in consultation with State Govt.
210
+ can make use of various products (e.g. Satellite based Vegetation Index, Weather
211
+ parameters, etc.) or other technologies (including statistical test, crop models
212
+ etc.) to confirm yield estimates. If Insurance Company witnesses any anomaly/deficiency
213
+ in the actual yield data(partial /consolidated) received from the State Govt.,
214
+ the same shall be brought into the notice of concerned State department within
215
+ 7 days from date of receipt of yield data with specific observations/remarks under
216
+ intimation to Govt. of India and anomaly, if any, may be resolved in next 7 days
217
+ by the State Level Coordination Committee (SLCC) headed by Additional Chief
218
+ Secretary/Principal Secretary/Secretary of the concerned department. This committee
219
+ shall be authorized to decide all such cases and the decision in such cases shall
220
+ be final. The SLCC may refer the case to State Level Technical Advisory Committee
221
+ (STAC) for dispute resolution (Constitution of STAC is defined in Para 19.5).
222
+ In case the matter stands unresolved even after examination by STAC, it may be
223
+ escalated to TAC along with all relevant documents including minutes of meetings/records
224
+ of discussion and report of the STAC and SLCC. Reference to TAC can be made thereafter
225
+ only in conditions specified in Para 19.7.1 However, data with anomalies which
226
+ is not reported within 7 days will be treated as accepted to insurance company.'''
227
+ - ''' (vi) A farmer can be member in more than one FPO with different produce clusters
228
+ but he/she will be eligible only once(for any one FPO that he/she is a member)
229
+ for the matching equity grant up to his/her share. (vii) In the Board of Directors
230
+ (BoD) and Governing Body (GB), as the case may be, there shall be adequate representation
231
+ of women farmer member(s) and there should be minimum one woman member. (viii) It
232
+ has a duly constituted Management Committee responsible for the business of the
233
+ FPO. (ix) It has a business plan and budget for next 18 months that is based
234
+ on a sustainable, revenue model as may be determined by the Implementing Agency.'''
235
+ - source_sentence: How often does DAC&FW release advances to Implementing Agencies?
236
+ sentences:
237
+ - '''| Picking 1 | Picking 2 |
238
+ Picking 4 |\n|-------------------------------------------------------|----------------|--------------|\n|
239
+ Total Yield Kg) | | |\n|
240
+ Picking 3 | | |\n|
241
+ Yield (Kg) | | |\n|
242
+ Crop | Experiment no. | |\n|
243
+ Yield | | |\n|
244
+ (Kg) | | |\n|
245
+ Yield | | |\n|
246
+ (Kg) | | |\n|
247
+ Yield | | |\n|
248
+ (Kg) | | |\n|
249
+ P1 | P2 | P3 |\n|
250
+ Well Conducted CCEs in the Taluka with 4 pickings | | |\n|
251
+ Cotton | E1 | 1 |\n|
252
+ Cotton | E2 | 1 |\n|
253
+ Cotton | E3 | 0.75 |\n|
254
+ Cotton | E4 | 0.8 |\n|
255
+ Cotton | E5 | 0.95 |\n| |
256
+ Average | 0.9 |\n| 6.373 |
257
+ 2.128 | 1.282 |\n| (1 | | |\n|
258
+ st | | |\n|
259
+ + 2 | | |\n|
260
+ nd | | |\n|
261
+ +3 | | |\n|
262
+ rd | | |\n| | | |\n|
263
+ Factor (Total yield/ | | |\n|
264
+ Picking Yield) | | |\n| | | |\n|
265
+ (1 | | |\n|
266
+ st | | |\n|
267
+ ) | (1 | |\n|
268
+ st | | |\n|
269
+ + | | |\n|
270
+ 2 | | |\n|
271
+ nd | | |\n|
272
+ ) | ) | |\n|
273
+ CCEs with Less Pickings in any IU within that Taluka | | |\n|
274
+ Cotton | E6 (only 1 | |\n|
275
+ st | | |\n|
276
+ Picking) | 1 | |\n|
277
+ Cotton | E7 (1 | |\n|
278
+ st | | |\n|
279
+ and 2 | | |\n|
280
+ nd | | |\n|
281
+ Picking) | 1.2 | 1.75 |\n|
282
+ Cotton | E8 (1 | |\n|
283
+ st | | |\n|
284
+ , 2 | | |\n|
285
+ nd | | |\n|
286
+ & 3 | | |\n|
287
+ rd | | |\n|
288
+ Picking) | 1.1 | 1.85 |'''
289
+ - '''8.2.1 DAC&FW will make the advance release to the Implementing Agencies (IAs)
290
+ on six monthly basis based on recommendation of N-PMAFSC, Annual Action Plan
291
+ (AAP) of IAs and the due utilization certificate submitted to meet out the expenses
292
+ for engaging NPMA, FPO formation & incubation cost to CBBO and also meeting out
293
+ the cost of FPO management cost direct to concerned FPOs account on recommendation
294
+ of concerned CBBO and Equity Grant etc. for effective and timely implementation
295
+ of the programme. The Implementing Agencies will develop the payment schedule
296
+ based on their various stages and component of payment involved. The Implementing
297
+ Agencies will raise the demand to DAC&FW for release of payment. The Implementing
298
+ Agencies will submit utilization certificate of last payment released as per GFR
299
+ for releasing the next payment to them. In case of training, NABARD and NCDC will
300
+ submit to N- PMAFSC the training schedule for a year with tentative expenditure
301
+ for training through specialised training institutes organised through their
302
+ respective nodal training Institute. DAC&FW will make due payment to NABARD and
303
+ NCDC for training through specialised Institutions based on the demand raised
304
+ by NABARD and NCDC respectively and utilisation certificate will be submitted
305
+ to DAC&FW by both as due. Further, as regards DAC&FW''s share towards Credit Guarantee
306
+ Fund (CGF) to be maintained and managed by NABARD and NCDC, the DAC&FW will provide
307
+ its matching share to NABARD and NCDC, as the case may be, which in turn will
308
+ submit detailed status of utilization to DAC&FW before raising the further demand
309
+ for next installment of CGF.'''
310
+ - '''7.5.1 Only those AWS/ARGs of IMD/State Govt. /private agencies should be
311
+ considered and notified which are as per standards defined by IMD/WMO and are
312
+ certified and approved by IMD/any agency to be notified by the State/Central
313
+ govt. These must be optimally operational and be able to provide real time weather
314
+ data. AWS/ARG of private agencies should only be considered in absence of properly functioning
315
+ AWS/ARGs of IMD/ State Govt. AWS /ARG data sourced for crop insurance should be transferred
316
+ on real time basis to National Portal. The detailed guidelines for sharing of
317
+ weather data on the Portal will be circulated separately. 7.5.2 State govt
318
+ can explore the possibility to create dense AWS/ARG network on PPP Mode for which
319
+ GOI will provide 50% of the viability gap funding. 7.5.3 The following data
320
+ sources may be used for validation of on account claims and claims for prevented sowing:'''
321
+ - source_sentence: Who is considered as the nodal agency for engagement with the Ministry
322
+ of Agriculture and Farmers Welfare and Insurance Companies?
323
+ sentences:
324
+ - '''8.2.1 DAC&FW will make the advance release to the Implementing Agencies (IAs)
325
+ on six monthly basis based on recommendation of N-PMAFSC, Annual Action Plan
326
+ (AAP) of IAs and the due utilization certificate submitted to meet out the expenses
327
+ for engaging NPMA, FPO formation & incubation cost to CBBO and also meeting out
328
+ the cost of FPO management cost direct to concerned FPOs account on recommendation
329
+ of concerned CBBO and Equity Grant etc. for effective and timely implementation
330
+ of the programme. The Implementing Agencies will develop the payment schedule
331
+ based on their various stages and component of payment involved. The Implementing
332
+ Agencies will raise the demand to DAC&FW for release of payment. The Implementing
333
+ Agencies will submit utilization certificate of last payment released as per GFR
334
+ for releasing the next payment to them. In case of training, NABARD and NCDC will
335
+ submit to N- PMAFSC the training schedule for a year with tentative expenditure
336
+ for training through specialised training institutes organised through their
337
+ respective nodal training Institute. DAC&FW will make due payment to NABARD and
338
+ NCDC for training through specialised Institutions based on the demand raised
339
+ by NABARD and NCDC respectively and utilisation certificate will be submitted
340
+ to DAC&FW by both as due. Further, as regards DAC&FW''s share towards Credit Guarantee
341
+ Fund (CGF) to be maintained and managed by NABARD and NCDC, the DAC&FW will provide
342
+ its matching share to NABARD and NCDC, as the case may be, which in turn will
343
+ submit detailed status of utilization to DAC&FW before raising the further demand
344
+ for next installment of CGF.'''
345
+ - ''' 13.4 Laxmanrao Imandar National Academy for Co-operative Research & Development
346
+ (LINAC), Gurugram promoted by NCDC is designated as Nodal Training Institution
347
+ at central level for FPOs registered under Co-operative Societies Act and promoted
348
+ by NCDC. The LINAC will work in partnership with other reputed national and regional
349
+ training institutions like NIAM, VAMNICOM, MANAGE, NIRD, NCCT, IRMA, ASCI, State
350
+ and Central Agriculture Universities, KVK, very reputed National level Management
351
+ and Skill Development Institutions/Universities etc. The LINAC in consultation
352
+ with NCDC and DAC&FW will prepare a training module and training schedule for
353
+ the ensuing year, which will be got approved by N-PMAFSC. As regards training
354
+ expenses, in case of LINAC being nodal agency, the LINAC through NCDC will claim
355
+ the expenses from DAC&FW and will also submit the utilization certificate through
356
+ NCDC after the training programme is over. 13.5 DAC&FW in due course may also
357
+ identify and designate other training institute(s) as additional Nodal Training
358
+ Institute at central level, which will undertake training and skill development
359
+ partnering with other national and regional level institutes. 13.6 The central
360
+ Nodal Training Institutes will ensure that training programme be held preferably
361
+ in same State/UT wherein FPO trainees located are proposed to participate to reduce
362
+ the burden on transportation(TA/DA) cost. While formulating the training schedule,
363
+ Nodal Training Institutes will ensure that BoDs, CEOs/Managers and other stakeholders
364
+ etc. are trained twice in a year. Nodal Training Institutes will have to make
365
+ boarding and lodging arrangements for the trainees and will also reimburse to
366
+ and fro journey tickets to the extent of sleeper class train tickets and/or ordinary
367
+ bus fare. Nodal Training Institutions will also evolve methodology to monitor
368
+ and track the performance of trainees and their FPO organization to ensure effectiveness
369
+ of training being provided.'''
370
+ - '''8.1 CSCs under Ministry of Electronics and Information Technology (MeITY)
371
+ have been engaged to enrol non-loanee farmers. The Insurance Companies are
372
+ required to enter into a separate agreement with CSC and pay service charges
373
+ as fixed by DAC&FW, GOI per farmer per village per season. No other agreement
374
+ or payment is required to be made for this purpose. Nodal agency for engagement
375
+ with Ministry of Agriculture and Farmers Welfare and Insurance Companies will
376
+ be CSC-SPV, a company established under MeITY for carrying out e-governance
377
+ initiatives of GoI. 8.2 No charges/fee shall be borne or paid by the farmers
378
+ being enrolled through CSCs i.e. CSC-SPV and CSC-VLE 8.3 As per IRDA circular,
379
+ no separate qualification/certification will be required for the VLEs of CSCs
380
+ to facilitate enrolment of non-loanee farmers. 8.4 All empanelled Insurance
381
+ Companies will compulsorily be required to enter into an agreement with CSC
382
+ for enrolment of non-loanee farmers and for provision of other defined services
383
+ to farmers. 8.5 Other designated intermediaries may be linked with the Portal
384
+ in due course. 8.6 Empanelled Insurance Companies have to necessarily register
385
+ on the portal and submit list and details of agents/intermediaries engaged
386
+ for enrolment of non-loanee farmers in the beginning of each season within
387
+ 10 days of award of work in the State. Further all agents/intermediaries have
388
+ to work strictly as per the provisions of the Scheme and IRDA regulations'''
389
+ model-index:
390
+ - name: SentenceTransformer based on BAAI/bge-small-en-v1.5
391
+ results:
392
+ - task:
393
+ type: information-retrieval
394
+ name: Information Retrieval
395
+ dataset:
396
+ name: val evaluator
397
+ type: val_evaluator
398
+ metrics:
399
+ - type: cosine_accuracy@1
400
+ value: 0.51
401
+ name: Cosine Accuracy@1
402
+ - type: cosine_accuracy@5
403
+ value: 0.9
404
+ name: Cosine Accuracy@5
405
+ - type: cosine_accuracy@10
406
+ value: 0.96
407
+ name: Cosine Accuracy@10
408
+ - type: cosine_precision@1
409
+ value: 0.51
410
+ name: Cosine Precision@1
411
+ - type: cosine_precision@5
412
+ value: 0.17999999999999997
413
+ name: Cosine Precision@5
414
+ - type: cosine_precision@10
415
+ value: 0.096
416
+ name: Cosine Precision@10
417
+ - type: cosine_recall@1
418
+ value: 0.51
419
+ name: Cosine Recall@1
420
+ - type: cosine_recall@5
421
+ value: 0.9
422
+ name: Cosine Recall@5
423
+ - type: cosine_recall@10
424
+ value: 0.96
425
+ name: Cosine Recall@10
426
+ - type: cosine_ndcg@5
427
+ value: 0.7319026681359824
428
+ name: Cosine Ndcg@5
429
+ - type: cosine_ndcg@10
430
+ value: 0.7503025597337694
431
+ name: Cosine Ndcg@10
432
+ - type: cosine_ndcg@100
433
+ value: 0.7590365063330959
434
+ name: Cosine Ndcg@100
435
+ - type: cosine_mrr@5
436
+ value: 0.6745
437
+ name: Cosine Mrr@5
438
+ - type: cosine_mrr@10
439
+ value: 0.6815000000000002
440
+ name: Cosine Mrr@10
441
+ - type: cosine_mrr@100
442
+ value: 0.6834441946057421
443
+ name: Cosine Mrr@100
444
+ - type: cosine_map@100
445
+ value: 0.6834441946057419
446
+ name: Cosine Map@100
447
+ - type: dot_accuracy@1
448
+ value: 0.51
449
+ name: Dot Accuracy@1
450
+ - type: dot_accuracy@5
451
+ value: 0.9
452
+ name: Dot Accuracy@5
453
+ - type: dot_accuracy@10
454
+ value: 0.96
455
+ name: Dot Accuracy@10
456
+ - type: dot_precision@1
457
+ value: 0.51
458
+ name: Dot Precision@1
459
+ - type: dot_precision@5
460
+ value: 0.17999999999999997
461
+ name: Dot Precision@5
462
+ - type: dot_precision@10
463
+ value: 0.096
464
+ name: Dot Precision@10
465
+ - type: dot_recall@1
466
+ value: 0.51
467
+ name: Dot Recall@1
468
+ - type: dot_recall@5
469
+ value: 0.9
470
+ name: Dot Recall@5
471
+ - type: dot_recall@10
472
+ value: 0.96
473
+ name: Dot Recall@10
474
+ - type: dot_ndcg@5
475
+ value: 0.7319026681359824
476
+ name: Dot Ndcg@5
477
+ - type: dot_ndcg@10
478
+ value: 0.7503025597337692
479
+ name: Dot Ndcg@10
480
+ - type: dot_ndcg@100
481
+ value: 0.7590365063330959
482
+ name: Dot Ndcg@100
483
+ - type: dot_mrr@5
484
+ value: 0.6745
485
+ name: Dot Mrr@5
486
+ - type: dot_mrr@10
487
+ value: 0.6815000000000002
488
+ name: Dot Mrr@10
489
+ - type: dot_mrr@100
490
+ value: 0.6834441946057421
491
+ name: Dot Mrr@100
492
+ - type: dot_map@100
493
+ value: 0.6834441946057419
494
+ name: Dot Map@100
495
+ ---
496
+
497
+ # SentenceTransformer based on BAAI/bge-small-en-v1.5
498
+
499
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
500
+
501
+ ## Model Details
502
+
503
+ ### Model Description
504
+ - **Model Type:** Sentence Transformer
505
+ - **Base model:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) <!-- at revision 5c38ec7c405ec4b44b94cc5a9bb96e735b38267a -->
506
+ - **Maximum Sequence Length:** 512 tokens
507
+ - **Output Dimensionality:** 384 tokens
508
+ - **Similarity Function:** Cosine Similarity
509
+ <!-- - **Training Dataset:** Unknown -->
510
+ <!-- - **Language:** Unknown -->
511
+ <!-- - **License:** Unknown -->
512
+
513
+ ### Model Sources
514
+
515
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
516
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
517
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
518
+
519
+ ### Full Model Architecture
520
+
521
+ ```
522
+ SentenceTransformer(
523
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
524
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
525
+ (2): Normalize()
526
+ )
527
+ ```
528
+
529
+ ## Usage
530
+
531
+ ### Direct Usage (Sentence Transformers)
532
+
533
+ First install the Sentence Transformers library:
534
+
535
+ ```bash
536
+ pip install -U sentence-transformers
537
+ ```
538
+
539
+ Then you can load this model and run inference.
540
+ ```python
541
+ from sentence_transformers import SentenceTransformer
542
+
543
+ # Download from the 🤗 Hub
544
+ model = SentenceTransformer("SamagraDataGov/embedding_finetuned_test")
545
+ # Run inference
546
+ sentences = [
547
+ 'Who is considered as the nodal agency for engagement with the Ministry of Agriculture and Farmers Welfare and Insurance Companies?',
548
+ "'8.1 CSCs under Ministry of Electronics and Information Technology (MeITY) have been engaged to enrol non-loanee farmers. The Insurance Companies are required to enter into a separate agreement with CSC and pay service charges as fixed by DAC&FW, GOI per farmer per village per season. No other agreement or payment is required to be made for this purpose. Nodal agency for engagement with Ministry of Agriculture and Farmers Welfare and Insurance Companies will be CSC-SPV, a company established under MeITY for carrying out e-governance initiatives of GoI. 8.2 No charges/fee shall be borne or paid by the farmers being enrolled through CSCs i.e. CSC-SPV and CSC-VLE 8.3 As per IRDA circular, no separate qualification/certification will be required for the VLEs of CSCs to facilitate enrolment of non-loanee farmers. 8.4 All empanelled Insurance Companies will compulsorily be required to enter into an agreement with CSC for enrolment of non-loanee farmers and for provision of other defined services to farmers. 8.5 Other designated intermediaries may be linked with the Portal in due course. 8.6 Empanelled Insurance Companies have to necessarily register on the portal and submit list and details of agents/intermediaries engaged for enrolment of non-loanee farmers in the beginning of each season within 10 days of award of work in the State. Further all agents/intermediaries have to work strictly as per the provisions of the Scheme and IRDA regulations'",
549
+ "' 13.4 Laxmanrao Imandar National Academy for Co-operative Research & Development (LINAC), Gurugram promoted by NCDC is designated as Nodal Training Institution at central level for FPOs registered under Co-operative Societies Act and promoted by NCDC. The LINAC will work in partnership with other reputed national and regional training institutions like NIAM, VAMNICOM, MANAGE, NIRD, NCCT, IRMA, ASCI, State and Central Agriculture Universities, KVK, very reputed National level Management and Skill Development Institutions/Universities etc. The LINAC in consultation with NCDC and DAC&FW will prepare a training module and training schedule for the ensuing year, which will be got approved by N-PMAFSC. As regards training expenses, in case of LINAC being nodal agency, the LINAC through NCDC will claim the expenses from DAC&FW and will also submit the utilization certificate through NCDC after the training programme is over. 13.5 DAC&FW in due course may also identify and designate other training institute(s) as additional Nodal Training Institute at central level, which will undertake training and skill development partnering with other national and regional level institutes. 13.6 The central Nodal Training Institutes will ensure that training programme be held preferably in same State/UT wherein FPO trainees located are proposed to participate to reduce the burden on transportation(TA/DA) cost. While formulating the training schedule, Nodal Training Institutes will ensure that BoDs, CEOs/Managers and other stakeholders etc. are trained twice in a year. Nodal Training Institutes will have to make boarding and lodging arrangements for the trainees and will also reimburse to and fro journey tickets to the extent of sleeper class train tickets and/or ordinary bus fare. Nodal Training Institutions will also evolve methodology to monitor and track the performance of trainees and their FPO organization to ensure effectiveness of training being provided.'",
550
+ ]
551
+ embeddings = model.encode(sentences)
552
+ print(embeddings.shape)
553
+ # [3, 384]
554
+
555
+ # Get the similarity scores for the embeddings
556
+ similarities = model.similarity(embeddings, embeddings)
557
+ print(similarities.shape)
558
+ # [3, 3]
559
+ ```
560
+
561
+ <!--
562
+ ### Direct Usage (Transformers)
563
+
564
+ <details><summary>Click to see the direct usage in Transformers</summary>
565
+
566
+ </details>
567
+ -->
568
+
569
+ <!--
570
+ ### Downstream Usage (Sentence Transformers)
571
+
572
+ You can finetune this model on your own dataset.
573
+
574
+ <details><summary>Click to expand</summary>
575
+
576
+ </details>
577
+ -->
578
+
579
+ <!--
580
+ ### Out-of-Scope Use
581
+
582
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
583
+ -->
584
+
585
+ ## Evaluation
586
+
587
+ ### Metrics
588
+
589
+ #### Information Retrieval
590
+ * Dataset: `val_evaluator`
591
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
592
+
593
+ | Metric | Value |
594
+ |:--------------------|:-----------|
595
+ | cosine_accuracy@1 | 0.51 |
596
+ | cosine_accuracy@5 | 0.9 |
597
+ | cosine_accuracy@10 | 0.96 |
598
+ | cosine_precision@1 | 0.51 |
599
+ | cosine_precision@5 | 0.18 |
600
+ | cosine_precision@10 | 0.096 |
601
+ | cosine_recall@1 | 0.51 |
602
+ | cosine_recall@5 | 0.9 |
603
+ | cosine_recall@10 | 0.96 |
604
+ | cosine_ndcg@5 | 0.7319 |
605
+ | cosine_ndcg@10 | 0.7503 |
606
+ | cosine_ndcg@100 | 0.759 |
607
+ | cosine_mrr@5 | 0.6745 |
608
+ | cosine_mrr@10 | 0.6815 |
609
+ | cosine_mrr@100 | 0.6834 |
610
+ | **cosine_map@100** | **0.6834** |
611
+ | dot_accuracy@1 | 0.51 |
612
+ | dot_accuracy@5 | 0.9 |
613
+ | dot_accuracy@10 | 0.96 |
614
+ | dot_precision@1 | 0.51 |
615
+ | dot_precision@5 | 0.18 |
616
+ | dot_precision@10 | 0.096 |
617
+ | dot_recall@1 | 0.51 |
618
+ | dot_recall@5 | 0.9 |
619
+ | dot_recall@10 | 0.96 |
620
+ | dot_ndcg@5 | 0.7319 |
621
+ | dot_ndcg@10 | 0.7503 |
622
+ | dot_ndcg@100 | 0.759 |
623
+ | dot_mrr@5 | 0.6745 |
624
+ | dot_mrr@10 | 0.6815 |
625
+ | dot_mrr@100 | 0.6834 |
626
+ | dot_map@100 | 0.6834 |
627
+
628
+ <!--
629
+ ## Bias, Risks and Limitations
630
+
631
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
632
+ -->
633
+
634
+ <!--
635
+ ### Recommendations
636
+
637
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
638
+ -->
639
+
640
+ ## Training Details
641
+
642
+ ### Training Hyperparameters
643
+ #### Non-Default Hyperparameters
644
+
645
+ - `eval_strategy`: steps
646
+ - `per_device_train_batch_size`: 32
647
+ - `per_device_eval_batch_size`: 32
648
+ - `learning_rate`: 1e-05
649
+ - `weight_decay`: 0.01
650
+ - `num_train_epochs`: 1.0
651
+ - `warmup_ratio`: 0.1
652
+ - `load_best_model_at_end`: True
653
+
654
+ #### All Hyperparameters
655
+ <details><summary>Click to expand</summary>
656
+
657
+ - `overwrite_output_dir`: False
658
+ - `do_predict`: False
659
+ - `eval_strategy`: steps
660
+ - `prediction_loss_only`: True
661
+ - `per_device_train_batch_size`: 32
662
+ - `per_device_eval_batch_size`: 32
663
+ - `per_gpu_train_batch_size`: None
664
+ - `per_gpu_eval_batch_size`: None
665
+ - `gradient_accumulation_steps`: 1
666
+ - `eval_accumulation_steps`: None
667
+ - `torch_empty_cache_steps`: None
668
+ - `learning_rate`: 1e-05
669
+ - `weight_decay`: 0.01
670
+ - `adam_beta1`: 0.9
671
+ - `adam_beta2`: 0.999
672
+ - `adam_epsilon`: 1e-08
673
+ - `max_grad_norm`: 1.0
674
+ - `num_train_epochs`: 1.0
675
+ - `max_steps`: -1
676
+ - `lr_scheduler_type`: linear
677
+ - `lr_scheduler_kwargs`: {}
678
+ - `warmup_ratio`: 0.1
679
+ - `warmup_steps`: 0
680
+ - `log_level`: passive
681
+ - `log_level_replica`: warning
682
+ - `log_on_each_node`: True
683
+ - `logging_nan_inf_filter`: True
684
+ - `save_safetensors`: True
685
+ - `save_on_each_node`: False
686
+ - `save_only_model`: False
687
+ - `restore_callback_states_from_checkpoint`: False
688
+ - `no_cuda`: False
689
+ - `use_cpu`: False
690
+ - `use_mps_device`: False
691
+ - `seed`: 42
692
+ - `data_seed`: None
693
+ - `jit_mode_eval`: False
694
+ - `use_ipex`: False
695
+ - `bf16`: False
696
+ - `fp16`: False
697
+ - `fp16_opt_level`: O1
698
+ - `half_precision_backend`: auto
699
+ - `bf16_full_eval`: False
700
+ - `fp16_full_eval`: False
701
+ - `tf32`: None
702
+ - `local_rank`: 0
703
+ - `ddp_backend`: None
704
+ - `tpu_num_cores`: None
705
+ - `tpu_metrics_debug`: False
706
+ - `debug`: []
707
+ - `dataloader_drop_last`: False
708
+ - `dataloader_num_workers`: 0
709
+ - `dataloader_prefetch_factor`: None
710
+ - `past_index`: -1
711
+ - `disable_tqdm`: False
712
+ - `remove_unused_columns`: True
713
+ - `label_names`: None
714
+ - `load_best_model_at_end`: True
715
+ - `ignore_data_skip`: False
716
+ - `fsdp`: []
717
+ - `fsdp_min_num_params`: 0
718
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
719
+ - `fsdp_transformer_layer_cls_to_wrap`: None
720
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
721
+ - `deepspeed`: None
722
+ - `label_smoothing_factor`: 0.0
723
+ - `optim`: adamw_torch
724
+ - `optim_args`: None
725
+ - `adafactor`: False
726
+ - `group_by_length`: False
727
+ - `length_column_name`: length
728
+ - `ddp_find_unused_parameters`: None
729
+ - `ddp_bucket_cap_mb`: None
730
+ - `ddp_broadcast_buffers`: False
731
+ - `dataloader_pin_memory`: True
732
+ - `dataloader_persistent_workers`: False
733
+ - `skip_memory_metrics`: True
734
+ - `use_legacy_prediction_loop`: False
735
+ - `push_to_hub`: False
736
+ - `resume_from_checkpoint`: None
737
+ - `hub_model_id`: None
738
+ - `hub_strategy`: every_save
739
+ - `hub_private_repo`: False
740
+ - `hub_always_push`: False
741
+ - `gradient_checkpointing`: False
742
+ - `gradient_checkpointing_kwargs`: None
743
+ - `include_inputs_for_metrics`: False
744
+ - `eval_do_concat_batches`: True
745
+ - `fp16_backend`: auto
746
+ - `push_to_hub_model_id`: None
747
+ - `push_to_hub_organization`: None
748
+ - `mp_parameters`:
749
+ - `auto_find_batch_size`: False
750
+ - `full_determinism`: False
751
+ - `torchdynamo`: None
752
+ - `ray_scope`: last
753
+ - `ddp_timeout`: 1800
754
+ - `torch_compile`: False
755
+ - `torch_compile_backend`: None
756
+ - `torch_compile_mode`: None
757
+ - `dispatch_batches`: None
758
+ - `split_batches`: None
759
+ - `include_tokens_per_second`: False
760
+ - `include_num_input_tokens_seen`: False
761
+ - `neftune_noise_alpha`: None
762
+ - `optim_target_modules`: None
763
+ - `batch_eval_metrics`: False
764
+ - `eval_on_start`: False
765
+ - `eval_use_gather_object`: False
766
+ - `batch_sampler`: batch_sampler
767
+ - `multi_dataset_batch_sampler`: proportional
768
+
769
+ </details>
770
+
771
+ ### Training Logs
772
+ | Epoch | Step | Training Loss | loss | val_evaluator_cosine_map@100 |
773
+ |:----------:|:------:|:-------------:|:---------:|:----------------------------:|
774
+ | **0.5172** | **15** | **2.0908** | **1.008** | **0.6834** |
775
+ | 1.0 | 29 | - | 1.0080 | 0.6834 |
776
+
777
+ * The bold row denotes the saved checkpoint.
778
+
779
+ ### Framework Versions
780
+ - Python: 3.10.14
781
+ - Sentence Transformers: 3.0.1
782
+ - Transformers: 4.43.4
783
+ - PyTorch: 2.4.1+cu121
784
+ - Accelerate: 0.33.0
785
+ - Datasets: 2.21.0
786
+ - Tokenizers: 0.19.1
787
+
788
+ ## Citation
789
+
790
+ ### BibTeX
791
+
792
+ #### Sentence Transformers
793
+ ```bibtex
794
+ @inproceedings{reimers-2019-sentence-bert,
795
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
796
+ author = "Reimers, Nils and Gurevych, Iryna",
797
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
798
+ month = "11",
799
+ year = "2019",
800
+ publisher = "Association for Computational Linguistics",
801
+ url = "https://arxiv.org/abs/1908.10084",
802
+ }
803
+ ```
804
+
805
+ #### GISTEmbedLoss
806
+ ```bibtex
807
+ @misc{solatorio2024gistembed,
808
+ title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning},
809
+ author={Aivin V. Solatorio},
810
+ year={2024},
811
+ eprint={2402.16829},
812
+ archivePrefix={arXiv},
813
+ primaryClass={cs.LG}
814
+ }
815
+ ```
816
+
817
+ <!--
818
+ ## Glossary
819
+
820
+ *Clearly define terms in order to be accessible across audiences.*
821
+ -->
822
+
823
+ <!--
824
+ ## Model Card Authors
825
+
826
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
827
+ -->
828
+
829
+ <!--
830
+ ## Model Card Contact
831
+
832
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
833
+ -->
config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-small-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 384,
11
+ "id2label": {
12
+ "0": "LABEL_0"
13
+ },
14
+ "initializer_range": 0.02,
15
+ "intermediate_size": 1536,
16
+ "label2id": {
17
+ "LABEL_0": 0
18
+ },
19
+ "layer_norm_eps": 1e-12,
20
+ "max_position_embeddings": 512,
21
+ "model_type": "bert",
22
+ "num_attention_heads": 12,
23
+ "num_hidden_layers": 12,
24
+ "pad_token_id": 0,
25
+ "position_embedding_type": "absolute",
26
+ "torch_dtype": "float32",
27
+ "transformers_version": "4.43.4",
28
+ "type_vocab_size": 2,
29
+ "use_cache": true,
30
+ "vocab_size": 30522
31
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.43.4",
5
+ "pytorch": "2.4.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4622c440c264399c23ac0e14ec6e14d6fb96180d6e6a6113d5ce008dcd5ef3f3
3
+ size 133462128
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff