File size: 1,841 Bytes
d2051f9 05de0ea d2051f9 05de0ea d2051f9 05de0ea d2051f9 311a854 d2051f9 05de0ea d2051f9 05de0ea d2051f9 05de0ea d2051f9 05de0ea d2051f9 05de0ea d2051f9 05de0ea |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
---
tags:
- fastai
- text-translation
language: ml
widget:
- text: "കേൾക്കുന്ന എല്ലാ കാര്യങ്ങളും എനിക്കു മനസിലായില്ല"
example_title: "Malayalam Seq2Seq translation"
---
# മലയാളം - English ULMFit translationmodel. (Working in Progress)
[![മലയാളം: kaggle notebook](https://img.shields.io/badge/മലയാളം%20-notebook-green.svg)](https://www.kaggle.com/code/rajeshradhakrishnan/ml-ulmfit-seq2seq-translation)
---
# malayalam-ULMFit-Seq2Seq (Traslation model)
malayalam-ULMFit-Seq2Seq model is pre-trained on [Malyalam_Language_Model_ULMFiT](https://github.com/goru001/nlp-for-malyalam/blob/master/language-model/Malyalam_Language_Model_ULMFiT.ipynb) using [fastai](https://docs.fast.ai/text.data.html) Language Model using fastai
Tokenized using Sentencepiece with a vocab size of 10000 the language model is upload to [kaggle dataset](https://www.kaggle.com/datasets/rajeshradhakrishnan/ulmfit-fastai)
## Usage
```
!pip install -Uqq huggingface_hub["fastai"]
from huggingface_hub import from_pretrained_fastai
learner = from_pretrained_fastai(repo_id)
original_xtext = 'കേൾക്കുന്ന എല്ലാ കാര്യങ്ങളും എനിക്കു മനസിലായില്ല'
original_ytext = 'I didnt understand all this'
predicted_text = learner.predict(original_xtext)
print(f'original text: {original_xtext}')
print(f'original answer: {original_ytext}')
print(f'predicted text: {predicted_text}')
```
## Intended uses & limitations
It's not fine tuned to the state of the art accuracy
## Training and evaluation data
[Malayalam Samanantar Dataset - uploaded to kaggle with english - malayalam ](https://www.kaggle.com/datasets/rajeshradhakrishnan/ulmfit-fastai)
|