File size: 4,759 Bytes
e43e9ba
 
 
 
 
 
 
 
 
 
bae42ba
 
 
 
 
 
 
 
 
 
 
e43e9ba
 
 
 
e5f312c
 
e43e9ba
 
 
 
 
 
 
e2f5e05
 
abf979e
1609154
abf979e
e2f5e05
 
 
 
 
 
 
 
e43e9ba
 
 
 
 
 
 
 
 
 
 
 
 
76b2d09
e43e9ba
 
049b4ef
 
 
d9589e2
247ebbb
049b4ef
 
 
 
 
d9589e2
049b4ef
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e2f5e05
 
 
 
e43e9ba
f114e5a
e43e9ba
 
 
 
 
 
 
f114e5a
b9f448a
e43e9ba
f114e5a
e43e9ba
 
b9f448a
e43e9ba
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
---
license: creativeml-openrail-m
language:
- my
tags:
- Myanmar
- Burmese
- GPT2
- MyanmarGPT
- Nautral Language Processing
widget:
  - text: "အီတလီ"
    example_title: "Example 1"
  - text: "အနုပညာ"
    example_title: "Example 2"
  - text: "တရုတ်"
    example_title: "Example 3"
  - text: "ကျောက်ခေတ်"
    example_title: "Example 4"
  - text: "မြန်မာနိုင်ငံ"
    example_title: "Example 5"
---

# Myanmar-GPT

မြန်မာ(ဗမာ)လိုနားလည်သော GPT - Myanmar GPT 

Myanmar GPT is a model trained on a private Myanmar language dataset made by MinSiThu.
The project aims to make the Myanmar language available in the GPT2 Model.

Fine-tuning the MyanmarGPT model makes it easier to build a custom Myanmar language model than using alternative language models.

Reports on training the MyanmarGPT model are visualized at [MyanmarGPT Report](https://api.wandb.ai/links/minsithu/wn8yul90).

Variants of the Burmese Language-Enabled Models can be found at [https://github.com/MinSiThu/MyanmarGPT](https://github.com/MinSiThu/MyanmarGPT).

There is also 1.42 billion parameters MyanmarGPT-Big model with multilanguage support.
You are find [MyanmarGPT-Big Here](https://huggingface.co/jojo-ai-mst/MyanmarGPT-Big).

Currently, Myanmar GPT has four main variant versions.

- [MyanmarGPT](https://huggingface.co/jojo-ai-mst/MyanmarGPT)
- [MyanmarGPT-Big](https://huggingface.co/jojo-ai-mst/MyanmarGPT-Big)
- [MyanmarGPT-Chat](https://huggingface.co/jojo-ai-mst/MyanmarGPT-Chat)
- [MyanmarGPTX](https://huggingface.co/jojo-ai-mst/MyanmarGPTX)


## How to use in your project

```
!pip install transformers
```

```python
from transformers import pipeline

generator = pipeline(model="jojo-ai-mst/MyanmarGPT")
outputs = generator("အီတလီ",do_sample=False)

print(outputs)
# [{'generated_text': 'အီတလီနိုင်ငံသည် ဥရောပတိုက်၏ တောင်ဘက်တွင် မြေထဲပင်လယ်ထဲသို့ ထိုးထွက်နေသော ကျွန်းဆွယ်ကြီးတစ်ခုဖြစ်၍ ပုံသဏ္ဌာန်အားဖြင့် မြင်းစီးဖိနပ်နှင့် တူလေသည်။ မြောက်ဘက်မှ တောင်ဘက်အငူစွန်းအထိ မိုင်ပေါင်း ၇၅ဝ ခန့် ရှည်လျား၍၊ ပျမ်းမျှမိုင် ၁ဝဝ မှ ၁၂ဝ ခန့်ကျယ်သည်။ အီတလီနိုင်ငံ၏ အကျယ်အဝန်းမှာ ဆာဒင်းနီးယားကျွန်း၊ စစ္စလီကျွန်းနှင့် အနီးပတ်ဝန်းကျင်ရှိ ကျွန်းကလေးများ အပါအဝင် ၁၁၆,၃၅၀ စတုရန်းမိုင်ရှိသည်။ '}]
```

### alternative ways

```python
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel

model = GPT2LMHeadModel.from_pretrained("jojo-ai-mst/MyanmarGPT")
tokenizer = GPT2Tokenizer.from_pretrained("jojo-ai-mst/MyanmarGPT")

def generate_text(prompt, max_length=300, temperature=0.8, top_k=50):
    input_ids = tokenizer.encode(prompt, return_tensors="pt").cuda() # remove .cude() if only cpu
    output = model.generate(
        input_ids,
        max_length=max_length,
        temperature=temperature,
        top_k=top_k,
        pad_token_id=tokenizer.eos_token_id,
        do_sample=True
    )
    for result in output:
      generated_text = tokenizer.decode(result, skip_special_tokens=True)
      print(generated_text)

generate_text("အီတလီ ")
```

## RoadMap for Burmese Language and Artificial Intelligence

I started MyanmarGPT, it has had a huge impact on Myanmar, thus I continue to move this project as a movement called [MyanmarGPT Movement](https://github.com/MyanmarGPT-Movement).
MyanmarGPT Movement is for everyone to initiate AI projects in Myanmar.

## Here are the guidelines for using the MyanmarGPT license,
- MyanmarGPT is free to use for everyone,
  
- **Must Do**
  - any project derived/finetuned from MyanmarGPT, used MyanmarGPT internally,
  - or modified MyanmarGPT, related to MyanmarGPT **must mention the citation below** in the corresponding project's page.
- the citation
```latex
@software{MyanmarGPT,
  author = {{MinSiThu}},
  title = {MyanmarGPT},
  version={1.1-SweptWood}
  url = {https://huggingface.co/jojo-ai-mst/MyanmarGPT},
  urldate = {2023-12-14}
  date = {2023-12-14},
}
```

For contact, reach me via [https://www.linkedin.com/in/min-si-thu/](https://www.linkedin.com/in/min-si-thu/)