|
# Changelog |
|
|
|
## v2.3 (2024-02-25) |
|
|
|
### 倧ããªå€æŽ |
|
|
|
#### ãŠãŒã¶ãŒèŸæžæ©èœ |
|
ãããããèŸæžã«åºæåè©ãè¿œå ããããšãã§ãããããåŠç¿æã»é³å£°åææã®èªã¿ååŸéšåã«é©å¿ãããŸããèŸæžã®è¿œå ã»ç·šéã¯æ¬¡ã®ãšãã£ã¿çµç±ã§è¡ã£ãŠãã ããã |
|
|
|
èŸæžéšåã®[å®è£
](/text/user_dict/) ã¯ãäžã®READMEã«ããéãã[VOICEVOX Editor](https://github.com/VOICEVOX/voicevox) ã®ãã®ã䜿ã£ãŠããããã®éšåã®ã³ãŒãã©ã€ã»ã³ã¹ã¯LGPL-3. |
|
|
|
#### é³å£°åæå°çšãšãã£ã¿ |
|
|
|
é³å£°åæå°çšãšãã£ã¿ãè¿œå ãä»ãŸã§ã®WebUIã§ã§ããæ©èœã®ã»ãã次ã®ãããªæ©èœã䜿ããŸãïŒã€ãŸãæ¢åã®æ¥æ¬èªé³å£°åæãœãããŠã§ã¢ã®ãšãã£ã¿ãç䌌ãŸããïŒïŒ |
|
- ã»ãªãåäœã§ãã£ã©ãèšå®ãå€æŽããªããåçš¿ãäœãããããäžæ¬ã§çæããããåçš¿ãä¿åçãããèªã¿èŸŒãã ã |
|
- GUIããåãããããã¢ã¯ã»ã³ãèª¿æŽ |
|
- ãŠãŒã¶ãŒèŸæžãžã®åèªè¿œå ãç·šé |
|
|
|
`Editor.bat`ãããã«ã¯ãªãã¯ã`python server_editor.py --inbrowser`ã§èµ·åããŸãããšãã£ã¿ãŒéšåã¯[ãã¡ãã®å¥ãªããžããª](https://github.com/litagin02/Style-Bert-VITS2-Editor)ã«ãªããŸããããã³ããšã³ãåå¿è
ãªã®ã§ãã«ãªã¯ãæ¹åæ¡çããåŸ
ã¡ããŠããŸãã |
|
|
|
### æ¹å |
|
- åŠç¿æã«ãã³ãŒããŒéšåãåçµãããªãã·ã§ã³ã®è¿œå ãå質ãããããããäžãããããããŸããã |
|
- |
|
|
|
## v2.2 (2024-02-09) |
|
|
|
### å€æŽã»æ©èœè¿œå |
|
- bfloat16ãªãã·ã§ã³ã¯ãã¡ãªããããç¡ããããªã®ã§ãåžžã«ãªãã§åŠç¿ããããå€æŽ |
|
- ããããµã€ãºã®ããã©ã«ãã4ãã2ã«å€æŽãåŠç¿ãé
ãå Žåã¯ããããµã€ãºãäžããŠè©ŠããŠã¿ãŠãVRAMã«äœè£ãããã°äžããŠãã ãããJP-Extra䜿çšæã§ã®ããããµã€ãºããšã®VRAM䜿çšéç®å®ã¯ã1: 6GB, 2: 8GB, 3: 10GB, 4: 12GB ãããã®ããã§ãã |
|
- åŠç¿ã®éã®æ€èšŒããŒã¿æ°ãããã©ã«ãã§0ã«å€æŽãããŸãæ€èšŒããŒã¿æ°ãåŠç¿çšWebUIã§æå®ã§ããããã«ãã |
|
- Tensorboardã®ãã°ééãåŠç¿çšWebUIã§æå®ã§ããããã«ãã |
|
- UIã®ããŒãã`common/constants.py`ã®`GRADIO_THEME`ã§æå®ã§ããããã«ãã |
|
|
|
### ãã°ä¿®æ£ |
|
- JP-Extra䜿çšæã«ããããµã€ãºã1ã ãšåŠç¿äžã«ãšã©ãŒãçºçãããã°ãä¿®æ£ |
|
- ãããã«ã¡ã¯!?!?!?!?ãçãæå笊çã®èšå·ãé£ç¶ãããšåŠç¿ã»é³å£°åæã§ãšã©ãŒã«ãªããã°ãä¿®æ£ |
|
- `â` (em dash, U+2014) ã `â` (quotation dash, U+2015) çã®ããã·ã¥ããã€ãã³ã®åçš®å€çš®ããçš®é¡ã«ãã£ãŠ`-`ïŒéåžžã®åè§ãã€ãã³ïŒã«æ£èŠåãããããããŠããªãã£ããããåŠçããå
šãŠæ£èŠåããããã«ä¿®æ£ |
|
|
|
## v2.1 (2024-02-07) |
|
|
|
### å€æŽ |
|
- åŠç¿ã®éãããã©ã«ãã§ã¯bfloat16ãªãã·ã§ã³ã䜿ããªãããå€æŽïŒåŠç¿ãçºæ£ããã質ãäžããããšãããæš¡æ§ïŒ |
|
- åŠç¿ã®éã®ã¡ã¢ãªäœ¿çšéãåæžããããšé 匵ã£ã |
|
|
|
### ãã°ä¿®æ£ãæ¹å |
|
- åŠç¿WebUIããTensorboardã®ãã°ãèŠããããã« |
|
- é³å£°åæïŒããã®APIïŒã«ãããŠãåæã«å¥ã®è©±è
ãéžæããé³å£°åæããªã¯ãšã¹ããããå Žåã«çºçãããšã©ãŒãä¿®æ£ |
|
- ã¢ãã«ããŒãžæã«ããã®ã¬ã·ãã`recipe.json`ãã¡ã€ã«ãžä¿åããããã«å€æŽ |
|
- ãæ¹è¡ã§åããŠçæããããææ
ãä¹ãæšã®æèšçã軜埮ãªèª¬ææã®æ¹å |
|
- ã`ãŒãŒããã¯é¢çœã`ããã`ãªãã»ã©ããŒãŒãŒããããããšãã`ãçãé·é³èšå·ã®åãæ¯é³ã§ãªãå Žåãé·é³èšå·`ãŒ`ã§ãªãããã·ã¥`â`ã®åéãã ãšæãããã®ã§ãããã·ã¥èšå·ãšããŠåŠçããããã«å€æŽ |
|
|
|
## v2.0.1 (2024-02-05) |
|
|
|
軜埮ãªãã°ä¿®æ£ãæ¹å |
|
- ã¹ã¿ã€ã«ãã¯ãã«ã«`NaN`ãå«ãŸããŠããå ŽåïŒäž»ã«é³å£°ãã¡ã€ã«ã極端ã«çãå Žåã«çºçïŒããããåŠç¿ãªã¹ãããé€å€ããããã«ä¿®æ£ |
|
- colabã«ããŒãžã®è¿œå |
|
- åŠç¿æã®ããã°ã¬ã¹ããŒã®è¡šç€ºãããããã£ãã®ãä¿®æ£ |
|
- ããã©ã«ãã®jvnvã¢ãã«ãJP-Extraçã«ã¢ããããŒããæ°ããã¢ãã«ã䜿ãããæ¹ã¯æåã§[ãã¡ã](https://huggingface.co/litagin/style_bert_vits2_jvnv/tree/main)ããããŠã³ããŒããããã`python initialize.py`ããããã[ãã®batãã¡ã€ã«](https://github.com/litagin02/Style-Bert-VITS2/releases/download/2.0.1/Update-to-JP-Extra.bat)ã`Style-Bert-VITS2`ãã©ã«ããããå ŽæïŒã€ã³ã¹ããŒã«batãã¡ã€ã«ãšãããã£ããšããïŒã«ãããŠããã«ã¯ãªãã¯ããŠãã ããã |
|
|
|
## v2.0 (2024-02-03) |
|
|
|
### 倧ããå€æŽ |
|
ã¢ãã«æ§é ã« [Bert-VITS2ã®æ¥æ¬èªç¹åã¢ãã« JP-Extra](https://github.com/fishaudio/Bert-VITS2/releases/tag/JP-Exta) ãåã蟌ãã ãã®ã䜿ããããã«å€æŽã[äºååŠç¿ã¢ãã«](https://huggingface.co/litagin/Style-Bert-VITS2-2.0-base-JP-Extra)ã[Bert-VITS2 JP-Extra](https://huggingface.co/Stardust-minus/Bert-VITS2-Japanese-Extra)ã®ãã®ãæ¹é ããŠStyle-Bert-VITS2ã§äœ¿ããããã«ããŸãã (ã¢ãã«æ§é ãèŠçŽããŠæ¥æ¬èªã§ã®åŠç¿ãããŠããã ãã [@Stardust-minus](https://github.com/Stardust-minus) æ§ã«æè¬ããŸã) |
|
- ããã«ãããæ¥æ¬èªã®çºé³ãã¢ã¯ã»ã³ããææãèªç¶æ§ãåäžããåŸåããããŸã |
|
- ã¹ã¿ã€ã«ãã¯ãã«ã䜿ã£ãã¹ã¿ã€ã«ã®æäœã¯å€ããã䜿ããŸã |
|
- ãã ãJP-Extraã§ã¯è±èªãšäžåœèªã®é³å£°åæã¯ïŒçŸç¶ã¯ïŒã§ããŸãã |
|
- æ§ã¢ãã«ãåŒãç¶ã䜿ãããšãã§ãããŸãæ§ã¢ãã«ã§åŠç¿ããããšãã§ããŸã |
|
- ããã©ã«ãã®JVNVã¢ãã«ã¯çŸåšã¯æ§verã®ãŸãŸã§ã |
|
|
|
### æ¹å |
|
- `Merge.bat`ã§ã声é³ããŒãžãããã现ããã声質ããšã声ã®é«ããã®ç¹ã§ããŒãžã§ããããã«ã |
|
|
|
### ãã°ä¿®æ£ |
|
- PyTorchã®ããŒãžã§ã³ã«ç±æ¥ãããã°ãä¿®æ£ïŒtorchã®ããŒãžã§ã³ã2.1.2ã«åºå®ïŒ |
|
- `â`ïŒããã·ã¥ãé·é³èšå·ã§ã¯ãªãïŒã2é£ç¶ãããšåŠç¿ã»é³å£°åæã§ãšã©ãŒã«ãªããã°ãä¿®æ£ |
|
- ãäžåãçããïŒæ¯é³ãã®ã¢ã¯ã»ã³ãã®ä»®åè¡šèšãããµãã³ãçã«ãªãããŸãå¶ã«ãšã©ãŒãçºçããåé¡ãä¿®æ£ïŒãããã®é³çŽ è¡šèšãå
éšçã«ã¯ãNãã§çµ±äžïŒ |
|
|
|
## v1.3 (2024-01-09) |
|
|
|
### 倧ããå€æŽ |
|
- å
ã
ã®Bert-VITS2ã«ååšãããæ¥æ¬èªã®çºé³ã»ã¢ã¯ã»ã³ãåŠçéšåã®ãã°ãä¿®æ£ã»ãªãã¡ã¯ã¿ãªã³ã° |
|
- `è»äž¡`ã`ã·ã£ãªãšãª`ã`æã`ã`ãªã¢ãª`ã`èŠã€ãã`ã`ããã±ã«`çã«çºé³ã»åŠç¿ãããŠããããã®åèªä»¥éã®ã¢ã¯ã»ã³ãæ
å ±ãå
šãŠæ»ãã§ãã |
|
- `ç§ã¯ãããèŠã`ã®ã¢ã¯ã»ã³ãã`ã¯âã¿ã·âã¯ããœâã¬âãªããâã«`ã ã£ãã®ã`ã¯âã¿ã·ã¯ããœâã¬ãªããâã«`ã«ä¿®æ£ |
|
- åŠç¿ã»é³å£°åæã§ç¡èŠãããŠããã¢ã«ãã¡ãããã»ã®ãªã·ã£æåãç¡èŠããªãããã«å€æŽïŒåºæ¬ã¯ã¢ã«ãã¡ãããèªã¿ã ãã©ç°¡åãªåèªã¯èªããããããåŠç¿ã®éã¯å¿µã®ããã«ã¿ã«ãçã«ããã»ããããã§ãïŒ |
|
- ä¿®æ£ã®åœ±é¿ã§ãååŠçæã«ïŒä»ãŸã§ç¡èŠãããŠããïŒèªããªã挢åçã§åŒã£ãããããã«ãªããŸããããã®å Žåã¯æžãèµ·ããã確èªããŠä¿®æ£ããããã«ããŠãã ããã |
|
- ã¢ã¯ã»ã³ãã調æŽããŠé³å£°åæã§ããããã«ïŒå®å
šã«å¶åŸ¡ã§ããããã§ã¯ãªããæ¹åãããå ŽåãããïŒã |
|
|
|
ãããŸã§ã®ã¢ãã«ããããŸã§éã䜿ããã¢ã¯ã»ã³ããçºé³çãæ¹åãããå¯èœæ§ããããŸããæ°ããããŒãžã§ã³ã§åŠç¿ãçŽããšããè¯ããªãå¯èœæ§ããããŸãããåçã«è¯ããªããã¯åãããŸããã |
|
|
|
### æ¹å |
|
- `Dataset.bat`ã®é³å£°ã¹ã©ã€ã¹ãšæžãèµ·ãããããã«ã¹ã¿ãã€ãºã§ããããã«ïŒã¹ã©ã€ã¹ã®ç§æ°èšå®ãæžãèµ·ããã®Whisperã¢ãã«æå®ãèšèªæå®çïŒ |
|
- `Style.bat`ã®ã¹ã¿ã€ã«åãã§ãã¹ã¿ã€ã«ããšã®ãµã³ãã«é³å£°ãæå®ããæ°ã ãè€æ°åçã§ããããã«ããŸãæ°ãã次å
åæžæ¹æ³ïŒUMAPïŒãšæ°ããã¹ã¿ã€ã«åãã®æ¹æ³ïŒDBSCANïŒãè¿œå ïŒUMAPã®ã»ããããã¹ã¿ã€ã«ãåããããããããŸããïŒ |
|
- `App.bat`ã§ã®é³å£°åææã«è€æ°è©±è
ã¢ãã«ã®å Žåã«è©±è
ãæå®ã§ããããã« |
|
- colabã®[ããŒãããã¯](http://colab.research.google.com/github/litagin02/Style-Bert-VITS2/blob/master/colab.ipynb)ã§ãé³å£°ãã¡ã€ã«ã®ã¿ããããŒã¿ã»ãããäœæãããªãã·ã§ã³éšåãè¿œå |
|
- ã¯ã©ãŠãå®è¡çã®éã«ãã¹ã®æå®ããã¡ãã§ã§ããããã«ããã¹ã®èšå®ã`configs/paths.yml`ã«ãŸãšããïŒcolabã®[ããŒãããã¯](http://colab.research.google.com/github/litagin02/Style-Bert-VITS2/blob/master/colab.ipynb)ãããã«äŒŽã£ãŠæŽæ°ïŒãããã©ã«ãã¯`dataset_root: Data`ãš`assets_root: model_assets`ãªã®ã§ãã¯ã©ãŠãçã§ããæ¹ã¯ãããå€æŽããŠãã ããã |
|
- ã©ã®ã¹ãããæ°ã®åºåããããã®ãäžã€ã®ãææšãšã㊠[SpeechMOS](https://github.com/tarepan/SpeechMOS) ã䜿ãã¹ã¯ãªãããè¿œå ïŒ |
|
```bash |
|
python speech_mos.py -m <model_name> |
|
``` |
|
ã¹ãããããšã®èªç¶æ§è©äŸ¡ã衚瀺ããã`mos_results`ãã©ã«ãã®`mos_{model_name}.csv`ãš`mos_{model_name}.png`ã«çµæãä¿åããããèªã¿äžãããããæç« ãå€ãããã£ããäžã®ãã¡ã€ã«ãåŒã£ãŠåèªèª¿æŽããŠãã ããããããŸã§ã¢ã¯ã»ã³ããææ
è¡šçŸãææãå
šãèããªãåºæºã§ã®è©äŸ¡ã§ãç®å®ã®ã²ãšã€ãªã®ã§ãå®éã«èªã¿äžããããŠéžå¥ããã®ãäžçªã ãšæããŸãã |
|
- åŠç¿æã®ãŠã©ãŒã ã¢ãããªãã·ã§ã³ãæ©èœããããã«ïŒ [@kale4eat](https://github.com/kale4eat) æ§ã«ããPRã§ããããããšãããããŸãïŒïŒãååŠçæã«çæããã`config.json`ã®`train`ã®`warmup_epochs`ãå€æŽããããšã§ããŠã©ãŒã ã¢ããã®ãšããã¯æ°ãå€æŽã§ããŸããããã©ã«ãã¯`0`ã§ä»ãŸã§ãšåãåŠç¿çã®æåã§ãã |
|
|
|
### ãã®ä» |
|
- `Dataset.bat`ã®é³å£°ã¹ã©ã€ã¹ã§ããŒãã©ã€ãºæ©èœãåé€ïŒåŠç¿ååŠçã§è¡ããããïŒ |
|
- `Train.bat`ã®é³éããŒãã©ã€ãºãšç¡é³åãè©°ããããã©ã«ãã§ãªãã«å€æŽ |
|
- åŠç¿æã®é²æãå
šäœãšããã¯æ°ã§è¡šç€ºããåŠç¿å
šäœã®é²æãèŠãããããã«( [@RedRayz](https://github.com/RedRayz) æ§ã«ããPRã§ããããããšãããããŸãïŒ) |
|
- ãã®ä»ãã°ä¿®æ£çïŒ [@tinjyuu](https://github.com/@tinjyuu) æ§ã [@darai0512](https://github.com/darai0512) æ§ããããšãããããŸãïŒïŒ |
|
- `config.json`ã«ã¹ã¿ã€ã«åã蟌ã¿éšåãåŠç¿ããªã`freeze_style`ãªãã·ã§ã³ãè¿œå ïŒããã©ã«ãã¯`false`ïŒ |
|
|
|
### TIPS |
|
- æ¥æ¬èªåŠç¿ã®å Žåã`config.json`ã®`freeze_bert`ãš`freeze_en_bert`ã`true`ã«ããŠãããšãè±èªãšäžåœèªã®çºè©±èœåãåŠç¿ã®éçšã§èœã¡ãªããããããŸããããããŸãæ¯èŒããŠããªã®ã§åãããŸããã |
|
|
|
## v1.2 (2023-12-31) |
|
|
|
- ã°ã©ãããªããŠãŒã¶ãŒã§ã®é³å£°åæããµããŒãã`Install-Style-Bert-VITS2-CPU.bat`ã§ã€ã³ã¹ããŒã«ã |
|
- Google Colabã§ã®åŠç¿ããµããŒãã[ããŒãããã¯](http://colab.research.google.com/github/litagin02/Style-Bert-VITS2/blob/master/colab.ipynb)ãè¿œå |
|
- é³å£°åæã®APIãµãŒããŒãè¿œå ã`python server_fastapi.py`ã§èµ·åããŸããAPIä»æ§ã¯èµ·ååŸã«`/docs`ã«ãŠç¢ºèªãã ãããïŒ [@darai0512](https://github.com/darai0512) æ§ã«ããPRã§ããããããšãããããŸãïŒïŒ |
|
- åŠç¿æã«èªåçã«ããã©ã«ãã¹ã¿ã€ã« Neutral ãçæããããã«ãç¹ã«ã¹ã¿ã€ã«æå®ãå¿
èŠã®ãªãæ¹ã¯ãåŠç¿ããããã®ãŸãŸé³å£°åæãè©ŠããŸãããããŸã§éãã¹ã¿ã€ã«ãèªåã§äœãããšãã§ããŸãã |
|
- ããŒãžæ©èœã®æ°èŠè¿œå : `Merge.bat`, `webui_merge.py` |
|
- ååŠçã®ãªãµã³ããªã³ã°æã«é³å£°ãã¡ã€ã«ã®éå§ã»çµäºéšåã®ç¡é³ãåé€ãããªãã·ã§ã³ãè¿œå ïŒããã©ã«ãã§ãªã³ïŒ |
|
- `ã¹ã¿ã€ã«ããã¹ã (style text)`ãã¹ã¿ã€ã«æå®ãšçŽããããã£ãã®ã§ã`ã¢ã·ã¹ãããã¹ã (assist text)`ã«å€æŽ |
|
- ãã®ä»ã³ãŒãã®ãªãã¡ã¯ã¿ãªã³ã° |
|
|
|
## v1.1 (2023-12-29) |
|
- TrainãšDatasetã®WebUIã®æ¹è¯ã»èª¿æŽïŒäžæ¬äºååŠçãã¿ã³çïŒ |
|
- ååŠçã®ãªãµã³ããªã³ã°æã«é³éãæ£èŠåãããªãã·ã§ã³ãè¿œå ïŒããã©ã«ãã§ãªã³ïŒ |
|
|
|
## v1.0 (2023-12-27) |
|
- åç |
|
|