File size: 718 Bytes
5ce2957
 
 
 
 
 
 
 
 
d848995
 
45ccf88
 
 
 
 
 
 
d848995
 
45ccf88
 
d848995
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
---
license: mit
datasets:
- Wenetspeech4TTS/WenetSpeech4TTS
language:
- zh
pipeline_tag: text-to-speech
---

## The vanilla VALL E train on WenetSpeech4TTS using Amphion tooltik.

The entire training process follows its training code, except that the text-to-phoneme feature step is slightly different.

### Checkpoints

- **base_model.bin** : VALL-E trained with the WenetSpeech4TTS Basic subset
- **38sft_model.bin** : VALL-E Basic fine-tuning with the WenetSpeech4TTS Standard subset
- **4sft_model.bin** : VALL-E Standard fine-tuning with the WenetSpeech4TTS Premium subset

### usage
Inference code and more details : [ISCSLP2024_CoVoC_baseline](https://github.com/xkx-hub/ISCSLP2024_CoVoC_baseline).
``` 


```