File size: 1,153 Bytes
757118b
 
5bb545b
 
 
757118b
7f129f0
 
ffa5a32
7f129f0
ffa5a32
 
 
5bb545b
7f129f0
 
5bb545b
 
 
 
 
 
7f129f0
 
 
5bb545b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
---
license: mit
language:
- ja
library_name: fairseq
---

# hubert-base-jtube
This repo provides model weights for the [hubert-base model](https://arxiv.org/abs/2106.07447) trained on the [JTubeSpeech](https://github.com/sarulab-speech/jtubespeech) corpus.

## Dataset
We extracted approximately 2720 hours of Japanese speech from the single-speaker subset of the JTubeSpeech corpus.
The training data includes approximately 6,000,000 utterances from a total of about 55,000 speakers.
## How to use

# Contributors
* [Wataru Nakata/中田 亘](https://wataru-nakata.github.io)
* [Kentaro Seki/関 健太郎](https://trgkpc.github.io/)
* [Hitomi Yanaka/谷中 瞳](https://hitomiyanaka.mystrikingly.com/)
* [Takaaki Saeki/佐伯 高明](https://takaaki-saeki.github.io/)
* [Yuki Saito/齋藤 佑樹](https://sython.org/)
* [Shinnosuke Takamichi/高道 慎之介](https://sites.google.com/site/shinnosuketakamichi/home)

# 謝辞/acknowledgements
本研究は、国立研究開発法人産業技術総合研究所事業の令和5年度覚醒プロジェクトの助成を受けたものです。
/This work was supported by AIST KAKUSEI project (FY2023).