--- library_name: transformers license: apache-2.0 pipeline_tag: automatic-speech-recognition tags: - audio --- # Cascaded Japanese Speech2Text Translation This is a pipeline for speech-to-text translation from Japanese speech to any target language text based on the cascaded approach, that consists of ASR and translation. ## Usage Here is an example to translate Japanese speech into English text translation. First, download a sample speech. ```bash wget https://huggingface.co/datasets/japanese-asr/ja_asr.jsut_basic5000/resolve/main/sample.flac -O sample_ja.flac ``` Then, run the pipeline as below. ```python3 from transformers import pipeline # load model pipe = pipeline( model="japanese-asr/ja-cascaded-s2t-translation", model_kwargs={"attn_implementation": "sdpa"}, model_translation="facebook/nllb-200-distilled-600M", tgt_lang="eng_Latn", chunk_length_s=15, trust_remote_code=True, ) # translate output = pipe("./sample_ja.flac") ```