Spaces:

fudan-generative-ai
/

hallo

Running

docs: update input requirements

by AricGamma - opened Jun 20

←

Files changed (1) hide show

app.py CHANGED Viewed

@@ -91,23 +91,6 @@ with gr.Blocks(css=css) as demo:
         ''', elem_id="warning-duplicate")
     gr.Markdown("# Demo for Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation")
     gr.Markdown("Generate talking head avatars driven from audio. **5 seconds of audio takes >10 minutes to generate on an L4** - duplicate the space for private use or try for free on Google Colab")
-    gr.Markdown("""
-Hallo has a few simple requirements for input data:
-For the source image:
-1. It should be cropped into squares.
-2. The face should be the main focus, making up 50%-70% of the image.
-3. The face should be facing forward, with a rotation angle of less than 30° (no side profiles).
-For the driving audio:
-1. It must be in WAV format.
-2. It must be in English since our training datasets are only in this language.
-3. Ensure the vocals are clear; background music is acceptable.
-We have provided some [samples](https://huggingface.co/datasets/fudan-generative-ai/hallo_inference_samples) for your reference.
-                """)
     with gr.Row():
         with gr.Column():
             avatar_face = gr.Image(type="filepath", label="Face")

         ''', elem_id="warning-duplicate")
     gr.Markdown("# Demo for Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation")
     gr.Markdown("Generate talking head avatars driven from audio. **5 seconds of audio takes >10 minutes to generate on an L4** - duplicate the space for private use or try for free on Google Colab")
     with gr.Row():
         with gr.Column():
             avatar_face = gr.Image(type="filepath", label="Face")