audio_abstract42 / README.md
varl42's picture
Update README.md
f44133f

A newer version of the Gradio SDK is available: 4.44.0

Upgrade
metadata
title: Audio Abstract42
emoji: 😻
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 4.7.1
app_file: app.py
pinned: false

PDF Audio Summarizer

This application summarizes PDF documents and converts the summary to audio.

How it works

The core logic is in the audio_pdf function. It:

  1. Extracts raw text from the uploaded PDF using PyPDF2
  2. Summarizes the text using LED-Based Summarization Model from HuggingFace Transformers. This uses AutoTokenizer and AutoModelForSeq2SeqLM to load the model and generate a summary
  3. Converts the text summary to an audio file using gTTS (Google Text-to-Speech)

The summary and audio file are returned and displayed in the Gradio web interface.

Interface

The interface is created using Gradio. The key components are:

  • File input to upload a PDF
  • Text output to display the text summary
  • Audio output to play the audio file

The interface is launched via iface.launch()

Dependencies

  • PyPDF2
  • Transformers
  • gTTS
  • Gradio
  • torch
  • numpy
  • scipy
  • io

Additional dependencies:

  • torch: For neural network computations in Transformers
  • numpy: For numerical processing
  • scipy: For scientific computing
  • io: To buffer the audio data

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference