Spaces:

varl42
/

audio_abstract42

Sleeping

File size: 1,398 Bytes

186ead1
 
d00b123
 
186ead1
 
 
 
 
 
d00b123
186ead1
d00b123
 
 
 
 
 
 
f44133f
d00b123

---
title: Audio Abstract42
emoji: 😻
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 4.7.1
app_file: app.py
pinned: false
---
# PDF Audio Summarizer

This application summarizes PDF documents and converts the summary to audio.

## How it works

The core logic is in the `audio_pdf` function. It:

1. Extracts raw text from the uploaded PDF using `PyPDF2` 
2. Summarizes the text using [LED-Based Summarization](https://huggingface.co/pszemraj/led-base-book-summary) Model from HuggingFace Transformers. This uses `AutoTokenizer` and `AutoModelForSeq2SeqLM` to load the model and generate a summary
3. Converts the text summary to an audio file using `gTTS` (Google Text-to-Speech)  

The summary and audio file are returned and displayed in the Gradio web interface.

## Interface 

The interface is created using Gradio. The key components are:

- `File` input to upload a PDF
- `Text` output to display the text summary 
- `Audio` output to play the audio file

The interface is launched via `iface.launch()`

## Dependencies

- PyPDF2
- Transformers
- gTTS 
- Gradio
- torch
- numpy
- scipy
- io

Additional dependencies:

- `torch`: For neural network computations in Transformers
- `numpy`: For numerical processing
- `scipy`: For scientific computing
- `io`: To buffer the audio data

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference