File size: 1,398 Bytes
186ead1
 
d00b123
 
186ead1
 
 
 
 
 
d00b123
186ead1
d00b123
 
 
 
 
 
 
f44133f
d00b123
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
title: Audio Abstract42
emoji: 😻
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 4.7.1
app_file: app.py
pinned: false
---
# PDF Audio Summarizer

This application summarizes PDF documents and converts the summary to audio.

## How it works

The core logic is in the `audio_pdf` function. It:

1. Extracts raw text from the uploaded PDF using `PyPDF2` 
2. Summarizes the text using [LED-Based Summarization](https://huggingface.co/pszemraj/led-base-book-summary) Model from HuggingFace Transformers. This uses `AutoTokenizer` and `AutoModelForSeq2SeqLM` to load the model and generate a summary
3. Converts the text summary to an audio file using `gTTS` (Google Text-to-Speech)  

The summary and audio file are returned and displayed in the Gradio web interface.

## Interface 

The interface is created using Gradio. The key components are:

- `File` input to upload a PDF
- `Text` output to display the text summary 
- `Audio` output to play the audio file

The interface is launched via `iface.launch()`

## Dependencies

- PyPDF2
- Transformers
- gTTS 
- Gradio
- torch
- numpy
- scipy
- io

Additional dependencies:

- `torch`: For neural network computations in Transformers
- `numpy`: For numerical processing
- `scipy`: For scientific computing
- `io`: To buffer the audio data

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference