Spaces:
Sleeping
Sleeping
# PDF Audio Summarizer
Browse files
README.md
CHANGED
@@ -1,12 +1,53 @@
|
|
1 |
---
|
2 |
title: Audio Abstract42
|
3 |
-
emoji:
|
4 |
-
colorFrom:
|
5 |
colorTo: green
|
6 |
sdk: gradio
|
7 |
sdk_version: 4.7.1
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
---
|
|
|
11 |
|
12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
title: Audio Abstract42
|
3 |
+
emoji: 😻
|
4 |
+
colorFrom: blue
|
5 |
colorTo: green
|
6 |
sdk: gradio
|
7 |
sdk_version: 4.7.1
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
---
|
11 |
+
# PDF Audio Summarizer
|
12 |
|
13 |
+
This application summarizes PDF documents and converts the summary to audio.
|
14 |
+
|
15 |
+
## How it works
|
16 |
+
|
17 |
+
The core logic is in the `audio_pdf` function. It:
|
18 |
+
|
19 |
+
1. Extracts raw text from the uploaded PDF using `PyPDF2`
|
20 |
+
2. Summarizes the text using a BART model from HuggingFace Transformers. This uses `AutoTokenizer` and `AutoModelForSeq2SeqLM` to load the model and generate a summary
|
21 |
+
3. Converts the text summary to an audio file using `gTTS` (Google Text-to-Speech)
|
22 |
+
|
23 |
+
The summary and audio file are returned and displayed in the Gradio web interface.
|
24 |
+
|
25 |
+
## Interface
|
26 |
+
|
27 |
+
The interface is created using Gradio. The key components are:
|
28 |
+
|
29 |
+
- `File` input to upload a PDF
|
30 |
+
- `Text` output to display the text summary
|
31 |
+
- `Audio` output to play the audio file
|
32 |
+
|
33 |
+
The interface is launched via `iface.launch()`
|
34 |
+
|
35 |
+
## Dependencies
|
36 |
+
|
37 |
+
- PyPDF2
|
38 |
+
- Transformers
|
39 |
+
- gTTS
|
40 |
+
- Gradio
|
41 |
+
- torch
|
42 |
+
- numpy
|
43 |
+
- scipy
|
44 |
+
- io
|
45 |
+
|
46 |
+
Additional dependencies:
|
47 |
+
|
48 |
+
- `torch`: For neural network computations in Transformers
|
49 |
+
- `numpy`: For numerical processing
|
50 |
+
- `scipy`: For scientific computing
|
51 |
+
- `io`: To buffer the audio data
|
52 |
+
|
53 |
+
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|