QueryYourDocs / examples /techniques /pymupdf_loader.py
LVKinyanjui's picture
Attempted to add hf hub login for gated repo; cleaned up house
67138c2
raw
history blame
No virus
208 Bytes
import pymupdf
doc = pymupdf.open("data/State Machines.pdf")
texts = [page.get_text().encode("utf-8") for page in doc]
print("Done")
# with open("data/State Machines.pdf", "wb", encoding="utf-8") as out: