openai langchain chromadb beautifulsoup4 unstructured poppler-utils tiktoken pytesseract gradio