llm-arch / Home.py
alfraser's picture
Added handling for the video overview presentation not being available, in order to avoid an issue in the submission to the university (including the video in the overall makes the submission go over the 50MB file limit).
c3c8ce3
import os
import streamlit as st
from src.st_helpers import st_setup
# Test runner now runs parallel, so need to flag this to the tokenizer
os.environ['TOKENIZERS_PARALLELISM'] = 'true'
if st_setup("LLM Architecture Assessment", skip_login=True):
st.write("""
# LLM Architecture Assessment
This application is an interactive element of the LLM Architecture Assessment project prepared by [Alisdair Fraser](https://www.linkedin.com/in/alisdairfraser/) (alisdairfraser (at) gmail (dot) com), in submission for the final research project for the [Online MSc in Artificial Intelligence](https://info.online.bath.ac.uk/msai/) with the University of Bath. This application allows users to browse a synthetic set of "private data" and to interact with systems built to represent different architectural prototypes.
The goal of the project is to make an assessment of the architectural patterns for deploying LLMs in conjunction with private data stores. The target audience is technology leaders, with a goal of providing key considerations for why one might choose a particular architecture or another.
All the source code for this application and the associated tooling and data can be found in the [project GitHub repo on Hugging Face](https://huggingface.co/spaces/alfraser/llm-arch/tree/main).
""")
# Place the video centred, but surrounded as a workaround to being able to specify the size
left, center, right = st.columns([2, 3, 2])
try:
with center:
with open('img/overview_presentation.m4v', 'rb') as f:
video_bytes = f.read()
st.video(video_bytes)
except:
st.info("Overview presentation video not available")
st.write("""
## Tools
This web application serves as the management console to run different elements required to test the architectures. Specifically:
- **LLM Architectures**: Around the LLM models are wrapped "architectures" which are the systems under test and being assessed. This area allows users to see those configurations and manually chat with the architecture, as opposed to directly with the model.
- **Data Browser**: Underlying this architectural assessment is a synthetic "closed" data set, which has been generated offline to simulate a closed enterprise style dataset for testing purposes. This data browser element allows users to view that data directly.
- **Test Runner**: This tool allows you to select a number of questions and a set of architectures. The same questions will then be sent to each of the architectures and the results logged for analysis.
- **Test Reporter**: As interactions are taking place with the architectures under test, the results are being logged for analysis. This area allows users to view those log records and see some simple results.
- **System Status**: This area lets the user undertake some basic system controls. It allows the test logs to be wiped clean, and also allows users to see the status of the LLM endpoints which the demo uses and pause/resume them as applicable.
## Credits
- This project predominantly uses [LLama 2](https://ai.meta.com/llama/) and derivative models for language inference. Models are made available under the [Meta Llama license](https://ai.meta.com/llama/license/).
- This application is built on [streamlit](https://streamlit.io).
""")