XThomasBU commited on
Commit
7a233a3
β€’
1 Parent(s): d95aad5

updated README

Browse files
Files changed (1) hide show
  1. README.md +51 -26
README.md CHANGED
@@ -1,36 +1,61 @@
1
- ---
2
- title: Dl4ds Tutor
3
- emoji: πŸƒ
4
- colorFrom: green
5
- colorTo: red
6
- sdk: docker
7
- pinned: false
8
- hf_oauth: true
9
- ---
10
 
11
- DL4DS Tutor
12
- ===========
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
15
 
16
- You can find an implementation of the Tutor at https://dl4ds-dl4ds-tutor.hf.space/, which is hosted on Hugging Face [here](https://huggingface.co/spaces/dl4ds/dl4ds_tutor)
17
 
18
- To run locally,
 
 
 
19
 
20
- Clone the repository from: https://github.com/DL4DS/dl4ds_tutor
 
 
21
 
22
- Put your data under the `storage/data` directory. Note: You can add urls in the urls.txt file, and other pdf files in the `storage/data` directory.
 
 
 
 
 
 
23
 
24
- To create the Vector Database, run the following command:
25
- ```cd code```
26
- ```python -m modules.vectorstore.store_manager```
27
- (Note: You would need to run the above when you add new data to the `storage/data` directory, or if the ``storage/data/urls.txt`` file is updated. Or you can set ``["vectorstore"]["embedd_files"]`` to True in the `code/modules/config/config.yaml` file, which would embed files from the storage directory everytime you run the below chainlit command.)
28
-
29
- To run the chainlit app, run the following command:
30
- ```chainlit run main.py```
31
 
32
  See the [docs](https://github.com/DL4DS/dl4ds_tutor/tree/main/docs) for more information.
33
 
34
- ## Contributing
35
-
36
- Please create an issue if you have any suggestions or improvements, and start working on it by creating a branch and by making a pull request to the main branch.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # DL4DS Tutor πŸƒ
 
 
 
 
 
 
 
 
2
 
3
+ Check out the configuration reference at [Hugging Face Spaces Config Reference](https://huggingface.co/docs/hub/spaces-config-reference).
 
4
 
5
+ You can find an implementation of the Tutor at [DL4DS Tutor on Hugging Face](https://dl4ds-dl4ds-tutor.hf.space/), which is hosted on Hugging Face [here](https://huggingface.co/spaces/dl4ds/dl4ds_tutor).
6
 
7
+ ## Running Locally
8
 
9
+ 1. **Clone the Repository**
10
+ ```bash
11
+ git clone https://github.com/DL4DS/dl4ds_tutor
12
+ ```
13
 
14
+ 2. **Put your data under the `storage/data` directory**
15
+ - Add URLs in the `urls.txt` file.
16
+ - Add other PDF files in the `storage/data` directory.
17
 
18
+ 3. **Create the Vector Database**
19
+ ```bash
20
+ cd code
21
+ python -m modules.vectorstore.store_manager
22
+ ```
23
+ - Note: You need to run the above command when you add new data to the `storage/data` directory, or if the `storage/data/urls.txt` file is updated.
24
+ - Alternatively, you can set `["vectorstore"]["embedd_files"]` to `True` in the `code/modules/config/config.yaml` file, which will embed files from the storage directory every time you run the below chainlit command.
25
 
26
+ 4. **Run the Chainlit App**
27
+ ```bash
28
+ chainlit run main.py
29
+ ```
 
 
 
30
 
31
  See the [docs](https://github.com/DL4DS/dl4ds_tutor/tree/main/docs) for more information.
32
 
33
+ ## File Structure
34
+
35
+ ```plaintext
36
+ code/
37
+ β”œβ”€β”€ modules
38
+ β”‚ β”œβ”€β”€ chat # Contains the chatbot implementation
39
+ β”‚ β”œβ”€β”€ chat_processor # Contains the implementation to process and log the conversations
40
+ β”‚ β”œβ”€β”€ config # Contains the configuration files
41
+ β”‚ β”œβ”€β”€ dataloader # Contains the implementation to load the data from the storage directory
42
+ β”‚ β”œβ”€β”€ retriever # Contains the implementation to create the retriever
43
+ β”‚ └── vectorstore # Contains the implementation to create the vector database
44
+ β”œβ”€β”€ public
45
+ β”‚ β”œβ”€β”€ logo_dark.png # Dark theme logo
46
+ β”‚ β”œβ”€β”€ logo_light.png # Light theme logo
47
+ β”‚ └── test.css # Custom CSS file
48
+ └── main.py
49
+
50
+
51
+ docs/ # Contains the documentation to the codebase and methods used
52
+
53
+ storage/
54
+ β”œβ”€β”€ data # Store files and URLs here
55
+ β”œβ”€β”€ logs # Logs directory, includes logs on vector DB creation, tutor logs, and chunks logged in JSON files
56
+ └── models # Local LLMs are loaded from here
57
+
58
+ vectorstores/ # Stores the created vector databases
59
+
60
+ .env # This needs to be created, store the API keys here
61
+ ```