burtenshaw commited on
Commit
9cc6120
·
1 Parent(s): e9484c6

first refactored commit

Browse files
README.md CHANGED
@@ -1,39 +1,58 @@
1
- ![Auto Assign](https://github.com/HF-RLHF-Platform/demo-repository/actions/workflows/auto-assign.yml/badge.svg)
 
 
 
 
 
 
 
 
 
2
 
3
- ![Proof HTML](https://github.com/HF-RLHF-Platform/demo-repository/actions/workflows/proof-html.yml/badge.svg)
4
 
5
- # Welcome to your organization's demo respository
6
- This code repository (or "repo") is designed to demonstrate the best GitHub has to offer with the least amount of noise.
7
 
8
- The repo includes an `index.html` file (so it can render a web page), two GitHub Actions workflows, and a CSS stylesheet dependency.
9
- # Model-Improvement-Platform-With-RLHF
10
- Platform being developed at MIT in collaboration with HuggingFace. Aimed at improving performance of existing Large Language Models through real time human feedback loop.
11
- # HF-RLHF-Platform
12
  Platform being developed at MIT in collaboration with HuggingFace. Aimed at improving performance of existing Large Language Models through real-time human feedback loop.
 
13
  This repository hosts the development of an automated RLHF platform for Hugging Face, where the community can provide real-time feedback on language models. The feedback is automatically integrated into an RLHF pipeline to continuously fine-tune and improve the models.
14
 
 
 
 
 
 
15
 
16
- # The Feedback Collective
17
 
18
- **Open RLHF on VLMs for Students**
19
 
20
- A community-driven project to improve Vision-Language Models (VLMs) for student-focused tasks.
21
- Leverages feedback from users and automated RLHF pipelines to continuously improve model performance.
22
 
 
 
 
 
 
 
 
 
23
 
 
24
 
25
- ## Dataset Schema for Project
26
 
27
- ### KTO Dataset Structure
28
 
29
- The dataset should be organized into two splits: `train` and `test`.
30
 
31
- Each split contains the following features:
32
 
33
- | **Feature** | **Type** | **Description** |
34
- |---------------|-----------|--------------------------------------------------------------------------------------|
35
- | `prompt` | `string` | The input text for the model. This should be a natural language query or input. |
36
- | `completion` | `string` | The output text generated by the model in response to the `prompt`. |
37
- | `label` | `bool` | A binary value (`True` or `False`) indicating whether the `completion` is desirable. |
38
 
 
39
 
 
 
 
 
1
+ ---
2
+ title: Feel
3
+ emoji: 🚀
4
+ colorFrom: blue
5
+ colorTo: gray
6
+ sdk: gradio
7
+ sdk_version: 5.8.0
8
+ app_file: app/app.py
9
+ pinned: false
10
+ ---
11
 
12
+ # Feel
13
 
14
+ This is a project to create a continuous training application.
 
15
 
 
 
 
 
16
  Platform being developed at MIT in collaboration with HuggingFace. Aimed at improving performance of existing Large Language Models through real-time human feedback loop.
17
+
18
  This repository hosts the development of an automated RLHF platform for Hugging Face, where the community can provide real-time feedback on language models. The feedback is automatically integrated into an RLHF pipeline to continuously fine-tune and improve the models.
19
 
20
+ ## What is Feel?
21
+
22
+ A community-driven project to improve Multilingual Vision-Language Models (VLMs). Leverages feedback from users and automated RLHF pipelines to continuously improve model performance.
23
+
24
+ ## Why Feel?
25
 
26
+ Feel is a platform that enables the community to provide real-time feedback on language models. The feedback is automatically integrated into an RLHF pipeline to continuously fine-tune and improve the models.
27
 
28
+ ## Repository Structure
29
 
30
+ The repository is organized as follows:
 
31
 
32
+ ```
33
+ ml/ # Directory for machine learning code
34
+ ├── README.md # Dataset schema and project structure
35
+ ├── data/ # Directory for dataset files
36
+ ├── models/ # Directory for model files
37
+ app/ # Directory for application code
38
+ ├── app.py # Main application file
39
+ ```
40
 
41
+ ## Installation
42
 
43
+ The repository uses `uv` for managing virtual environments. To install `uv`, go [here](https://docs.astral.sh/uv/getting-started/installation/).
44
 
 
45
 
46
+ To install the required dependencies, run the following commands:
47
 
48
+ ### ML Dependencies
49
 
50
+ ```bash
51
+ uv install ml
52
+ ```
 
 
53
 
54
+ ### App Dependencies
55
 
56
+ ```bash
57
+ uv install app
58
+ ```
app/README.md ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Config
2
+
3
+ ```
4
+ export HF_TOKEN=<your-token>
5
+ export MODEL_ID=<your-model-id> # https://huggingface.co/models?inference=warm&pipeline_tag=image-text-to-text&sort=trending
6
+ export BASE_URL=<your-base-url> # https://hf-mirror.com/
7
+ ```
8
+
9
+ # Run
10
+
11
+ ```
12
+ python app.py
13
+ ```
app/app.py ADDED
@@ -0,0 +1,205 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import uuid
3
+ from base64 import b64encode
4
+ from datetime import datetime
5
+ from mimetypes import guess_type
6
+ from pathlib import Path
7
+
8
+ import gradio as gr
9
+ from huggingface_hub import InferenceClient
10
+ from pandas import DataFrame
11
+
12
+ from feedback import save_feedback
13
+
14
+ client = InferenceClient(
15
+ token=os.getenv("HF_TOKEN"),
16
+ model=(
17
+ os.getenv("MODEL", "meta-llama/Llama-3.2-11B-Vision-Instruct")
18
+ if not os.getenv("BASE_URL")
19
+ else None
20
+ ),
21
+ base_url=os.getenv("BASE_URL"),
22
+ )
23
+
24
+
25
+ def add_user_message(history, message):
26
+ for x in message["files"]:
27
+ history.append({"role": "user", "content": {"path": x}})
28
+ if message["text"] is not None:
29
+ history.append({"role": "user", "content": message["text"]})
30
+ return history, gr.MultimodalTextbox(value=None, interactive=False)
31
+
32
+
33
+ def _format_history_as_messages(history: list):
34
+ messages = []
35
+ current_role = None
36
+ current_message_content = []
37
+
38
+ for entry in history:
39
+ content = entry["content"]
40
+
41
+ if entry["role"] != current_role:
42
+ if current_role is not None:
43
+ messages.append(
44
+ {"role": current_role, "content": current_message_content}
45
+ )
46
+ current_role = entry["role"]
47
+ current_message_content = []
48
+
49
+ if isinstance(content, tuple): # Handle file paths
50
+ for path in content:
51
+ data_uri = _convert_path_to_data_uri(path)
52
+ current_message_content.append(
53
+ {"type": "image_url", "image_url": {"url": data_uri}}
54
+ )
55
+ elif isinstance(content, str): # Handle text
56
+ current_message_content.append({"type": "text", "text": content})
57
+
58
+ if current_role is not None:
59
+ messages.append({"role": current_role, "content": current_message_content})
60
+
61
+ return messages
62
+
63
+
64
+ def _convert_path_to_data_uri(path) -> str:
65
+ mime_type, _ = guess_type(path)
66
+ with open(path, "rb") as image_file:
67
+ data = image_file.read()
68
+ data_uri = f"data:{mime_type};base64," + b64encode(data).decode("utf-8")
69
+ return data_uri
70
+
71
+
72
+ def _is_file_safe(path) -> bool:
73
+ try:
74
+ return Path(path).is_file()
75
+ except Exception:
76
+ return False
77
+
78
+
79
+ def _process_content(content) -> str | list[str]:
80
+ if isinstance(content, str) and _is_file_safe(content):
81
+ return _convert_path_to_data_uri(content)
82
+ elif isinstance(content, list):
83
+ return _convert_path_to_data_uri(content[0])
84
+ return content
85
+
86
+
87
+ def respond_system_message(history: list) -> list: # -> list:
88
+ """Respond to the user message with a system message"""
89
+ messages = _format_history_as_messages(history)
90
+ response = client.chat.completions.create(
91
+ messages=messages,
92
+ max_tokens=2000,
93
+ stream=False,
94
+ )
95
+ content = response.choices[0].message.content
96
+ # TODO: Add a response to the user message
97
+
98
+ message = gr.ChatMessage(role="assistant", content=content)
99
+ history.append(message)
100
+ return history
101
+
102
+
103
+ def wrangle_like_data(x: gr.LikeData, history) -> DataFrame:
104
+ """Wrangle conversations and liked data into a DataFrame"""
105
+
106
+ liked_index = x.index[0]
107
+
108
+ output_data = []
109
+ for idx, message in enumerate(history):
110
+ if idx == liked_index:
111
+ message["metadata"] = {"title": "liked" if x.liked else "disliked"}
112
+ rating = message["metadata"].get("title")
113
+ if rating == "liked":
114
+ message["rating"] = 1
115
+ elif rating == "disliked":
116
+ message["rating"] = -1
117
+ else:
118
+ message["rating"] = None
119
+
120
+ output_data.append(
121
+ dict([(k, v) for k, v in message.items() if k != "metadata"])
122
+ )
123
+
124
+ return history, DataFrame(data=output_data)
125
+
126
+
127
+ def submit_conversation(dataframe, session_id):
128
+ """ "Submit the conversation to dataset repo"""
129
+ if dataframe.empty:
130
+ gr.Info("No messages to submit because the conversation was empty")
131
+ return (gr.Dataframe(value=None, interactive=False), [])
132
+
133
+ dataframe["content"] = dataframe["content"].apply(_process_content)
134
+ conversation_data = {
135
+ "conversation": dataframe.to_dict(orient="records"),
136
+ "timestamp": datetime.now().isoformat(),
137
+ "session_id": session_id,
138
+ "conversation_id": str(uuid.uuid4()),
139
+ }
140
+ save_feedback(input_object=conversation_data)
141
+ gr.Info(f"Submitted {len(dataframe)} messages to the dataset")
142
+ return (gr.Dataframe(value=None, interactive=False), [])
143
+
144
+
145
+ with gr.Blocks() as demo:
146
+ ##############################
147
+ # Chatbot
148
+ ##############################
149
+ session_id = gr.Textbox(
150
+ interactive=False,
151
+ value=str(uuid.uuid4()),
152
+ visible=False,
153
+ )
154
+
155
+ chatbot = gr.Chatbot(
156
+ elem_id="chatbot",
157
+ bubble_full_width=False,
158
+ type="messages",
159
+ )
160
+
161
+ chat_input = gr.MultimodalTextbox(
162
+ interactive=True,
163
+ file_count="multiple",
164
+ placeholder="Enter message or upload file...",
165
+ show_label=False,
166
+ submit_btn=True,
167
+ )
168
+
169
+ chat_msg = chat_input.submit(
170
+ fn=add_user_message, inputs=[chatbot, chat_input], outputs=[chatbot, chat_input]
171
+ )
172
+
173
+ bot_msg = chat_msg.then(
174
+ respond_system_message, chatbot, chatbot, api_name="bot_response"
175
+ )
176
+
177
+ bot_msg.then(lambda: gr.Textbox(interactive=True), None, [chat_input])
178
+
179
+ ##############################
180
+ # Deal with feedback
181
+ ##############################
182
+
183
+ dataframe = gr.DataFrame()
184
+
185
+ chatbot.like(
186
+ fn=wrangle_like_data,
187
+ inputs=[chatbot],
188
+ outputs=[chatbot, dataframe],
189
+ like_user_message=False,
190
+ )
191
+
192
+ gr.Button(
193
+ value="Submit conversation",
194
+ ).click(
195
+ fn=submit_conversation,
196
+ inputs=[dataframe, session_id],
197
+ outputs=[dataframe, chatbot],
198
+ )
199
+ demo.load(
200
+ lambda: str(uuid.uuid4()),
201
+ inputs=[],
202
+ outputs=[session_id],
203
+ )
204
+
205
+ demo.launch()
app/feedback.py ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ import uuid
3
+ from pathlib import Path
4
+
5
+ from huggingface_hub import CommitScheduler
6
+
7
+ APP_INSTANCE_ID = str(uuid.uuid4())
8
+
9
+ feedback_file = Path("user_feedback/") / f"data_{APP_INSTANCE_ID}.json"
10
+ feedback_folder = feedback_file.parent
11
+
12
+ scheduler = CommitScheduler(
13
+ repo_id="ohp-test-conversation",
14
+ repo_type="dataset",
15
+ folder_path=feedback_folder,
16
+ path_in_repo="data",
17
+ every=1,
18
+ )
19
+
20
+
21
+ def save_feedback(input_object: dict) -> None:
22
+ """
23
+ Append input/outputs and user feedback to a JSON Lines file using a thread lock to avoid concurrent writes from different users.
24
+ """
25
+ with scheduler.lock:
26
+ with feedback_file.open(mode="a") as f:
27
+ f.write(json.dumps(obj=input_object))
28
+ f.write("\n")
ml/README.md ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Dataset Schema for Project
2
+
3
+ ### KTO Dataset Structure
4
+
5
+ The dataset should be organized into two splits: `train` and `test`.
6
+
7
+ Each split contains the following features:
8
+
9
+ | **Feature** | **Type** | **Description** |
10
+ |---------------|-----------|--------------------------------------------------------------------------------------|
11
+ | `prompt` | `string` | The input text for the model. This should be a natural language query or input. |
12
+ | `completion` | `string` | The output text generated by the model in response to the `prompt`. |
13
+ | `label` | `bool` | A binary value (`True` or `False`) indicating whether the `completion` is desirable. |
ml/eval/kto_generations.json DELETED
The diff for this file is too large to render. See raw diff
 
ml/eval/sft_generations.json DELETED
The diff for this file is too large to render. See raw diff
 
pyproject.toml ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [project]
2
+ name = "ohp"
3
+ version = "0.1.0"
4
+ description = "A human feedback project"
5
+ readme = "README.md"
6
+ requires-python = ">=3.11"
7
+ dependencies = [
8
+ "datasets>=3.1.0",
9
+ ]
10
+
11
+ [dependency-groups]
12
+ ml = [
13
+ "trl>=0.12.2",
14
+ ]
15
+ app = [
16
+ "gradio>=5.8.0",
17
+ "huggingface-hub>=0.26.5",
18
+ ]