davanstrien HF staff commited on
Commit
1d82c63
β€’
1 Parent(s): d23d711
app.py CHANGED
@@ -140,7 +140,7 @@ To train or fine-tune a ColPali model, we need a dataset of image-text pairs whi
140
  To make the ColPali models work even better we might want a dataset of query/image document pairs related to our domain or task.
141
 
142
  One way in which we might go about generating such a dataset is to use an VLM to generate synthetic queries for us.
143
- This space uses the [Qwen/Qwen2-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct) to generate queries for a document, based on an input document image.
144
 
145
 
146
  This [blog post](https://danielvanstrien.xyz/posts/post-with-code/colpali/2024-09-23-generate_colpali_dataset.html) gives an overview of how you can use this kind of approach to generate a full dataset for fine-tuning ColPali models.
@@ -149,11 +149,17 @@ If you want to convert a PDF(s) to a dataset of page images you can try out the
149
 
150
  """
151
 
 
 
 
 
 
152
  demo = gr.Interface(
153
  fn=generate_response,
154
  inputs=gr.Image(type="pil"),
155
  outputs=gr.Json(),
156
  title=title,
157
  description=description,
 
158
  )
159
  demo.launch()
 
140
  To make the ColPali models work even better we might want a dataset of query/image document pairs related to our domain or task.
141
 
142
  One way in which we might go about generating such a dataset is to use an VLM to generate synthetic queries for us.
143
+ This space uses the [Qwen/Qwen2-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct) VLM model to generate queries for a document, based on an input document image.
144
 
145
 
146
  This [blog post](https://danielvanstrien.xyz/posts/post-with-code/colpali/2024-09-23-generate_colpali_dataset.html) gives an overview of how you can use this kind of approach to generate a full dataset for fine-tuning ColPali models.
 
149
 
150
  """
151
 
152
+ examples = [
153
+ "examples/Approche_no_13_1977.pdf_page_22.jpg",
154
+ "examples/SRCCL_Technical-Summary.pdf_page_7.jpg",
155
+ ]
156
+
157
  demo = gr.Interface(
158
  fn=generate_response,
159
  inputs=gr.Image(type="pil"),
160
  outputs=gr.Json(),
161
  title=title,
162
  description=description,
163
+ examples=examples,
164
  )
165
  demo.launch()
examples/Approche_no_13_1977.pdf_page_22.jpg ADDED
examples/SRCCL_Technical-Summary.pdf_page_7.jpg ADDED