Spaces:

pierreguillou
/

Inference-APP-Document-Understanding-at-linelevel-v1

Runtime error

App Files Files Community

pierreguillou commited on Feb 13, 2023

Commit

d159597

•

1 Parent(s): 08d0375

Update app.py

Browse files

Files changed (1) hide show

app.py +8 -8

app.py CHANGED Viewed

@@ -81,15 +81,15 @@ def app_outputs(uploaded_pdf):
             df[i].to_csv(csv_file, encoding="utf-8", index=False)
     else:
-        img_files, images, csv_files = [""]*3,[""]*3,[""]*3
-        img_files[0], img_files[1], img_files[2] = image_blank, image_blank, image_blank
-        images[0], images[1], images[2] = Image.open(image_blank), Image.open(image_blank), Image.open(image_blank)
         csv_file = "csv_wo_content.csv"
-        csv_files[0], csv_files[1], csv_files[2] = gr.File.update(value=csv_file, visible=True), gr.File.update(value=csv_file, visible=True), gr.File.update(value=csv_file, visible=True)
         df, df_empty = dict(), pd.DataFrame()
-        df[0], df[1], df[2] = df_empty.to_csv(csv_file, encoding="utf-8", index=False), df_empty.to_csv(csv_file, encoding="utf-8", index=False), df_empty.to_csv(csv_file, encoding="utf-8", index=False)
-    return msg, img_files[0], img_files[1], img_files[2], images[0], images[1], images[2], csv_files[0], csv_files[1], csv_files[2], df[0], df[1], df[2]
 # gradio APP
 with gr.Blocks(title="Inference APP for Document Understanding at line level (v1)", css=".gradio-container") as demo:
@@ -99,7 +99,7 @@ with gr.Blocks(title="Inference APP for Document Understanding at line level (v1
     <div><p><a style="text-decoration: none; border-bottom: #64b5f6 0.125em solid; color: #64b5f6" href="https://arxiv.org/abs/2202.13669" target="_blank">LiLT (Language-Independent Layout Transformer)</a> is a Document Understanding model that uses both layout and text in order to detect labels of bounding boxes. Combined with the model <a style="text-decoration: none; border-bottom: #64b5f6 0.125em solid; color: #64b5f6" href="https://huggingface.co/xlm-roberta-base" target="_blank">XML-RoBERTa base</a>, this finetuned model has the capacity to understand any language. Finetuned on the dataset <a style="text-decoration: none; border-bottom: #64b5f6 0.125em solid; color: #64b5f6" href="https://huggingface.co/datasets/pierreguillou/DocLayNet-base" target="_blank">DocLayNet base</a>, it can classifly any bounding box (and its OCR text) to 11 labels (Caption, Footnote, Formula, List-item, Page-footer, Page-header, Picture, Section-header, Table, Text, Title).</p></div>
     <div><p>It relies on an external OCR engine to get words and bounding boxes from the document image. Thus, let's run in this APP an OCR engine ourselves (<a style="text-decoration: none; border-bottom: #64b5f6 0.125em solid; color: #64b5f6" href="https://github.com/madmaze/pytesseract#python-tesseract" target="_blank">PyTesseract</a>) as we'll need to do it in real life to get the bounding boxes, then run LiLT (already fine-tuned on the dataset DocLayNet base at line level) on the individual tokens and then, visualize the result at line level!</p></div>
     <div><p>From any PDF (of any language), it allows to get all pages with bounding boxes labeled at line level and the associated dataframes with labeled data (bounding boxes, texts, labels).</p></div>
-    <div><p>To avoid running this APP for too long, <b>only the first 3 pages are processed by this APP</b>. If you want to update this limit, you can either clone this APP and change the value of the parameter <code>max_imgboxes</code>, or run the corresponding notebook "<a style="text-decoration: none; border-bottom: #64b5f6 0.125em solid; color: #64b5f6" href="https://github.com/piegu/language-models/blob/master/inference_on_LiLT_model_finetuned_on_DocLayNet_base_in_any_language_at_levellines_ml384.ipynb" target="_blank">Document AI | Inference at line level with a Document Understanding model (LiLT fine-tuned on DocLayNet dataset)</a>" which does not have this limit.</p></div>
     <div style="margin-top: 20px"><p>More information about the DocLayNet datasets, the finetuning of the model and this APP in the following blog posts:</p>
     <ul><li><a style="text-decoration: none; border-bottom: #64b5f6 0.125em solid; color: #64b5f6" href="https://medium.com/@pierre_guillou/document-ai-document-understanding-model-at-line-level-with-lilt-tesseract-and-doclaynet-dataset-347107a643b8" target="_blank">(02/10/2023) Document AI | Document Understanding model at line level with LiLT, Tesseract and DocLayNet dataset</a></li><li><a style="text-decoration: none; border-bottom: #64b5f6 0.125em solid; color: #64b5f6" href="https://medium.com/@pierre_guillou/document-ai-doclaynet-image-viewer-app-3ac54c19956" target="_blank"> (01/31/2023) Document AI | DocLayNet image viewer APP</a></li><li><a style="text-decoration: none; border-bottom: #64b5f6 0.125em solid; color: #64b5f6" href="https://medium.com/@pierre_guillou/document-ai-processing-of-doclaynet-dataset-to-be-used-by-layout-models-of-the-hugging-face-hub-308d8bd81cdb" target="_blank">(01/27/2023) Document AI | Processing of DocLayNet dataset to be used by layout models of the Hugging Face hub (finetuning, inference)</a></li></ul></div>
     """)
@@ -155,4 +155,4 @@ with gr.Blocks(title="Inference APP for Document Understanding at line level (v1
         cache_examples=True,
     )
-demo.launch(debug=True)

             df[i].to_csv(csv_file, encoding="utf-8", index=False)
     else:
+        img_files, images, csv_files = [""]*max_imgboxes, [""]*max_imgboxes, [""]*max_imgboxes
+        img_files[0], img_files[1] = image_blank, image_blank
+        images[0], images[1] = Image.open(image_blank), Image.open(image_blank)
         csv_file = "csv_wo_content.csv"
+        csv_files[0], csv_files[1] = gr.File.update(value=csv_file, visible=True), gr.File.update(value=csv_file, visible=True)
         df, df_empty = dict(), pd.DataFrame()
+        df[0], df[1] = df_empty.to_csv(csv_file, encoding="utf-8", index=False), df_empty.to_csv(csv_file, encoding="utf-8", index=False)
+    return msg, img_files[0], img_files[1], images[0], images[1], csv_files[0], csv_files[1], df[0], df[1]
 # gradio APP
 with gr.Blocks(title="Inference APP for Document Understanding at line level (v1)", css=".gradio-container") as demo:
     <div><p><a style="text-decoration: none; border-bottom: #64b5f6 0.125em solid; color: #64b5f6" href="https://arxiv.org/abs/2202.13669" target="_blank">LiLT (Language-Independent Layout Transformer)</a> is a Document Understanding model that uses both layout and text in order to detect labels of bounding boxes. Combined with the model <a style="text-decoration: none; border-bottom: #64b5f6 0.125em solid; color: #64b5f6" href="https://huggingface.co/xlm-roberta-base" target="_blank">XML-RoBERTa base</a>, this finetuned model has the capacity to understand any language. Finetuned on the dataset <a style="text-decoration: none; border-bottom: #64b5f6 0.125em solid; color: #64b5f6" href="https://huggingface.co/datasets/pierreguillou/DocLayNet-base" target="_blank">DocLayNet base</a>, it can classifly any bounding box (and its OCR text) to 11 labels (Caption, Footnote, Formula, List-item, Page-footer, Page-header, Picture, Section-header, Table, Text, Title).</p></div>
     <div><p>It relies on an external OCR engine to get words and bounding boxes from the document image. Thus, let's run in this APP an OCR engine ourselves (<a style="text-decoration: none; border-bottom: #64b5f6 0.125em solid; color: #64b5f6" href="https://github.com/madmaze/pytesseract#python-tesseract" target="_blank">PyTesseract</a>) as we'll need to do it in real life to get the bounding boxes, then run LiLT (already fine-tuned on the dataset DocLayNet base at line level) on the individual tokens and then, visualize the result at line level!</p></div>
     <div><p>From any PDF (of any language), it allows to get all pages with bounding boxes labeled at line level and the associated dataframes with labeled data (bounding boxes, texts, labels).</p></div>
+    <div><p>To avoid running this APP for too long, <b>only the first 2 pages are processed by this APP</b>. If you want to update this limit, you can either clone this APP and change the value of the parameter <code>max_imgboxes</code>, or run the corresponding notebook "<a style="text-decoration: none; border-bottom: #64b5f6 0.125em solid; color: #64b5f6" href="https://github.com/piegu/language-models/blob/master/inference_on_LiLT_model_finetuned_on_DocLayNet_base_in_any_language_at_levellines_ml384.ipynb" target="_blank">Document AI | Inference at line level with a Document Understanding model (LiLT fine-tuned on DocLayNet dataset)</a>" which does not have this limit.</p></div>
     <div style="margin-top: 20px"><p>More information about the DocLayNet datasets, the finetuning of the model and this APP in the following blog posts:</p>
     <ul><li><a style="text-decoration: none; border-bottom: #64b5f6 0.125em solid; color: #64b5f6" href="https://medium.com/@pierre_guillou/document-ai-document-understanding-model-at-line-level-with-lilt-tesseract-and-doclaynet-dataset-347107a643b8" target="_blank">(02/10/2023) Document AI | Document Understanding model at line level with LiLT, Tesseract and DocLayNet dataset</a></li><li><a style="text-decoration: none; border-bottom: #64b5f6 0.125em solid; color: #64b5f6" href="https://medium.com/@pierre_guillou/document-ai-doclaynet-image-viewer-app-3ac54c19956" target="_blank"> (01/31/2023) Document AI | DocLayNet image viewer APP</a></li><li><a style="text-decoration: none; border-bottom: #64b5f6 0.125em solid; color: #64b5f6" href="https://medium.com/@pierre_guillou/document-ai-processing-of-doclaynet-dataset-to-be-used-by-layout-models-of-the-hugging-face-hub-308d8bd81cdb" target="_blank">(01/27/2023) Document AI | Processing of DocLayNet dataset to be used by layout models of the Hugging Face hub (finetuning, inference)</a></li></ul></div>
     """)
         cache_examples=True,
     )
+demo.launch()