srimanth-d commited on
Commit
de3b21e
·
verified ·
1 Parent(s): 1811609

Delete README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -81
README.md DELETED
@@ -1,81 +0,0 @@
1
- ## NOTE:
2
- - Before looking into the project, I would like to convey that with all efforts to satisfy your requirements,I have tried my level best to implement an OCR model to extract both English and Hindi Texts from an image.But due to limited compute resources and data,I was unable to finetune an effective model for Hindi OCR(loss value 0.8) so I only implemented English version.
3
- - Also note that this model takes significantly large processing time per image i.e., 5-7 minutes as per my observation during inference because it runs on CPU.As GOT OCR2.0 is only available on GPU, but due to no compute resources, I implemented a CPU version of it and uploaded it HuggingFace Models Hub [https://huggingface.co/srimanth-d/GOT_CPU](https://huggingface.co/srimanth-d/GOT_CPU) and contributing to the CPU version of GOT as mentioned in the official repository of [GOT OCR2.0](https://github.com/ElvisClaros/GOT-OCR2.0).
4
- - Here are the finetuned model weights which were finetuned on this dataset --> [https://huggingface.co/datasets/damerajee/hindi-ocr](https://huggingface.co/datasets/damerajee/hindi-ocr) Link to finetuned weights(finetuned using [ms-swift](https://github.com/modelscope/ms-swift)) --> [https://drive.google.com/file/d/1qbupBRk8yIgiD3WzIwKP54-Fn4wpgpg1/view?usp=sharing](https://drive.google.com/file/d/1qbupBRk8yIgiD3WzIwKP54-Fn4wpgpg1/view?usp=sharing)
5
- - I could have quantized the model, but the accuracy reduces greatly.So I refrained from doing so.
6
- - I also noted that few other students who did this task used my implementation of GOT CPU from HuggingFace.
7
-
8
-
9
- # Web-Based Optical Character Recognition (OCR) Prototype
10
-
11
- ## Objective
12
- This web application demonstrates the ability to perform Optical Character Recognition (OCR) on an uploaded image containing text in both **Hindi** and **English**. It also implements a basic keyword search functionality based on the extracted text. The prototype is deployed and accessible online via a live URL.
13
-
14
- ## Features
15
- - **Image Upload**: Supports common image formats such as JPEG, JPG and PNG.
16
- - **OCR Processing**: Extracts text in Hindi and English from the uploaded image.
17
- - **Keyword Search**: Allows users to search within the extracted text and highlights matching sections.
18
- - **Live Deployment**: Accessible via a public URL.
19
-
20
- ## Technology Stack
21
- - **Backend**: Python
22
- - **OCR Models**:
23
- - Huggingface Transformers (General OCR Theory model(GOT OCR2.0)
24
- - PyTorch for model execution
25
- - **Web Framework**: Streamlit
26
- - **Deployment**: HuggingFace Spaces, Streamlit Sharing.
27
-
28
- ## Tasks Overview
29
-
30
- ### Task 1: Environment Setup and OCR Implementation
31
- 1. **Environment Setup**:
32
- - Set up the environment with the required dependencies:
33
- ```bash
34
- pip install -r requirements.txt
35
- ```
36
- If you are running this command before cloning this repository you may get an error. That is first clone this repository and then change your current directory to GOT_OCR2.0 .
37
- The commands for which are given below in the [Running the Application Locally](#running-the-application-locally) section step number 1.
38
-
39
- 2. **OCR Model Implementation**:
40
- - General OCR Theory (GOT), a 580M end-to-end OCR model was used to build this application.
41
-
42
- ### Task 2: Web Application Development
43
- 1. **Image Upload**:
44
- - Allow users to upload a single image for OCR.
45
-
46
- 2. **Text Extraction**:
47
- - Use the chosen OCR model to extract text and display it on the page.
48
-
49
- 3. **Keyword Search**:
50
- - Implement a basic search feature where users can input a keyword.
51
- - Highlight the matching text within the extracted content.
52
-
53
- ### Task 3: Deployment
54
- 1. **Deploy the Web Application**:
55
- - Deploy the web app using platforms like Hugging Faces, Streamlit Sharing, or any other suitable platform.
56
- - Ensure it is accessible via a public URL.
57
-
58
- ## Running the Application Locally
59
- To run the application on your local machine:
60
-
61
- 1. **Clone the repository**:
62
- ```bash
63
- git clone https://github.com/AISpaceXDragon/GOT-OCR2.0.git
64
- ```
65
-
66
- ```bash
67
- cd GOT_OCR2.0
68
- ```
69
-
70
- 2.**Run the streamlit app locally**
71
- ```bash
72
- streamlit run app.py
73
- ```
74
- This command must be run only after executing the "step 1.clone the repository" given above.
75
-
76
- ## Deployment
77
- This application is deployed on Streamlit Sharing and HuggingFace Spaces.The links for both of them are given below.
78
-
79
- 1.Link(Streamlit Sharing) - To be posted soon.
80
-
81
- 2.Link(HuggingFace Spaces) - To be posted soon.