File size: 3,187 Bytes
7fcc76f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40230e9
 
 
 
 
7fcc76f
 
 
 
 
 
 
 
 
 
 
 
 
 
dbdf3e4
 
 
7fcc76f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
---
tags:
- geology
- geophysics
- geoscience
license: apache-2.0
language:
- en
metrics:
- accuracy
base_model:
- google/gemma-2-2b
new_version: ShebMichel/geobot_teacher-v0
pipeline_tag: question-answering
library_name: keras
---

# Model Card for Model ID and Description
This model have been fine-tuned using Gemma_2b_en. The data train on is a syntetic 253 QA pair generated from wide topics of geoscience in ChatGPT.
The model perform well with a nbre of epoch =75. So the training is availaible on my kaggle repo: 

![image/png](https://cdn-uploads.huggingface.co/production/uploads/66d9cc4b75b3337a8532ed56/-WSeP07VwELvX1SGIlCMW.png) or my github repo 

![image/png](https://cdn-uploads.huggingface.co/production/uploads/66d9cc4b75b3337a8532ed56/Dmrey9DCRFL4srpQqNuQM.png)

The general idea is to have a bot which assess geoscience student assessment as fast as possible with the resulting of either a pass or fail.
So, if an exam is submitted, the bot will report student and predicted answers as well as the evaluation/metric between the two. 
Then finally, use that metric to compile whether it is a pass or fail (coming soon)  


- **Developed by:** Dr. Michel M. Nzikou
- **Funded by [optional]:** KaggleX- Fellow cohort 4 and Google with GCP credit
- **Model type:** Text generation models: Chatbot
- **Language(s) (NLP):** Python 3.10, keras==3.6.0, keras_nlp==0.15.1
- **License:** Apache 2.0
- **Finetuned from model [optional]:** Gemma_2b_en 

# Sample Data

Please download the file geology-exam-test_for_gemma_model_2b_en_253_75.json to test the UI. 
However, use the two questions if you are using kaggle notebook. Otherwise, create a json file similar to the downloaded file with the same structure.

"Question": "How do sedimentary rocks form?", "Response": "Sedimentary rocks form from the accumulation of sediments."
"Question": "What is igneous rock formation?", "Response": "Igneous rocks form when molten rock cools and solidifies."

To test or evaluate the model, try tweaking the question and see how it respond? Please do not hesitate to contact me for further development of collaboration.

## Bias, Risks, and Limitations

- The smaller dataset fine-tuned is a great limitation, however, we have the pipeline ready and if you have a small set, you could 
use the github repo (to be filled soon) to train your model.
- Bias from data generation using existing llm model. However, the sample were pre-processed before being used for fine-tuned.


### How to Get Started with the Model

Test the app with your own questions, if not download it and fine-tune on top of this one. If you do so, share your variant model card.


### Environmental Impact

As we know the more we use paper assessment, we have to cut more tree, so this model is a green model.
- **Hardware Type:** [GPU T4 *2]
- **Hours used:** [5hours]
- **Cloud Provider:** [Kaggle]
- **Compute Region:** [AU]
- **Carbon Emitted:** [CO2 emission to fill in the gap here :)]


 
#### Model Card Authors [optional]
Dr. Michel M. Nzikou, Research Fellow, Center of Exploration Targeting, UWA, Perth, Australia

#### Model Card Contact
michel.nzikou@alumni.uleth.ca/michel.nzikoumamboukou@uwa.edu.au