File size: 3,542 Bytes

d1fbbc2
 
68d61d8
 
 
 
 
d1fbbc2
 
 
 
68d61d8
 
d1fbbc2
 
 
 
68d61d8
 
 
 
 
 
d1fbbc2
68d61d8
d1fbbc2
 
68d61d8
 
d1fbbc2
 
 
 
 
68d61d8
 
d1fbbc2
68d61d8
d1fbbc2
 
68d61d8
 
d1fbbc2
 
 
 
68d61d8
 
 
 
 
 
d1fbbc2
68d61d8
 
 
 
d1fbbc2
 
 
 
 
 
68d61d8
 
 
 
 
d1fbbc2
 
 
 
68d61d8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d1fbbc2
 
 
 
68d61d8
d1fbbc2
 
68d61d8
 
 
 
 
d1fbbc2
 
 
de77249
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d1fbbc2
 
 
 
68d61d8
d1fbbc2
 
68d61d8
992659c

---
library_name: transformers
tags:
- chess
license: mit
language:
- en
---

# Model Card for Model ID

The base model, Mitral-7B-v1, has been fine-tuned to improve its reasoning, game analysis, and chess understanding capabilities, including proficiency in Algebraic Notation and FEN (Forsyth-Edwards Notation). This enhancement aims to create a robust AI system architecture that can integrate various tools seamlessly, boosting cognitive abilities within the controlled environment of chess.  
The full work can be accessed [here](__link__to__add__)


### Model Description

- **Developed by:** Danny Xu, Carlos Kuhn, Muntasir Adnan 
- **Funded by:** OpenSI
- **Model type:** Transformer based
- **License:** MIT
- **Finetuned from model:** Mistral-7B-v0.1
- 

### Model Sources


- **Repository:** https://github.com/TheOpenSI/cognitive_AI_experiments
- **Paper:** [Unleashing Artificial Cognition: Integrating Multiple AISystems](__link__to__add__)

## Uses

### Direct Use

- Chess analysis
- Meausre cognition qualities in a controlled environment

### Downstream Use

<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
- AGI
- Cognition capability of AI Systems


## How to Get Started with the Model

The model card contains only the LoRA adapter. To use it, load the adapter with the base Mistral model
```
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=bnb_config
)

lora_repo = "OpenSI/cognitive_AI_finetune_3"
adapter_config = PeftConfig.from_pretrained(lora_repo)
openSI_chess = PeftModel.from_pretrained(model, lora_model_name)
```

## Training Details

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
- Analysis
- Probable winner
- Next move prediction
- FEN parsing
- Capture analysis


#### Training Hyperparameters

- **Training regime:**
```
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16)


model_args = TrainingArguments(
    output_dir="mistral_7b",
    num_train_epochs=3,
    # max_steps=50,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=2,
    gradient_checkpointing=True,
    optim="paged_adamw_32bit",
    logging_steps=20,
    save_strategy="epoch",
    learning_rate=2e-4,
    bf16=True,
    tf32=True,
    max_grad_norm=0.3,
    warmup_ratio=0.03,
    lr_scheduler_type="constant",
    disable_tqdm=False
)
```

## Evaluation

#### Testing Data
Test dataset can be accessed here - [OpenSI Cognitive_AI](https://github.com/TheOpenSI/cognitive_AI_experiments/tree/master/data/test_framework)

#### Metrics
- Memory
- Perception
- Attention
- Reasoning
- Anticipation


### Results

<table>
    <thead>
        <tr>
            <th>Evaluation</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>
                <img src="./radar_plot.PNG" alt="Evaluation">
            </td>
        </tr>
    </tbody>
</table>


#### Hardware

Nvidia RTX 3090


## Citation
```
@misc{Adnan2024,
    title         = {Unleashing Artificial Cognition: Integrating Multiple AI Systems},
    author        = {Muntasir Adnan and Buddhi Gamage and Zhiwei Xu and Damith Herath and Carlos C. N. Kuhn},
    year          = {2024},
    eprint        = {2408.04910},
    archivePrefix = {arXiv}
}
```