File size: 2,788 Bytes

# Model Card: Labrador Transformer Model

## Model Overview
Labrador is a transformer-based machine learning model pre-trained on a masked language modeling (MLM) task. It is designed to analyze clinical laboratory data, focusing on morning routine lab values from the MIMIC IV dataset. The model aims to understand and predict laboratory test outcomes, providing insights for clinical informatics applications.

## Intended Use
- **Primary Application:** Research and analysis in clinical informatics, with a focus on laboratory data interpretation and prediction.
- **Target Users:** Researchers, data scientists, and healthcare professionals with expertise in machine learning and clinical data.

## Model/Data Specifications
- **Input Data:** Laboratory values including Bicarbonate (Bic), Creatinine (Crt), Potassium (Pot), Sodium (Sod), Urea (Ure), Hemoglobin (Hgb), Platelets (Plt), and White Blood Cell count (Wbc).
- **Model Outputs:** Predictive outputs for laboratory values, provided as both categorical and continuous data points.

## Training Data
The model leverages anonymized data from the MIMIC IV dataset, specifically focusing on routine morning lab values from patients at Beth Israel Deaconess Medical Center.

## Model Architecture & Parameters
- **Embedding Dimension:** 756
- **Hidden Dimension:** 756
- **Transformer Heads:** 4
- **Number of Blocks:** 10
- **Feedforward Dimension:** 1024
- **Dropout Rate:** 0.3
- **Activation:** ReLU

## Training Details
- **Optimizer:** Adam
- **Epochs:** 12
- **Learning Rate:** 8e-6
- **Batch Size:** 512
- **Masking Ratio:** 40%

## Limitations & Bias
- **Data Source Bias:** The training data from a single healthcare institution may not be representative of broader populations.
- **Analytical Bias:** The focus on specific lab values may not capture the full spectrum of patient health.
- **Generalization:** The model's performance may vary across different healthcare settings and patient demographics.

## Ethical Considerations
- **Data Privacy:** Users must adhere to ethical standards and privacy laws when applying the model to sensitive health information.
- **Clinical Decision Making:** The model's predictions should complement, not replace, clinical judgment and patient-specific considerations.

## Acknowledgements
This work was supported by MIT Critical Data and utilizes the MIMIC IV dataset. We thank all contributors to the MIMIC project and acknowledge the patients and healthcare providers who made this research possible.

## Model Details
- **Name:** Labrador
- **Version:** 1.0
- **Release Date:** January 28, 2024
- **Developer:** David Restrepo
- **Affiliation:** MIT Critical Data
- **Contact:** davidres@mit.edu

## License
This model is released under the MIT License.

---
license: mit
---