CardinalOperations
/

ORLM-LLaMA-3-8B

@@ -1,5 +1,7 @@
 see our paper in https://arxiv.org/abs/2405.17743
 ## Model Details
 LLaMA-3-8B-ORLM is fully fine-tuned on the OR-Instruct data and built with Meta [LLaMA-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) model.
@@ -8,14 +10,14 @@ More training details can be seen at https://arxiv.org/abs/2405.17743
 ## Model Usage
 Prompting Template:
-'''text
 Below is an operations research question. Build a mathematical model and corresponding python code using `coptpy` that appropriately addresses the question.
-\# Question:
 {Question}
-\# Response:
-'''
 Please replace the `{Question}` with any natural language OR question.
@@ -54,7 +56,7 @@ Using this model, we can apply linear programming techniques to find the optimal
 ## Python Code Solution Using `coptpy`:
 Here is a Python script using the `coptpy` library to solve the problem:
-```python
 import coptpy as cp
 from coptpy import COPT
@@ -86,7 +88,7 @@ if model.status == COPT.OPTIMAL:
     print("Number of small pills to be made: {:.0f}".format(y.x))
 else:
     print("No optimal solution found.")
-```
 In this script, we first create a `COPT` environment and model. Then, we add two integer decision variables `x` and `y`, representing the number of large and small pills to be made, respectively.
@@ -104,21 +106,21 @@ This script provides a complete example of using the `coptpy` library to solve t
 ## Performances
-Below is the comparison of performance on the NL4OPT, MAMO, and IndustryOR benchmarks. Values marked with a ^*^ are directly copied from original papers, with blanks where data were not reported. The highest results are highlighted in bold.
 | **Method**                                     | **NL4OPT**              | **MAMO EasyLP**       | **MAMO ComplexLP**  | **IndustryOR**    | **Micro Avg**   | **Macro Avg**   |
 |------------------------------------------------|-------------------------|-----------------------|----------------------|-------------------|-----------------|-----------------|
 | *Methods based on PLMs*                        |                         |                       |                      |                   |                 |                 |
-| `tag-BART`                                     | 47.9%^*^                | -                     | -                    | -                 | -               | -               |
 | *Methods based on GPT-3.5*                     |                         |                       |                      |                   |                 |                 |
-| `Standard`                                     | 42.4%^*^                | -                     | -                    | -                 | -               | -               |
-| `Reflexion`                                    | 50.7%^*^                | -                     | -                    | -                 | -               | -               |
-| `Chain-of-Experts`                             | 58.9%^*^                | -                     | -                    | -                 | -               | -               |
 | *Methods based on GPT-4*                       |                         |                       |                      |                   |                 |                 |
-| `Standard`                                     | 47.3%^*^                | 66.5%^*^              | 14.6%^*^             | 28.0%             | 50.2%           | 39.1%           |
-| `Reflexion`                                    | 53.0%^*^                | -                     | -                    | -                 | -               | -               |
-| `Chain-of-Experts`                             | 64.2%^*^                | -                     | -                    | -                 | -               | -               |
-| `OptiMUS`                                      | 78.8%^*^                | -                     | -                    | -                 | -               | -               |
 | *ORLMs based on open-source LLMs*              |                         |                       |                      |                   |                 |                 |
 | `ORLM-Mistral-7B`                              | 84.4%                   | 81.4%                 | 32.0%                | 27.0%             | 68.8%           | 56.2%           |
 | `ORLM-Deepseek-Math-7B-Base`                   | **86.5%**               | 82.2%                 | **37.9%**            | 33.0%             | 71.2%           | 59.9%           |

 see our paper in https://arxiv.org/abs/2405.17743
+github repo: https://github.com/Cardinal-Operations/ORLM
 ## Model Details
 LLaMA-3-8B-ORLM is fully fine-tuned on the OR-Instruct data and built with Meta [LLaMA-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) model.
 ## Model Usage
 Prompting Template:
+```text
 Below is an operations research question. Build a mathematical model and corresponding python code using `coptpy` that appropriately addresses the question.
+# Question:
 {Question}
+# Response:
+```
 Please replace the `{Question}` with any natural language OR question.
 ## Python Code Solution Using `coptpy`:
 Here is a Python script using the `coptpy` library to solve the problem:
+\`\`\`python
 import coptpy as cp
 from coptpy import COPT
     print("Number of small pills to be made: {:.0f}".format(y.x))
 else:
     print("No optimal solution found.")
+\`\`\`
 In this script, we first create a `COPT` environment and model. Then, we add two integer decision variables `x` and `y`, representing the number of large and small pills to be made, respectively.
 ## Performances
+Below is the comparison of performance on the NL4OPT, MAMO, and IndustryOR benchmarks. Values marked with a <sup>*</sup> are directly copied from original papers, with blanks where data were not reported. The highest results are highlighted in bold.
 | **Method**                                     | **NL4OPT**              | **MAMO EasyLP**       | **MAMO ComplexLP**  | **IndustryOR**    | **Micro Avg**   | **Macro Avg**   |
 |------------------------------------------------|-------------------------|-----------------------|----------------------|-------------------|-----------------|-----------------|
 | *Methods based on PLMs*                        |                         |                       |                      |                   |                 |                 |
+| `tag-BART`                                     | 47.9%<sup>*</sup>               | -                     | -                    | -                 | -               | -               |
 | *Methods based on GPT-3.5*                     |                         |                       |                      |                   |                 |                 |
+| `Standard`                                     | 42.4%<sup>*</sup>                | -                     | -                    | -                 | -               | -               |
+| `Reflexion`                                    | 50.7%<sup>*</sup>                | -                     | -                    | -                 | -               | -               |
+| `Chain-of-Experts`                             | 58.9%<sup>*</sup>                | -                     | -                    | -                 | -               | -               |
 | *Methods based on GPT-4*                       |                         |                       |                      |                   |                 |                 |
+| `Standard`                                     | 47.3%<sup>*</sup>                | 66.5%<sup>*</sup>              | 14.6%<sup>*</sup>             | 28.0%             | 50.2%           | 39.1%           |
+| `Reflexion`                                    | 53.0%<sup>*</sup>                | -                     | -                    | -                 | -               | -               |
+| `Chain-of-Experts`                             | 64.2%<sup>*</sup>                | -                     | -                    | -                 | -               | -               |
+| `OptiMUS`                                      | 78.8%<sup>*</sup>                | -                     | -                    | -                 | -               | -               |
 | *ORLMs based on open-source LLMs*              |                         |                       |                      |                   |                 |                 |
 | `ORLM-Mistral-7B`                              | 84.4%                   | 81.4%                 | 32.0%                | 27.0%             | 68.8%           | 56.2%           |
 | `ORLM-Deepseek-Math-7B-Base`                   | **86.5%**               | 82.2%                 | **37.9%**            | 33.0%             | 71.2%           | 59.9%           |