File size: 5,285 Bytes
c951cca
e61b199
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c951cca
e61b199
 
 
bd05070
ec2689f
0383f32
e61b199
c951cca
e61b199
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
493441c
 
107414b
 
 
0383f32
493441c
c951cca
493441c
 
 
 
 
 
 
 
c951cca
493441c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
# Towards Neuro-Symbolic Language Understanding

![alt text](https://www.flexudy.com/wp-content/uploads/2021/09/conceptor.png "Flexudy's conceptor")

At [Flexudy](https://flexudy.com), we look for ways to unify symbolic and sub-symbolic methods to improve model interpretation and inference.

## Problem

1. Word embeddings are awesome 🚀. However, no one really knows what an array of 768 numbers means?
2. Text/Token classification is also awesome ❤️‍. Still, classifying things into a finite set of concepts is rather limited.
3. Last but not least, how do I know that the word *cat* is a **mammal** and also an **animal** if my neural network is only trained to predict whether something is an animal or not?

## Solution

1. It would be cool if my neural network would just know that **cat** is an **animal** right? *∀x.Cat(x) ⇒ Animal(x)*.
Or for example, (*∀x.SchöneBlumen(x) ⇒ Blumen(x)*) -- English meaning: For all x, If x is a beautiful flower, then x is still a flower. --

2. All of a sudden, tasks like **Question Answering**, **Summarization**, **Named Entity Recognition** or even **Intent Classification** etc become easier right?

Well, one might probably still need time to build a good and robust solution that is not as large as **GPT3**.

Like [Peter Gärdenfors, author of conceptual spaces](https://www.goodreads.com/book/show/1877443.Conceptual_Spaces), we are trying to find ways to navigate between the symbolic and the sub-symbolic by thinking in concepts.

Should such a solution exist, one could easily leverage true logical reasoning engines on natural language.

How awesome would that be? 💡

## Flexudy's Conceptor

1. We developed a poor man's implementation of the ideal solution described above.
2. Though it is a poor man's model, **it is still a useful one** 🤗.

### Usage

No library should anyone suffer. Especially not if it is built on top of 🤗 **HF Transformers**.


Go to the [Github repo](https://github.com/flexudy/natural-language-logic)

`pip install git+https://github.com/flexudy/natural-language-logic.git@v0.0.1`

```python
from flexudy.conceptor.start import FlexudyConceptInferenceMachineFactory

# Load me only once
concept_inference_machine = FlexudyConceptInferenceMachineFactory.get_concept_inference_machine()

# A list of terms.
terms = ["cat", "dog", "economics and sociology", "public company"]

# If you don't pass the language, a language detector will attempt to predict it for you
# If any error occurs, the language defaults to English.
language = "en"

# Predict concepts
# You can also pass the batch_size=2 and the beam_size=4
concepts = concept_inference_machine.infer_concepts(terms, language=language)
```

Output:

```python
{'cat': ['mammal', 'animal'], 'dog': ['hound', 'animal'], 'economics and sociology': ['both fields of study'], 'public company': ['company']}
```


### How was it trained?

1. Using Google's T5-base and T5-small. Both models are released on the Hugging Face Hub.
2. T5-base was trained for only two epochs while T5-small was trained for 5 epochs.

## Where did you get the data?

1. I extracted and curated a fragment of [Conceptnet](https://conceptnet.io/)
2. In particular, only the IsA relation was used.
3. Note that one term can belong to multiple concepts (which is pretty cool if you think about [Fuzzy Description Logics](https://lat.inf.tu-dresden.de/~stefborg/Talks/QuantLAWorkshop2013.pdf)).
Multiple inheritances however mean some terms belong to so many concepts. Hence, I decided to randomly throw away some due to the **maximum length limitation**.

### Setup
1. I finally allowed only `2` to `4` concepts at random for each term. This means, there is still great potential to make the models generalise better 🚀.
3. I used a total of `279884` training examples and `1260` for testing. Edges -- i.e `IsA(concept u, concept v)` -- in both sets are disjoint.
4. Trained for `15K` steps with learning rate linear decay during each step. Starting at `0.001`
5. Used `RAdam Optimiser` with weight_decay =`0.01` and batch_size =`36`.
6. Source and target max length were both `64`.

### Multilingual Models

1. The "conceptor" model is multilingual. English, German and French is supported.
2. [Conceptnet](https://conceptnet.io/) supports many languages, but I just chose those three because those are the ones I speak.

### Metrics for flexudy-conceptor-t5-base

| Metric        |         Score |
| ------------- |:-------------:|
| Exact Match   | 36.67         |
| F1            | 43.08         |
| Loss smooth   | 1.214         |

Unfortunately, we no longer have the metrics for flexudy-conceptor-t5-small. If I recall correctly, base was just slightly better on the test set (ca. `2%` F1).

## Why not just use the data if you have it structured already?

Conceptnet is very large. Even if you just consider loading a fragment into your RAM, say with only 100K edges, this is still a large graph.
Especially, if you think about how you will save the node embeddings efficiently for querying.
If you prefer this approach, [Milvus](https://github.com/milvus-io/pymilvus) can be of great help.
You can compute query embeddings and try to find the best match. From there (after matching), you can navigate through the graph at `100%` precision.