flair/ner-english · Doubts around NER

Hi,

I have pre-trained FastText as well as bert-base-uncased on custom corpus.
How can I use it for NER training?

Specifically, is something like this possible in flair:

embedding_types = [
    WordEmbeddings('path to custom fasttext vectors'), # .txt file generated from fasttext
    FlairEmbeddings('path to custom flair-forward vectors'),
    FlairEmbeddings('path to custom flair-backward vectors'),
]

Furthermore,
When we use this as embedder, what happens behind the scenes?

embeddings = TransformerWordEmbeddings(model='xlm-roberta-large',
                                       layers="-1",
                                       subtoken_pooling="first",
                                       fine_tune=True,
                                       use_context=True,
                                       )

Is it simply taking the token embeddings from 'xlm-roberta-large' and puts a linear layer for NER on top of it?
Where exactly FLERT's functionalities are used? Are they automatically handled behind the scenes?
- If yes, how can I turn off FLERT's features so that I can compare the gains we are getting before and after using FLERT
I have pre-trained "bert-base-uncased" on custom dataset. How should I use this instead of "xlm-roberta-large"?
Any suggestions on whether or not we should use CRF layer on top of bert embeddings for NER tasks
The flag "use_rnn" what exactly it does? If I switch it off, what does it do? Will it switch off char-rnn layer or word-rnn layer?

I know these are lot of questions but simplicity of Flair enabled me to quickly run experiments and hence the curiosity :)

-Nitesh