Edit model card

Nesta, the UK's innovation agency, has been scraping online job adverts since 2021 and building algorithms to extract and structure information as part of the Open Jobs Observatory project.

Although we are unable to share the raw data openly, we aim to open source our models, algorithms and tools so that anyone can use them for their own research and analysis.

πŸ“Ÿ About

This model is pre-trained from a distilbert-base-uncased checkpoint on 100k sentences from scraped online job postings as part of the Open Jobs Observatory.

πŸ–¨οΈ Use

To use the model:

from transformers import pipeline

model = pipeline('fill-mask', model='ihk/ojobert', tokenizer='ihk/ojobert')

An example use is as follows:


text = "Would you like to join a major [MASK] company?"
results = model(text, top_k=3)

results

>> [{'score': 0.1886572688817978,
  'token': 13859,
  'token_str': 'pharmaceutical',
  'sequence': 'would you like to join a major pharmaceutical company?'},
 {'score': 0.07436735928058624,
  'token': 5427,
  'token_str': 'insurance',
  'sequence': 'would you like to join a major insurance company?'},
 {'score': 0.06400047987699509,
  'token': 2810,
  'token_str': 'construction',
  'sequence': 'would you like to join a major construction company?'}]

βš–οΈ Training results

The fine-tuning metrics are as follows:

  • eval_loss: 2.5871026515960693
  • eval_runtime: 134.4452
  • eval_samples_per_second: 14.281
  • eval_steps_per_second: 0.223
  • epoch: 3.0
  • perplexity: 13.29
Downloads last month
3
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ihk/ojobert

Finetuned
(6612)
this model