DIY AI For Journalists
Compiling resources useful for journalists building prototypes with AI
Runtime error173🔥Note This Space provides a version of Whisper (a speech to text model) with speaker diarization. This allows you to transcribe audio containing speech along with information about who is speaking.
pyannote/speaker-diarization
Automatic Speech Recognition • Updated • 5.36M • 854Note This model allows you to perform diarization (identification of who is speaking in audio)
copenlu/scientific-exaggeration-detection
Text Classification • Updated • 18 • 3Note This model can measure the causal claim strength of a scientific sentence, which can be used to compare two sentences for exaggeration in causal claim strength.
Running143📝📎PDF OCR
Note A space that allows you to perform OCR on PDF documents
Running5🌍Grobid CRF only
Note GROBID is a machine learning library for extracting, parsing and re-structuring raw documents such as PDF into structured XML/TEI encoded documents with a particular focus on technical and scientific publications.
Running3🥥Coconut
Note Coconut Library Tool is an all-in-one data mining and textual analysis tool
tomaarsen/span-marker-bert-base-uncased-keyphrase-inspec
Token Classification • Updated • 13 • 11Note This is a Named Entity Recognition model trained to extract keywords from a text.
Running on CPU Upgrade29✍Argilla Space
Note Sometimes it may be useful to create your own training data for training or evaluating machine learning models. Tools like Argilla can help with the process of creating these annotations.