nlprepl / index.html
martynawck's picture
Update index.html
9edc948
raw
history blame
No virus
3.63 kB
<!DOCTYPE html>
<html>
<head>
<title>Bootstrap Online Editor</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/css/bootstrap.min.css">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.6.0/jquery.min.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/js/bootstrap.min.js"></script>
</head>
<body>
<div class="container">
<h2 style="text-align: center;">NLPre-PL Dataset</h2>
<p>The official NLPre-PL dataset - a uniformly paragraph-level divided version of NKJP1M corpus – the 1-million token balanced subcorpus of the National Corpus of Polish (Narodowy Korpus Języka Polskiego).
</p>
<p></p>
The NLPre dataset aims at fairly dividing the paragraphs length-wise and topic-wise into train, development, and test sets. Thus, we ensure a similar number of segments distribution per paragraph and avoid the situation when paragraphs with a small (or large) number of segments are available only e.g. during test time.
</p>
<p>
<a style="text-align: center;" href="https://huggingface.co/datasets/ipipan/nlprepl" class="btn btn-primary btn-lg active" role="button" aria-pressed="true">&#129303; NLPre-PL Dataset</a>
<a style="text-align: center;"href="http://git.nlp.ipipan.waw.pl/alina/PDBUD" class="btn btn-primary btn-lg active" role="button" aria-pressed="true">&#129303; PDB-UD Dataset</a>
</p>
<div><p></p></div>
<div class="container">
<h2 style="text-align: center;">NLPre-PL Trained models</h2>
<p>Here are listed all available models, trained for the purpouse of creating NLPre-PL Benchmark.</p>
<div class="alert alert-primary" role="alert">
COMBO
</div>
<p><b>UD TAGSET</b></p>
<p>
<a href="https://huggingface.co/ipipan/nlpre_combo_ud_herBERT_pdb" class="btn btn-seconday btn-lg active" > COMBO + HerBERT + PDB-UD</a>
<a href="https://huggingface.co/ipipan/nlpre_combo_ud_fasttext_pdb" class="btn btn-seconday btn-lg active" >COMBO + fasttext + PDB-UD </a>
<a href="https://huggingface.co/ipipan/nlpre_combo_ud_herBERT_nkjp-by-name" class="btn btn-seconday btn-lg active" > COMBO + HerBERT + NLPrePL-fair-by-name </a>
<a href="https://huggingface.co/ipipan/nlpre_combo_ud_herBERT_nkjp-by-type" class="btn btn-seconday btn-lg active" > COMBO + HerBERT + NLPrePL-fair-by-type </a>
<a href="https://huggingface.co/ipipan/nlpre_combo_ud_fasttext_nkjp-by-name" class="btn btn-seconday btn-lg active" > COMBO + fasttext + NLPrePL-fair-by-name </a>
<a href="https://huggingface.co/ipipan/nlpre_combo_ud_fasttext_nkjp-by-type" class="btn btn-seconday btn-lg active" > COMBO + fasttext + NLPrePL-fair-by-type </a>
</p>
<p><b>NKJP TAGSET</b></p>
<p>
<a href="https://huggingface.co/ipipan/nlpre_combo_nkjp_herBERT_nkjp-by-name" class="btn btn-seconday btn-lg active" > COMBO + HerBERT + NLPrePL-fair-by-name</a>
<a href="https://huggingface.co/ipipan/nlpre_combo_nkjp_herBERT_nkjp-by-type" class="btn btn-seconday btn-lg active" >COMBO + HerBERT + NLPrePL-fair-by-type </a>
<a href="https://huggingface.co/ipipan/nlpre_combo_nkjp_fasttext_nkjp-by-name" class="btn btn-seconday btn-lg active" > COMBO + fasttext + NLPrePL-fair-by-name</a>
<a href="https://huggingface.co/ipipan/nlpre_combo_nkjp_fasttext_nkjp-by-type" class="btn btn-seconday btn-lg active" >COMBO + fasttext + NLPrePL-fair-by-type </a>
</p>
<div class="alert alert-primary" role="alert">
Spacy
</div>
<div class="alert alert-primary" role="alert">
Stanza
</div>
<div class="alert alert-primary" role="alert">
Trankit
</div>
</div>
</body>
</html>