Skip to main content
Stack Overflow
  1. About
  2. For Teams
Filter by
Sorted by
Tagged with
0 votes
1 answer
87 views

I'm trying to reproduce the training of one of the spaCy pipeline for Italian language: it_core_news_sm. This pipeline is trained on 2 datasets: UD_Italian-ISDT for the conllu tasks WikiNer for NET ...
1 vote
0 answers
855 views

I am trying to export the result of the file that I imported to Label Studio. This is my labeling interface : <View> <Labels name="label" toName="text"> <Label ...
1 vote
0 answers
32 views

I am working with Italian Universal Dependency data in CONLLU format, like this: sent_id = VIT-4006 text = "grazie dell'informazione, la metterò nella memoria del mio Macintosh". 1 " ...
4 votes
1 answer
1k views

I'm working on a named entity recognition (NER) project and would like to create my own dataset based on the CoNLL2003 dataset (link: https://huggingface.co/datasets/conll2003). I've been looking at ...
1 vote
1 answer
453 views

I have been searching for a while now but haven't found any solution to my problem. For a relation classification task I have annotated several news like text documents with prodigy annotation ...
0 votes
1 answer
92 views

This is my first time posting here, so be gentle, please. I have written the following code: import pandas as pd import spacy df = pd.read_csv('../../../Data/conll2003.dev.conll', sep='\t', ...
Ayro's user avatar
  • 3
1 vote
3 answers
821 views

I was planning to train a Spark NLP custom NER model, which uses the CoNLL 2003 format to do so (this blog even leaves some traning sample data to speed-up the follow-up). This "sample data" ...
0 votes
0 answers
469 views

I have a text file that contains data for the NER model, the data is in CoNLL format. The CoNLL format is a text file with one word per line with sentences separated by an empty line. The first word ...
1 vote
0 answers
142 views

I am working in a Portuguese Digital Humanities project using R. I created a CONLLU-style dataframe with the corpus data, using the UDPipe library: textAnnotated <- udpipe::udpipe_annotate(m_port, ...
-1 votes
1 answer
140 views

I'm trying to implement ML models with Amazon SageMaker Studio, the thing is that the model that I want to implement is from hugging face and It uses a Dataset from CONLL Corpora. Following the ...
0 votes
2 answers
632 views

I need to preprocess XML files for a NER task and I am struggling with the conversion of the XML files. I guess there is a nice and easy way to solve the following problem. Given an annotated text in ...
1 vote
1 answer
2k views

I am working on NER application where i have data annotated in the following data format. [('The F15 aircraft uses a lot of fuel', {'entities': [(4, 7, 'aircraft')]}), ('did you see the F16 landing?',...
0 votes
1 answer
94 views

I was working with conll2003dataset. It contains articles from various news sources among other things. It contains sentences, part of speech tags for each word in those sentences, chunk ids for those ...
Rnj's user avatar
  • 1,199
0 votes
1 answer
118 views

From my IOB corpus such as: mention Tag 170 171 467 O 172 173 Vincennes B-LOCATION 174 . O 175 176 Confirmation O 177 des O 178 privilèges O 179 de O 180 la O 181 ville ...
Lter's user avatar
  • 85
1 vote
1 answer
449 views

The goal is to train BERT SRL on another data set. According to configuration, it requires conll-formatted-ontonotes-5.0. Natively, my data comes in a CoNLL format and I converted it to the conll-...
Chiarcos's user avatar
  • 364

15 30 50 per page
1
2 3

AltStyle によって変換されたページ (->オリジナル) /