3,794 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
Advice
0
votes
1
replies
43
views
Organisation/Person tagging using Spacy
We’re working on a problem where our master dataset contains names of organizations and individuals, but some entries are untagged. We only have the names (no additional details such as email or ...
1
vote
1
answer
67
views
Output of for loop filling down in dataframe instead of returning corresponding values for each row
I'm using SpaCy to process a series of sentences and return the five most common words in each sentence. My goal is to store the output of that frequency analysis (using Counter) in a column beside ...
0
votes
0
answers
53
views
Training data format for SpanCategorizer when using custom suggester function
I'm taking a stab at building my own claim extraction pipeline (first time spaCy user).
Upstream in my pipeline, I feed n amount of docs to NER in the en_core_web_sm pretrained model in order to ...
0
votes
0
answers
56
views
Training with spaCy from command line, don't know why gpu-id not recognized
I am having the hardest of times getting my training session to use my gpu 0 which by every measure is present and correctly setup with cuda 12.2.
When I try to do python -m spacy train base_config....
0
votes
1
answer
180
views
How to make Microsoft Presidio detect and mask Indian names and unusual text patterns in banking data?
I’m working on anonymizing PII in banking text using Microsoft Presidio
.
The built-in PERSON recognizer (which uses spaCy under the hood) works for some Western names and when the sentence is clear
...
2
votes
1
answer
80
views
How can I extract symptoms/diseases from a running transcription?
I'm working on a project where I'm attempting to extract medical symptoms from a running transcription. I'm using SocketIO to get mic audio and then using Whisper to transcribe the audio into text ...
1
vote
2
answers
101
views
how to efficiently use spacy for pos tagging and ner
I am having 200 documents and I want to do NER and pos_tagging. However I find spacy to be too slow(I am running this code in google colab):
for doc in nlp.pipe(dataset["text"], batch_size=...
0
votes
0
answers
64
views
spaCy spancat won’t learn (zero F-score) while NER on same data scores 0.40 — Prodigy-generated KPI/target corpus
I am traing to train a spaCy v 3.8.7 spancat model on ~100 sustainability reports (annotated with Prodigy) to extract KPIs and targets.
An NER pipeline trained on the same data reaches F≈0.40, but ...
0
votes
1
answer
192
views
Unable to install spacy on MacOS 15.5 (M2) with Python 3.13.3 [duplicate]
Having created a new venv I am attempting to install spacy strictly in accordance with the documentation
Specifically:
pip install -U pip setuptools wheel
pip install -U 'spacy[apple]'
This fails (...
-2
votes
2
answers
606
views
pip install spacy errors with Python 3.13
I'm new to Python and I was given this code by my professor which includes "import spacy" and when I run the code I get the line: ModuleNotFoundError: No module named 'spacy'
That's where I ...
0
votes
0
answers
24
views
spaCy DependencyMatcher: One head for multiple children
How can I extract a single noun that is the head of multiple children?
I'm facing an issue in dependency matching in spaCy. I want to extract the nouns describing the name entities (identified by ...
4
votes
0
answers
63
views
Can older spaCy models be ported to future spaCy versions?
The latest spaCy versions have better performance and compatibility for GPU acceleration on Apple devices, but I have an existing project that depends on spaCy 3.1.4 and some of the specific behavior ...
0
votes
0
answers
32
views
Retrieving spaCy transformer tokenization ids
While using spacy transformer pipeline en_core_web_trf. How to retrieve the transformer tokenization (often roberta-base), it can be the tokenizer ids, tokenizer strings, or both (preferably).
Actual ...
0
votes
1
answer
332
views
Accessing Docling features from within spaCy Layout in Python
Currently I'm using spacy-layout as part of a pipeline to OCR documents and analyse documents. However, I also need to access other features of Docling such as counting the number of images in each ...
0
votes
1
answer
165
views
Problems with installing spacy on windows laptop
Hi Im trying to install Spacy on my win 11 laptop. I have python (3+) and pip (latest) already installed. However when I run the install command as indicated on the website -
pip install -U spacy
the ...