Skip to main content
Stack Overflow
  1. About
  2. For Teams
Filter by
Sorted by
Tagged with
Advice
0 votes
0 replies
41 views

I’m currently working on extracting / segmenting text lines from handwritten documents. Most of the input images are camera-captured, which introduces several challenges: Lines may be curved or ...
1 vote
2 answers
247 views

I want to use Benepar with a French model to do a syntactic segmentation. I followed the tutorial but I have always have this error RuntimeError: Error(s) in loading state_dict for ChartParser: ...
0 votes
1 answer
220 views

For a text image input, I need to break the text into segments using the OPENCV library Let's say the image has 4 lines of text, I need to write a function that breaks down and cuts the lines and ...
3 votes
0 answers
99 views

The icu4x icu_segmenter::WordSegmenter seems like the best word segmenter out there. I don't understand how data providers work with word segmentation at all. It seems very complicated to me and I ...
mash's user avatar
  • 2,544
1 vote
1 answer
734 views

I'm trying to write a method to count the number of words when the content is in chinese and japanese. This should exclude the special characters / punctuations / whiteSpaces. I tried creating a regex ...
1 vote
0 answers
42 views

I am currently working on a problem that requires segmenting a video lecture transcript based on the topics present within the video. My dataset consists of sentence wise labels where 1 indicates the ...
0 votes
1 answer
600 views

OriginalImage1 BinarizedImage1 OriginalImage2 BinarizedImage2 OriginalImage3 BinarizedImage3 OriginalImage4 BinarizedImage4 I`m preparing image for OCR by Tesseract (pre-trained for this custom font) ...
2 votes
1 answer
478 views

I want to split into sentences a large corpus (.txt) with a custom rule i.e. {SENT} using Spacy 3.1. My main issue is that I want to "disable" the segmentation from the pretrained spacy ...
1 vote
0 answers
47 views

Is it possible to segment a bs4.element.Tag into several bs4.element.Tag? You can think of an application as the following: 1- The original bs4.element.Tag contains a paragraph. 2- We want to segment ...
1 vote
1 answer
2k views

The following code uses SymSpell in Python, see the symspellpy guide on word_segmentation. It uses "de-100k.txt" and "en-80k.txt" frequency dictionaries from a github repo, you ...
9 votes
1 answer
3k views

What is the difference between Tokenization and Segmentation in NLP. I searched about them but I didn't really find any differences .
0 votes
1 answer
89 views

So the question revolve around character segmentation. My problem is the following: I want to segment characters, based on y-axis pixel numbers, following this ( in python) : source What i already ...
0 votes
1 answer
203 views

How can I obtain a whole word within a string-type sentence? \ For instance, if the given string was: The app has been updated to 88.0.1234.141 which contains a number of fixes and improvements. And ...
-1 votes
2 answers
720 views

Is there a simple way to convert plain text into a segmented array of chunks in python? Each chunk should be for example 16 Bytes. If the last part of the plain text is smaller than 16 Bytes it should ...
2 votes
3 answers
437 views

I'd like to remove all the timestamps in the parentheses in the below sample text data. Input: Agent: Can I help you? ( 3s ) Customer: Thank you( 40s ) Customer: I have a question about X. ( 8m 1s ) ...

15 30 50 per page
1
2 3 4 5
...
14

AltStyle によって変換されたページ (->オリジナル) /