367 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
Best practices
0
votes
2
replies
63
views
Recommended way to create abstracted text embeddings from large text data?
I would like to use a LLM Encoder model to create vector embeddings for certain texts in my dataset. The texts are written as technical problem descriptions by experts who are trying to repair a ...
0
votes
1
answer
141
views
'NoneType' object has no attribute 'encode' when loading tokenizer
Error occurs when trying to load Pegasus model for text summarization
from transformers import pipeline, set_seed
pipe = pipeline("summarization", model="google/pegasus-cnn_dailymail&...
1
vote
0
answers
34
views
Is it possible to replicate a Power BI matrix in DAX to find the maximum value in a column>?
I have a matrix visual in a Power BI report that finds the percentage of people in a quintile that are not completing a qualification which reports on three academic years.
I need to pull out the ...
1
vote
0
answers
38
views
Text summarizations of comments and replace the duplicates with the first occurrence if the meaning is comment is same
Context - Doing an NLP project to analyze comments column in a data frame. I want to replace the duplicates with the first occurrence if the meaning of the comments are same.
I wants to compare all ...
0
votes
1
answer
198
views
BadRequestError when Summarizing with MapReduceDocumentsChain as a tool within AgentExecutor in langchain
I try to combine 3 models with one another within langchain and that an openai tools calling agent can call the correct model based on a question. I made StructuredTools from the chains and made sure ...
1
vote
1
answer
118
views
How do non-LLM models compare to LLMs for Abstractive Summaries of HTML content?
I'm interested in utilizing an NLP model to provide short (one sentence in length) abstractive summaries of web pages, providing the model a set of commonly occurring HTML content from each web page (...
0
votes
1
answer
499
views
Increase summary length using MS Azure-AI services
Recently, I have been using Azure AI cognitive services to summarize text using document summarization and conversation summarization of it. But the summary length using both document summarization ...
0
votes
1
answer
161
views
NLP - make summarization from each subtitle
I am very new to NLP. I'm trying to build simple text summarization model where it takes 1-2 important sentence from each subtitle in an article. For example, in the image I want take 1 sentence from &...
1
vote
1
answer
615
views
How to Find Positional embeddings from BARTTokenizer?
The objective is to add token embeddings (customized- obtained using different model) and the positional Embeddings.
Is there a Way I can find out positonal embedding along with the token embeddings ...
1
vote
1
answer
1k
views
Summarization and Topic Extraction with LLMs (private) and LangChain or LlamaIndex using flan-t5-small
has anyone used Langchain or LlamaIndex imports to deal with single documents that amount to >512 tokens? Yes, I know there are other approaches to dealing with it, but it is difficult to find ...
0
votes
0
answers
754
views
Max Length error while using Huggingface Transformer model for SHAP Explanation
I am using SHAP Explanation to explain the output of the pretrained model. It works for the documents with the token length less than 1024. It throws an error below if I provide sequence with token ...
1
vote
1
answer
169
views
speed up PyTextRank for summarizing a document
I need to summarize documents with spacy-pytextrank, what is the best approach to make it faster without increasing the resources of the machine?
I was thinking of parallelizing the computation using ...
2
votes
0
answers
141
views
Stack size errors on fine tunning t5 with xsum using pytorch
I am trying to fine fine tunning t5-small with xsum dataset on pytorch Windows 10 (CUDA 12.1).
Unfortunately Trainer (or Seq2SeqTrainer) class from bitsandbytes is not avaliable for Windows, so it was ...
1
vote
3
answers
154
views
Get unique values from rows with comma separated values based on a specific category column in R
Lets say I have:
group X Y Z
A cat, dog dog, fox
A fox, chicken dog, fox, chicken
A
B fox, dog
B fox
B ...
1
vote
1
answer
103
views
Detecting adding/removal from string difference between texts
I have two versions of a short text, e.g.:
old = "(a) The provisions of this article apply to machinery of class 6."
new = "(a) The provisions of this article apply to machinery of ...