Skip to main content
Stack Overflow
  1. About
  2. For Teams
Filter by
Sorted by
Tagged with
Best practices
0 votes
2 replies
63 views

I would like to use a LLM Encoder model to create vector embeddings for certain texts in my dataset. The texts are written as technical problem descriptions by experts who are trying to repair a ...
0 votes
1 answer
141 views

Error occurs when trying to load Pegasus model for text summarization from transformers import pipeline, set_seed pipe = pipeline("summarization", model="google/pegasus-cnn_dailymail&...
1 vote
0 answers
34 views

I have a matrix visual in a Power BI report that finds the percentage of people in a quintile that are not completing a qualification which reports on three academic years. I need to pull out the ...
David's user avatar
  • 85
1 vote
0 answers
38 views

Context - Doing an NLP project to analyze comments column in a data frame. I want to replace the duplicates with the first occurrence if the meaning of the comments are same. I wants to compare all ...
0 votes
1 answer
198 views

I try to combine 3 models with one another within langchain and that an openai tools calling agent can call the correct model based on a question. I made StructuredTools from the chains and made sure ...
1 vote
1 answer
118 views

I'm interested in utilizing an NLP model to provide short (one sentence in length) abstractive summaries of web pages, providing the model a set of commonly occurring HTML content from each web page (...
Max Chis's user avatar
0 votes
1 answer
499 views

Recently, I have been using Azure AI cognitive services to summarize text using document summarization and conversation summarization of it. But the summary length using both document summarization ...
0 votes
1 answer
161 views

I am very new to NLP. I'm trying to build simple text summarization model where it takes 1-2 important sentence from each subtitle in an article. For example, in the image I want take 1 sentence from &...
1 vote
1 answer
615 views

The objective is to add token embeddings (customized- obtained using different model) and the positional Embeddings. Is there a Way I can find out positonal embedding along with the token embeddings ...
1 vote
1 answer
1k views

has anyone used Langchain or LlamaIndex imports to deal with single documents that amount to >512 tokens? Yes, I know there are other approaches to dealing with it, but it is difficult to find ...
0 votes
0 answers
754 views

I am using SHAP Explanation to explain the output of the pretrained model. It works for the documents with the token length less than 1024. It throws an error below if I provide sequence with token ...
1 vote
1 answer
169 views

I need to summarize documents with spacy-pytextrank, what is the best approach to make it faster without increasing the resources of the machine? I was thinking of parallelizing the computation using ...
2 votes
0 answers
141 views

I am trying to fine fine tunning t5-small with xsum dataset on pytorch Windows 10 (CUDA 12.1). Unfortunately Trainer (or Seq2SeqTrainer) class from bitsandbytes is not avaliable for Windows, so it was ...
1 vote
3 answers
154 views

Lets say I have: group X Y Z A cat, dog dog, fox A fox, chicken dog, fox, chicken A B fox, dog B fox B ...
1 vote
1 answer
103 views

I have two versions of a short text, e.g.: old = "(a) The provisions of this article apply to machinery of class 6." new = "(a) The provisions of this article apply to machinery of ...

15 30 50 per page
1
2 3 4 5
...
25

AltStyle によって変換されたページ (->オリジナル) /