Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

How to handle very long prompts? #26

Unanswered
Selbi182 asked this question in Q&A
Discussion options

I'm writing a Discord bot meant to summarize the last n messages. The idea is that large group chats often accumulate hundreds of unread messages while someone else is offline and it's a pain to read it all back by hand.

Now, so far I got it set up to summarize the last like 200 messages just fine. However, after that I get an error message because my prompt used too many tokens (more than 4096).

What's the approach to passing larger prompts (like 1000 messages)?

I tried splitting the messages into partitions of 200 each, which sorta works, but GPT treats each partition individually rather than part of a bigger picture and it doesn't keep context.

You must be logged in to vote

Replies: 1 comment 5 replies

Comment options

So you are limited by token counts. Here are 2 ideas to consider:

Increasing the Token Count

Using a 16k or 32k context model (If you have beta access to it... you need to apply for the 32k context I believe). gpt-3.5-turbo-16k

Note that using these models is much more expensive.

Segmenting the work

In 1 ChatUser.USER message, you can provide, say 100 messages, ask it to summarize, next 100, summarize, etc etc.

Then you can either append these summaries together or use another ChatRequest to smoothly combine these summaries.

Choosing another AI

I hate to recommend this at all, but OpenAI isn't really built for long context windows. They're larger context models are much more expensive and are still too small for certain tasks. If you are willing to build your own API, some models exist online that can handle large amounts of text and are good at summaries (I believe there are models that are used summarize entire books).

You must be logged in to vote
5 replies
Comment options

Thanks for the extensive feedback! Yeah, the first option isn't financially viable for me, but I also don't want to switch to a different AI altogether. I like the second approach the most.

However, there's one problem with that as well. It should happen rarely, but sometimes people dump half a novel's worth of content into a single message. This has already resulted in exceeding the token limit multiple times. If I knew how many tokens a given prompt would need beforehand, I'd limit it programmatically, but so far I couldn't find a reliable solution. It's not just the word count.

Comment options

That's an API I need to add actually, it's called the Tokenizer. I don't know if ChatGPT has a tokenizer api yet though...

You could also try-catch the error and check to see if the issue is tokens.

An even hackier approach is to get the number of words, multiply by 0.75, and that's roughly how many tokens you have.

Comment options

I did try the Tokenizer library before, but it doesn't seem to be really accurate either.
int tokenCount = SimpleTokenizer.INSTANCE.tokenize(prompt).length;
It's only off by a few tokens, but that doesn't stop GPT from complaining. try-catching is my current approach, but obviously it's bugfixing with a bandaid.

The last part is weird to me, because so far it seems like the token count is always greater than the word count. Do you mean divide by 0.75 instead perhaps?

Comment options

oh yeah words per token not tokens per word.

Comment options

Alright, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet

AltStyle によって変換されたページ (->オリジナル) /