Video DB Scene Index with LLM. #26
-
Hi VideoDB Team,
I was following up on the example https://docs.videodb.io/adding-ai-generated-voiceovers-with-videodb-and-lovo-70
Here are the few questions that are unclear from the document.
- indexing timeline and scene description + llm response.
- I see the shot-based indexing created a 85 scenes out of 2.3 minutes of video. But while providing promt to llm you have done it single prompt and the response I got by following the doc has only 41 shots.
- Why don't we iterate over each scene and ask llm to generate description to just fill that the timeline.
- How we are sure that reponse given by the llm just fill the entire timeline of the video. It would be great if you can provide explaination of this.
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1
Replies: 2 comments
-
Just pointing out one more thing if there is slight movement in audio with scene it could create a whole different meaning. by shifiting the position.
Beta Was this translation helpful? Give feedback.
All reactions
-
Hi @dineshkumar181094 great observations!
- Could the LLM be stopping due to token limit? That might be one of the reason, as there is nothing in the prompt that is instructing LLM to restrict / stop. Ideally it should cover the whole input (85 scenes in your case).
- In shot based indexing, where the scene duration is very short (1-2 seconds) the output might not sound coherent, here for better precision maybe better way would be to club on certain threshold (x minutes) and generate audio for those clubbed chunks instead of char based chunks given in the tutorial.
- In our experimentation prompt
generate a synced script based on the descriptionwrites a script with sentences which are roughly the same length as the time stamp of the scene in the description.
Just pointing out one more thing if there is slight movement in audio with scene it could create a whole different meaning. by shifiting the position. - Can you please share some example of this if handy? Probably good chunking should smooth out the cases like this.
Beta Was this translation helpful? Give feedback.