Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

InputContainer.seek(backward=True, any_frame=False) overshoots #1982

Unanswered
nikonikolov asked this question in 1. Help
Discussion options

I am trying to get a specific frame index from a video as fast as possible. The strategy I follow is what the docs of InputContainer.seek suggest - I call InputContainer.seek(backward=True, any_frame=False) so that I get the closest previous keyframe and then decode forward sequentially to the desired frame. I am using the following code

def decode_frame_from_video(video_path: str, frame_index: int) -> PIL.Image.Image:
 with av.open(video_path) as container: # av.container.input.InputContainer
 stream: av.video.stream.VideoStream = container.streams.video[0]
 # Calculate timestamp from frame index for seeking
 fps = float(stream.average_rate)
 target_timestamp_sec: float = frame_index / fps
 container.seek(int(target_timestamp_sec / stream.time_base), backward=True, any_frame=False, stream=stream)
 # Starting from the keyframe found by seek, decode frames until we reach the desired frame index
 for frame in container.decode(stream): # av.video.frame.VideoFrame
 index = round(frame.pts * stream.time_base * fps)
 if index == frame_index:
 image: PIL.Image.Image = frame.to_image()
 return image
 raise ValueError(
 f"Could not find frame at index {frame_index} in video of length "
 f"{stream.frames} frames at {video_path}"
 )

I am testing this with a video with 5 FPS and keyframes at indices 0, 20, 40, 60. All frames from 0 to 35 work fine, but when I pass frame_index=36, the first frame container.decode(stream) returns in the for loop after the call to container.seek is the keyframe at index 40. If I instead seek with container.seek(int(target_timestamp_sec * av.time_base), backward=True, any_frame=False), the error happens at index 19 and the keyframe returned is 20. The video is encoded with x265 and
stream.time_base = 1 / 10240. The frames also have the expected time and pts:

ipdb> for frame in container.decode(stream): 
 index = round(frame.pts * stream.time_base * fps) 
 print(index, frame.time, frame.pts) 
 
0 0.0 0 
1 0.2 2048 
2 0.4 4096 
3 0.6 6144 
4 0.8 8192 
5 1.0 10240 
6 1.2 12288 
7 1.4 14336 
8 1.6 16384 
9 1.8 18432 
10 2.0 20480 
11 2.2 22528 
12 2.4 24576 
13 2.6 26624 
14 2.8 28672 
15 3.0 30720 
16 3.2 32768 
17 3.4 34816 
18 3.6 36864 
19 3.8 38912 
20 4.0 40960 
21 4.2 43008 
22 4.4 45056 
23 4.6 47104 
24 4.8 49152 
25 5.0 51200 
26 5.2 53248 
27 5.4 55296 
28 5.6 57344 
29 5.8 59392 
30 6.0 61440 
31 6.2 63488 
32 6.4 65536 
33 6.6 67584 
34 6.8 69632 
35 7.0 71680 
36 7.2 73728 
37 7.4 75776 
38 7.6 77824 
39 7.8 79872
40 8.0 81920
You must be logged in to vote

Replies: 0 comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
1 participant

AltStyle によって変換されたページ (->オリジナル) /