Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Reimplement video utils #1929

Open
Open
Labels
@SkalskiP

Description

Description

The functionalities currently available in supervision.utils.video should be reimplemented and consolidated within a new Video class. Importantly, all features supported by the old video API must remain available in the new implementation.

  • get video info (works for files, RTSP, webcams)

    import supervision as sv
     
    # static video
    sv.Video("source.mp4").info
    # video stream
    sv.Video("rtsp://...").info
    # webcam
    sv.Video(0).info
  • simple frame iteration (object is iterable)

    import supervision as sv
    video = sv.Video("source.mp4")
    for frame in video:
     ...
  • advanced frame iteration (stride, sub-clip, on-the-fly resize)

    import supervision as sv
    for frame in sv.Video("source.mp4").frames(stride=5, start=100, end=500, resolution_wh=(1280, 720)):
     ...
  • process the video

    import cv2
    import supervision as sv
    def blur(frame, i):
     return cv2.GaussianBlur(frame, (11, 11), 0)
    sv.Video("source.mp4").save(
     "blurred.mp4",
     callback=blur,
     show_progress=True
    )
  • overwrite target video parameters

    import supervision as sv
    sv.Video("source.mp4").save(
     "timelapse.mp4",
     fps=60,
     callback=lambda f, i: f,
     show_progress=True
    )
  • complete manual control with explicit VideoInfo

    from supervision import Video, VideoInfo
    source = Video("source.mp4")
    target_info = VideoInfo(width=800, height=800, fps=24)
    with src.sink("square.mp4", info=target_info) as sink:
     for f in src.frames():
     f = cv2.resize(f, target_info.resolution_wh)
     sink.write(f)
  • multi-backend support decode/encode; implement PyAV and OpenCV

    import supervision as sv
    video = sv.Video("source.mkv", backend="pyav")
    video = sv.Video("source.mkv", backend="opencv")

    suggested minimal protocol

    class Backend(Protocol):
     def open(self, path: str) -> Any: ...
     def info(self, handle: Any) -> VideoInfo: ...
     def read(self, handle: Any) -> tuple[bool, np.ndarray]: ...
     def grab(self, handle: Any) -> bool: ...
     def seek(self, handle: Any, frame_idx: int) -> None: ...
     def writer(self, path: str, info: VideoInfo, codec: str) -> Writer: ...
    class Writer(Protocol):
     def write(self, frame: np.ndarray) -> None: ...
     def close(self) -> None: ...

Additional

  • Please share a Google Colab with minimal code to test the new feature. We know it's additional work, but it will speed up the review process. The reviewer must test each change. Setting up a local environment to do this is time-consuming. Please ensure that Google Colab can be accessed without any issues (make it public). Thank you! 🙏🏻
  • Mark all methods from the old video API as deprecated. Find examples of already deprecated methods or classes in the current codebase and use the same approach to mark the old video API methods. A deprecation period of at least 5 releases is required.
  • Reimplement the internals of the old video API using the new video API.
  • Provide full unit-test coverage matching existing test style.
  • Update the library docs (docstrings, mkdocs) to reflect the new API.
  • Take into account the comments in Issue with sv.VideoInfo FPS Handling for Precise Video Metadata Retrieval #1687 .

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

      Relationships

      None yet

      Development

      No branches or pull requests

      Issue actions

        AltStyle によって変換されたページ (->オリジナル) /