Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

zhanghengdev/awesome-video-object-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

24 Commits

Repository files navigation

Awesome Video-Object-Detection

Intro

This is a list of awesome articles about object detection from video.

Datasets

ImageNet VID Challenge

VisDrone Challenge

Paper list

2016

Seq-NMS for Video Object Detection

[Arxiv]

  • Date: Feb 2016
  • Motivation: Smoothing the final bounding box predictions across time.
  • Summary: Constructing a temporal graph from overlapping bounding box detections across the adjacent frames, and using dynamic programming to select bounding box sequences with the highest overall detection score.

T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos

[Arxiv] [Code]

  • Date: Apr 2016
  • Summary: Using a video object detection pipeline that involves predicting optical flow first, then propagating image level predictions according to the flow, and finally using a tracking algorithm to select temporally consistent high confidence detections.
  • Performance: 73.8% mAP on ImageNet VID validation.

Object Detection from Video Tubelets with Convolutional Neural Networks

[Arxiv] [Code]

  • Date: Apr 2016

Deep Feature Flow for Video Recognition

[Arxiv] [Code]

  • Date: Nov 2016
  • Performance: 73.0% mAP on ImageNet VID validation at 29 fps on a Titan X GPU.

2017

Object Detection in Videos with Tubelet Proposal Networks

[Arxiv]

  • Date: Feb 2017

Flow-Guided Feature Aggregation for Video Object Detection

[Arxiv] [Code]

  • Date: Mar 2017
  • Motivation: Producing powerful spatiotemporal features.
  • Performance: 76.3% mAP at 1.4 fps or 78.4% (combined with Seq-NMS) at 1.1 fps on ImageNet VID validation on a Titan X GPU.

Detect to Track and Track to Detect

[Arxiv] [Summary] [Code]

  • Date: Oct 2017
  • Motivation: Smoothing the final bounding box predictions across time.
  • Summary: Proposing a ConvNet architecture that solves detection and tracking problems jointly and applying a Viterbi algorithm to link the detections across time.
  • Performance: 79.8% mAP on ImageNet VID validation.

Towards High Performance Video Object Detection

[Arxiv]

  • Date: Nov 2017
  • Motivation: Producing powerful spatiotemporal features.
  • Performance: 78.6% mAP on ImageNet VID validation at 13 fps on a Titan X GPU.

Video Object Detection with an Aligned Spatial-Temporal Memory

[Arxiv] [Summary] [Code] [Demo]

  • Date: Dec 2017
  • Motivation: Producing powerful spatiotemporal features.
  • Performance: 80.5% mAP on ImageNet VID validation.

2018

Object Detection in Videos by High Quality Object Linking

[Arxiv]

  • Date: Jan 2018

Towards High Performance Video Object Detection for Mobiles

[Arxiv]

  • Date: Apr 2018
  • Motivation: Producing powerful spatiotemporal features.
  • Performance: 60.2% mAP on ImageNet VID validation at 25.6 fps on mobiles.

Optimizing Video Object Detection via a Scale-Time Lattice

[Arxiv] [Summary] [Code]

  • Date: Apr 2018
  • Performance: 79.4% mAP at 20 fps or 79.0% at 62 fps on ImageNet VID validation on a Titan X GPU.

Object Detection in Video with Spatiotemporal Sampling Networks

[Arxiv] [Summary]

  • Date: Mar 2018
  • Motivation: Producing powerful spatiotemporal features.
  • Performance: 78.9% mAP or 80.4% (combined with Seq-NMS) on ImageNet VID validation.

Fully Motion-Aware Network for Video Object Detection

[Paper] [Summary]

  • Date: Stp. 2018
  • Motivation: Producing powerful spatiotemporal features.
  • Performance: 78.1% mAP or 80.3% (combined with Seq-NMS) on ImageNet VID validation.

Integrated Object Detection and Tracking with Tracklet-Conditioned Detection

[Arxiv] [Summary]

  • Date: Nov 2018
  • Motivation: Smoothing the final bounding box predictions across time.
  • Performance: 83.5% of mAP with FGFA and Deformable ConvNets v2 on ImageNet VID validation.

2019

AdaScale: Towards Real-time Video Object Detection Using Adaptive Scaling

[arXiv]

  • Date: Feb 2019
  • Motivation: Adaptively rescale the input image resolution to improve both accuracy and speed for video object detection.
  • Performance: 75.5% of mAP on ImageNet VID validation for 4 different multi-scale training (600, 480, 360, 240).

Improving Video Object Detection by Seq-Bbox Matching

[pdf]

  • Date: Feb 2019
  • Motivation: Smoothing the final bounding box predictions across time (box-level method).
  • Performance: 80.9% of mAP (offline detection) and 78.2% of mAP (online detection) both at 38 fps on a Titan X GPU.

Comparison table

Paper Date Base detector Backbone Tracking? Optical flow? Online? mAP(%) FPS (Titan X)
Seq-NMS Feb 2016 R-FCN ResNet101 no no no 76.8 2.3
T-CNN Apr 2016 RCNN DeepIDNet+CRAFT yes no no 73.8 -
DFF Nov 2016 R-FCN ResNet101 no yes yes 73.0 29
TPN Feb 2017 TPN GoogLeNet yes no no 68.4 -
FGFA Mar 2017 R-FCN ResNet101 no yes yes 76.3 1.4
FGFA + Seq-NMS 29 Mar 2017 R-FCN ResNet101 no yes no 78.4 1.14
D&T Oct 2017 R-FCN (15 anchors) ResNet101 yes no no 79.8 7.09
STMN Dec 2017 R-FCN ResNet101 no no no 80.5 -
Scale-time-lattice 16 Apr 2018 Faster RCNN (15 anchors) ResNet101 no no no 79.6 20
Scale-time-lattice Apr 2018 Faster RCNN (15 anchors) ResNet101 no no no 79.0 62
SSN (per-frame baseline for STSN) Mar 2018 R-FCN Deformable ResNet101 no no yes 76.0 -
STSN Mar 2018 R-FCN Deformable ResNet101 no no yes 78.9 -
STSN+Seq-NMS Mar 2018 R-FCN Deformable ResNet101 no no no 80.4 -
MANet Sep. 2018 R-FCN ResNet101 no yes yes 78.1 5
MANet+Seq-NMS Sep. 2018 R-FCN ResNet101 no yes no 80.3 -
Tracklet-Conditioned Detection Nov 2018 R-FCN ResNet101 yes no yes 78.1 -
Tracklet-Conditioned Detection+DCNv2 Nov 2018 R-FCN ResNet101 yes no yes 82.0 -
Tracklet-Conditioned Detection+DCNv2+FGFA Nov 2018 R-FCN ResNet101 yes no yes 83.5 -
Seq-Bbox Matching Feb 2019 YOLOv3 darknet53 no no no 80.9 38
Seq-Bbox Matching Feb 2019 YOLOv3 darknet53 no no yes 78.2 38

About

This is a list of awesome articles about object detection from video.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

AltStyle によって変換されたページ (->オリジナル) /