Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

LaSOT-based benchmark for trackers #2516

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ieliz wants to merge 35 commits into opencv:4.x
base: 4.x
Choose a base branch
Loading
from ieliz:trackers
Open

LaSOT-based benchmark for trackers #2516

ieliz wants to merge 35 commits into opencv:4.x from ieliz:trackers

Conversation

@ieliz
Copy link
Contributor

@ieliz ieliz commented Apr 27, 2020
edited
Loading

Added Python3 script with a benchmark for trackers.
LaSOT paper: https://arxiv.org/abs/1809.07845
TrackingNet paper: https://arxiv.org/abs/1803.10794
TrackingNet repo: https://github.com/SilvioGiancola/TrackingNet-devkit/blob/master/metrics.py

For every tracker was used a particular rate of re-initialization (measured in frames).

UPD 07.07.2020:

Current values for the LaSOT dataset (testing part) on Ubuntu 18.04:
Names: |IoU: |Precision: |N.Precision: 
-----------------------------------------------------------------------------------
Boosting |0.2911 |0.2463 |0.3036 
MIL |0.2801 |0.2459 |0.2897 
KCF |0.2298 |0.1907 |0.2456 
MedianFlow |0.2443 |0.2100 |0.2366 
CSRT |0.3316 |0.3158 |0.3755 
MOSSE |0.2329 |0.1845 |0.2364 

GOTURN still has some memory issues. The issue is reported.

DaSiamRPN results for the LaSOT dataset (testing part) on Ubuntu 18.04:
Names: |IoU: |Precision: |N.Precision: 
-----------------------------------------------------------------------------------
DaSiamRPN |0.2337 |0.1701 |0.1950 
Version of the benchmark for DaSiamRPN:
import numpy as np
import cv2 as cv
import argparse
import warnings
import os
class DaSiamRPNTracker:
 def __init__(self, im, target_pos, target_sz, net, kernel_r1, kernel_cls1):
 self.windowing = "cosine"
 self.exemplar_size = 127
 self.instance_size = 271
 self.total_stride = 8
 self.score_size = (self.instance_size -
 self.exemplar_size) // self.total_stride + 1
 self.context_amount = 0.5
 self.ratios = [0.33, 0.5, 1, 2, 3]
 self.scales = [8, ]
 self.anchor_num = len(self.ratios) * len(self.scales)
 self.penalty_k = 0.055
 self.window_influence = 0.42
 self.lr = 0.295
 self.im_h = im.shape[0]
 self.im_w = im.shape[1]
 self.target_pos = target_pos
 self.target_sz = target_sz
 self.avg_chans = np.mean(im, axis=(0, 1))
 self.net = net
 self.score = []
 if ((self.target_sz[0] * self.target_sz[1]) / float(self.im_h * self.im_w)) < 0.004:
 warnings.warn(
 "Using initializing bounding box of that size may cause inaccuracy of predictions!",
 category=None, stacklevel=1, source=None)
 self.anchor = self.__generate_anchor()
 wc_z = self.target_sz[0] + self.context_amount * sum(self.target_sz)
 hc_z = self.target_sz[1] + self.context_amount * sum(self.target_sz)
 s_z = round(np.sqrt(wc_z * hc_z))
 z_crop = self.__get_subwindow_tracking(im, self.exemplar_size, s_z)
 z_crop = z_crop.transpose(2, 0, 1).reshape(
 1, 3, 127, 127).astype(np.float32)
 self.net.setInput(z_crop)
 z_f = self.net.forward('63')
 kernel_r1.setInput(z_f)
 r1 = kernel_r1.forward()
 kernel_cls1.setInput(z_f)
 cls1 = kernel_cls1.forward()
 r1 = r1.reshape(20, 256, 4, 4)
 cls1 = cls1.reshape(10, 256, 4, 4)
 self.net.setParam(self.net.getLayerId('65'), 0, r1)
 self.net.setParam(self.net.getLayerId('68'), 0, cls1)
 if self.windowing == "cosine":
 self.window = np.outer(np.hanning(
 self.score_size), np.hanning(self.score_size))
 elif self.windowing == "uniform":
 self.window = np.ones((self.score_size, self.score_size))
 self.window = np.tile(self.window.flatten(), self.anchor_num)
 def __generate_anchor(self):
 self.anchor = np.zeros((self.anchor_num, 4), dtype=np.float32)
 size = self.total_stride * self.total_stride
 count = 0
 for ratio in self.ratios:
 ws = int(np.sqrt(size / ratio))
 hs = int(ws * ratio)
 for scale in self.scales:
 wws = ws * scale
 hhs = hs * scale
 self.anchor[count] = [0, 0, wws, hhs]
 count += 1
 score_sz = int(self.score_size)
 self.anchor = np.tile(self.anchor, score_sz *
 score_sz).reshape((-1, 4))
 ori = - (score_sz / 2) * self.total_stride
 xx, yy = np.meshgrid([ori + self.total_stride * dx for dx in range(score_sz)], [
 ori + self.total_stride * dy for dy in range(score_sz)])
 xx, yy = np.tile(xx.flatten(), (self.anchor_num, 1)).flatten(), np.tile(
 yy.flatten(), (self.anchor_num, 1)).flatten()
 self.anchor[:, 0], self.anchor[:, 1] = xx.astype(
 np.float32), yy.astype(np.float32)
 return self.anchor
 def track(self, im):
 wc_z = self.target_sz[1] + self.context_amount * sum(self.target_sz)
 hc_z = self.target_sz[0] + self.context_amount * sum(self.target_sz)
 s_z = np.sqrt(wc_z * hc_z)
 scale_z = self.exemplar_size / s_z
 d_search = (self.instance_size - self.exemplar_size) / 2
 pad = d_search / scale_z
 s_x = round(s_z + 2 * pad)
 x_crop = self.__get_subwindow_tracking(im, self.instance_size, s_x)
 x_crop = x_crop.transpose(2, 0, 1).reshape(
 1, 3, 271, 271).astype(np.float32)
 self.score = self.__tracker_eval(x_crop, scale_z)
 self.target_pos[0] = max(0, min(self.im_w, self.target_pos[0]))
 self.target_pos[1] = max(0, min(self.im_h, self.target_pos[1]))
 self.target_sz[0] = max(10, min(self.im_w, self.target_sz[0]))
 self.target_sz[1] = max(10, min(self.im_h, self.target_sz[1]))
 def __tracker_eval(self, x_crop, scale_z):
 target_size = self.target_sz * scale_z
 self.net.setInput(x_crop)
 outNames = self.net.getUnconnectedOutLayersNames()
 outNames = ['66', '68']
 delta, score = self.net.forward(outNames)
 delta = np.transpose(delta, (1, 2, 3, 0))
 delta = np.ascontiguousarray(delta, dtype=np.float32)
 delta = np.reshape(delta, (4, -1))
 score = np.transpose(score, (1, 2, 3, 0))
 score = np.ascontiguousarray(score, dtype=np.float32)
 score = np.reshape(score, (2, -1))
 score = self.__softmax(score)[1, :]
 delta[0, :] = delta[0, :] * self.anchor[:, 2] + self.anchor[:, 0]
 delta[1, :] = delta[1, :] * self.anchor[:, 3] + self.anchor[:, 1]
 delta[2, :] = np.exp(delta[2, :]) * self.anchor[:, 2]
 delta[3, :] = np.exp(delta[3, :]) * self.anchor[:, 3]
 def __change(r):
 return np.maximum(r, 1./r)
 def __sz(w, h):
 pad = (w + h) * 0.5
 sz2 = (w + pad) * (h + pad)
 return np.sqrt(sz2)
 def __sz_wh(wh):
 pad = (wh[0] + wh[1]) * 0.5
 sz2 = (wh[0] + pad) * (wh[1] + pad)
 return np.sqrt(sz2)
 s_c = __change(__sz(delta[2, :], delta[3, :]) / (__sz_wh(target_size)))
 r_c = __change(
 (target_size[0] / target_size[1]) / (delta[2, :] / delta[3, :]))
 penalty = np.exp(-(r_c * s_c - 1.) * self.penalty_k)
 pscore = penalty * score
 pscore = pscore * (1 - self.window_influence) + \
 self.window * self.window_influence
 best_pscore_id = np.argmax(pscore)
 target = delta[:, best_pscore_id] / scale_z
 target_size /= scale_z
 lr = penalty[best_pscore_id] * score[best_pscore_id] * self.lr
 res_x = target[0] + self.target_pos[0]
 res_y = target[1] + self.target_pos[1]
 res_w = target_size[0] * (1 - lr) + target[2] * lr
 res_h = target_size[1] * (1 - lr) + target[3] * lr
 self.target_pos = np.array([res_x, res_y])
 self.target_sz = np.array([res_w, res_h])
 return score[best_pscore_id]
 def __softmax(self, x):
 x_max = x.max(0)
 e_x = np.exp(x - x_max)
 y = e_x / e_x.sum(axis=0)
 return y
 def __get_subwindow_tracking(self, im, model_size, original_sz):
 im_sz = im.shape
 c = (original_sz + 1) / 2
 context_xmin = round(self.target_pos[0] - c)
 context_xmax = context_xmin + original_sz - 1
 context_ymin = round(self.target_pos[1] - c)
 context_ymax = context_ymin + original_sz - 1
 left_pad = int(max(0., -context_xmin))
 top_pad = int(max(0., -context_ymin))
 right_pad = int(max(0., context_xmax - im_sz[1] + 1))
 bottom_pad = int(max(0., context_ymax - im_sz[0] + 1))
 context_xmin += left_pad
 context_xmax += left_pad
 context_ymin += top_pad
 context_ymax += top_pad
 r, c, k = im.shape
 if any([top_pad, bottom_pad, left_pad, right_pad]):
 te_im = np.zeros((r + top_pad + bottom_pad, c +
 left_pad + right_pad, k), np.uint8)
 te_im[top_pad:top_pad + r, left_pad:left_pad + c, :] = im
 if top_pad:
 te_im[0:top_pad, left_pad:left_pad + c, :] = self.avg_chans
 if bottom_pad:
 te_im[r + top_pad:, left_pad:left_pad + c, :] = self.avg_chans
 if left_pad:
 te_im[:, 0:left_pad, :] = self.avg_chans
 if right_pad:
 te_im[:, c + left_pad:, :] = self.avg_chans
 im_patch_original = te_im[int(context_ymin):int(
 context_ymax + 1), int(context_xmin):int(context_xmax + 1), :]
 else:
 im_patch_original = im[int(context_ymin):int(
 context_ymax + 1), int(context_xmin):int(context_xmax + 1), :]
 if not np.array_equal(model_size, original_sz):
 im_patch_original = cv.resize(
 im_patch_original, (model_size, model_size))
 return im_patch_original
def get_iou(new, gt):
 new_xmin, new_ymin, new_w, new_h = new
 gt_xmin, gt_ymin, gt_w, gt_h = gt
 def get_max_coord(coord, size): return coord + size - 1.0
 new_xmax, new_ymax = get_max_coord(new_xmin, new_w), get_max_coord(
 new_ymin, new_h)
 gt_xmax, gt_ymax = get_max_coord(gt_xmin, gt_w), get_max_coord(
 gt_ymin, gt_h)
 dx = max(0, min(new_xmax, gt_xmax) - max(new_xmin, gt_xmin))
 dy = max(0, min(new_ymax, gt_ymax) - max(new_ymin, gt_ymin))
 area_of_overlap = dx * dy
 area_of_union = (new_xmax - new_xmin) * (new_ymax - new_ymin) + (
 gt_xmax - gt_xmin) * (gt_ymax - gt_ymin) - area_of_overlap
 iou = area_of_overlap / area_of_union if area_of_union != 0 else 0
 return iou
def get_pr(new, gt, is_norm):
 new_x, new_y, new_w, new_h = new
 gt_x, gt_y, gt_w, gt_h = gt
 def get_center(coord, size): return coord + (size + 1.0) / 2
 new_cx, new_cy, gt_cx, gt_cy = get_center(new_x, new_w), get_center(
 new_y, new_h), get_center(gt_x, gt_w), get_center(gt_y, gt_h)
 dx = new_cx - gt_cx
 dy = new_cy - gt_cy
 if is_norm:
 dx /= gt_w
 dy /= gt_h
 return np.sqrt(dx ** 2 + dy ** 2)
def main():
 parser = argparse.ArgumentParser(
 description="Run LaSOT-based benchmark for DaSiamRPN tracker")
 parser.add_argument("--net", type=str, default="dasiamrpn_model.onnx",
 help="Full path to onnx model of net")
 parser.add_argument("--kernel_r1", type=str, default="dasiamrpn_kernel_r1.onnx",
 help="Full path to onnx model of kernel_r1")
 parser.add_argument("--kernel_cls1", type=str, default="dasiamrpn_kernel_cls1.onnx",
 help="Full path to onnx model of kernel_cls1")
 parser.add_argument("--dataset", type=str,
 help="Full path to LaSOT folder")
 parser.add_argument("--v", dest="visualization", action='store_true',
 help="Showing process of tracking")
 args = parser.parse_args()
 trackers = ["DaSiamRPN"]
 cx, cy, w, h = 0.0, 0.0, 0, 0
 net = cv.dnn.readNet(args.net)
 kernel_r1 = cv.dnn.readNet(args.kernel_r1)
 kernel_cls1 = cv.dnn.readNet(args.kernel_cls1)
 video_names = os.path.join(args.dataset, "testing_set.txt")
 with open(video_names, 'rt') as f:
 list_of_videos = f.read().rstrip('\n').split('\n')
 iou_avg = []
 pr_avg = []
 n_pr_avg = []
 for tracker_name in trackers:
 print("Tracker name: ", tracker_name)
 number_of_thresholds = 21
 iou_video = np.zeros(number_of_thresholds)
 pr_video = np.zeros(number_of_thresholds)
 n_pr_video = np.zeros(number_of_thresholds)
 iou_thr = np.linspace(0, 1, number_of_thresholds)
 pr_thr = np.linspace(0, 50, number_of_thresholds)
 n_pr_thr = np.linspace(0, 0.5, number_of_thresholds)
 for video_name in list_of_videos:
 init_once = False
 print("\tVideo name: " + str(video_name))
 gt_file = open(os.path.join(args.dataset, video_name,
 "groundtruth.txt"), "r")
 gt_bb = gt_file.readline().rstrip("\n").split(",")
 init_bb = tuple([float(b) for b in gt_bb])
 video_sequence = sorted(os.listdir(os.path.join(
 args.dataset, video_name, "img")))
 iou_values = []
 pr_values = []
 n_pr_values = []
 frame_counter = len(video_sequence)
 for number_of_the_frame, image in enumerate(video_sequence):
 frame = cv.imread(os.path.join(
 args.dataset, video_name, "img", image))
 gt_bb = tuple([float(x) for x in gt_bb])
 if gt_bb[2] == 0 or gt_bb[3] == 0:
 gt_bb = gt_file.readline().rstrip("\n").split(",")
 frame_counter -= 1
 continue
 if ((number_of_the_frame + 1) % 250 == 0):
 init_once = False
 init_bb = gt_bb
 if not init_once:
 target_pos, target_sz = np.array(
 [init_bb[0], init_bb[1]]), np.array(
 [init_bb[2], init_bb[3]])
 tracker = DaSiamRPNTracker(
 frame, target_pos, target_sz, net, kernel_r1, kernel_cls1)
 init_once = True
 tracker.track(frame)
 w, h = tracker.target_sz
 cx, cy = tracker.target_pos
 new_bb = (cx, cy, w, h)
 if args.visualization:
 new_x, new_y, new_w, new_h = list(map(int, new_bb))
 cv.rectangle(frame, (new_x, new_y), ((
 new_x + new_w), (new_y + new_h)), (200, 0, 0))
 cv.imshow("Tracking", frame)
 cv.waitKey(1)
 iou_values.append(get_iou(new_bb, gt_bb))
 pr_values.append(get_pr(new_bb, gt_bb, is_norm=False))
 n_pr_values.append(get_pr(new_bb, gt_bb, is_norm=True))
 gt_bb = gt_file.readline().rstrip("\n").split(",")
 iou_video += (np.fromiter([sum(
 i >= thr for i in iou_values).astype(
 float) / frame_counter for thr in iou_thr], dtype=float))
 pr_video += (np.fromiter([sum(
 i <= thr for i in pr_values).astype(
 float) / frame_counter for thr in pr_thr], dtype=float))
 n_pr_video += (np.fromiter([sum(
 i <= thr for i in n_pr_values).astype(
 float) / frame_counter for thr in n_pr_thr], dtype=float))
 iou_mean_avg = np.array(iou_video) / len(list_of_videos)
 pr_mean_avg = np.array(pr_video) / len(list_of_videos)
 n_pr_mean_avg = np.array(n_pr_video) / len(list_of_videos)
 iou = np.trapz(iou_mean_avg, x=iou_thr) / iou_thr[-1]
 pr = np.trapz(pr_mean_avg, x=pr_thr) / pr_thr[-1]
 n_pr = np.trapz(n_pr_mean_avg, x=n_pr_thr) / n_pr_thr[-1]
 iou_avg.append('%.4f' % iou)
 pr_avg.append('%.4f' % pr)
 n_pr_avg.append('%.4f' % n_pr)
 titles = ["Names:", "IoU:", "Precision:", "N.Precision:"]
 data = [titles] + list(zip(trackers, iou_avg, pr_avg, n_pr_avg))
 for number, for_tracker in enumerate(data):
 line = '|'.join(str(x).ljust(20) for x in for_tracker)
 print(line)
 if number == 0:
 print('-' * len(line))
if __name__ == "__main__":
 main()

UPD 29.07.2020:

Results for GOTURN:

Names: |IoU: |Precision: |N.Precision: 
-----------------------------------------------------------------------------------
GOTURN |0.2259 |0.1789 |0.2243 

Links to fixes for GOTURN will be provided here soon.

UPD 14.08.2020:

Link to PR with fixes for GOTURN tracker.
Table with all results:

Names: |IoU: |Precision: |N.Precision: 
-----------------------------------------------------------------------------------
Boosting |0.2911 |0.2463 |0.3036 
MIL |0.2801 |0.2459 |0.2897 
KCF |0.2298 |0.1907 |0.2456 
MedianFlow |0.2443 |0.2100 |0.2366 
CSRT |0.3316 |0.3158 |0.3755 
MOSSE |0.2329 |0.1845 |0.2364 
GOTURN |0.2259 |0.1789 |0.2243 
DaSiamRPN |0.2337 |0.1701 |0.1950 

UPD 07.09.2020:

Pull request for GOTURN fixes is merged with the test.

alalek reacted with thumbs up emoji
@dkurt dkurt changed the base branch from 3.4 to master April 27, 2020 18:49
Copy link
Member

dkurt commented May 21, 2020

@l-bat, please join to review

@dkurt dkurt requested a review from l-bat May 21, 2020 12:50
Copy link
Contributor Author

ieliz commented Jun 18, 2020

I noticed one thing when comparing the results of the benchmark on Ubuntu and Windows 10:
results are different for the same methods, trackers, and videos (tested for 1 video/1 tracker). I gonna do some more experiments, but I want to ask - can it be connected with OS or it is about something else?

Copy link
Member

dkurt commented Jun 22, 2020

@ieliz, first you need to check what exactly differs - before getting a final metric there are a lot of intermediate steps - dataset loading, tracker execution, metric calculation.

# As a default argument used name of the original dataset folder
parser.add_argument("--dataset", type=str,
default="LaSOTTesting", help="Full path to LaSOT")
parser.add_argument("--v", dest="visualization", action='store_true',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just -v

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@ieliz ieliz mentioned this pull request Jul 8, 2020
4 tasks
Copy link
Contributor Author

ieliz commented Aug 5, 2020

May I change some parts of the benchmark for adding DaSiamRPN in the list of trackers, due to changes in the DaSiamRPN tracker sample?
Like:

from dasiamrpn_tracker import DaSiamRPNTracker

Copy link
Member

dkurt commented Aug 7, 2020
edited
Loading

May I change some parts of the benchmark for adding DaSiamRPN in the list of trackers, due to changes in the DaSiamRPN tracker sample?
Like:
from dasiamrpn_tracker import DaSiamRPNTracker

Sure! That's what we exactly wanted to do (unify both benchmarks: for DaSiam and for the rest of trackers into one)

Copy link
Member

dkurt commented Aug 10, 2020

Please unite all the numbers into a single table (seems to me that we need to run DaSiam again as a part of new script)

Copy link
Contributor Author

ieliz commented Aug 10, 2020

I ran the DaSiamRPN tracker as a part of the new benchmark and compare results with the results of the integrated benchmark (in the description of the PR). Results for 'coin-3' video from the Lasot dataset are equal.

import argparse
import os

from opencv.samples.dnn.dasiamrpn_tracker import DaSiamRPNTracker
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to suggest to use this construction:

import sys
path = cv.samples.findFile('samples/dnn/dasiamrpn_tracker.py')
sys.path.append(os.path.dirname(path))
from dasiamrpn_tracker import DaSiamRPNTracker

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Contributor

asenyaev commented Apr 9, 2021

jenkins cn please retry a build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

@l-bat l-bat l-bat left review comments

@dkurt dkurt dkurt left review comments

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

AltStyle によって変換されたページ (->オリジナル) /