Learn Object Tracking in OpenCV Python with Code Examples

Introduction

In this article, we will be implementing and comparing algorithms for object tracking in OpenCV Python library.  We will first understand what is object tracking and then see code examples of few object tracking modules of OpenCV python like KCF, CSRT, Mean Shift, and Cam Shift algorithms.

What is Object Tracking?

Object tracking is a computer vision task that refers to the process of finding & tracking the position of a predefined object that is moving in the frames of a video.

Object Tracking vs Object Detection

At times beginners confuse object tracking with object detection and use the two words interchangeably. But there is a slight difference between the two –

In the object detection task, we identify the object in a specific frame or a scene that may be just a static image. Whereas in object tracking we track the object which is in continuous motion in a video. In fact, if we perform object detection on every frame of the video its resulting effect is of object tracking only.

Applications of Object Tracking

Object tracking has many interesting and useful applications, some of which are given below –

  • Human-computer interaction
  • Security and surveillance
  • Augmented reality
  • Traffic control
  • Medical imaging
  • Video editing

Types of Object Tracking Algorithms

i) Single Object Tracking

Single object tracking refers to the process of selecting a region of interest (in the initial frame of a video) and tracking the position (i.e. coordinates) of the object in the upcoming frames of the video. We will be covering some of the algorithms used for single object tracking in this article.

Single object detector example
Single object detector example

ii) Multiple Object Tracking (MOT)

Multiple object tracking is the task of tracking more than one object in the video. In this case, the algorithm assigns a unique variable to each of the objects that are detected in the video frame. Subsequently, it identifies and tracks all these multiple objects in consecutive/upcoming frames of the video.

Since a video may have a large number of objects, or the video itself may be unclear, and there can be ambiguity in direction of the object’s motion Multiple Object Tracking is a difficult task and it thus relies on single frame object detection.

Multiple object detection example
Multiple object detection example

Installing the libraries

i) Installing OpenCV

We install the opencv-contrib-python library for our purpose. It is a different community maintained OpenCV Python package that contains some extra features and implementation than the regular OpenCV Python package.

pip install opencv-contrib-python

ii) Installing Numpy

Numpy is an important pre requisite for any computer vision task and it can be installed like below.

pip install numpy

iii) Importing the libraries

Let us import these libraries as show below.

import cv2
import numpy as np

i) KCF Object Tracking

KCF stands for Kernelized Correlation Filter, it is is a combination of techniques of two tracking algorithms (BOOSTING and MIL tracker). It is supposed to translate the bounding box (position of the object) using circular shift. In simple words, the KCF tracker focuses on the direction of change in an image(could be motion, extension or, orientation) and tries to generate the probabilistic position of the object that is to be tracked.

KCF Object Tracking in OpenCV Python

The KCF object tracking is implemented in the TrackerKCF_create() module of OpenCV python. Below is the code along with the explanation.

tracker = cv2.TrackerKCF_create()
video = cv2.VideoCapture('video.mp4')
ok,frame=video.read()

bbox = cv2.selectROI(frame)

ok = tracker.init(frame,bbox)

while True:
   ok,frame=video.read()
   if not ok:
        break
   ok,bbox=tracker.update(frame)
   if ok:
        (x,y,w,h)=[int(v) for v in bbox]
        cv2.rectangle(frame,(x,y),(x+w,y+h),(0,255,0),2,1)
   else:
        cv2.putText(frame,'Error',(100,0),cv2.FONT_HERSHEY_SIMPLEX,1,(0,0,255),2)
   cv2.imshow('Tracking',frame)
   if cv2.waitKey(1) & 0XFF==27:
        break
cv2.destroyAllWindows()

Line 1-3: We first initialize the ‘KCF’ tracker object. Next, we initialize the video and then use the ‘read()’ function to fetch the first frame of the video.

Line 5: We initialize the ‘selectROI’ function with the first frame of the video which we fetched on the second line and store its value in the ‘bbox’ variable.

Line 7: We initialize the tracker (using ‘init’) with the frame (in which we selected our region of interest) and position (bbox) of the object to be tracked.

Line 9: Initialize a while loop that loops through the frames of our video.

Line 10: Use the ‘read()’ function on the video object to fetch the frames of the video along with a flag parameter(‘ok’) which informs if the frame fetching process was successful or not.

Line 11-12: If the flag parameter is false the execution stops i.e. if the video is not fetched properly the execution is stopped.

Line 13: We use the tracker ‘update’ function to pass a new consecutive frame with every iteration of the loop. It returns two variables, first is a flag parameter that informs if the tracking process was successful or not and the second returns the position of the tracked object in the frame if and only if the first parameter was true.

Line 14-16: If the ‘ok’ flag is true this block is executed. We fetched the position of the object in the ‘bbox’ variable, here we initialize the x,y coordinates and the values of width and height. Next, we use the OpenCV ‘rectangle’ function to put a bounding box around the detected object in consecutive frames of the video.

Line 17-18: If the tracker is unable to track the selected ROI or faces any errors, this block of code prints ‘Error’ on the video frames.

Line 19: Showing the video frames on a separate window using the ‘cv2.imshow’ function.

Line 20-21: If the user clicks the ‘escape’ button execution stops.

Line 22: Use the OpenCV ‘destroyAllWindows()’ function to close all lingering windows if there are any.

Output

Object tracking in OpenCV Python - KCF implementation
KCF implementation

ii) CSRT Object Tracking

CSRT is the OpenCV implementation of the CSR-DCF (Channel and Spatial Reliability of Discriminative Correlation Filter) it is an advanced algorithm that accommodates changes like enlarging and non-rectangular objects. Essentially it uses HoG features along with SRM(spatial reliability maps) for object localization and tracking.

CSRT Object Tracking in OpenCV Python

The CSRT object tracking is implemented in the TrackerCSRT_create() module of OpenCV python. It can be used with videos similar to the previous section. Just change the tracker variable to the CSRT one and you will be good to go.

tracker = cv2.TrackerCSRT_create()
video = cv2.VideoCapture('video.mp4')
ok,frame=video.read()

bbox = cv2.selectROI(frame)

ok = tracker.init(frame,bbox)

while True:
    ok,frame=video.read()
    if not ok:
        break
    ok,bbox=tracker.update(frame)
    if ok:
        (x,y,w,h)=[int(v) for v in bbox]
        cv2.rectangle(frame,(x,y),(x+w,y+h),(0,255,0),2,1)
    else:
        cv2.putText(frame,'Error',(100,0),cv2.FONT_HERSHEY_SIMPLEX,1,(0,0,255),2)
    cv2.imshow('Tracking',frame)
    if cv2.waitKey(1) & 0XFF==27:
        break
cv2.destroyAllWindows()

 

Output

Object tracking in OpenCV Python - CSRT implementation
CSRT implementation

Other Object Tracking Algorithms in OpenCV

The OpenCV object tracking API provides a variety of trackers. You can try some of the other tracking algorithms by simply changing the value of the tracker variable.

GOTURN Algorithm

tracker = cv2.TrackerGOTURN_create()

MIL algorithm

tracker = cv2.TrackerMIL_create()

iv) Histogram Density Algorithms for Object Tracking

i) Object tracking using Mean Shift algorithm

Mean Shift is a object tracking algorithm that uses the logic of pixel density in different images/histograms to track objects. It finds the closest cluster for a pixel point and iteratively moves toward it until a cluster center is reached or the error is below a threshold value. It basically means that it runs itself again and again (comparing pixel values to find a match) on an image until the object we are tracking is found (cluster). If the exact image cannot be found then an area with the maximum match is selected.

MeanShift example gif
MeanShift implementation example

Mean Shift Object Tracker Implementation in OpenCV Python

cap = cv2.VideoCapture('video.mp4')
ret,frame=cap.read()
x,y,w,h = cv2.selectROI(frame)
track_window = (x, y, w, h)
roi = frame[y:y+h, x:x+w]
hsv_roi =  cv2.cvtColor(roi, cv2.COLOR_BGR2HSV)
mask = cv2.inRange(hsv_roi, np.array((0., 60.,32.)), np.array((180.,255.,255.)))
roi_hist = cv2.calcHist([hsv_roi],[0],mask,[180],[0,180])
cv2.normalize(roi_hist,roi_hist,0,255,cv2.NORM_MINMAX)
term_crit = ( cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 1 )
while(1):
   ret, frame = cap.read()
   if ret == True:
       hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
       dst = cv2.calcBackProject([hsv],[0],roi_hist,[0,180],1)
       ret, track_window = cv2.meanShift(dst, track_window, term_crit)
       x,y,w,h = track_window
       img2 = cv2.rectangle(frame, (x,y), (x+w,y+h), 255,2)
       cv2.imshow('img2',img2)
       k = cv2.waitKey(30) & 0xff
       if k == 27:
           break
   else:
       break
cv2.destroyAllWindows()

Line 1-3: Load the video (on which tracking is to be performed) and fetch the first frame in the ‘frame’ variable. Finally, use the ‘selectROI’ function with the first frame and store the position of the object in the ‘bbox’ variable. This actually enables us to select our ROI manually instead of hard-coding it.

Line 4-5: Here we set up the initial location of the tracking window by using the values of the ROI that the user provided us. We initialize the ‘roi’ variable which holds the part of the image to be tracked.

Line 6-9: First we change the colorspace of our ROI to ‘HSV’ colorspace and then define a mask variable that ranges from max to min values(of pixels) present in the ‘roi’ variable. We perform histogram equalization on the ‘roi’ and lastly, we normalize the pixel values.

Line 10: We define a termination criterion (which is passed as an argument to the ‘meanShift’ algorithm function) an integer that defines the no of iterations(10) and another integer which defines how many units to move our computation square(1) as shown in the gif.

Line 11: Run a while loop to loop through the video.

Line 12: Use the ‘cv2’ ‘read()’ function to fetch the consecutive frames.

Line 13: If the flag variable ‘ret’ is true execute the code inside the ‘if’ statement.

Line 14-15: As we did in line 5 we need to change the colorspace of each and every consecutive frame that is being sent for tracking thus we apply the process again. Next, as we discussed in the definition the algorithm uses back-projection so we use the ‘calcBackProject()’ function.

Line 16: We call the ‘meanShift’ function that takes as arguments the image in which we need to detect and track the object, the termination criteria, and the position coordinates of the object to be detected. It returns the value of the coordinates of a rectangle which can be used as points for a bounding box parameter(‘ret’).

Line 17-19: First we update the positions of the object that is being tracked and store them(in x,y,w,h). Next, put a bounding box around the object(using the ‘rectangle’ function) and finally show the image(using ‘imshow’).

Line 20-22: This code block makes sure that if the user clicks the ‘escape’ button the execution stops.

Line 21-24: If the flag variable ‘ret’ is false the execution flow breaks out of the loop.

Line 25: Closes all windows.

Output

Object tracking in OpenCV Python - Meanshift implementation
Meanshift implementation

ii) Object Tracking using Cam Shift algorithm

The problem with the mean shift algorithm is that the size of the bounding box always remains the same(even when the object starts to approach the camera and ultimately increases in size or vice versa). Continuously adaptive mean shift or CAM algorithm solves this problem for us (It applies mean shift on the previous window and new scaled search window). It is supposed to update the size of the window as

s=2×√M/256

and finds the best ellipse that fits our object and returns its minor and major axis.

CamShift implementation gif
CamShift implementation example

Cam Shift Object Tracker Implementation in OpenCV Python

cap = cv2.VideoCapture('video.mp4')
ret,frame=cap.read()
x,y,w,h = cv2.selectROI(frame)
track_window = (x, y, w, h)
roi = frame[y:y+h, x:x+w]
hsv_roi =  cv2.cvtColor(roi, cv2.COLOR_BGR2HSV)
mask = cv2.inRange(hsv_roi, np.array((0., 60.,32.)), np.array((180.,255.,255.)))
roi_hist = cv2.calcHist([hsv_roi],[0],mask,[180],[0,180])
cv2.normalize(roi_hist,roi_hist,0,255,cv2.NORM_MINMAX)
term_crit = ( cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 1 )
while(1):
   ret, frame = cap.read()
   if ret == True:
       hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
       dst = cv2.calcBackProject([hsv],[0],roi_hist,[0,180],1)
       ret, track_window = cv2.CamShift(dst, track_window, term_crit)
       pts = cv2.boxPoints(ret)
       pts=np.int0(pts)
       img2 = cv2.polylines(frame, [pts], True, 255,2)
       cv2.imshow('img2',img2)
       k = cv2.waitKey(30) & 0xff
       if k == 27:
           break
   else:
       break
cv2.destroyAllWindows()

This implementation of Cam Shift is similar to Mean Shift implementation. Here we just change the ‘meanShift’ function into the ‘CamShift’ function.

Line 16-19: This time since the coordinates need not be a perfect rectangle thus we use the first var ‘ret’. We need to typecast it into ‘integer’ in order to be sent to the ‘polylines’ function.

Output

Object tracking in OpenCV Python - Camshift implementation
Camshift implementation

 

  • Gaurav Maindola

    I am a machine learning enthusiast with a keen interest in web development. My main interest is in the field of computer vision and I am fascinated with all things that comprise making computers learn and love to learn new things myself.

Follow Us

Leave a Reply

Your email address will not be published. Required fields are marked *