Review — GIoU: Generalized Intersection over Union

Generalized IoU for Object Detection

Two sets of examples (a) and (b), IoU and GIoU values are very different.

Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression
GIoU, by Stanford University, The University of Adelaide, and Aibee Inc.
2019 CVPR, Over 1200 Citations (Sik-Ho Tsang @ Medium)
Object Detection

  • The weakness of Intersection over Union (IoU) is addressed, and Generalized IoU (GIoU) is proposed.
  • YOLOv3, Faster R-CNN, and Mask R-CNN using GIoU, obtains better performance.

Outline

  1. Intersection over Union (IoU)
  2. Generalized IoU (GIoU)
  3. Experimental Results

1. Intersection over Union (IoU)

  • Intersection over Union (IoU) for comparing similarity between two arbitrary shapes (volumes) A, B S R^n is attained by:
  • IoU can be treated as a distance, e.g. L_IoU=1−IoU, is a metric [9].
  • L_IoU fulfills all properties of a metric such as non-negativity, identity of indiscernibles, symmetry and triangle inequality.
  • IoU is invariant to the scale of the problem. This means that the similarity between two arbitrary shapes A and B is independent from the scale of their space S.

However, IoU has a major weakness: If |AB|=0, IoU(A, B)=0. In this case, IoU does not reflect if two shapes are in vicinity of each other or very far from each other.

  • A general extension to IoU, namely Generalized Intersection over Union GIoU, is proposed.

2. Generalized IoU (GIoU)

Generalized Intersection over Union

For two arbitrary convex shapes (volumes) A, BSR^n, we first find the smallest convex shapes C SR^n enclosing both A and B.

Then we calculate a ratio between the volume (area) occupied by C excluding A and B and divide by the total volume (area) occupied by C.

  • Similar to IoU, GIoU as a distance, e.g. L_GIoU=1−GIoU, holding all properties of a metric such as non-negativity, identity of indiscernibles, symmetry and triangle inequality.
  • Similar to IoU, GIoU is invariant to the scale of the problem.
  • GIoU is always a lower bound for IoU. (Please read paper for the proof.)
  • Similar to IoU, the value 1 occurs only when two objects overlay perfectly.
  • GIoU value asymptotically converges to -1 when the ratio between occupying regions of two shapes, |AB|, and the volume (area) of the enclosing shape |C| tends to zero.

In summary, this generalization keeps the major properties of IoU while rectifying its weakness. Therefore, GIoU can be a proper substitute for IoU in all performance measures used in 2D/3D computer vision tasks.

3. Experimental Results

3.1. YOLOv3

  • To train YOLOv3 using IoU and GIoU losses, the bounding box regression MSE loss is simply replaced with LIoU and LGIoU losses.
  • A very minimal effort to regularize these new regression losses against the MSE classification loss.
Test set of PASCAL VOC 2007
5K of the 2014 validation set of MS COCO
Test set of MS COCO 2018

The results show consistent improvement in performance for YOLOv3 when it is trained using LGIoU as regression loss.

The classification loss and accuracy (average IoU) against training iterations when YOLOv3 was trained using its standard (MSE) loss as well as LIoU and LGIoU losses
  • (a): The localization accuracy for YOLOv3 significantly improves when LGIoU loss is used.
  • (b): However, with the current naïve tuning of regularization parameters, balancing bounding box loss vs. classification loss, the classification scores may not be optimal, compared to the baseline.

3.2. Faster R-CNN and Mask R-CNN

  • To train Faster R-CNN and Mask R-CNN using IoU and GIoU losses, their ℓ1-smooth loss in the final bounding box refinement stage is replaced with LIoU and LGIoU losses.
  • LIoU and LGIoU losses are simply multiplied by a factor of 10 for all experiments.
mAP value against different IoU thresholds

Incorporating LIoU as the regression loss can slightly improve the performance of Faster R-CNN on this benchmark.

Faster R-CNN on Test set of MS COCO 2018
Mask R-CNN on Test set of MS COCO 2018

Training Faster R-CNN and Mask R-CNN using LGIoU as the bounding box regression loss can consistently improve its performance compared to its own regression loss (ℓ1-smooth).

3.3. Qualitative Results

Example results from COCO validation using YOLOv3 trained using (left to right) LGIoU, LIoU, and MSE losses Ground truth is shown by a solid line and predictions are represented with dashed lines.
Two example results from COCO validation using Mask R-CNN trained using (left to right) LGIoU, LIoU, ℓ1-smooth losses Ground truth is shown by a solid line and predictions are represented with dashed lines.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store