Brief Review — Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation

Copy-Paste Augments Images for Training

3 min readAug 2, 2022

--

**Data-efficiency on the COCO benchmark**

Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation, Copy-Paste, by Google Research, Brain Team, UC Berkeley, and Cornell University
2021 CVPR, Over 200 Citations (Sik-Ho Tsang @ Medium)
Instance Segmentation, Data Augmentation, Object Detection

Prior studies on Copy-Paste relied on modeling the surrounding visual context for pasting the objects.
In this paper, it is found that the simple mechanism of pasting objects randomly is good enough and can provide solid gains.

Outline

Copy-Paste
Experimental Results

1. Copy-Paste

1.1. Copy-Paste Overall Procedures

**Simple copy and paste method to create new images for training instance segmentation models.**

Two images are randomly selected. Random scale jittering and random horizontal flipping are applied on each of them.
Then, a random subset of objects is selected from one of the images and paste them onto the other image.
Lastly, the ground-truth annotations are adjusted accordingly: Fully occluded objects are removed. The masks and bounding boxes of partially occluded objects are updated.

Giraffes and soccer players with very different scales can appear next to each other.

1.2. Scale Jittering

**Notation and visualization of the two scale jittering augmentation methods used throughout the paper**

Standard scale jittering (SSJ) and large scale jittering (LSJ) are used. These methods randomly resize and crop images.
Standard Scale Jittering (SSJ) resizes and crops an image with a resize range of 0.8 to 1.25 of the original image size.
The resize range in Large Scale Jittering (LSJ) is from 0.1 to 2.0 of the original image size. The large scale jittering yields significant performance improvements over the standard scale jittering used in many prior arts.

1.3. Self-training Copy-Paste

A supervised model with Copy-Paste augmentation is trained on labeled data.
Pseudo labels are generated on unlabeled data.
Ground-truth instances are pasted into pseudo labeled and supervised labeled images and a model is trained on this new data.