Brief Review — Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation

Copy-Paste Augments Images for Training

Sik-Ho Tsang
3 min readAug 2, 2022


Data-efficiency on the COCO benchmark

Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation, Copy-Paste, by Google Research, Brain Team, UC Berkeley, and Cornell University
2021 CVPR, Over 200 Citations (Sik-Ho Tsang @ Medium)
Instance Segmentation, Data Augmentation, Object Detection

  • Prior studies on Copy-Paste relied on modeling the surrounding visual context for pasting the objects.
  • In this paper, it is found that the simple mechanism of pasting objects randomly is good enough and can provide solid gains.


  1. Copy-Paste
  2. Experimental Results

1. Copy-Paste

1.1. Copy-Paste Overall Procedures

Simple copy and paste method to create new images for training instance segmentation models.
  1. Two images are randomly selected. Random scale jittering and random horizontal flipping are applied on each of them.
  2. Then, a random subset of objects is selected from one of the images and paste them onto the other image.
  3. Lastly, the ground-truth annotations are adjusted accordingly: Fully occluded objects are removed. The masks and bounding boxes of partially occluded objects are updated.
  • Giraffes and soccer players with very different scales can appear next to each other.

1.2. Scale Jittering

Notation and visualization of the two scale jittering augmentation methods used throughout the paper
  • Standard scale jittering (SSJ) and large scale jittering (LSJ) are used. These methods randomly resize and crop images.
  • Standard Scale Jittering (SSJ) resizes and crops an image with a resize range of 0.8 to 1.25 of the original image size.
  • The resize range in Large Scale Jittering (LSJ) is from 0.1 to 2.0 of the original image size. The large scale jittering yields significant performance improvements over the standard scale jittering used in many prior arts.

1.3. Self-training Copy-Paste

  1. A supervised model with Copy-Paste augmentation is trained on labeled data.
  2. Pseudo labels are generated on unlabeled data.
  3. Ground-truth instances are pasted into pseudo labeled and supervised labeled images and a model is trained on this new data.

2. Experimental Results

Copy-Paste provides gains that are robust to training configurations

Copy-Paste provides gains that are robust to training configurations.

Copy-Paste is additive to large scale jittering augmentation

Copy-Paste outperforms mixup.

Copy-paste works well across a variety of different model architectures, model sizes and image resolutions

With Copy-Paste, backbone of EfficientNet-B7 with FPN got higher AP.

  • Self-training Copy-Paste obtains even better results by utilizing pseudo labels on unlabeled data in semi-supervised setting.
Comparison with the state-of-the-art models on COCO object detection and instance segmentation


[2021 CVPR] [Copy-Paste]
Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation

Instance Segmentation

20142021 [PVT, PVTv1] [Copy-Paste] 2022 [PVTv2]

My Other Previous Paper Readings



Sik-Ho Tsang

PhD, Researcher. I share what I learn. :) Linktree: for Twitter, LinkedIn, etc.