[Paper] P3D: Pseudo-3D Residual Networks (Video Classification & Action Recognition)

Factorized 3D Convolutions, Outperforms Deep Video & C3D

Outline

1. Pseudo-3D (P3D) Convolution

Pseudo-3D (P3D) Convolution blocks.

2. P3D ResNet Block Variants

P3D ResNet Block
model size, speed, and accuracy on UCF101 (split1).

3. Experimental Results

Top-1 clip-level accuracy and Top-1&5 video-level accuracy on Sports-1M.
Performance comparisons with the state-of-the-art methods on UCF101 (3 splits).
Performance comparisons in terms of Top-1 & Top-3 classification accuracy, and mean AP on ActivityNet
Action similarity labeling performances on ASLAN benchmark.
The accuracy performance of scene recognition on Dynamic Scene and YUPENN sets

--

--

PhD, Researcher. I share what I learn. :) Linktree: https://linktr.ee/shtsang for Twitter, LinkedIn, etc.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store