Training From Scratch Not Worse Than ImageNet Pre-Training

Image for post
Image for post
The model, ResNet50-FPN Using GN, trained from random initialization needs more iterations to converge, but converges to a solution that is no worse than the fine-tuning counterpart.
  • Training from random initialization is surprisingly robust, the results hold even when: (i) using only 10% of the training data, (ii) for deeper and wider models, and (iii) for multiple tasks and metrics.
  • ImageNet pre-training speeds up convergence early in training, but does not necessarily provide regularization or improve final target task accuracy.


Video Frame Interpolation Using ConvLSTM for Camera Tampering Detection

Image for post
Image for post
Surveillance Cameras on Moving Train
  • Video Frame Interpolation Using ConvLSTM is used to predict a video frame.
  • The predicted frame is compared with the current frame to decide if there is a (cyber or physical) tamper occurred.

Outline

  1. Cyber Attacks and Physical Tampering Against Surveillance Cameras
  2. Proposed Video Frame Interpolation Using ConvLSTM
  3. Experimental Results

1. Cyber Attacks and Physical Tampering Against Surveillance Cameras

  • Surveillance cameras, like all Internet of Things (IoT) devices, are also at risk to a…


Attention Branch for Various Kernel Sizes, Outperforms SENet

  • A building block called Selective Kernel (SK) unit is designed, in which multiple branches with different kernel sizes are fused using softmax attention that is guided by the information in…


Training Autoencoder by Pretraining Restricted Boltzmann Machine (RBM) for Data Visualization

Happy Chinese New Year of the Ox 2021 !!

Image for post
Image for post
  • An autoencoder is trained to reduce the data dimensions for data visualization.

Outline

  1. Pretraining by Training Restricted Boltzmann Machine (RBM)
  2. Unrolling to Form an Autoencoder
  3. Fine-tuning the Autoencoder
  4. Experimental Results

1. Pretraining by Training Restricted Boltzmann Machine (RBM)


Camera Tampering Detection Using AlexNet, ResNet & DenseNet

  • A synthetic dataset with over 6.8 million annotated images is proposed.
  • The problem of tampering detection is formulated as a classification problem. Deep learning architectures, such as AlexNet, ResNet, and DenseNet, are used for evaluation.

Outline

  1. Camera Tampering
  2. UHCTD: University of Houston Camera Tampering Detection Dataset
  3. Tampering Synthesis
  4. Tampering Detection as a Classification Problem

1. Camera Tampering

  • There are many types of camera tampering. For example:
  • Covered tampering occurs when the view…


Introducing A Background Map to Solve the Crowding Problem

  • Aspect Maps are introduced for Data with a Mixture of Maps.
  • A Background Map is used to solve the crowding problem.

Outline

  1. Brief Review of SNE & Symmetric SNE
  2. Aspect Maps
  3. UNI-SNE: A Background Map

1. Brief Review of SNE & Symmetric SNE

1.1. SNE

  • To visualize the high dimensional data, we need to map those data to a low dimensional…


Visualizing High-Dimensional Data in Low-Dimensional Space

Image for post
Image for post
t-SNE on MNIST Data set (From Google TechTalk by the First Author, Laurens van der Maaten, https://www.youtube.com/watch?v=RJVL80Gg3lA)
  • t-SNE is proposed, compared to SNE, it is much easier to optimize.
  • t-SNE reduces the crowding problem, compared to SNE.
  • t-SNE has been used in various fields for data visualization.

Outline

  1. Dimensionality Reduction for Data Visualization
  2. Brief Review of SNE
  3. t-Distributed Stochastic Neighbor Embedding (t-SNE)
  4. Experimental…


High-Dimensional Data Mapping to Low-Dimensional Space

  • A probabilistic approach, by pairwise dissimilarity, using Gaussian, is used to map the high-dimensional data distribution on the low-dimensional space, e.g. 2D space, for data visualization.
  • Cost function, a sum of Kullback-Leibler divergences, is minimized using gradient descent.

Outline

  1. Basic Stochastic Neighbor Embedding (SNE)
  2. Experimental Results

1. Basic Stochastic Neighbor Embedding (SNE)


Image for post
Image for post
Some challenging cases for defocus blur detection
  • A novel recurrent residual refinement branch embedded with multiple residual refinement modules (RRMs) is designed.
  • The deep features from different layers are aggregated to learn the residual between the intermediate prediction and the ground truth for each recurrent step in each residual refinement branch.
  • The side output of each branch is fused to obtain the final blur detection map.


Image for post
Image for post
Examples of Blur Detection
  • A Deep Pyramid Network (DPN) with recurrent Distinction Enhanced Block modules is designed.
  • A new Distinction Enhanced Block (DEB) is introduced to merge the high-level semantic information with the low-level details effectively.
  • A new blur detection dataset (SZU-DB) is constructed.

Outline

  1. DPN: Network Architecture
  2. Distinction Enhanced Block (DEB)
  3. Loss Function
  4. Dataset
  5. Ablation Study
  6. Experimental Results

1. DPN: Network Architecture

Sik-Ho Tsang

PhD, Researcher. I share what I've learnt and done. :) My LinkedIn: https://www.linkedin.com/in/sh-tsang/, My Paper Reading List: https://bit.ly/33TDhxG

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store