# Brief Review — Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting

## ConvLSTM, Convolutions Used Within LSTM

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting,, by Hong Kong University of Science and Technology, and Hong Kong Observatory

ConvLSTM2015 NIPS, Over 5500 Citations(Sik-Ho Tsang @ Medium)

Forecast Prediction

**ConvLSTM**is proposed where**convolution is added into LSTM**for forecast prediction.

# Outline

**ConvLSTM****Results**

# 1. ConvLSTM

## 1.1. Standard FC-LSTM

**LSTM**as a special RNN structure has proven**stable**and powerful for**modeling long-range dependencies**in various previous studies.**FC-LSTM**may be seen as a multivariate version of LSTM where the input, cell output and states are**all 1D vectors:**

- (Please visit LSTM if interested.)

## 1.2. Proposed ConvLSTM

- An extension of FC-LSTM,
**ConvLSTM**, is proposed, which has**convolutional structures**in both the input-to-state and state-to-state transitions:

- where * is the convolutional operator.
- In this sense,
**FC-LSTM**is actually**a special case of ConvLSTM**with**all features standing on a single cell**. - Zero padding is used in the hidden state.

## 1.3. Encoding-Forecasting Structure

- For
**precipitation nowcasting, the observation at every timestamp**is**a 2D radar echo map**. - If we
**divide the map into tiled non-overlapping patches**and view the pixels inside a patch as its measurements (see the above figure), the nowcasting problem naturally becomes**a spatiotemporal sequence forecasting problem.**

- For the proposed spatiotemporal sequence forecasting problem, the model consists of two networks, an
**encoding network**and a**forecasting network.** - The encoding LSTM
**compresses the whole input sequence into a hidden state tensor**and the forecasting LSTM**unfolds this hidden state to give the final prediction:**

# 2. Results

## 2.1. Moving-MNIST Dataset

**Two handwritten digits bouncing inside a 64×64 patch**. The starting position and velocity direction are chosen uniformly at random and the velocity amplitude is chosen randomly in [3; 5).

ConvLSTM obtains lower loss than FC-LSTM.

The model can

separate the overlapping digits successfullyandpredict the overall motion although the predicted digits are quite blurred.

## 2.2. Radar Echo Dataset

- The radar echo dataset used in this paper is a subset of the
**three-year weather radar intensities collected in Hong Kong from 2011 to 2013**. Since not every day is rainy and our nowcasting target is precipitation, the top 97 rainy days are selected to form the dataset.

- The performance of the
**FC-LSTM**network is**not so good**for this task, which is mainly caused by the**strong spatial correlation in the radar maps**, i.e., the**motion of clouds**is highly consistent in a local region.

ConvLSTM outperforms the optical flow based ROVER algorithm, since it is trained end-to-end for this task andsome complex spatiotemporal patternsin the datasetcan be learned by the nonlinear and convolutional structure of the network.

- Some prediction results of ROVER2 and ConvLSTM are shown above.

ConvLSTM gains high citation. It is amazing that it is partially done by HK Observatory! I always watch their weather forecast!

## Reference

[2015 NIPS] [ConvLSTM]

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting

## Language Model / Sequence Model

**2007 **… **2015** … [ConvLSTM] **2016 …** **2020 **[ALBERT] [GPT-3] [T5] [Pre-LN Transformer] [MobileBERT] [TinyBERT]