After a general introduction on neural networks, we explain different recurrent neural networksarchitectures such as GRU and LSTM and how to train them with Backpropagation Trough Time (BPTT).
In this lecture, we will provide a general overview of Deep Learning and neural network architectures, with a focus on Long Short-Term Memory (LSTM) neural networks, which can model sequential data such as time series.
We begin by demonstrating how we train a feed-forward neural network with stochastic gradient descent by deriving the gradient update with backpropagation.
We explain how, due to the vanishing gradient problem, this type of neural network can have learning difficulties with sequential data. Finally, we demonstrate how recurrent neural networks, specifically the LSTM architecture, address this issue.