CRM: Centro De Giorgi
logo sns
Mathematical and Computational Aspects of Machine Learning

Linear Autoencoder Pretraining for Recurrent Neural Networks

speaker: Antonio Carta (University of Pisa)

abstract: Orthogonal recurrent neural networks address the vanishing gradient problem by parameterizing the recurrent connections using an orthogonal matrix. This class of models is particularly effective to solve tasks that require the memorization of long sequences. We propose an alternative solution based on explicit memorization using linear autoencoders for sequences. We propose an initialization schema that sets the weights of a recurrent architecture to approximate a linear autoencoder of the input sequences, which can be found with a closed-form solution. We argue that this approach is superior to a random orthogonal initialization due to the autoencoder, which allows the memorization of long sequences even before training. The empirical analysis shows that our approach achieves competitive results against orthogonal models and the LSTM

Wed 9 Oct, 16:20 - 16:40, Aula Dini
<< Go back