info coming from bengio
Recurrent Nets
[1] Learning long-term dependencies with gradient descent is difficult
[2] Advances in Optimizing Recurrent Networks
[3] Learning recurrent neural networks with Hessian-free optimization
[4] On the importance of momentum and initialization in deep learning,
[5] Long short-term memory (Hochreiter & Schmidhuber)
[6] Generating Sequences With Recurrent Neural Networks
[7] Long Short-Term Memory in Echo State Networks: Details of a Simulation Study
[8] The "echo state" approach to analysing and training recurrent neural networks
[9] Backpropagation-Decorrelation: online recurrent learning with O(N) complexity
[10] New results on recurrent network training:Unifying the algorithms and accelerating convergence
[11] Audio Chord Recognition with Recurrent Neural Networks
Convolutional Nets
[1] Generalization and Network Design Strategies (LeCun)
[2] ImageNet Classification with Deep Convolutional Neural Networks, Alex Krizhevsky, Ilya Sutskever, Geoffrey E Hinton, NIPS 2012.
[3] On Random Weights and Unsupervised Feature Learning
Optimization issues with DL
[2] Evolving Culture vs Local Minima
[3] Knowledge Matters: Importance of Prior Information for Optimization
[5] Practical recommendations for gradient-based training of deep architectures
[6] Natural Gradient Works Efficiently (Amari 1998)
[7] Hessian Free
[8] Natural Gradient (TONGA)