Written as part of an invited talk at the Aspects of Neuroscience BrainHack 2017, Warsaw, Poland.
Sebiastian Ruder has written an excellent paper [1] and blog post [2] about many different optimisation algorithms, but I couldn't find any associated code, while wanting to try the algorithms on other cost functions, tweak the parameters and observe first-hand how the different methods compare; hence this notebook.
By Charl Linssen [email protected], Nov 19th, 2017.
Released under the CC0 1.0 Universal ("public domain") licence.
[1]: "An overview of gradient descent optimization algorithms", Sebasian Ruder, URL: http://ruder.io/optimizing-gradient-descent/ (retrieved Nov 19th, 2017)
[2]: Sebastian Ruder (2016). An overview of gradient descent optimisation algorithms. arXiv preprint arXiv:1609.04747