Skip to content

epfml/grad-norm-smooth

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Supplementary code for "Gradient-Normalized Smoothness for Optimization with Approximate Hessians"

arXiv

This code comes jointly with reference:

Andrei Semenov, Martin Jaggi, Nikita Doikov.

Date: June 2025

Structure

src/
    methods.py         # Algorithm 1 from the paper, algorithms with other adaptive search schemes, gradient methods
    oracles.py         # LogSumExp, Nonlinear Equations with linear operator and Chebyshev polynomials,  Rosenbrock function, etc.
    approximations.py  # code for Hessian approximations for different oracles
    utils.py           # code for plotting graphs
    data/
        mushrooms.txt  # example of a dataset; you can add here more
notebooks/
    examples.ipynb     # examples of approximations and comparison of methods

Quickstart

Simply run the examples.ipynb notebook. At the beginning of the notebook, we provide practical approximations for each oracle. All of them are compatible with our theory. In particular, we investigated the following approximations.

Problem Naming in the paper Approximation Code reference in src/approximations.py
LogSumExp Weighted Gauss-Newton $\frac{1}{\mu}\mathbf{A}^\top Diag\left(\mathrm{softmax}\left(\mathbf{A}, x\right)\right)\mathbf{A}$ approx_hess_fn_logsumexp
Equations with linear operator Fisher Term of $\mathbf{H}$ $\frac{p-2}{\lVert u(x) \rVert^p} \nabla f(x) \nabla f(x)^\top$ approx_hess_fn_fisher_term
Nonlinear Equations & Rosenbrock Inexact Hessian $\lVert u(x)\rVert^{p - 2} \nabla u(x)^\top \mathbf{B} \nabla u(x) + \frac{p - 2}{\lVert u(x) \rVert^p} \nabla f(x) \nabla f(x)^{\top}$ approx_hess_nonlinear_equations
Nonlinear Equations & Chebyshev polynomials Inexact Hessian $\lVert u(x) \rVert^{p - 2} \nabla u(x)^\top \mathbf{B} \nabla u(x) + \frac{p - 2}{\lVert u(x) \rVert^p} \nabla f(x) \nabla f(x)^{\top}$ approx_hess_fn_chebyshev

You can also use a fast implementation of our algorithm, which corresponds to grad_norm_smooth_for_rank_one function in examples.py. Thus, you could obtain the following nice examples:

Fisher Fisher time

We believe the details provided are clear enough to reproduce the main findings of our paper.

@misc{semenov2025gradientnormalizedsmoothnessoptimizationapproximate,
      title={Gradient-Normalized Smoothness for Optimization with Approximate Hessians}, 
      author={Andrei Semenov and Martin Jaggi and Nikita Doikov},
      year={2025},
      eprint={2506.13710},
      archivePrefix={arXiv},
      primaryClass={math.OC},
      url={https://arxiv.org/abs/2506.13710}, 
}

About

Official implementation of "Gradient-Normalized Smoothness for Optimization with Approximate Hessians"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •