Optimization¶
Root finding, unconstrained minimization, and nonlinear least squares.
Requires the optim Cargo feature:
Algorithm Summary¶
| Algorithm | Fixed-size | Dynamic-size | Use case |
|---|---|---|---|
| Brent's method | brent |
— | Bracketed scalar root finding — superlinear convergence |
| Newton 1D | newton_1d |
— | Scalar root finding with analytic derivative |
| BFGS | minimize_bfgs |
minimize_bfgs_dyn |
Unconstrained smooth minimization |
| Gauss-Newton | least_squares_gn |
least_squares_gn_dyn |
Nonlinear least squares (QR-based, full-rank Jacobian) |
| Levenberg-Marquardt | least_squares_lm |
least_squares_lm_dyn |
Nonlinear least squares (damped, more robust) |
The fixed-size routines work on Vector<T, N> / Matrix<T, M, N> and are
no-alloc (suitable for embedded). The _dyn variants take DynVector<T> /
DynMatrix<T> and pick the parameter / residual dimension at runtime — they
require the alloc feature.
Root Finding¶
Brent's Method¶
Bracketed root finding — guaranteed to converge given a sign change.
use numeris::optim::{brent, RootSettings};
// Solve x² - 2 = 0 on [0, 2]
let result = brent(
|x| x * x - 2.0,
0.0_f64, 2.0,
&RootSettings::default(),
).unwrap();
assert!((result.x - std::f64::consts::SQRT_2).abs() < 1e-12);
println!("root = {}, f(root) = {}, iters = {}", result.x, result.fx, result.iterations);
Newton 1D¶
Scalar Newton's method with analytic derivative — quadratic convergence near the root.
use numeris::optim::{newton_1d, RootSettings};
// Solve cos(x) = 0 near x = 1.5
let result = newton_1d(
|x| x.cos(),
|x| -x.sin(),
1.5_f64,
&RootSettings::default(),
).unwrap();
assert!((result.x - std::f64::consts::FRAC_PI_2).abs() < 1e-12);
Settings¶
use numeris::optim::RootSettings;
let settings = RootSettings {
tol: 1e-12, // convergence tolerance on |f(x)| and bracket width
max_iter: 200,
..RootSettings::default()
};
BFGS Minimization¶
Quasi-Newton unconstrained minimization with Armijo backtracking line search. Requires both a function and its gradient. Each step is
where \(\alpha_k\) is the line-search step length and \(H_k \approx (\nabla^2 f)^{-1}\) is the BFGS inverse-Hessian estimate, updated from \(s_k = x_{k+1}-x_k\), \(y_k = \nabla f_{k+1} - \nabla f_k\), and \(\rho_k = 1/(y_k^\top s_k)\):
use numeris::optim::{minimize_bfgs, BfgsSettings};
use numeris::Vector;
// Minimize the Rosenbrock function: (1-x)² + 100(y-x²)²
let result = minimize_bfgs(
|v: &Vector<f64, 2>| {
let x = v[0]; let y = v[1];
(1.0 - x).powi(2) + 100.0 * (y - x * x).powi(2)
},
|v: &Vector<f64, 2>| {
let x = v[0]; let y = v[1];
Vector::from_array([
-2.0 * (1.0 - x) - 400.0 * x * (y - x * x),
200.0 * (y - x * x),
])
},
&Vector::from_array([-1.0, 1.0]), // initial guess
&BfgsSettings::default(),
).unwrap();
assert!((result.x[0] - 1.0).abs() < 1e-5);
assert!((result.x[1] - 1.0).abs() < 1e-5);
Using Finite-Difference Gradient¶
When an analytic gradient is unavailable:
use numeris::optim::{minimize_bfgs, finite_difference_gradient, BfgsSettings};
use numeris::Vector;
let f = |v: &Vector<f64, 2>| (v[0] - 1.0).powi(2) + (v[1] - 2.0).powi(2);
let grad = |v: &Vector<f64, 2>| finite_difference_gradient(f, v);
let result = minimize_bfgs(f, grad, &Vector::from_array([0.0, 0.0]), &BfgsSettings::default()).unwrap();
BfgsSettings¶
use numeris::optim::BfgsSettings;
let settings = BfgsSettings {
grad_tol: 1e-8, // gradient norm convergence criterion
max_iter: 1000,
..BfgsSettings::default()
};
Gauss-Newton Least Squares¶
QR-based Gauss-Newton for nonlinear least squares: minimizes \(\tfrac12\|r(x)\|^2\) for residuals \(r(x) \in \mathbb{R}^m\) with Jacobian \(J = \partial r/\partial x\). Each step solves the normal equations
via a QR factorization of \(J\) (without forming \(J^\top J\) explicitly). Works best when the Jacobian is full rank and the residual is small at the solution.
use numeris::optim::{least_squares_gn, GnSettings};
use numeris::{Matrix, Vector};
// Fit y = a * exp(b * x) to noisy data
let t = [0.0_f64, 1.0, 2.0, 3.0, 4.0];
let y = [2.0_f64, 2.7, 3.65, 4.95, 6.7];
let result = least_squares_gn(
// Residual function: r_i = model(x) - y_i
|x: &Vector<f64, 2>| {
let mut r = Vector::<f64, 5>::zeros();
for i in 0..5 { r[i] = x[0] * (x[1] * t[i]).exp() - y[i]; }
r
},
// Jacobian ∂r/∂x
|x: &Vector<f64, 2>| {
let mut j = Matrix::<f64, 5, 2>::zeros();
for i in 0..5 {
let e = (x[1] * t[i]).exp();
j[(i, 0)] = e;
j[(i, 1)] = x[0] * t[i] * e;
}
j
},
&Vector::from_array([1.0, 0.5]), // initial guess [a, b]
&GnSettings::default(),
).unwrap();
println!("a = {:.4}, b = {:.4}", result.x[0], result.x[1]);
println!("cost = {:.6}", result.cost);
Levenberg-Marquardt¶
Damped Gauss-Newton with adaptive regularization — more robust than pure GN when the Jacobian is ill-conditioned or the initial guess is far from the solution. The step adds a damping term \(\mu \ge 0\) to the normal equations:
\(\mu\) is adapted each iteration — increased when a step fails (toward gradient descent), decreased when it succeeds (toward Gauss-Newton).
use numeris::optim::{least_squares_lm, LmSettings};
use numeris::{Matrix, Vector};
let t = [0.0_f64, 1.0, 2.0, 3.0, 4.0];
let y = [2.0_f64, 2.7, 3.65, 4.95, 6.7];
let result = least_squares_lm(
|x: &Vector<f64, 2>| {
let mut r = Vector::<f64, 5>::zeros();
for i in 0..5 { r[i] = x[0] * (x[1] * t[i]).exp() - y[i]; }
r
},
|x: &Vector<f64, 2>| {
let mut j = Matrix::<f64, 5, 2>::zeros();
for i in 0..5 {
let e = (x[1] * t[i]).exp();
j[(i, 0)] = e;
j[(i, 1)] = x[0] * t[i] * e;
}
j
},
&Vector::from_array([1.0, 0.1]),
&LmSettings::default(),
).unwrap();
assert!(result.cost < 0.1);
LmSettings¶
use numeris::optim::LmSettings;
let settings = LmSettings {
grad_tol: 1e-8, // ∥Jᵀr∥ convergence criterion
step_tol: 1e-8, // step size convergence criterion
cost_tol: 1e-8, // cost reduction convergence criterion
lambda0: 1e-3, // initial damping factor
max_iter: 200,
..LmSettings::default()
};
Finite Difference Utilities¶
use numeris::optim::{finite_difference_gradient, finite_difference_jacobian};
use numeris::{Matrix, Vector};
// Gradient of a scalar function ℝⁿ → ℝ
let f = |x: &Vector<f64, 3>| x[0].powi(2) + x[1].powi(2) + x[2].powi(2);
let x = Vector::from_array([1.0_f64, 2.0, 3.0]);
let g: Vector<f64, 3> = finite_difference_gradient(f, &x);
// g ≈ [2, 4, 6]
// Jacobian of a vector function ℝⁿ → ℝᵐ (M residuals, N parameters)
let r = |x: &Vector<f64, 2>| Vector::from_array([x[0].powi(2) - 1.0, x[1].powi(2) - 4.0]);
let x2 = Vector::from_array([1.0_f64, 2.0]);
let j: Matrix<f64, 2, 2> = finite_difference_jacobian(r, &x2);
Result Types¶
use numeris::optim::{OptimResult, LsqResult};
// Scalar root finding
// result.x — solution scalar
// result.fx — f(x) at solution
// result.iterations — iterations taken
// BFGS minimization
// result.x — solution Vector<T, N>
// result.fx — f(x) at solution
// result.grad_norm — ∥∇f∥ at solution
// result.iterations — iterations taken
// Least squares
// result.x — solution Vector<T, N>
// result.cost — ½‖r‖² at solution
// result.iterations — iterations taken
Dynamic-Dimension Variants¶
When the parameter or residual dimension is only known at runtime, use the
_dyn routines. They mirror the fixed-size API but accept DynVector<T> /
DynMatrix<T> and require the alloc feature.
use numeris::optim::{least_squares_lm_dyn, LmSettings};
use numeris::{DynMatrix, DynVector};
let t = vec![0.0_f64, 1.0, 2.0, 3.0, 4.0];
let y = vec![2.0, 2.7, 3.65, 4.95, 6.7];
let m = t.len();
let result = least_squares_lm_dyn(
|x: &DynVector<f64>| {
let mut r = DynVector::zeros(m);
for i in 0..m {
r[i] = x[0] * (x[1] * t[i]).exp() - y[i];
}
r
},
|x: &DynVector<f64>| {
let mut j = DynMatrix::zeros(m, 2);
for i in 0..m {
let e = (x[1] * t[i]).exp();
j[(i, 0)] = e;
j[(i, 1)] = x[0] * t[i] * e;
}
j
},
&DynVector::from_slice(&[1.0, 0.1]),
&LmSettings::default(),
).unwrap();
assert!(result.cost < 0.1);
minimize_bfgs_dyn, least_squares_gn_dyn, finite_difference_gradient_dyn,
and finite_difference_jacobian_dyn follow the same pattern. Settings structs
(BfgsSettings, GaussNewtonSettings, LmSettings) and OptimError are
shared with the fixed-size routines. Results come back as
MinimizeResultDyn<T> / LeastSquaresResultDyn<T>.
Parallel finite-difference Jacobians
With the rayon feature, the separate functions
finite_difference_jacobian_dyn_par and finite_difference_gradient_dyn_par
evaluate their columns in parallel — each column is an independent call to
your function, so an expensive f (e.g. an ODE integration) scales across
cores. They require Fn + Sync + Send and return the same result as the
sequential routines. The sequential finite_difference_jacobian_dyn /
_gradient_dyn keep their FnMut signatures unchanged, so enabling rayon
never affects them. See Parallelism.
Error Handling¶
use numeris::optim::OptimError;
match least_squares_lm(r, j, &x0, &settings) {
Ok(result) => { /* converged */ }
Err(OptimError::MaxIterExceeded) => { /* increase max_iter or relax tolerances */ }
Err(OptimError::LineSearchFailed) => { /* BFGS line search failure */ }
Err(OptimError::SingularJacobian) => { /* Gauss-Newton needs better initial guess */ }
Err(e) => { /* other */ }
}