Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ Preferences = "21216c6a-2e73-6563-6e65-726566657250"
ReTestItems = "817f1d60-ba6b-4fd5-9520-3cf149f6a823"
Reexport = "189a3867-3050-52da-a836-e630ba90ab69"
SciMLBase = "0bca4576-84f4-4d90-8ffe-ffa030f20462"
SciMLLogging = "a6db7da4-7206-11f0-1eab-35f2a5dbe1d1"
SimpleNonlinearSolve = "727e6d20-b764-4bd8-a329-72de5adea6c7"
StaticArraysCore = "1e83bf80-4336-4d27-bf5d-d5a4f845583c"
SymbolicIndexingInterface = "2efcf032-c050-4f8e-a9bb-153293bab1f5"
Expand Down Expand Up @@ -86,7 +87,7 @@ LeastSquaresOptim = "0.8.5"
LineSearch = "0.1.4"
LineSearches = "7.3"
LinearAlgebra = "1.10"
LinearSolve = "3.46"
LinearSolve = "3.48"
MINPACK = "1.2"
MPI = "0.20.22"
NLSolvers = "0.5"
Expand Down
44 changes: 44 additions & 0 deletions docs/src/native/solvers.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,10 @@ documentation.
differencing tricks (without ever constructing the Jacobian). However, if the Jacobian
is still needed, for example for a preconditioner, `concrete_jac = true` can be passed
in order to force the construction of the Jacobian.
- `forcing`: Adaptive forcing term strategy for Newton-Krylov methods. When using an
iterative linear solver (Krylov method), this controls how accurately the linear system
is solved at each Newton iteration. Defaults to `nothing` (fixed tolerance). See
[Forcing Term Strategies](@ref forcing_strategies) for available options.

## Nonlinear Solvers

Expand Down Expand Up @@ -81,3 +85,43 @@ QuasiNewtonAlgorithm
GeneralizedFirstOrderAlgorithm
GeneralizedDFSane
```

## [Forcing Term Strategies](@id forcing_strategies)

Forcing term strategies control how accurately the linear system is solved at each Newton
iteration when using iterative (Krylov) linear solvers. This is the key idea behind
Newton-Krylov methods: instead of solving ``J \delta u = -f(u)`` exactly, we solve it only
approximately with a tolerance ``\eta_k`` (the forcing term).

The [eisenstat1996choosing](@citet) paper introduced adaptive strategies for choosing
``\eta_k`` that can significantly improve convergence, especially for problems where the
initial guess is far from the solution.

```@docs
EisenstatWalkerForcing2
```

### Example Usage

```julia
using NonlinearSolve, LinearSolve

# Define a large nonlinear problem
function f!(F, u, p)
for i in 2:(length(u) - 1)
F[i] = u[i - 1] - 2u[i] + u[i + 1] + sin(u[i])
end
F[1] = u[1] - 1.0
F[end] = u[end]
end

n = 1000
u0 = zeros(n)
prob = NonlinearProblem(f!, u0)

# Use Newton-Raphson with GMRES and Eisenstat-Walker forcing
sol = solve(prob, NewtonRaphson(
linsolve = KrylovJL_GMRES(),
forcing = EisenstatWalkerForcing2()
))
```
11 changes: 11 additions & 0 deletions docs/src/refs.bib
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
@article{eisenstat1996choosing,
title = {Choosing the forcing terms in an inexact Newton method},
author = {Eisenstat, Stanley C and Walker, Homer F},
journal = {SIAM Journal on Scientific Computing},
volume = {17},
number = {1},
pages = {16--32},
year = {1996},
publisher = {SIAM}
}

@article{bastin2010retrospective,
title = {A retrospective trust-region method for unconstrained optimization},
author = {Bastin, Fabian and Malmedy, Vincent and Mouffe, M{\'e}lodie and Toint, Philippe L and Tomanos, Dimitri},
Expand Down
3 changes: 2 additions & 1 deletion docs/src/solvers/nonlinear_system_solvers.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,8 @@ sparse/structured matrix support, etc. These methods support the largest set of
features, but have a bit of overhead on very small problems.

- [`NewtonRaphson()`](@ref): A Newton-Raphson method with swappable nonlinear solvers and
autodiff methods for high performance on large and sparse systems.
autodiff methods for high performance on large and sparse systems. Supports Newton-Krylov
methods with adaptive forcing via [`EisenstatWalkerForcing2`](@ref).
- [`TrustRegion()`](@ref): A Newton Trust Region dogleg method with swappable nonlinear
solvers and autodiff methods for high performance on large and sparse systems.
- [`LevenbergMarquardt()`](@ref): An advanced Levenberg-Marquardt implementation with the
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -77,4 +77,8 @@ function set_lincache_A!(lincache, new_A)
return
end

function LinearSolve.update_tolerances!(cache::LinearSolveJLCache; kwargs...)
LinearSolve.update_tolerances!(cache.lincache; kwargs...)
end

end
10 changes: 8 additions & 2 deletions lib/NonlinearSolveBase/src/verbosity.jl
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ diagnostic messages, warnings, and errors during nonlinear system solution.

## Numerical Group
- `threshold_state`: Messages about threshold state in low-rank methods
- `forcing`: Messages about forcing parameter in Newton-Krylov methods

## Sensitivity Group
- `sensitivity_vjp_choice`: Messages about VJP choice in sensitivity analysis (used by SciMLSensitivity.jl)
Expand Down Expand Up @@ -65,7 +66,7 @@ NonlinearVerbosity

@verbosity_specifier NonlinearVerbosity begin
toggles = (:linear_verbosity, :non_enclosing_interval, :alias_u0_immutable,
:linsolve_failed_noncurrent, :termination_condition, :threshold_state,
:linsolve_failed_noncurrent, :termination_condition, :threshold_state, :forcing,
:sensitivity_vjp_choice)

presets = (
Expand All @@ -76,6 +77,7 @@ NonlinearVerbosity
linsolve_failed_noncurrent = Silent(),
termination_condition = Silent(),
threshold_state = Silent(),
forcing = Silent(),
sensitivity_vjp_choice = Silent()
),
Minimal = (
Expand All @@ -85,6 +87,7 @@ NonlinearVerbosity
linsolve_failed_noncurrent = WarnLevel(),
termination_condition = Silent(),
threshold_state = Silent(),
forcing = Silent(),
sensitivity_vjp_choice = Silent()
),
Standard = (
Expand All @@ -94,6 +97,7 @@ NonlinearVerbosity
linsolve_failed_noncurrent = WarnLevel(),
termination_condition = WarnLevel(),
threshold_state = WarnLevel(),
forcing = InfoLevel(),
sensitivity_vjp_choice = WarnLevel()
),
Detailed = (
Expand All @@ -103,6 +107,7 @@ NonlinearVerbosity
linsolve_failed_noncurrent = WarnLevel(),
termination_condition = WarnLevel(),
threshold_state = WarnLevel(),
forcing = InfoLevel(),
sensitivity_vjp_choice = WarnLevel()
),
All = (
Expand All @@ -112,14 +117,15 @@ NonlinearVerbosity
linsolve_failed_noncurrent = WarnLevel(),
termination_condition = WarnLevel(),
threshold_state = InfoLevel(),
forcing = InfoLevel(),
sensitivity_vjp_choice = WarnLevel()
)
)

groups = (
error_control = (:non_enclosing_interval, :alias_u0_immutable,
:linsolve_failed_noncurrent, :termination_condition),
numerical = (:threshold_state,),
numerical = (:threshold_state, :forcing),
sensitivity = (:sensitivity_vjp_choice,)
)
end
3 changes: 3 additions & 0 deletions lib/NonlinearSolveFirstOrder/src/NonlinearSolveFirstOrder.jl
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ using ForwardDiff: ForwardDiff, Dual # Default Forward Mode AD

include("solve.jl")
include("raphson.jl")
include("eisenstat_walker.jl")
include("gauss_newton.jl")
include("levenberg_marquardt.jl")
include("trust_region.jl")
Expand Down Expand Up @@ -101,6 +102,8 @@ end
export NewtonRaphson, PseudoTransient
export GaussNewton, LevenbergMarquardt, TrustRegion

export EisenstatWalkerForcing2

export RadiusUpdateSchemes

export GeneralizedFirstOrderAlgorithm
Expand Down
111 changes: 111 additions & 0 deletions lib/NonlinearSolveFirstOrder/src/eisenstat_walker.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
"""
EisenstatWalkerForcing2(; η₀ = 0.5, ηₘₐₓ = 0.9, γ = 0.9, α = 2, safeguard = true, safeguard_threshold = 0.1)

Algorithm 2 from the classical work by Eisenstat and Walker (1996) as described by formula (2.6):
ηₖ = γ * (||rₖ|| / ||rₖ₋₁||)^α

Here the variables denote:
rₖ residual at iteration k
η₀ ∈ [0,1) initial value for η
ηₘₐₓ ∈ [0,1) maximum value for η
γ ∈ [0,1) correction factor
α ∈ [1,2) correction exponent

Furthermore, the proposed safeguard is implemented:
ηₖ = max(ηₖ, γ*ηₖ₋₁^α) if γ*ηₖ₋₁^α > safeguard_threshold
to prevent ηₖ from shrinking too fast.
"""
@concrete struct EisenstatWalkerForcing2
η₀
ηₘₐₓ
γ
α
safeguard
safeguard_threshold
end

function EisenstatWalkerForcing2(; η₀ = 0.5, ηₘₐₓ = 0.9, γ = 0.9, α = 2, safeguard = true, safeguard_threshold = 0.1)
EisenstatWalkerForcing2(η₀, ηₘₐₓ, γ, α, safeguard, safeguard_threshold)
end


@concrete mutable struct EisenstatWalkerForcing2Cache
p::EisenstatWalkerForcing2
η
rnorm
rnorm_prev
internalnorm
verbosity
end



function pre_step_forcing!(cache::EisenstatWalkerForcing2Cache, descend_cache::NonlinearSolveBase.NewtonDescentCache, J, u, fu, iter)
@SciMLMessage("Eisenstat-Walker forcing residual norm $(cache.rnorm) with rate estimate $(cache.rnorm / cache.rnorm_prev).", cache.verbosity, :forcing)

# On the first iteration we initialize η with the default initial value and stop.
if iter == 0
cache.η = cache.p.η₀
@SciMLMessage("Eisenstat-Walker initial iteration to η=$(cache.η).", cache.verbosity, :forcing)
LinearSolve.update_tolerances!(descend_cache.lincache; reltol=cache.η)
return nothing
end

# Store previous
ηprev = cache.η

# Formula (2.6)
# ||r|| > 0 should be guaranteed by the convergence criterion
(; rnorm, rnorm_prev) = cache
(; α, γ) = cache.p
cache.η = γ * (rnorm / rnorm_prev)^α

# Safeguard 2 to prevent over-solving
if cache.p.safeguard
ηsg = γ*ηprev^α
if ηsg > cache.p.safeguard_threshold && ηsg > cache.η
cache.η = ηsg
end
end

# Far away from the root we also need to respect η ∈ [0,1)
cache.η = clamp(cache.η, 0.0, cache.p.ηₘₐₓ)

@SciMLMessage("Eisenstat-Walker iter $iter update to η=$(cache.η).", cache.verbosity, :forcing)

# Communicate new relative tolerance to linear solve
LinearSolve.update_tolerances!(descend_cache.lincache; reltol=cache.η)

return nothing
end



function post_step_forcing!(cache::EisenstatWalkerForcing2Cache, J, u, fu, δu, iter)
# Cache previous residual norm
cache.rnorm_prev = cache.rnorm
cache.rnorm = cache.internalnorm(fu)

# @SciMLMessage("Eisenstat-Walker sanity check: $(cache.internalnorm(fu + J*δu)) ≤ $(cache.η * cache.internalnorm(fu)).", cache.verbosity, :linear_verbosity)
end



function InternalAPI.init(
prob::AbstractNonlinearProblem, alg::EisenstatWalkerForcing2, f, fu, u, p,
args...; verbose, internalnorm::F = L2_NORM, kwargs...
) where {F}
fu_norm = internalnorm(fu)

return EisenstatWalkerForcing2Cache(
alg, alg.η₀, fu_norm, fu_norm, internalnorm, verbose
)
end



function InternalAPI.reinit!(
cache::EisenstatWalkerForcing2Cache; p = cache.p, kwargs...
)
cache.p = p
end
25 changes: 23 additions & 2 deletions lib/NonlinearSolveFirstOrder/src/raphson.jl
Original file line number Diff line number Diff line change
@@ -1,22 +1,43 @@
"""
NewtonRaphson(;
concrete_jac = nothing, linsolve = nothing, linesearch = missing,
autodiff = nothing, vjp_autodiff = nothing, jvp_autodiff = nothing
autodiff = nothing, vjp_autodiff = nothing, jvp_autodiff = nothing,
forcing = nothing,
)

An advanced NewtonRaphson implementation with support for efficient handling of sparse
matrices via colored automatic differentiation and preconditioned linear solvers. Designed
for large-scale and numerically-difficult nonlinear systems.

### Keyword Arguments

- `autodiff`: determines the backend used for the Jacobian. Defaults to `nothing` which
means that a default is selected according to the problem specification.
- `concrete_jac`: whether to build a concrete Jacobian. If a Krylov-subspace method is
used, then the Jacobian will not be constructed. Defaults to `nothing`.
- `linsolve`: the [LinearSolve.jl](https://github.com/SciML/LinearSolve.jl) solver used
for the linear solves within the Newton method. Defaults to `nothing`, which means it
uses the LinearSolve.jl default algorithm choice.
- `linesearch`: the line search algorithm to use. Defaults to `missing` (no line search).
- `vjp_autodiff`: backend for computing Vector-Jacobian products.
- `jvp_autodiff`: backend for computing Jacobian-Vector products.
- `forcing`: Adaptive forcing term strategy for Newton-Krylov methods. When using an
iterative linear solver (e.g., `KrylovJL_GMRES()`), this controls how accurately the
linear system is solved at each iteration. Use `EisenstatWalkerForcing2()` for the
classical Eisenstat-Walker adaptive forcing strategy. Defaults to `nothing` (fixed
tolerance from the termination condition).
"""
function NewtonRaphson(;
concrete_jac = nothing, linsolve = nothing, linesearch = missing,
autodiff = nothing, vjp_autodiff = nothing, jvp_autodiff = nothing
autodiff = nothing, vjp_autodiff = nothing, jvp_autodiff = nothing,
forcing = nothing,
)
return GeneralizedFirstOrderAlgorithm(;
linesearch,
descent = NewtonDescent(; linsolve),
autodiff, vjp_autodiff, jvp_autodiff,
concrete_jac,
forcing,
name = :NewtonRaphson
)
end
Loading
Loading