Skip to content

Commit 3b7864e

Browse files
ChrisRackauckas-Claudetermi-officialChrisRackauckasclaude
authored
Eisenstat-Walker Newton-Krylov solver (rebased) (#754)
* Eisenstat-Walker Newton-Krylov solver * Missing default * Fix typos * Add missing dispatch * Add verbosity * Add maximum eta to params. * More parameters * Lower initial eta * Derp * Derp again * Debug instability * Verbosity * Export forcing * Bump LinearSolve to reflect that the new API is used * Remove export of helper constructor * Docstring for forcing alg * Fix CI * Fix some corner cases * Move export to the right place * Add documentation for EisenstatWalkerForcing2 - Add Eisenstat & Walker (1996) reference to refs.bib - Add forcing parameter to General Keyword Arguments section - Add new Forcing Term Strategies section with EisenstatWalkerForcing2 docs - Update NewtonRaphson docstring with detailed keyword argument descriptions - Add mention of Newton-Krylov support in solver recommendations 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Update verbosity.jl --------- Co-authored-by: termi-official <termi-official@users.noreply.github.com> Co-authored-by: termi-official <9196588+termi-official@users.noreply.github.com> Co-authored-by: Christopher Rackauckas <accounts@chrisrackauckas.com> Co-authored-by: ChrisRackauckas <me@chrisrackauckas.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 56fd8fe commit 3b7864e

File tree

12 files changed

+286
-12
lines changed

12 files changed

+286
-12
lines changed

Project.toml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ Preferences = "21216c6a-2e73-6563-6e65-726566657250"
2525
ReTestItems = "817f1d60-ba6b-4fd5-9520-3cf149f6a823"
2626
Reexport = "189a3867-3050-52da-a836-e630ba90ab69"
2727
SciMLBase = "0bca4576-84f4-4d90-8ffe-ffa030f20462"
28+
SciMLLogging = "a6db7da4-7206-11f0-1eab-35f2a5dbe1d1"
2829
SimpleNonlinearSolve = "727e6d20-b764-4bd8-a329-72de5adea6c7"
2930
StaticArraysCore = "1e83bf80-4336-4d27-bf5d-d5a4f845583c"
3031
SymbolicIndexingInterface = "2efcf032-c050-4f8e-a9bb-153293bab1f5"
@@ -86,7 +87,7 @@ LeastSquaresOptim = "0.8.5"
8687
LineSearch = "0.1.4"
8788
LineSearches = "7.3"
8889
LinearAlgebra = "1.10"
89-
LinearSolve = "3.46"
90+
LinearSolve = "3.48"
9091
MINPACK = "1.2"
9192
MPI = "0.20.22"
9293
NLSolvers = "0.5"

docs/src/native/solvers.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,10 @@ documentation.
3535
differencing tricks (without ever constructing the Jacobian). However, if the Jacobian
3636
is still needed, for example for a preconditioner, `concrete_jac = true` can be passed
3737
in order to force the construction of the Jacobian.
38+
- `forcing`: Adaptive forcing term strategy for Newton-Krylov methods. When using an
39+
iterative linear solver (Krylov method), this controls how accurately the linear system
40+
is solved at each Newton iteration. Defaults to `nothing` (fixed tolerance). See
41+
[Forcing Term Strategies](@ref forcing_strategies) for available options.
3842

3943
## Nonlinear Solvers
4044

@@ -81,3 +85,43 @@ QuasiNewtonAlgorithm
8185
GeneralizedFirstOrderAlgorithm
8286
GeneralizedDFSane
8387
```
88+
89+
## [Forcing Term Strategies](@id forcing_strategies)
90+
91+
Forcing term strategies control how accurately the linear system is solved at each Newton
92+
iteration when using iterative (Krylov) linear solvers. This is the key idea behind
93+
Newton-Krylov methods: instead of solving ``J \delta u = -f(u)`` exactly, we solve it only
94+
approximately with a tolerance ``\eta_k`` (the forcing term).
95+
96+
The [eisenstat1996choosing](@citet) paper introduced adaptive strategies for choosing
97+
``\eta_k`` that can significantly improve convergence, especially for problems where the
98+
initial guess is far from the solution.
99+
100+
```@docs
101+
EisenstatWalkerForcing2
102+
```
103+
104+
### Example Usage
105+
106+
```julia
107+
using NonlinearSolve, LinearSolve
108+
109+
# Define a large nonlinear problem
110+
function f!(F, u, p)
111+
for i in 2:(length(u) - 1)
112+
F[i] = u[i - 1] - 2u[i] + u[i + 1] + sin(u[i])
113+
end
114+
F[1] = u[1] - 1.0
115+
F[end] = u[end]
116+
end
117+
118+
n = 1000
119+
u0 = zeros(n)
120+
prob = NonlinearProblem(f!, u0)
121+
122+
# Use Newton-Raphson with GMRES and Eisenstat-Walker forcing
123+
sol = solve(prob, NewtonRaphson(
124+
linsolve = KrylovJL_GMRES(),
125+
forcing = EisenstatWalkerForcing2()
126+
))
127+
```

docs/src/refs.bib

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,14 @@
1+
@article{eisenstat1996choosing,
2+
title = {Choosing the forcing terms in an inexact Newton method},
3+
author = {Eisenstat, Stanley C and Walker, Homer F},
4+
journal = {SIAM Journal on Scientific Computing},
5+
volume = {17},
6+
number = {1},
7+
pages = {16--32},
8+
year = {1996},
9+
publisher = {SIAM}
10+
}
11+
112
@article{bastin2010retrospective,
213
title = {A retrospective trust-region method for unconstrained optimization},
314
author = {Bastin, Fabian and Malmedy, Vincent and Mouffe, M{\'e}lodie and Toint, Philippe L and Tomanos, Dimitri},

docs/src/solvers/nonlinear_system_solvers.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,8 @@ sparse/structured matrix support, etc. These methods support the largest set of
5050
features, but have a bit of overhead on very small problems.
5151

5252
- [`NewtonRaphson()`](@ref): A Newton-Raphson method with swappable nonlinear solvers and
53-
autodiff methods for high performance on large and sparse systems.
53+
autodiff methods for high performance on large and sparse systems. Supports Newton-Krylov
54+
methods with adaptive forcing via [`EisenstatWalkerForcing2`](@ref).
5455
- [`TrustRegion()`](@ref): A Newton Trust Region dogleg method with swappable nonlinear
5556
solvers and autodiff methods for high performance on large and sparse systems.
5657
- [`LevenbergMarquardt()`](@ref): An advanced Levenberg-Marquardt implementation with the

lib/NonlinearSolveBase/ext/NonlinearSolveBaseLinearSolveExt.jl

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,4 +77,8 @@ function set_lincache_A!(lincache, new_A)
7777
return
7878
end
7979

80+
function LinearSolve.update_tolerances!(cache::LinearSolveJLCache; kwargs...)
81+
LinearSolve.update_tolerances!(cache.lincache; kwargs...)
82+
end
83+
8084
end

lib/NonlinearSolveBase/src/verbosity.jl

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ diagnostic messages, warnings, and errors during nonlinear system solution.
1414
1515
## Numerical Group
1616
- `threshold_state`: Messages about threshold state in low-rank methods
17+
- `forcing`: Messages about forcing parameter in Newton-Krylov methods
1718
1819
## Sensitivity Group
1920
- `sensitivity_vjp_choice`: Messages about VJP choice in sensitivity analysis (used by SciMLSensitivity.jl)
@@ -65,7 +66,7 @@ NonlinearVerbosity
6566

6667
@verbosity_specifier NonlinearVerbosity begin
6768
toggles = (:linear_verbosity, :non_enclosing_interval, :alias_u0_immutable,
68-
:linsolve_failed_noncurrent, :termination_condition, :threshold_state,
69+
:linsolve_failed_noncurrent, :termination_condition, :threshold_state, :forcing,
6970
:sensitivity_vjp_choice)
7071

7172
presets = (
@@ -76,6 +77,7 @@ NonlinearVerbosity
7677
linsolve_failed_noncurrent = Silent(),
7778
termination_condition = Silent(),
7879
threshold_state = Silent(),
80+
forcing = Silent(),
7981
sensitivity_vjp_choice = Silent()
8082
),
8183
Minimal = (
@@ -85,6 +87,7 @@ NonlinearVerbosity
8587
linsolve_failed_noncurrent = WarnLevel(),
8688
termination_condition = Silent(),
8789
threshold_state = Silent(),
90+
forcing = Silent(),
8891
sensitivity_vjp_choice = Silent()
8992
),
9093
Standard = (
@@ -94,6 +97,7 @@ NonlinearVerbosity
9497
linsolve_failed_noncurrent = WarnLevel(),
9598
termination_condition = WarnLevel(),
9699
threshold_state = WarnLevel(),
100+
forcing = InfoLevel(),
97101
sensitivity_vjp_choice = WarnLevel()
98102
),
99103
Detailed = (
@@ -103,6 +107,7 @@ NonlinearVerbosity
103107
linsolve_failed_noncurrent = WarnLevel(),
104108
termination_condition = WarnLevel(),
105109
threshold_state = WarnLevel(),
110+
forcing = InfoLevel(),
106111
sensitivity_vjp_choice = WarnLevel()
107112
),
108113
All = (
@@ -112,14 +117,15 @@ NonlinearVerbosity
112117
linsolve_failed_noncurrent = WarnLevel(),
113118
termination_condition = WarnLevel(),
114119
threshold_state = InfoLevel(),
120+
forcing = InfoLevel(),
115121
sensitivity_vjp_choice = WarnLevel()
116122
)
117123
)
118124

119125
groups = (
120126
error_control = (:non_enclosing_interval, :alias_u0_immutable,
121127
:linsolve_failed_noncurrent, :termination_condition),
122-
numerical = (:threshold_state,),
128+
numerical = (:threshold_state, :forcing),
123129
sensitivity = (:sensitivity_vjp_choice,)
124130
)
125131
end

lib/NonlinearSolveFirstOrder/src/NonlinearSolveFirstOrder.jl

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ using ForwardDiff: ForwardDiff, Dual # Default Forward Mode AD
3333

3434
include("solve.jl")
3535
include("raphson.jl")
36+
include("eisenstat_walker.jl")
3637
include("gauss_newton.jl")
3738
include("levenberg_marquardt.jl")
3839
include("trust_region.jl")
@@ -101,6 +102,8 @@ end
101102
export NewtonRaphson, PseudoTransient
102103
export GaussNewton, LevenbergMarquardt, TrustRegion
103104

105+
export EisenstatWalkerForcing2
106+
104107
export RadiusUpdateSchemes
105108

106109
export GeneralizedFirstOrderAlgorithm
Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
"""
2+
EisenstatWalkerForcing2(; η₀ = 0.5, ηₘₐₓ = 0.9, γ = 0.9, α = 2, safeguard = true, safeguard_threshold = 0.1)
3+
4+
Algorithm 2 from the classical work by Eisenstat and Walker (1996) as described by formula (2.6):
5+
ηₖ = γ * (||rₖ|| / ||rₖ₋₁||)^α
6+
7+
Here the variables denote:
8+
rₖ residual at iteration k
9+
η₀ ∈ [0,1) initial value for η
10+
ηₘₐₓ ∈ [0,1) maximum value for η
11+
γ ∈ [0,1) correction factor
12+
α ∈ [1,2) correction exponent
13+
14+
Furthermore, the proposed safeguard is implemented:
15+
ηₖ = max(ηₖ, γ*ηₖ₋₁^α) if γ*ηₖ₋₁^α > safeguard_threshold
16+
to prevent ηₖ from shrinking too fast.
17+
"""
18+
@concrete struct EisenstatWalkerForcing2
19+
η₀
20+
ηₘₐₓ
21+
γ
22+
α
23+
safeguard
24+
safeguard_threshold
25+
end
26+
27+
function EisenstatWalkerForcing2(; η₀ = 0.5, ηₘₐₓ = 0.9, γ = 0.9, α = 2, safeguard = true, safeguard_threshold = 0.1)
28+
EisenstatWalkerForcing2(η₀, ηₘₐₓ, γ, α, safeguard, safeguard_threshold)
29+
end
30+
31+
32+
@concrete mutable struct EisenstatWalkerForcing2Cache
33+
p::EisenstatWalkerForcing2
34+
η
35+
rnorm
36+
rnorm_prev
37+
internalnorm
38+
verbosity
39+
end
40+
41+
42+
43+
function pre_step_forcing!(cache::EisenstatWalkerForcing2Cache, descend_cache::NonlinearSolveBase.NewtonDescentCache, J, u, fu, iter)
44+
@SciMLMessage("Eisenstat-Walker forcing residual norm $(cache.rnorm) with rate estimate $(cache.rnorm / cache.rnorm_prev).", cache.verbosity, :forcing)
45+
46+
# On the first iteration we initialize η with the default initial value and stop.
47+
if iter == 0
48+
cache.η = cache.p.η₀
49+
@SciMLMessage("Eisenstat-Walker initial iteration to η=$(cache.η).", cache.verbosity, :forcing)
50+
LinearSolve.update_tolerances!(descend_cache.lincache; reltol=cache.η)
51+
return nothing
52+
end
53+
54+
# Store previous
55+
ηprev = cache.η
56+
57+
# Formula (2.6)
58+
# ||r|| > 0 should be guaranteed by the convergence criterion
59+
(; rnorm, rnorm_prev) = cache
60+
(; α, γ) = cache.p
61+
cache.η = γ * (rnorm / rnorm_prev)^α
62+
63+
# Safeguard 2 to prevent over-solving
64+
if cache.p.safeguard
65+
ηsg = γ*ηprev^α
66+
if ηsg > cache.p.safeguard_threshold && ηsg > cache.η
67+
cache.η = ηsg
68+
end
69+
end
70+
71+
# Far away from the root we also need to respect η ∈ [0,1)
72+
cache.η = clamp(cache.η, 0.0, cache.p.ηₘₐₓ)
73+
74+
@SciMLMessage("Eisenstat-Walker iter $iter update to η=$(cache.η).", cache.verbosity, :forcing)
75+
76+
# Communicate new relative tolerance to linear solve
77+
LinearSolve.update_tolerances!(descend_cache.lincache; reltol=cache.η)
78+
79+
return nothing
80+
end
81+
82+
83+
84+
function post_step_forcing!(cache::EisenstatWalkerForcing2Cache, J, u, fu, δu, iter)
85+
# Cache previous residual norm
86+
cache.rnorm_prev = cache.rnorm
87+
cache.rnorm = cache.internalnorm(fu)
88+
89+
# @SciMLMessage("Eisenstat-Walker sanity check: $(cache.internalnorm(fu + J*δu)) ≤ $(cache.η * cache.internalnorm(fu)).", cache.verbosity, :linear_verbosity)
90+
end
91+
92+
93+
94+
function InternalAPI.init(
95+
prob::AbstractNonlinearProblem, alg::EisenstatWalkerForcing2, f, fu, u, p,
96+
args...; verbose, internalnorm::F = L2_NORM, kwargs...
97+
) where {F}
98+
fu_norm = internalnorm(fu)
99+
100+
return EisenstatWalkerForcing2Cache(
101+
alg, alg.η₀, fu_norm, fu_norm, internalnorm, verbose
102+
)
103+
end
104+
105+
106+
107+
function InternalAPI.reinit!(
108+
cache::EisenstatWalkerForcing2Cache; p = cache.p, kwargs...
109+
)
110+
cache.p = p
111+
end
Lines changed: 23 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,43 @@
11
"""
22
NewtonRaphson(;
33
concrete_jac = nothing, linsolve = nothing, linesearch = missing,
4-
autodiff = nothing, vjp_autodiff = nothing, jvp_autodiff = nothing
4+
autodiff = nothing, vjp_autodiff = nothing, jvp_autodiff = nothing,
5+
forcing = nothing,
56
)
67
78
An advanced NewtonRaphson implementation with support for efficient handling of sparse
89
matrices via colored automatic differentiation and preconditioned linear solvers. Designed
910
for large-scale and numerically-difficult nonlinear systems.
11+
12+
### Keyword Arguments
13+
14+
- `autodiff`: determines the backend used for the Jacobian. Defaults to `nothing` which
15+
means that a default is selected according to the problem specification.
16+
- `concrete_jac`: whether to build a concrete Jacobian. If a Krylov-subspace method is
17+
used, then the Jacobian will not be constructed. Defaults to `nothing`.
18+
- `linsolve`: the [LinearSolve.jl](https://github.com/SciML/LinearSolve.jl) solver used
19+
for the linear solves within the Newton method. Defaults to `nothing`, which means it
20+
uses the LinearSolve.jl default algorithm choice.
21+
- `linesearch`: the line search algorithm to use. Defaults to `missing` (no line search).
22+
- `vjp_autodiff`: backend for computing Vector-Jacobian products.
23+
- `jvp_autodiff`: backend for computing Jacobian-Vector products.
24+
- `forcing`: Adaptive forcing term strategy for Newton-Krylov methods. When using an
25+
iterative linear solver (e.g., `KrylovJL_GMRES()`), this controls how accurately the
26+
linear system is solved at each iteration. Use `EisenstatWalkerForcing2()` for the
27+
classical Eisenstat-Walker adaptive forcing strategy. Defaults to `nothing` (fixed
28+
tolerance from the termination condition).
1029
"""
1130
function NewtonRaphson(;
1231
concrete_jac = nothing, linsolve = nothing, linesearch = missing,
13-
autodiff = nothing, vjp_autodiff = nothing, jvp_autodiff = nothing
32+
autodiff = nothing, vjp_autodiff = nothing, jvp_autodiff = nothing,
33+
forcing = nothing,
1434
)
1535
return GeneralizedFirstOrderAlgorithm(;
1636
linesearch,
1737
descent = NewtonDescent(; linsolve),
1838
autodiff, vjp_autodiff, jvp_autodiff,
1939
concrete_jac,
40+
forcing,
2041
name = :NewtonRaphson
2142
)
2243
end

0 commit comments

Comments
 (0)