Approximating solutions of equations
using Newton’s method with a
modified Newton’s method iterate as a starting point
Abstract.
In this study we are concerned with the problem of approximating a locally unique solution of an equation in a Banach space setting using Newton’s and modified Newton’s methods. We provide weaker convergence conditions for both methods than before [6]–[8]. Then, we combine Newton’s with the modified Newton’s method to approximate locally unique solutions of operator equations in a Banach space setting. Finer error estimates, a larger convergence domain, and a more precise information on the location of the solution are obtained under the same or weaker hypotheses than before [6]–[8]. Numerical examples are also provided.
Key words and phrases:
Banach space, Newton–Kantorovich method, radius of convergence, Fréchet-derivative, Banach lemma on invertible operators.1991 Mathematics Subject Classification:
65H10, 65G99, 47H10, 49M15.1. Introduction
In this study we are concerned with the problem of approximating a locally unique solution of nonlinear equation
| (1) |
where is a Fréchet-differentiable operator defined on an open convex subset of a Banach space with values in a Banach space .
A large number of problems in applied mathematics and also in engineering are solved by finding the solutions of certain equations. For example, dynamic systems are mathematically modeled by difference or differential equations, and their solutions usually represent the states of the systems. For the sake of simplicity, assume that a time-invariant system is driven by the equation for some suitable operator , where is the state. Then the equilibrium states are determined by solving equation (1.1). Similar equations are used in the case of discrete systems. The unknowns of engineering equations can be functions (difference, differential, and integral equations), vectors (systems of linear or nonlinear algebraic equations), or real or complex numbers (single algebraic equations with single unknowns). Except in special cases, the most commonly used solution methods are iterative – when starting from one or several initial approximations a sequence is constructed that converges to a solution of the equation. Iteration methods are also applied for solving optimization problems. In such cases, the iteration sequences converge to an optimal solution of the problem at hand. Since all of these methods have the same recursive structure, they can be introduced and discussed in a general framework.
The most popular methods for approximation are undoubtedly Newton’s method
| (2) |
and the modified Newton’s method
| (3) |
There is an extensive literature on the semilocal as well as the local convergence results for both methods under various hypotheses. Such results can be found in [4], [6], [7], and the references there.
The most popular hypotheses are of Newton–Kantorovich type [4], [6], [7]. Indeed, let . Assume there exist constants , such that
| (4) | |||
| (5) | |||
| (6) | |||
| (7) |
and
| (8) |
where,
| (9) |
Estimate (7) is the crucial non-optimum sufficient condition for the semilocal convergence of both methods [4], [6], [7] (see also Theorem 1).
Under condition (7) method (2) converges quadratically to (if (7) holds as a strict inequality) whereas method (3) converges linearly to . There are examples in the literature where both methods converge to but condition (7) is violated. Therefore one would expect that there may be conditions weaker than (7). This is the motivation for our study. Note that in view of the Lipschitz condition (6) it follows that there exists such that the center-Lipschitz condition
| (10) |
holds.
In general
| (11) |
holds true, and can be arbitrarily large [2]–[4].
Recently, in [3, pp. 387, Case 3, for ], we showed that condition (7) can always be replaced by the weaker
| (12) |
in the case of Newton’s method (2) (see also, Examples 9–11).
Here, we show that in the case of the modified method (3), for convergence, condition (7) can be replaced by
| (13) |
and (6) by weaker condition (10). Finer error estimates on the distances involved, a larger convergence domain, and a more precise information on the location of the solution than in earlier results [6] are also obtained this way (see Theorem 3 for method (2), and Theorem 5 for method (3)).
Using (13) (whenever (7) (or (12) do not hold), we can employ method (3) for a finite number of steps, say until condition (7) (or (12)) is satisfied for . Then faster method (2) takes over from method (3).
The above advantages extend to the local convergence of methods (2) and (3) (see Theorems 10 and 11). Numerical examples are also provided. The technique introduced here can extend to other Newton-type iterative methods [1], [3], [4], [5].
2. Semilocal Convergence Analysis
The following semilocal convergence result for methods (2) and (3) can be found in [7]:
Theorem 1.
Let be a differentiable operator.
Assume there exist , and constants , such that
and
where
Then sequences , are well defined, remain in for all and converge to a unique solution of equation in . Moreover the following estimates hold:
and
where,
and
| (14) |
Remark 2.
There is a plethora of estimates on the distances , , , [4], [6], [7]. However we decided to list only the estimates related to what we need in this study. In the case of Newton’s method (2) we showed in [2], [3] the following improvement of Theorem 1.∎
Theorem 3.
[2], [3]. Let be a differentiable operator.
Assume there exist , and constants , such that
where,
and
Then sequence generated by Newton’s method (2) is well defined, remains in for all and converges to a unique solution of equation in .
Moreover the following estimates hold for all :
| (15) | |||||
| (16) |
and
| (17) |
Remark 4.
Note also that (15) and (16) hold as strict inequalities if [2]–[4]. Moreover we have:
| (18) |
but not vice versa unless if . That is under the same computational cost we managed to weaken (7) since in practice the computation of also requires the computation of . Furthermore, in Example 9, we show that (12) holds but condition (7) is violated.∎
Concerning the semilocal convergence of the modified Newton’s method we show that (13) replaces condition (7).
Theorem 5.
Let be a differentiable operator.
Assume there exist and constants , , such that
and
| (19) |
where,
Then sequence generated by the modified Newton’s method (3) is well defined, remains in for all and converges to a unique solution of equation in .
Moreover the following estimates hold for all :
and
where
Proof.
We shall show that the assumptions of the contraction mapping principle are satisfied for the operator
| (20) |
Let . Then we can obtain the identity
This identity together with (10) implies the estimate
| (21) | |||||
Consequently, is a contraction operator in the ball . To complete the proof, it remains to show that
Let . Then by (20) we can obtain in turn
by the choice of . That completes the proof of Theorem 5. ∎
Remark 6.
Note that by (21) the operator satisfies a Lipschitz condition with constant in the ball . The modified Newton’s method thus converges at the rate of a geometric progression with quotient .∎
The above analysis of method (3) relates to the simplest case. More subtle arguments (see, e.g. Kantorovich and Akilov [6]) show that Theorem 5 remains valid if the sign in (19) is replace by . Therefore from now on we can replace (19) by (13) in Theorem 5.
Remark 7.
If Theorems 3 and 5 reduce to Theorem 1. Otherwise these theorems constitute improvements of it. Indeed see (15)–(18), and notice that
and
Notice also that (7) (or (12)) implies (13) and if the quadratic convergence of method (2) is guaranteed. Moreover given in closed form can then in practice replace . Furthermore if then there exists such that for and then again can replace .∎
Next we show that we can start with method (3) and after a finite number of steps continue with faster method (2):
Proposition 8.
Under hypotheses (4)–(6), (10), (13), and (19), for , define
for a fixed integer
and
where denotes the integer part of real number .
Set
Moreover, assume:
| (22) |
Then the following hold:
| (23) |
| (24) |
| (25) |
| (26) |
Newton’s method (2), starting at
converges to a unique solution
of equation in ,
and
where
and
Moreover, if the inclusion
| (27) |
hold, then
Note that parameter is independent of if , and the inclusion (27) holds if and only if
If , then by the definition of , there exists an integer , such that (27) holds. In this case should replace in the Proposition. Hypothesis (22) can now be dropped, since it follows from (19) and (27).
Proof.
Using Theorem 5, and the estimates
we obtain
and
by the choice of .
It follows by Theorems 1 and 3, with , , , , replacing
, , , , respectively, that there exists a unique
solution of equation in .
Moreover, if inclusion (27) holds by the uniqueness of the solution in
, we deduce .
That completes the proof of Proposition 8. ∎
Let us provide an example.
Example 9.
Let , , and define scalar function on by
| (28) |
Choose . Using (5), (6), (10) and (28), we obtain
| (29) |
The Newton–Kantorovich hypothesis (7) becomes
| (30) |
for all . That is according to Theorem 1 there is no guarantee that either methods (2) or (3) starting at converge to .
However according to Theorem 3 condition (12) becomes:
| (31) |
provided that
| (32) |
Using condition (13) we can do even better since
| (33) |
provided that
| (34) |
which improves the choice for given by (32). However only linear and not quadratic convergence is guaranteed.
Let us now use . In particular (7) does not hold since for ,
However (13) holds since for :
We get
| (35) |
Moreover we obtain
Finally note that .
Our motivation for introducing condition (10) instead of (6) for the convergence modified Newton’s method (3) can also be seen in the following examples.
Example 10.
Let , , and define function on by
| (36) |
where are real parameters and an integer. Then is not Lipschitz on . However, center Lipschitz condition (10) holds for .
Indeed, we have
Example 11.
We consider the integral equation
| (37) |
Here, is a given continuous function satisfying , , is a real number, and the kernel is continuous and positive in .
For example, when is the Green kernel, the corresponding integral equation is equivalent to the boundary value problem
These type of problems have been considered in [2]–[7].
Equations of the form (37) generalize equations of the type
| (38) |
studies in [4], [6], [7].
Instead of (37) we can try to solve the equation where
and
The norm we consider is the max-norm.
The derivative is given by
First of all, we notice that does not satisfy a Lipschitz-type condition in . Let us consider, for instance, , and . Then and
If were a Lipschitz function, then
or, equivalently, the inequality
| (39) |
would hold for all and for a constant . But this is not true. Consider, for example, the functions
If these are substituted into (39)
This inequality is not true when .
Therefore, condition (6) fails in this case. However, condition (10) holds. To show this, let and , . Then, for ,
Hence,
where , , and . Finally note that condition (13) is satisfied for sufficiently small .
3. Local Convergence Analysis
In order for us to cover the local convergence of methods (2) and (3) we start the theorem [7]:
Theorem 12.
Let be a differentiable operator. Assume there exist and a constant such that:
| (40) | |||
| (41) |
and
| (42) |
where,
| (43) |
Then
-
(a)
sequence generated by Newton’s method (2) is well defined, remains in for all , converges to provided that and
(44) If (58) is replaced by
(45) where,
(46) then
-
(b)
sequence generated by modified Newton’s method (3) is well defined, remains in for all , converges to provided that , and
(47)
Proof.
The proof of (a) can be found in [8], whereas the proof of (b) is a special case of part (b) in our Theorem 11 that follows.
It follows from condition (41) that there exists such that:
| (48) |
Then using a combination of conditions (41) and (48) for method (2), and only condition (48) for method (3) we can show: ∎
Theorem 13.
Let be a differentiable operator.
Assume there exist and constants , such that:
and
| (49) |
where,
| (50) |
-
(a)
Then sequence generated by Newton’s method (2) is well defined, remains in for all , converges to provided that and
(51) Using only the center-Lipschitz condition, and if (49) is replaced by
(52) and (48), where,
(53) then,
-
(b)
sequence generated by the modified Newton’s method (3) is well defined, remains in for all , converges to provided that ,
and
(54)
Proof.
(a) The proof can be found in [2].
(b) Let . Then using (48) we get
| (55) |
by the choice of . It follows from (55) and the Banach Lemma on invertible operators [4], [6] that exists, and
| (56) |
In particular by hypothesis .
Let us assume for . Then using (3), (48) and (53) we obtain in turn
| (57) | |||||
and
| (58) | |||||
which shows (54), and .
That completes the proof of Theorem 13. ∎
Remark 14.
In general
| (59) |
holds and can be arbitrarily large. If Theorem 13 reduces to Theorem 10. Otherwise Theorem 13 improves Theorem 12 under the same hypotheses for method (2), and the same or less computational cost for method (3); finer estimates on the distances are obtained and the radius of convergence is enlarged. In particular, we have
| (60) | |||||
| (61) |
Moreover, since
| (62) |
iterates from method (3) cannot be used to find the initial guess for faster method (2).
Examples where estimate (59) holds as in strict inequality can be found in [2]–[4]. However using Theorem 13 we can achieve this as follows: Let be the desired rate of convergence for method (3). Then by (54) we have for that:
Choose:
| (63) |
then it can easily be seen that
and consequently according to (a) of Theorem 11 we can set
In case of according to (a) of Theorem 10 we can set
where,
(simply set in (63)). Note that
Finally we observe
if
which can hold, since can be arbitrarily large [2]. ∎
The ideas presented here can be extended to other Newton-type iterative methods [1], [3], [4], [5] along the same lines.
4. Conclusion
The famous for its simplicity and clarity Newton–Kantorovich hypothesis (7) is the crucial sufficient semilocal convergence condition for both the quadratically convergent Newton’s method (2), and the linearly convergent modified Newton’s method (3) [6]. There exist simple numerical examples to show that both methods converge even when condition (7) is violated [4], [6]. In fact, it is common practice even condition (7) is violated for a certain initial guess, to still use Newton’s method (2) for a few iterates until condition (7) is satisfied. However, this approach is considered a shot in the dark, and it is not working in general [4], [7]. Here we have introduced a certain approach that works in case condition (7)is violated. First, we showed that condition (13) is a weaker sufficient convergence hypothesis for the modified Newton’s method (3) than (7). That is we extended the convergence region of method (3). Then, if (7) is violated but (13) holds, we start with slower method (3) until we reach, (after a finite number of steps) a certain iterate for which condition (7) also holds true. We then continue with faster method (2) using this iterate.
References
- [1] Amat, S., Busquier, S. and Candela, V., A class of quasi-Newton generalized Steffensen methods on Banach spaces, J. Comput. Appl. Math., 149, pp. 397–408, 2002.
- [2] Argyros, I.K., On the Newton–Kantorovich hypothesis for solving equations, J. Comput. Appl. Math., 169, pp. 315–332, 2004.
- [3] Argyros, I.K., A unifying local-semilocal convergence analysis and applications for two-point Newton-like methods in Banach space, J. Math. Anal. Applic., 298, 2, pp. 374–397, 2004.
- [4] Argyros, I.K., Computational theory of iterative methods, Studies in Computational Mathematics, 15 (Editors: C.K. Chui and L. Wuytack), Elsevier Publ. Co., New York, 2007.
- [5] Gutierrez, J.M., Hernandez, M.A. and Salanova, M.A., Accessibility of solutions by Newton’s method, Intern. J. Comput. Math., 57, pp. 239–247, 1995.
- [6] Kantorovich, L.V. and Akilov, G.P., Functional Analysis in Normed Spaces, Pergamon Press, Oxford, 1982.
- [7] Krasnosel′skii, M.A., Vainikko, G.M., Zabreiko, P.P., Rutitskii, Ya.B. and Stetsenko, V.Ya., Approximate Solution of Operator Equations, Wolters-Noordhoff Publ., Groningen, 1972.
- [8] Rheinboldt, W.C., An adaptive continuation process for solving systems of nonlinear equations, Polish Academy of Science, Banach Ctr. Publ., 3, pp. 129–142, 1977.
- [9]








