<!DOCTYPE html>
<html lang="en">
<head>
<script>
  MathJax = { 
    tex: {
		    inlineMath: [['\\(','\\)']]
	} }
</script>
<script type="text/javascript" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js">
</script>
<meta name="generator" content="plasTeX" />
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Global Convergence of the Armijo epsilon steepest descent Algorithm: Global Convergence of the Armijo epsilon steepest descent Algorithm</title>
<link rel="stylesheet" href="/var/www/clients/client1/web1/web/files/jnaat-files/journals/1/articles/styles/theme-white.css" />
</head>

<body>

<div class="wrapper">

<div class="content">
<div class="content-wrapper">


<div class="main-text">


<div class="titlepage">
<h1>Global Convergence of the Armijo epsilon steepest descent Algorithm</h1>
<p class="authors">
<span class="author">Nour Eddine Rahali\(^{\ast }\), Nacera Djeghaba\(^{\S }\) Rachid Benzine\(^{\bullet }\)</span>
</p>
<p class="date">January 23, 2012.</p>
</div>
<p>\(^{\ast }\)Department of Mathematics, Souk Ahras University, Algeria, e-mail: <span class="ttfamily">rahali.noureddine@yahoo.fr</span>. </p>
<p>\(^{\S }\)Department of Mathematics, Badji Mokhtar University, B.P. 12, Annaba, Algeria,<br />e-mail: <span class="ttfamily">djeghaba.nacera@yahoo.fr</span>. </p>
<p>\(^{\bullet }\)Department of Mathematics, Badji Mokhtar University, B.P. 12, Annaba, Algeria,<br />e-mail: <span class="ttfamily">rachid.benzine@univ-annaba.org</span>. </p>

<div class="abstract"><p> In this article, we study the unconstrained minimization problem </p>
<div class="displaymath" id="a0000000002">
  \[  (P)\, \, \, \min \left\{  f(x):x\in \mathbb {R}^{n}\right\}  .  \]
</div>
<p> where \(f:\mathbb {R}^{n}\rightarrow \mathbb {R}\) is a continuously differentiable function. We introduce a new algorithm which accelerates the convergence of the steepest descent method. We further establish the global convergence of this algorithm in the case of Armijo inexact line search. </p>
<p><b class="bf">MSC.</b> 90C30; 65K05; 49M37 </p>
<p><b class="bf">Keywords.</b> Unconstrained optimization, global convergence, steepest descent algorithm, \(\  \varepsilon \)-algorithm, Armijo inexact line search. </p>
</div>
<h1 id="a0000000003">1 INTRODUCTION</h1>
<p>Consider the following unconstrained minimization problem: </p>
<div class="equation" id="a0000000004">
<p>
  <div class="equation_content">
    \begin{equation}  \min \left\{  f(x):x\in \mathbb {R}^{n}\right\}  \tag {1}\end{equation}
  </div>
  <span class="equation_label">1</span>
</p>
</div>
<p> where \(f:\mathbb {R}^{n}\rightarrow \mathbb {R}\) is a continuously differentiable function. Numerical methods for problem (1) are iterative. An initial point \(x_{1}\) should be given, and at the \(k\)-th iteration a new iterate point \(x_{k+1}\) is to be computed by using the information at the current iterate point \(x_{k}\) and those at the previous points. It is hoped that the sequence \(\left\{  x_{k}\right\}  _{k\in \mathbb {N}}\) generated will converge to the solution of (1). </p>
<p>Most numerical methods for unconstrained optimization can be classified into two groups, namely line search algorithms and trust region algorithms. A line search algorithm chooses or computes a search direction \(d_{k}\) at the \(k\)-th iteration, and it sets the next iterate point by </p>
<div class="displaymath" id="a0000000005">
  \[  x_{k+1}=x_{k}+\alpha _{k}d_{k} \]
</div>
<p> where \(d_{k}\) is a descent direction of \(f(x)\) at \(x_{k}\) and \(\alpha _{k}\) is a step size. The search direction \(d_{k}\) is generally required to satisfy </p>
<div class="displaymath" id="a0000000006">
  \[  \nabla f\left( x_{k}\right) ^{t}d_{k}{\lt}0,  \]
</div>
<p> which guarantees that \(d_{k}\) is a descent direction of \(f(x)\) at \(x_{k}\  \)([5], [28]). In line search methods, if the search direction \(d_{k}\) is given at the \(k\)-th iteration, then the next task is to find a step size \(\alpha _{k}\) along the search direction. The ideal line search rule is the exact one that satisfies</p>
<div class="displaymath" id="a0000000007">
  \[  f(x_{k}+\alpha _{k}d_{k})=\underset {\alpha {\gt}0}{\min }f(x_{k}+\alpha d_{k}).  \]
</div>
<p> In fact, the exact step size is difficult or even impossible to obtain in practical computation. Thus many researchers constructed some inexact line search rules, such as Armijo rule, Goldstein rule and Wolfe rule ([4], [26], [34]). In this article, we use the Armijo’s line search ([4]), which may be summarized as follows: </p>
<p><i class="itshape">Armijo’s line search </i>([4]) </p>
<p>Armijo’s Rule is driven by two parameters, \(0{\lt}c{\lt}1\) and \(\beta {\gt}1,\) which respectively manage the acceptable step length from being too large or too small. (Typical values are \(c=0.2,\  \beta =2\)). Suppose that \(\nabla f\left( x_{k}\right) ^{t}d_{k}{\lt}0.\) Define the functions \(\varphi (\alpha )\) and \(\widehat{\varphi }(\alpha )\) as follows:</p>
<div class="displaymath" id="a0000000008">
  \[  \varphi (\alpha )=f(x_{k}+\alpha d_{k}),\  \  \alpha \geq 0  \]
</div>
<div class="displaymath" id="a0000000009">
  \[  \widehat{\varphi }(\alpha )=\varphi (0)+\alpha c\varphi ^{\prime }(0)=f(x_{k})+\alpha c\nabla f\left( x_{k}\right) ^{t}d_{k},\  \  \  \alpha \geq 0,\  0{\lt}c{\lt}1.  \]
</div>
<p>A step length \(\overline{\alpha }\) is considered to be acceptable, provided that </p>
<div class="displaymath" id="a0000000010">
  \[  \varphi \left( \overline{\alpha }\right) \leq \widehat{\varphi }(\overline{\alpha }).  \]
</div>
<p> However, in order to prevent \(\overline{\alpha }\) from being too small, Armijo Rule also requires that the following inequality holds </p>
<div class="displaymath" id="a0000000011">
  \[  \varphi \left( \beta \overline{\alpha }\right) {\gt}\widehat{\varphi }(\beta \overline{\alpha }),  \]
</div>
<p> which yields an acceptable range for \(\overline{\alpha }.\) </p>
<p>Frequently, Armijo Rule is adopted in the following manner. A fixed step length parameter \(\overline{\alpha }\) is chosen. If \(\varphi \left( \overline{\alpha }\right) \leq \widehat{\varphi }(\overline{\alpha }),\) then either \(\overline{\alpha }\) is itself selected as the step size, or \(\overline{\alpha } \) is sequentially-doubled (assuming \(\beta =2\)) to find the largest integer \(t\geq 0\) for which </p>
<div class="displaymath" id="a0000000012">
  \[  \varphi \left( 2^{t}\overline{\alpha }\right) \leq \widehat{\varphi }(2^{t}\overline{\alpha })  \]
</div>
<p> On the other hand, if \(\varphi \left( \overline{\alpha }\right) {\gt}\widehat{\varphi }(\overline{\alpha }),\) then \(\overline{\alpha }\) is sequentially halved to find the smallest integer \(t\geq 1\) for which </p>
<div class="displaymath" id="a0000000013">
  \[  \varphi \left( \tfrac {\overline{\alpha }}{2^{t}}\right) \leq \widehat{\varphi }(\tfrac {\overline{\alpha }}{2^{t}}).  \]
</div>
<p>The steepest descent method is one of the simplest and the most fundamental minimization methods for unconstrained optimization. Since it uses the negative gradient as its descent direction, it is also called the gradient method. </p>
<p>For many problems, the steepest descent method is very slow. Although the method usually works well in the early steps, as a stationary point is approached, it descends very slowly with zigzagging phenomena. There are some ways to overcome these difficulties of zigzagging by deflecting the gradient. Rather then moving along \(d=-\nabla f(x)\), we can move along \(d=-D\nabla f(x)\  \)([8], [9], [10], [12], [13], [14],&#160;[15], [17], [21], [22], [23], [24], [25], [31], [32], [33]) or along \(d=-\nabla f(x)+g\  \)([18], [19], [20],  [21], [27],  [29], [30]), where \(D\) is an appropriate matrix, and \(g\) is an appropriate vector. </p>
<p>In [16] Benzine and Djeghaba provided another solution to this problem by accelerating the convergence of the gradient method. </p>
<p>They achieved this goal by designing a new algorithm, named the epsilon steepest descent algorithm, in which the Wynn epsilon algorithm ([1], [6], [35], [36]) and exact line searches played a prominent role. </p>
<p>In this work we accelerate the convergence of the gradient method by using the Florent Cordellier epsilon algorithm ([11]). </p>
<p>We study the global convergence of the new algorithm, named the Armijo epsilon steepest descent algorithm, by using Armijo inexact line searches ([4]) </p>
<h1 id="a0000000014">2 THE EPSILON ALGORITHM</h1>
<p>The Epsilon Algorithm is due to P. Wynn ([35], [36]). <br />Given a sequence \(\left\{  x_{k}\right\}  _{k\in \mathbb {N}},\  x_{k}\in \mathbb {R} ^{n}.\  \)The coordinates of \(x_{k}\) will be noted as follows: </p>
<div class="displaymath" id="a0000000015">
  \[  x_{k}=(x_{k}^{1},x_{k}^{2},...,x_{k}^{i},...,x_{k}^{n})\in \mathbb {R}^{n} \]
</div>
<p> For \(i\in \left\{  1,2,...,n\right\}  ,\  \)the Epsilon Algorithm calculates quantities with two indices \(\varepsilon _{j}^{k,i}(j,k=0,1,...)\) as follows:</p>
<div class="displaymath" id="a0000000016">
  \begin{align} & \varepsilon _{-1}^{k,i} =0\, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \varepsilon _{0}^{k,i}\, =x_{k}^{i}\quad \quad k=0,1,...\tag {2}\\ & \varepsilon _{j+1}^{k,i} =\varepsilon _{j-1}^{k+1,i}+\tfrac {1}{\varepsilon _{j}^{k+1,i}-\varepsilon _{j}^{k,i}}\quad \quad j,k=0,1,...\nonumber \end{align}
</div>
<p> For \(i\in \left\{  1,2,...,n\right\}  ,\  \)these numbers can be placed in an array as follows: </p>
<div class="table"  id="a0000000017">
  <table class="tabular">
  <tr>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p>\(\varepsilon _{-1}^{0,i}=0\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
  </tr>
  <tr>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{0}^{0,i}=x_{0}^{i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
  </tr>
  <tr>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p>\(\varepsilon _{-1}^{1,i}=0\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{1}^{0,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
  </tr>
  <tr>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{0}^{1,i}=x_{1}^{i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{2}^{0,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
  </tr>
  <tr>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p>\(\varepsilon _{-1}^{2,i}=0\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{1}^{1,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{3}^{0,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
  </tr>
  <tr>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{0}^{2,i}=x_{2}^{i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{2}^{1,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{4}^{0,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
  </tr>
  <tr>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p>\(\varepsilon _{-1}^{3,i}=0\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{1}^{2,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{3}^{1,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{5}^{0,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
  </tr>
  <tr>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{0}^{3,i}=x_{3}^{i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{2}^{2,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{4}^{1,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{6}^{0,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
  </tr>
  <tr>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p>\(\varepsilon _{-1}^{4,i}=0\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{1}^{3,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{3}^{2,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{5}^{1,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{7}^{0,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
  </tr>
  <tr>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{0}^{4,i}=x_{4}^{i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{2}^{3,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{4}^{2,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{6}^{1,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(-\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
  </tr>
  <tr>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p>\(\varepsilon _{-1}^{5,i}=0\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{1}^{4,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{3}^{3,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{5}^{2,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{7}^{1,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
  </tr>
  <tr>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{0}^{5,i}=x_{5}^{i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{2}^{4,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{4}^{3,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{6}^{2,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(-\)</p>

    </td>
  </tr>
  <tr>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p>\(\varepsilon _{-1}^{6,i}=0\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{1}^{5,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{3}^{4,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{5}^{3,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{7}^{2,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
  </tr>
  <tr>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{0}^{6,i}=x_{6}^{i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{2}^{5,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{4}^{4,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{6}^{3,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(-\)</p>

    </td>
  </tr>
  <tr>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p>\(\varepsilon _{-1}^{7,i}=0\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{1}^{6,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{3}^{5,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{5}^{4,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">
      <p> \(\varepsilon _{7}^{3,i}\) </p>

    </td>
    <td  style="text-align:left" 
        rowspan=""
        colspan="">&nbsp;</td>
  </tr>
</table> <figcaption>
  <span class="caption_title">Table</span> 
  <span class="caption_ref">1</span> 
  <span class="caption_text">Epsilon Algorithm</span> 
</figcaption>
</div>
<p>This array is called the \(\varepsilon \)-array. In this array the lower index denotes a column while the upper index denotes a diagonal. </p>
<p>For \(i\in \left\{  1,2,...,n\right\}  ,\  \)the Epsilon algorithm relates the numbers located at the four vertices of a rhombus: </p>
<div class="displaymath" id="a0000000018">
  \[ \begin{array}[c]{ccc}&  \varepsilon _{j}^{k,i} & \\ \varepsilon _{j-1}^{k+1,i} & &  \varepsilon _{j+1}^{k,i}\\ &  \varepsilon _{j}^{k+1,i} & \end{array}  \]
</div>
<p>To calculate the quantities \(\varepsilon _{j+1}^{k,i}\),  we need to know the numbers \(\varepsilon _{j-1}^{k+1,i},\)<br />\(\  \varepsilon _{j}^{k+1,i}\  \)and  \(\varepsilon _{j}^{k,i}.\) </p>
<h1 id="a0000000019">3 THE ARMIJO EPSILON STEEPEST DESCENT ALGORITHM</h1>
<p>To construct our algorithm, we use the column \(\varepsilon _{2}^{k,i}\  (i=1,2,...,n).\  \)Given a sequence \(\left\{  x_{k}^{i}\right\}  _{k\in \mathbb {N}}\  (i=1,2,...,n),\) F. Cordellier ([11]) proposed another formula to calculate the epsilon algorithm of order 2. The quantities \(\varepsilon _{2}^{k,i}\)can be calculated as follows</p>
<div class="equation" id="a0000000020">
<p>
  <div class="equation_content">
    \begin{equation}  \varepsilon _{2}^{k,i}=x_{k+1}^{i}+\left[ \tfrac {1}{x_{k+2}^{i}-x_{k+1}^{i}}-\tfrac {1}{x_{k+1}^{i}-x_{k}^{i}}\right] ^{-1},\  \  (i=1,2,...,n)\tag {3}\end{equation}
  </div>
  <span class="equation_label">3</span>
</p>
</div>
<p> To calculate \(\varepsilon _{2}^{k,i},\) we use the elements \(x_{k}^{i},\) \(x_{k+1}^{i}\) and \(x_{k+2}^{i}\  (i=1,2,...,n).\)  Numerical calculations ([11]) showed that the epsilon algorithm of order 2 with the Cordellier formula (3) is more stable than Wynn epsilon algorithm (2). </p>
<p>We are now in measure to introduce our new algorithm: the Armijo epsilon steepest descent algorithm. </p>
<p><b class="bfseries">The Armijo epsilon steepest descent algorithm</b> <br /><b class="bfseries">Initialization step: </b>Choose an initial point \(x_{0}\in \mathbb {R}^{n} \) an initial point. The coordinates of \(x_{0}\) will be noted as follows: </p>
<div class="displaymath" id="a0000000021">
  \[  x_{0}=(x_{0}^{1},x_{0}^{2},...,x_{0}^{i},...,x_{0}^{n})\in \mathbb {R}^{n} \]
</div>
<p> Let \(k=0\) and go to Main step.</p>
<p><br /><b class="bfseries">Main Step:</b> Starting with the vector \(x_{k},\)</p>
<div class="displaymath" id="a0000000022">
  \[  x_{k}=(x_{k}^{1},x_{k}^{2},...,x_{k}^{i},...,x_{k}^{n}).  \]
</div>
<p> If \(\left\Vert \nabla f(x_{k})\right\Vert =0,\)  stop. Otherwise, let should be \(r_{k}=x_{k}\) and compute the vectors \(s_{k}\) and \(t_{k}\) by using twice the steepest descent algorithm, with Armijo inexact line search </p>
<div class="displaymath" id="a0000000023">
  \[  s_{k}=r_{k}-\lambda _{k}\nabla f(r_{k}),  \]
</div>
<p> and</p>
<div class="displaymath" id="a0000000024">
  \[  t_{k}=s_{k}-\beta _{k}\nabla f(s_{k}),  \]
</div>
<p> \(\lambda _{k}\, \) and \(\beta _{k}\) are positive scalars obtained by the Armijo inexact line search.                                                   <br />If </p>
<div class="displaymath" id="a0000000025">
  \[  s_{k}^{i}-r_{k}^{i}\neq 0,\  \  t_{k}^{i}-s_{k}^{i}\neq 0\  \  \mathrm{and}\  \  \tfrac {1}{t_{k}^{i}-s_{k}^{i}}-\tfrac {1}{s_{k}^{i}-r_{k}^{i}}\neq 0,\  \  i=1,...n  \]
</div>
<p> Let </p>
<div class="displaymath" id="a0000000026">
  \[  \varepsilon _{2}^{k,i}=s_{k}^{i}+\left[ \tfrac {1}{t_{k}^{i}-s_{k}^{i}}-\tfrac {1}{s_{k}^{i}-r_{k}^{i}}\right] ^{-1},\  \  \  \  \  \  i=1,....n,  \]
</div>
<p> and </p>
<div class="displaymath" id="a0000000027">
  \[  \varepsilon _{2}^{k}=\left( \varepsilon _{2}^{k,1},...,\varepsilon _{2}^{k,i},...\varepsilon _{2}^{k,n}\right) .  \]
</div>
<p> If  \(f\left( \varepsilon _{2}^{k}\right) {\lt}f(t_{k}),\) let \(x_{k}=\varepsilon _{2}^{k}.\) Replace \(k\) by \(k+1\) and go to main step.<br />If  \(f\left( \varepsilon _{2}^{k}\right) \geq f(t_{k})\) or if </p>
<div class="displaymath" id="a0000000028">
  \[  s_{k}^{i_{0}}-r_{k}^{i_{0}}=0\  \text{\  or }\  t_{k}^{i_{0}}-s_{k}^{i_{0}}=0\  \  \text{\  or }\  \tfrac {1}{t_{k}^{i_{0}}-s_{k}^{i_{0}}}-\tfrac {1}{s_{k}^{i_{0}}-r_{k}^{i_{0}}}=0,\  i_{0}\in \left\{  1,...,n\right\}  .  \]
</div>
<p> Let  \(x_{k}=t_{k}.\) Replace \(k\) by \(k+1\) and go to Main step. </p>
<p><div class="remark_thmwrapper " id="a0000000029">
  <div class="remark_thmheading">
    <span class="remark_thmcaption">
    Remark
    </span>
    <span class="remark_thmlabel">1</span>
  </div>
  <div class="remark_thmcontent">
  <p>According to the Algorithm, the vectors \(s_{k}\) and \(t_{k}\) are obtained by using twice the steepest descent method, with Armijo inexact line search. Then we have</p>
<div class="displaymath" id="a0000000030">
  \[  f(s_{k}){\lt}f(r_{k})=f(x_{k})  \]
</div>
<p> and</p>
<div class="displaymath" id="a0000000031">
  \[  f(t_{k}){\lt}f(s_{k})  \]
</div>
<p> Now, by considering the Algorithm, if the calculation of \(\varepsilon _{2}^{k}\) is possible, two cases are possible:<br /><b class="bfseries">a) </b> \(f\left( \varepsilon _{2}^{k}\right) {\lt}f\left( t_{k}\right) .\) Then we have </p>
<div class="displaymath" id="a0000000032">
  \[  f\left( x_{k+1}\right) =f\left( \varepsilon _{2}^{k}\right) {\lt}f\left( x_{k}\right)  \]
</div>
<p> <b class="bfseries">b) </b>\(f\left( \varepsilon _{2}^{k}\right) \geq f\left( t_{k}\right) \) or if the calculation of \(\varepsilon _{2}^{k}\) is not possible\(.\  \)In this case and according to the algorithm we have</p>
<div class="displaymath" id="a0000000033">
  \[  f\left( x_{k+1}\right) =f(t_{k}){\lt}f\left( x_{k}\right)  \]
</div>
<p> In conclusion the Armijo epsilon steepest descent Algorithm guarantees </p>
<div class="equation" id="a0000000034">
<p>
  <div class="equation_content">
    \begin{equation}  f\left( x_{k+1}\right) <f\left( x_{k}\right) ,\  \  k=0,1,2,...\tag {4}\end{equation}
  </div>
  <span class="equation_label">4</span>
</p>
</div>
<p> <span class="qed">â–¡</span></p>

  </div>
</div> </p>
<h1 id="a0000000035">4 Global Convergence <br />of the Armijo epsilon steepest descent algorithm</h1>
<p>The foregoing preparatory results enable us to establish the following theorem </p>
<p><div class="theorem_thmwrapper " id="a0000000036">
  <div class="theorem_thmheading">
    <span class="theorem_thmcaption">
    Theorem
    </span>
    <span class="theorem_thmlabel">2</span>
  </div>
  <div class="theorem_thmcontent">
  <p>For the unconstrained minimization problem <span class="rm">(1)</span>, we let \(x_{0}\)be a starting point of the Armijo epsilon steepest descent Algorithm, and assume that the following assumptions hold </p>
<ul class="itemize">
  <li><p>The function f is continuously differentiable in a neighborhood \(\mathcal{L}\) of the level set \(\delta (x_{0})=\left\{  x\in \mathbb {R} ^{n}:\  f(x)\leq f(x_{0})\right\}  .\) </p>
</li>
  <li><p>The gradient of \(f\) is Lipschitzian in \(\mathcal{L}\), i.e. there exists \(K{\gt}0,\) such that </p>
<div class="displaymath" id="a0000000037">
  \[  \left\Vert \nabla f(x)-\nabla f(y)\right\Vert \leq K\left\Vert x-y\right\Vert ,\  \  \  \  \forall (x,y)\in \mathcal{L}\times \mathcal{L} \]
</div>
</li>
</ul>
<p>Then, the sequence \(\left\{  x_{k}\right\}  _{k\in \mathbb {N}}\)generated by the Armijo epsilon steepest descent algorithm must satisfy one of the properties: \(\nabla f(x_{k_{0}})=0\) for some \(k_{0}\in \mathbb {N}\), or \(\left\Vert \nabla f(x_{k})\right\Vert \underset {k\rightarrow \infty }{\longrightarrow }0.\) </p>

  </div>
</div> </p>
<p><div class="proof_wrapper" id="a0000000038">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div>Suppose that an infinite sequence \(\left\{  x_{k}\right\}  _{k\in \mathbb {N}}\) is generated by the Armijo epsilon steepest descent Algorithm. In the main step of the Algorithm, the vectors \(s_{k}\) and \(t_{k}\) are obtained by using twice the steepest descent method, with Armijo inexact line search. The vectors \(s_{k}\) and \(t_{k}\) are the successors of \(x_{k},\) and used to calculate \(x_{k+1}\) (see the main step of the algorithm)\(.\  \)Note that </p>
<div class="displaymath" id="a0000000039">
  \[  s_{k}=x_{k}-\lambda _{k}\nabla f(x_{k}),  \]
</div>
<p> \(\lambda _{k}=\tfrac {\overline{\alpha }}{2^{t}}\  (\overline{\alpha }{\gt}0\) is a constant defined in the Armijo inexact line search) verifying the following Armijo criterion</p>
<div class="equation" id="a0000000040">
<p>
  <div class="equation_content">
    \begin{equation}  \varphi \left( \tfrac {\overline{\alpha }}{2^{t}}\right) \leq \widehat{\varphi }\left( \tfrac {\overline{\alpha }}{2^{t}}\right) \tag {5}\end{equation}
  </div>
  <span class="equation_label">5</span>
</p>
</div>
<p> with</p>
<div class="equation" id="a0000000041">
<p>
  <div class="equation_content">
    \begin{equation}  \varphi (\tfrac {\overline{\alpha }}{2^{t}})=f\left( x_{k}-\tfrac {\overline{\alpha }}{2^{t}}\nabla f\left( x_{k}\right) \right) ,\tag {6}\end{equation}
  </div>
  <span class="equation_label">6</span>
</p>
</div>
<p> and </p>
<div class="equation" id="a0000000042">
<p>
  <div class="equation_content">
    \begin{equation}  \widehat{\varphi }\left( \tfrac {\overline{\alpha }}{2^{t}}\right) =f(x_{k})-\tfrac {\overline{\alpha }}{2^{t}}c\nabla f\left( x_{k}\right) ^{t}\nabla f\left( x_{k}\right) ,\tag {7}\end{equation}
  </div>
  <span class="equation_label">7</span>
</p>
</div>
<p> \(\  c\in \left] 0,1\right[ ,\  \overline{\alpha }{\gt}0,\  t\in \mathbb {N}\), such that the inequality (5) is satisfied. Taking into account the relations (5), (6) and (7), we obtain </p>
<div class="displaymath" id="a0000000043">
  \begin{align}  f\left( s_{k}\right) &  =f\left[ x_{k}-\tfrac {\overline{\alpha }}{2^{t}}\nabla f\left( x_{k}\right) \right] \tag {8}\\ &  \leq f(x_{k})-c\tfrac {\overline{\alpha }}{2^{t}}\left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}.\nonumber \end{align}
</div>
<p> (8) implies </p>
<div class="equation" id="a0000000044">
<p>
  <div class="equation_content">
    \begin{equation}  f\left( s_{k}\right) -f(x_{k})\leq -c\tfrac {\overline{\alpha }}{2^{t}}\left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}.\tag {9}\end{equation}
  </div>
  <span class="equation_label">9</span>
</p>
</div>
<p> On the other hand, by using the mean value theorem, we have </p>
<div class="equation" id="a0000000045">
<p>
  <div class="equation_content">
    \begin{equation}  f\left( s_{k}\right) -f(x_{k})=\left( s_{k}-x_{k}\right) ^{T}\nabla f\left( \widetilde{x}\right) ;\  \  \  \widetilde{x}=\lambda s_{k}+(1-\lambda )x_{k},\  \  \  \lambda \in ]0,1[.\tag {10}\end{equation}
  </div>
  <span class="equation_label">10</span>
</p>
</div>
<p> In as much as</p>
<div class="equation" id="a0000000046">
<p>
  <div class="equation_content">
    \begin{equation}  s_{k}=x_{k}-\alpha _{k}\nabla f\left( x_{k}\right) ,\  \  \alpha _{k}=\tfrac {\overline{\alpha }}{2^{t}},\tag {11}\end{equation}
  </div>
  <span class="equation_label">11</span>
</p>
</div>
<p> then</p>
<div class="displaymath" id="a0000000047">
  \begin{align}  f\left( s_{k}\right) -f(x_{k}) &  =-\alpha _{k}\nabla f\left( x_{k}\right) ^{t}.\nabla f\left( \widetilde{x}\right) \tag {12}\\ &  =-\alpha _{k}\nabla f\left( x_{k}\right) ^{t}\left[ \nabla f\left( x_{k}\right) -\nabla f\left( x_{k}\right) +\nabla f\left( \widetilde{x}\right) \right] \nonumber \\ &  =-\alpha _{k}\left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}+\alpha _{k}\nabla f\left( x_{k}\right) ^{t}\left[ \nabla f\left( x_{k}\right) -\nabla f\left( \widetilde{x}\right) \right] .\nonumber \end{align}
</div>
<p> \(\widetilde{x}=\lambda s_{k}+(1-\lambda )x_{k},\  \lambda \in ]0,1[\). Noting that the sequence \(\left\{  f(x_{k})\right\}  _{k\in \mathbb {N}},\) is decreasing, then </p>
<div class="displaymath" id="a0000000048">
  \[  x_{k}\in \delta (x_{0}),\  \  k=0,1,...  \]
</div>
<p>Now, by using the Cauchy Schwarz inequality and the fact that the gradient of<i class="itshape"> </i>\(f\)<i class="itshape"> </i>is Lipschitzian of constant \(K\), we have</p>
<div class="displaymath" id="a0000000049">
  \begin{align}  f\left( s_{k}\right) -f(x_{k}) &  \leq -\alpha _{k}\left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}+\alpha _{k}\left\Vert \nabla f\left( x_{k}\right) \right\Vert \left\Vert \nabla f\left( x_{k}\right) -\nabla f\left( \widetilde{x}\right) \right\Vert \tag {13}\\ &  \leq -\alpha _{k}\left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}+\alpha _{k}\left\Vert \nabla f\left( x_{k}\right) \right\Vert K\left\Vert x_{k}-\widetilde{x}\right\Vert \nonumber \end{align}
</div>
<p> Noting that \(\widetilde{x}=\lambda s_{k}+(1-\lambda )x_{k},\  \  \  \lambda \in ]0,1[,\  \)then </p>
<div class="equation" id="a0000000050">
<p>
  <div class="equation_content">
    \begin{equation}  \left\Vert x_{k}-\widetilde{x}\right\Vert =\left\Vert (1-\lambda )(x_{k}-x_{s})\right\Vert \leq \left\vert (1-\lambda )\right\vert \left\Vert (x_{k}-x_{s})\right\Vert <\left\Vert (x_{k}-x_{s})\right\Vert .\tag {14}\end{equation}
  </div>
  <span class="equation_label">14</span>
</p>
</div>
<p> (13) and (14) imply</p>
<div class="displaymath" id="a0000000051">
  \[  f\left( s_{k}\right) -f(x_{k})\leq -\alpha _{k}\left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}+\alpha _{k}\left\Vert \nabla f\left( x_{k}\right) \right\Vert K\left\Vert x_{s}-x_{k}\right\Vert  \]
</div>
<p> Remark that \(x_{s}-x_{k}=x_{k}-\alpha _{k}\nabla f\left( x_{k}\right) -x_{k}= \) \(-\alpha _{k}\nabla f\left( x_{k}\right) ,\) \(\alpha _{k}=\tfrac {\overline{\alpha }}{2^{t}},\  \)then </p>
<div class="equation" id="a0000000052">
<p>
  <div class="equation_content">
    \begin{equation}  f\left( s_{k}\right) -f(x_{k})\leq -\alpha _{k}\left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}+\alpha _{k}\left\Vert \nabla f\left( x_{k}\right) \right\Vert K\alpha _{k}\left\Vert \nabla f\left( x_{k}\right) \right\Vert .\tag {15}\end{equation}
  </div>
  <span class="equation_label">15</span>
</p>
</div>
<p> Hence, we obtain </p>
<div class="equation" id="a0000000053">
<p>
  <div class="equation_content">
    \begin{equation}  f\left( s_{k}\right) -f(x_{k})\leq -\tfrac {\overline{\alpha }}{2^{t}}\left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}\left[ 1-K\tfrac {\overline{\alpha }}{2^{t}}\right] .\tag {16}\end{equation}
  </div>
  <span class="equation_label">16</span>
</p>
</div>
<p> Now choose \(\  t{\gt}0,\) the smallest integer such that the following relation is true</p>
<div class="equation" id="a0000000054">
<p>
  <div class="equation_content">
    \begin{equation}  2^{t}\geq \tfrac {K\overline{\alpha }}{1-c}\tag {17}\end{equation}
  </div>
  <span class="equation_label">17</span>
</p>
</div>
<p> (17) implies</p>
<div class="equation" id="a0000000055">
<p>
  <div class="equation_content">
    \begin{equation}  1-K\tfrac {\overline{\alpha }}{2^{t}}\geq c.\tag {18}\end{equation}
  </div>
  <span class="equation_label">18</span>
</p>
</div>
<p> Hence</p>
<div class="equation" id="a0000000056">
<p>
  <div class="equation_content">
    \begin{equation}  1-K\tfrac {\overline{\alpha }}{2^{t-1}} < c. \tag {19}\end{equation}
  </div>
  <span class="equation_label">19</span>
</p>
</div>
<p> (18) implies</p>
<div class="equation" id="a0000000057">
<p>
  <div class="equation_content">
    \begin{equation}  -\tfrac {\overline{\alpha }}{2^{t}}\left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}\left[ 1-K\tfrac {\overline{\alpha }}{2^{t}}\right] \leq -\tfrac {\overline{\alpha }}{2^{t}}c\left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2},\  \  \  \  c\in ]0,1[,\tag {20}\end{equation}
  </div>
  <span class="equation_label">20</span>
</p>
</div>
<p> and (19) gives</p>
<div class="displaymath" id="a0000000058">
  \[  -K\tfrac {\overline{\alpha }}{2^{t-1}}{\lt}c-1\Rightarrow K\tfrac {\overline{\alpha }}{2^{t-1}}{\gt}1-c.  \]
</div>
<p> Hence</p>
<div class="equation" id="a0000000059">
<p>
  <div class="equation_content">
    \begin{equation}  -\tfrac {\overline{\alpha }}{2^{t}}c<-\tfrac {c(1-c)}{2K}.\tag {21}\end{equation}
  </div>
  <span class="equation_label">21</span>
</p>
</div>
<p> Note that the choice of \(t{\gt}0\) satisfying (17) implies that the inequality (9) holds true. Therefore and taking into account the relations (16), (18), (19), (20), (21), we obtain </p>
<div class="equation" id="a0000000060">
<p>
  <div class="equation_content">
    \begin{equation}  f\left( s_{k}\right) -f(x_{k})\leq -\tfrac {c(1-c)}{2K}\left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}.\tag {22}\end{equation}
  </div>
  <span class="equation_label">22</span>
</p>
</div>
<p> Denote by \(G=-\tfrac {c(1-c)}{2K}.\) It is clear that \(G{\lt}0\) and (22) gives </p>
<div class="equation" id="a0000000061">
<p>
  <div class="equation_content">
    \begin{equation}  f\left( s_{k}\right) -f(x_{k})\leq G\left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}.\tag {23}\end{equation}
  </div>
  <span class="equation_label">23</span>
</p>
</div>
<p> Consider now \(t_{k}\  \)the successor of \(s_{k},\  t_{k}=s_{k}-\beta _{k}\nabla f(s_{k}).\  \)By doing the same with \(t_{k},\) we obtain</p>
<div class="equation" id="a0000000062">
<p>
  <div class="equation_content">
    \begin{equation}  f\left( t_{k}\right) -f(s_{k})\leq G\left\Vert \nabla f\left( s_{k}\right) \right\Vert ^{2}.\tag {24}\end{equation}
  </div>
  <span class="equation_label">24</span>
</p>
</div>
<p> We will prove now that</p>
<div class="displaymath" id="a0000000063">
  \[  \underset {k\rightarrow \infty }{\lim }\left\Vert \nabla f\left( x_{k}\right) \right\Vert =0.  \]
</div>
<p> To this end, consider</p>
<div class="displaymath" id="a0000000064">
  \[  f\left( x_{k+1}\right) -f(x_{k})=f\left( \varepsilon _{2}^{k}\right) -f\left( t_{k}\right) +f\left( t_{k}\right) -f(s_{k})+f(s_{k})-f(x_{k})  \]
</div>
<p> Note that</p>
<div class="displaymath" id="a0000000065">
  \[  f\left( \varepsilon _{2}^{k}\right) -f\left( t_{k}\right) {\lt}0\  \  \  (see\  (4)).  \]
</div>
<p> then</p>
<div class="displaymath" id="a0000000066">
  \[  f\left( x_{k+1}\right) -f(x_{k}){\lt}f\left( t_{k}\right) -f(s_{k})+f(s_{k})-f(x_{k})  \]
</div>
<p> The relations (23) and (24) imply</p>
<div class="equation" id="a0000000067">
<p>
  <div class="equation_content">
    \begin{equation}  f\left( x_{k+1}\right) -f(x_{k}) < G\left( \left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}+\left\Vert \nabla f\left( s_{k}\right) \right\Vert ^{2}\right) .\tag {25}\end{equation}
  </div>
  <span class="equation_label">25</span>
</p>
</div>
<p> Noting that \(\left\{  f(x_{k})\right\}  _{k\in \mathbb {N}}\) is a monotone decreasing sequence and so has a limit \((\)otherwise \(\inf f(x)=-\infty )\). Hence, the relation (25) implies</p>
<div class="equation" id="a0000000068">
<p>
  <div class="equation_content">
    \begin{equation}  \left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}+\left\Vert \nabla f\left( s_{k}\right) \right\Vert ^{2}<\tfrac {1}{G}\left[ f\left( x_{k+1}\right) -f(x_{k})\right] .\tag {26}\end{equation}
  </div>
  <span class="equation_label">26</span>
</p>
</div>
<p> Taking \(\overline{\lim }\) as \(k\rightarrow \infty ,\) we get</p>
<div class="equation" id="a0000000069">
<p>
  <div class="equation_content">
    \begin{equation}  \underset {k\rightarrow \infty }{\overline{\lim }}\left( \left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}+\left\Vert \nabla f\left( s_{k}\right) \right\Vert ^{2}\right) \leq 0\tag {27}\end{equation}
  </div>
  <span class="equation_label">27</span>
</p>
</div>
<p> On the other hand we have</p>
<div class="equation" id="a0000000070">
<p>
  <div class="equation_content">
    \begin{equation}  \underset {k\rightarrow \infty }{\underline{\lim }}\left( \left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}+\left\Vert \nabla f\left( s_{k}\right) \right\Vert ^{2}\right) \leq \underset {k\rightarrow \infty }{\overline{\lim }}\left( \left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}+\left\Vert \nabla f\left( s_{k}\right) \right\Vert ^{2}\right) \tag {28}\end{equation}
  </div>
  <span class="equation_label">28</span>
</p>
</div>
<p> and </p>
<div class="equation" id="a0000000071">
<p>
  <div class="equation_content">
    \begin{equation}  0\leq \underset {k\rightarrow \infty }{\underline{\lim }}\left( \left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}+\left\Vert \nabla f\left( s_{k}\right) \right\Vert ^{2}\right) \tag {29}\end{equation}
  </div>
  <span class="equation_label">29</span>
</p>
</div>
<p> The inequalities (27), (28) and (29) imply</p>
<div class="displaymath" id="a0000000072">
  \begin{align}  0 &  \leq \underset {k\rightarrow \infty }{\underline{\lim }}\left( \left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}+\left\Vert \nabla f\left( s_{k}\right) \right\Vert ^{2}\right) \tag {30}\\ &  \leq \underset {k\rightarrow \infty }{\overline{\lim }}\left( \left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}+\left\Vert \nabla f\left( s_{k}\right) \right\Vert ^{2}\right) \leq 0\nonumber \end{align}
</div>
<p> which implies</p>
<div class="displaymath" id="a0000000073">
  \begin{align} &  \underset {k\rightarrow \infty }{\underline{\lim }}\left( \left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}\! +\! \left\Vert \nabla f\left( s_{k}\right) \right\Vert ^{2}\right) =\underset {k\rightarrow \infty }{\overline{\lim }}\left( \left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}\! +\! \left\Vert \nabla f\left( s_{k}\right) \right\Vert ^{2}\right) \nonumber \\ &  =\underset {k\rightarrow \infty }{\lim }\left( \left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}+\left\Vert \nabla f\left( s_{k}\right) \right\Vert ^{2}\right) =0\nonumber \end{align}
</div>
<p> Notice that we have</p>
<div class="equation" id="a0000000074">
<p>
  <div class="equation_content">
    \begin{equation}  0\leq \left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}\leq \left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}+\left\Vert \nabla f\left( s_{k}\right) \right\Vert ^{2}\tag {32}\end{equation}
  </div>
  <span class="equation_label">31</span>
</p>
</div>
<p> and </p>
<div class="equation" id="a0000000075">
<p>
  <div class="equation_content">
    \begin{equation}  0\leq \left\Vert \nabla f\left( s_{k}\right) \right\Vert ^{2}\leq \left\Vert \nabla f\left( x_{k}\right) \right\Vert ^{2}+\left\Vert \nabla f\left( s_{k}\right) \right\Vert ^{2}\tag {33}\end{equation}
  </div>
  <span class="equation_label">32</span>
</p>
</div>
<p> Finally, the inequalities (31), (32) and (33) imply</p>
<div class="displaymath" id="a0000000076">
  \[  \underset {k\rightarrow \infty }{\lim }\left\Vert \nabla f\left( x_{k}\right) \right\Vert =\underset {k\rightarrow \infty }{\lim }\left\Vert \nabla f\left( s_{k}\right) \right\Vert =0.  \]
</div>

<h1 id="a0000000077">5 Numerical results and comparisons</h1>
<p>In this section we report some numerical results obtained with an implementation of the Armijo Epsilon Steepest Descent algorithm. For our numerical tests, we used test functions and Fortran programs from ([2],[7]). Considering the same criteria as in ([3]), the code is written in Fortran and compiled with f90 on a Workstation Intel Pentium 4 with 2 GHz. We selected a number of 52 unconstrained optimization test functions in generalized or extended form [34] (some from CUTE library [7]). For each test function we have taken twenty (20) numerical experiments with the number of variables increasing as\(\  n=2,10,30,50,70,100,\) \(300,500,700,900,1000,2000,3000,4000,5000,\) \(6000,7000,8000,9000,10000.\) The algorithm implements the Armijo line search conditions ([1]), and the same stopping criterion \(\left\Vert \nabla f\left( x_{k}\right) \right\Vert {\lt}10^{-6}.\) In all the algorithms we considered in this numerical study the maximum number of iterations is limited to 100000. </p>
<p>The comparisons of algorithms are given in the following context. Let \(f_{i}^{ALG1}\  \)and \(\  f_{i}^{ALG2}\  \)be the optimal value found by ALG1 and ALG2, for problem \(i=1,...,962,\) respectively. We say that, in the particular problem \(i,\) the performance of ALG1 was better than the performance of ALG2 if: </p>
<div class="displaymath" id="a0000000078">
  \[  \left\vert f_{i}^{ALG1}-f_{i}^{ALG2}\right\vert {\lt}10^{-3} \]
</div>
<p>and the number of iterations, or the number of function-gradient evaluations, or the CPU time of ALG1 was less than the number of iterations, or the number of function-gradient evaluations, or the CPU time corresponding to ALG2, respectively. </p>
<p>In the set of numerical experiments we compare Armijo Epsilon Steepest Descent algorithm versus Steepest descent algorithm. Figure 1 shows the Dolan and Moré CPU performance profile of Armijo Epsilon Steepest Descent algorithm versus Steepest descent algorithm. </p>
<p>In a performance profile plot, the top curve corresponds to the method that solved the most problems in a time that was within a factor \(\tau \) of the best time. The percentage of the test problems for which a method is the fastest is given on the left axis of the plot. The right side of the plot gives the percentage of the test problems that were successfully solved by these algorithms, respectively. Mainly, the right side is a measure of the robustness of an algorithm. When comparing Armijo Epsilon Steepest Descent algorithm with Steepest descent algorithm subject to CPU time metric we see that Armijo Epsilon Steepest Descent algorithm is top performer. The Armijo Epsilon Steepest Descent algorithm is more successful than the Epsilon Steepest Descent algorithm. </p>
<figure id="a0000000079">
  <div class="centered"><img src="img-0001.png" alt="\includegraphics[ height=3.98in, width=4.47in]{figura.png}" style="height:3.98in; width:4.47in" />
<figcaption>
  <span class="caption_title">Figure</span> 
  <span class="caption_ref">1</span> 
  <span class="caption_text">The Dolan and Moré CPU performance profile of Armijo Epsilon Steepest Descent algorithm <i class="it">versus</i> Steepest descent algorithm.</span> 
</figcaption> </div>

</figure>
<p><div class="acknowledgement_thmwrapper " id="a0000000080">
  <div class="acknowledgement_thmheading">
    <span class="acknowledgement_thmcaption">
    Acknowledgement
    </span>
  </div>
  <div class="acknowledgement_thmcontent">
  <p>The authors are very grateful to Professor Neculai Andrei for his help and his valuable comments and suggestions on the early version of the paper. </p>

  </div>
</div> </p>
<p><small class="footnotesize">  </small></p>
<div class="bibliography">
<h1>Bibliography</h1>
<dl class="bibliography">
  <dt><a name="1">1</a></dt>
  <dd><p><span class="scshape">A.C. Aitken</span>, <i class="itshape">On Bernoulli’s numerical solution of algebraic equations</i>, Proc. Roy. Soc. Edinburgh, <b class="bfseries">46</b> (1926), pp.&#160;289–305. </p>
</dd>
  <dt><a name="2">2</a></dt>
  <dd><p><span class="scshape">N. Andrei</span>, <i class="itshape">An unconstrained optimization test functions collection</i>, Advanced Modeling and Optimization, <b class="bfseries">10</b> (2008), pp.&#160;147–161. </p>
</dd>
  <dt><a name="3">3</a></dt>
  <dd><p><span class="scshape">N. Andrei</span>, <i class="itshape">Another conjugate gradient algorithm for unconstrained optimization</i>, Annals of Academy of Romanian Scientists, Series on Science and Technology of Information, <b class="bfseries">1</b> (2008) no. 1, pp.&#160;7–20. </p>
</dd>
  <dt><a name="4">4</a></dt>
  <dd><p><span class="scshape">L. Armijo</span>, <i class="itshape">Minimization of functions having Lipschitz continuous first partial derivatives</i>, Pacific J. Mathematics, <b class="bfseries">16</b> (1966) 1, pp.&#160;1–3. </p>
</dd>
  <dt><a name="5">5</a></dt>
  <dd><p><span class="scshape">M.S. Bazaraa, H.D. Sherali</span> and <span class="scshape">C.M Shetty</span>, <i class="itshape">Nonlinear Programing</i>, John Wiley &amp; Sons, New York, 1993. </p>
</dd>
  <dt><a name="6">6</a></dt>
  <dd><p><span class="scshape">C. Brezinski</span>, <i class="itshape">Acceleration de la convergence en analyse numérique</i>, Lecture Notes in Mathematics, <b class="bfseries">584</b> (1977), Springer Verlag. </p>
</dd>
  <dt><a name="7">7</a></dt>
  <dd><p><span class="scshape">I. Bongartz, A.R. Conn, N.I.M. Gould</span> and <span class="scshape">P.L. Toint</span>, <i class="itshape">CUTE: constrained and unconstrained testing environments</i>, ACM Trans. Math. Software, <b class="bfseries">21</b> (1995), pp.123–160. </p>
</dd>
  <dt><a name="8">8</a></dt>
  <dd><p><span class="scshape">C.G. Broyden</span>, <i class="itshape">Quasi-Newton Methods and their application to Function Minimization</i>, Mathematics of Coputation, <b class="bfseries">21</b> (1967), pp.&#160;368–381. </p>
</dd>
  <dt><a name="9">9</a></dt>
  <dd><p>C.G. Broyden, <i class="itshape">The convergence of a class of double rank minimization algorithms 2. The new algorithm</i>, J. Institute of Mathematics and its applications, <b class="bfseries">6</b> (1970), pp.&#160;222–231. </p>
</dd>
  <dt><a name="10">10</a></dt>
  <dd><p><span class="scshape">C.G. Broyden, J.E. Jr. Dennis</span> and <span class="scshape">J .J. Moré</span>, <i class="itshape">On the local and superlinear convergence of quasi-Newton methods</i>, J .Inst. Math. Appl., <b class="bfseries">12</b> (1973), pp.&#160;223–246. </p>
</dd>
  <dt><a name="11">11</a></dt>
  <dd><p><span class="scshape">F. Cordellier</span>, <i class="itshape">Transformations de suites scalaires et vectorielles</i>, Thèse de doctorat d’état soutenue à l’université de Lille I, 1981. </p>
</dd>
  <dt><a name="12">12</a></dt>
  <dd><p><span class="scshape">W.C. Davidon</span>, <i class="itshape">Variable Metric Method for Minimization</i>, AEC research Development, Report ANL-5990, 1959. </p>
</dd>
  <dt><a name="13">13</a></dt>
  <dd><p><span class="scshape">J.E.Jr. Dennis</span> and <span class="scshape">J.J. Moré</span>, <i class="itshape">A characterization of superlinear convergence and its application to quasi-Newton methods</i>, Math. Comp., <b class="bfseries">28</b> (1974), pp.&#160;549–560. </p>
</dd>
  <dt><a name="14">14</a></dt>
  <dd><p><span class="scshape">J.E. Dennis</span> and <span class="scshape">J.J. Moré</span>, <i class="itshape">Quasi-Newton methods, motivation and theory</i>, SIAM. Rev., <b class="bfseries">19</b> (1977), pp.&#160;46–89. </p>
</dd>
  <dt><a name="15">15</a></dt>
  <dd><p><span class="scshape">L.C.W. Dixon</span>, <i class="itshape">Variable metric algorithms: necessary and sufficient conditions for identical behavior on nonquadratic functions</i>, J. Opt. Theory Appl., <b class="bfseries">10</b> (1972), pp.&#160;34–40. </p>
</dd>
  <dt><a name="16">16</a></dt>
  <dd><p><span class="scshape">N. Djeghaba</span> and <span class="scshape">R. Benzine</span>, <i class="itshape">Accélération de la convergence de la méthode de la plus forte pente</i>, Demonstratio Mathematica., <b class="bfseries">39</b> (2006) No. 1, pp.&#160;169–181. </p>
</dd>
  <dt><a name="17">17</a></dt>
  <dd><p><span class="scshape">R. Fletcher</span>, <i class="itshape">A new approch to Variable Metric Algorithms</i>, Computer Journal, <b class="bfseries">13</b> (1970), pp.&#160;317–322. </p>
</dd>
  <dt><a name="18">18</a></dt>
  <dd><p><span class="scshape">R. Fletcher</span>, <i class="itshape">Practical methods of Optimization</i>, Second Edition, John Wiley &amp; Sons, Chichester, 1987. </p>
</dd>
  <dt><a name="19">19</a></dt>
  <dd><p><span class="scshape">R. Fletcher</span>, <i class="itshape">An overview of unconstrained optimization</i>, Algorithms for Continuous Optimization: the State of Art, E. Spedicato, ed., Kluwer Academic Publishers, 1994. </p>
</dd>
  <dt><a name="20">20</a></dt>
  <dd><p><span class="scshape">R. Fletcher</span>, and <span class="scshape">M. Reeves</span>, <i class="itshape">Function minimization by conjugate gradients</i>, Computer J., <b class="bfseries">7</b> (1964), pp.&#160;149–154. </p>
</dd>
  <dt><a name="21">21</a></dt>
  <dd><p><span class="scshape">R. Fletcher</span>, and <span class="scshape">M.M. Powell</span>, <i class="itshape">A rapidly Convergent Descent Method for Minimization</i>, Computer Journal, <b class="bfseries">6</b> (1963), pp.&#160;163–168. </p>
</dd>
  <dt><a name="22">22</a></dt>
  <dd><p><span class="scshape">G.E. Forsythe</span>, <i class="itshape">On the asymptotic directions of the s-dimentional optimum gradient method</i>, Numerische Mathematik, <b class="bfseries">11</b>, pp.&#160;57–76. </p>
</dd>
  <dt><a name="23">23</a></dt>
  <dd><p><span class="scshape">P.E. Gill</span> and <span class="scshape">W. Marray</span>, <i class="itshape">Quasi-Newton Methods for unconstrained optimization</i>, J.Inst. Maths applics, <b class="bfseries">9</b> (1972), pp.&#160;91–108. </p>
</dd>
  <dt><a name="24">24</a></dt>
  <dd><p><span class="scshape">A. Griewank</span>, <i class="itshape">The global convergence of partitioned BFGS on problems with convex decompositions and Lipschitz gradients</i>, Math. Prog., <b class="bfseries">50</b> (1991), pp.&#160;141–175. </p>
</dd>
  <dt><a name="25">25</a></dt>
  <dd><p><span class="scshape">D. Goldfarb</span>, <i class="itshape">A Family of Variable Metric Methods Derived by Variational Means</i>, Mathematics of Computation, <b class="bfseries">24</b> (1970), pp.&#160;23–26. </p>
</dd>
  <dt><a name="26">26</a></dt>
  <dd><p><span class="scshape">A.A. Goldstein</span> and <span class="scshape">J.F. Price</span>, <i class="itshape">An effective Algorithm for Minimization</i>, Numerische Mathematik, <b class="bfseries">10</b> (1967), pp.&#160;184–189. </p>
</dd>
  <dt><a name="27">27</a></dt>
  <dd><p><span class="scshape">M.R. Hestenes</span> and <span class="scshape">E. Stiefel</span>, <i class="itshape">Methods of conjugate gradients for solving linear systems</i>, J. Res. Nation-Bureau Stand., <b class="bfseries">49</b> (1952) No. 6, pp.&#160;409–436. </p>
</dd>
  <dt><a name="28">28</a></dt>
  <dd><p><span class="scshape">J. Nocedal</span> and <span class="scshape">S.J. Wright</span>, <i class="itshape">Numerical Optimization</i>, Springer, Second edition, 2006. </p>
</dd>
  <dt><a name="29">29</a></dt>
  <dd><p><span class="scshape">E. Polak</span> and <span class="scshape">G. Ribiere</span>, <i class="itshape">Note sur la convergence de méthodes de directions conjuguées</i>, Revue Française Informatique et Recherche opérationnelle, <b class="bfseries">16</b> (1969), pp.&#160;35–43. </p>
</dd>
  <dt><a name="30">30</a></dt>
  <dd><p><span class="scshape">B.T. Polyak</span>, <i class="itshape">The Method of Conjugate Gradient in Extremum Problems</i>, USSR Computational Mathematics and Mathematical Physics, (English Translation), <b class="bfseries">9</b> (1969), pp.&#160;94–112. </p>
</dd>
  <dt><a name="31">31</a></dt>
  <dd><p><span class="scshape">M.J.D, Powell</span>, <i class="itshape">On the convergence of the variable metric algorithms</i>, J. Inst. Math. Appl., <b class="bfseries">7</b> (1971), pp.&#160;21–36. </p>
</dd>
  <dt><a name="32">32</a></dt>
  <dd><p><span class="scshape">M.J.D. Powell</span>, <i class="itshape">Some global convergence properties of variable metric algorithms for minimization without exact line searches</i>, Nonlinear Programming, SIAM-AMS Proceedings, <b class="bfseries">IX</b> (1976), R.W. Cottle and C.E. Lemke eds., SIAM. </p>
</dd>
  <dt><a name="33">33</a></dt>
  <dd><p><span class="scshape">D.F. Shanno</span>, <i class="itshape">Conditionning of quasi-Newton Methods for function minimization</i>, Mathematics of Computation, <b class="bfseries">24</b> (1970), pp.&#160;641–656. </p>
</dd>
  <dt><a name="34">34</a></dt>
  <dd><p><span class="scshape">P. Wolfe</span>, <i class="itshape">Convergence conditions for ascent méthods</i>, Siam Review, <b class="bfseries">11</b> (1969), pp.&#160;226–235. </p>
</dd>
  <dt><a name="35">35</a></dt>
  <dd><p><span class="scshape">P. Wynn</span>, <i class="itshape">On a device for computing the \(e_{m}(S_{n})\) transformation</i>, M.T.A.C., <b class="bfseries">10</b> (1956), pp.&#160;91–96. </p>
</dd>
  <dt><a name="36">36</a></dt>
  <dd><p><span class="scshape">P. Wynn</span>, <i class="itshape">Upon systems of recursions which obtain among quotients of the Padé table</i>, Numer. Math., <b class="bfseries">8</b> (1966), pp.&#160;264–269. </p>
</dd>
</dl>


</div>
</div> <!--main-text -->
</div> <!-- content-wrapper -->
</div> <!-- content -->
</div> <!-- wrapper -->

<nav class="prev_up_next">
</nav>

<script type="text/javascript" src="/var/www/clients/client1/web1/web/files/jnaat-files/journals/1/articles/js/jquery.min.js"></script>
<script type="text/javascript" src="/var/www/clients/client1/web1/web/files/jnaat-files/journals/1/articles/js/plastex.js"></script>
<script type="text/javascript" src="/var/www/clients/client1/web1/web/files/jnaat-files/journals/1/articles/js/svgxuse.js"></script>
</body>
</html>