Critical cones for sufficient second order conditions in PDE constrained optimization

In this paper, we analyze optimal control problems governed by semilinear parabolic equations. Box constraints for the controls are imposed and the cost functional involves the state and possibly a sparsity-promoting term, but not a Tikhonov regularization term. Unlike finite dimensional optimization or control problems involving Tikhonov regularization, second order sufficient optimality conditions for the control problems we deal with must be imposed in a cone larger than the one used to obtain necessary conditions. Different extensions of this cone have been proposed in the literature for different kinds of minima: strong or weak minimizers for optimal control problems. After a discussion on these extensions, we propose a new extended cone smaller than those considered until now. We prove that a second order condition based on this new cone is sufficient for a strong local minimum.

Above y u denotes the state associated to the control u related by the following semilinear parabolic state equation y u = 0 on Σ, y u (0) = y 0 in Ω. (1.1) Assumptions on the data A, f , y 0 , L and L Ω are specified in Section 2.
It is well known that ifū is a local minimum then first order necessary optimality conditions can be written as J ′ (ū; u −ū) ≥ 0 ∀u ∈ U ad while second order necessary optimality conditions read like where Cū is the cone Cū = {v ∈ L 2 (Q) satisfying the sign condition (1.2) and J ′ (ū; v) = 0}, v(x, t) ≥ 0 ifū(x, t) = α, ≤ 0 ifū(x, t) = β. (1. 2) The reader is referred to [11,Theorem 3.7] for the elliptic case or [12, Theorem 3.1. Case I] for the parabolic case. It is well known that in finite dimensional optimization the cone used to establish necessary second order necessary optimality conditions is the same as the one used for sufficient second order conditions. However this not the case in general for optimization problems in infinite dimension; see the example by Dunn [24]. Despite this, if the Tikhonov term γ 2 u 2 L 2 (Q) with γ > 0 is present in the cost functional of the control problem, we can take the same cone for both necessary and sufficient conditions; see e.g., [4], [19] or [20] for the case µ = 0, or [11], [12] or [17] for µ > 0. Other works that consider second order sufficient conditions for problems with no Tikhonov regularization are [16], [21], [22], and [23]. The results in these works cannot be applied to our problem due to the facts that we deal with a semilinear parabolic equation, our controls depend both on space and time and we do not have any assumption on the structure of the adjoint state.
In this paper, the Tikhonov term is not present. Then, an approach to deal with second order sufficient conditions, as suggested by Dunn [24] or Maurer and Zowe [27] among others, consists of extending the cone of critical directions Cū. As far as we know, two ways to enlarge the cone have been proposed in the literature. In the context of abstract optimization problems, following Maurer and Zowe [27], one could replace the condition J ′ (ū; v) = 0 by J ′ (ū; v) ≤ τ v L 2 (Q) for some small τ > 0. In optimal control problems we can take advantage of the structure of the problem to define a slightly smaller cone by taking where z v is the derivative of the control-to-state mapping in the direction v; see (2.1) below. A second alternative to extend Cū is based on the observation that for functions v ∈ L 2 (Q) satisfying the sign condition (1.2) we have whereφ is the adjoint state associated withū, defined in (2.10) below; see [6], [17], [20], [21], [22]. Then a natural extension can be done specifying a smaller set of points where the functions v should vanish: given τ > 0 we define the extended cone The following question immediately arises: is one of these two extensions better than the other? The answer seems to be difficult because they are not easy to compare. However we solve this issue by choosing D τ u ∩ E τ u . The main goal of this paper is to prove that a second order optimality condition based on this cone along with the first order optimality conditions imply the strong local optimality ofū.
The plan of the paper is as follows. In Section 2 we establish the assumptions on the functions defining (P), recall some regularity results on the state equation and the linearized state equation and establish the differentiability properties of the control-to-state mapping. We also state necessary optimality conditions. In Section 3 we prove our main result, namely Theorem 3.1. In Section 4 we comment about extensions and limitations of our main result.
Before ending this introduction let us mention that the methods used in this paper cannot be applied to the case of control problems governed by the Navier-Stokes system. This is due to the fact that our approach requieres L ∞ (Q) bounds for the states; see Theorem 2.1. For quasilinear parabolic equations, it seems possible to obtain similar bounds using the results in [9]. Also it seems reasonable that estimates analogous to that of (2.4) or (2.9) hold, but the extension is not immediate and is beyond the scope of this paper. We refer the reader interested in optimal control problems governed by these types of equations to [7], [8], [9], [10], [15], [18], [28] for the case where the Tikhonov term is present in the cost functional.

Assumptions and preliminary results
On the partial differential equation (1.1), we make the following assumptions.
(A1) A denotes the elliptic operator where b j ∈ L ∞ (Q), a i,j ∈ L ∞ (Ω), and the uniform ellipticity condition a i,j (x)ξ i ξ j for all ξ ∈ R n and a.a. x ∈ Ω holds.
(A2) We assume that f : Q × R → R is a Carathéodory function of class C 2 with respect to the last variable satisfying the following properties: for almost all (x, t) ∈ Q.
Examples of functions f satisfying the above assumptions are the polynomials of odd degree with positive leading coefficients or the exponential function f (x, t, y) = g(x, t)exp(y) with g ∈ L ∞ (Q), g(x, t) ≥ 0 for almost all (x, t) ∈ Q.
On the functions L and L Ω defining the differentiable part F of the cost functional J, we assume: (A4) L : Q × R → R is a Carathéodory function of class C 2 with respect to the last variable satisfying the following properties: L(·, ·,0) ∈ L 1 (Q) and ∀M > 0 ∃Ψ M ∈ Lp(0, T ; Lq(Ω)) and C Q,M such that ∂L ∂y (x, t, y) ≤ Ψ M (x, t) and for almost all (x, t) ∈ Q.
Let us comment that the classical tracking-type cost functional satisfies the above assumptions if y d ∈ Lp(0, T ; Lq(Ω)) and y Ω ∈ L ∞ (Ω). Hereafter, these hypotheses will be assumed without further notice throughout the rest of the work.

Analysis of the state equation
In this section we analyze the existence, uniqueness and some regularity properties for the solution of (1.1) as well as its dependence with respect to the control u. We also prove some technical results to be used in the proof of our main result, Theorem 3.1.
To deduce the second estimate and the convergence properties, we introduce w k = y u k − y u . Subtracting the equations satisfied by y u k and y u and using the mean value theorem we get the existence of measurable functionsŷ in Ω.
From [26, Theorem III-10.1], we deduce the existence of Cp ,q > 0 and γ ∈ (0, 1) such that . This proves the second estimate. Finally, since C γ,γ/2 (Q) is compactly embedded in C(Q) it is immediate to see that w k C(Q) → 0. In particular, w k (·, T ) L ∞ (Ω) → 0 holds. Using this fact and multiplying the above equation by w k and making integration by parts we infer convergence w k → 0 in L 2 (0, T ; H 1 (Ω)).
Moreover z v and z v1,v2 are continuous functions inQ.
For the proof the reader is referred, for instance, to [19,Theorem 5.1]. From the classical theory for linear parabolic partial differential equations, we know that for every v ∈ L 2 (Q) there exists a unique solution z v of (2.1) in the space C([0, T ], L 2 (Ω)) ∩ L 2 (0, T ; H 1 0 (Ω)). Therefore the linear mapping G ′ (u) can be extended to a continuous linear mapping G ′ (u) : L 2 (Q) → C([0, T ], L 2 (Ω)) ∩ L 2 (0, T ; H 1 0 (Ω)). The following estimates for z v will be used in the next sections. Lemma 2.3. Let u ∈ U ad and v ∈ L 2 (Q) be arbitrary, and let z v = G ′ (u)v be the solution of (2.1). Then, there exist constants C Q,2 and C Q,1 independent of u and v such that If, further, v ∈ Lp(0, T ; Lq(Ω)), then there exists a constant C Q,∞ independent of u and v such that Proof. First let us note that from Theorem 2.1 and our assumption on f (A2) we have that The estimate (2.3) for z v L 1 (Q) follows from [13]; see also [3,5].
To prove the estimate for z v (·, T ) L 1 (Ω) we proceed as follows. Consider the function where A * is the adjoint of A given by Multiplying the equation satisfied by z v by ψ and integrating over Q, we obtain Integrating by parts in the first integral, we have Now using (2.7), we have that Finally, it is enough to realize that for some constant C we have and the proof is complete.
The following technical result will be used in the proof of Theorem 3.1.
Lemma 2.4. Consider u,ū ∈ U ad with associated states y u andȳ, respectively. Set z u−ū = G ′ (ū)(u −ū) and consider the constants C f,M∞ satisfying (2.5) and C Q,∞ introduced in Lemma 2.3. Then the following estimates hold: Proof. Define η = y u − (ȳ + z u−ū ). The function η satisfies the equation Using a second order Taylor expansion, we have that there exists a measurable function 0 < θ(x, t) < 1 such that, if we nameŷ =ȳ + θ(y u −ȳ), we have that Let us prove the first estimate. With the help of Assumption (A2), we deduce from (2.4) and (2.5) that Using this and (2.8), we infer For the second inequality, notice that using the uniform boundness of the admissible states, assumption (A2) and (2.2), we have that Finally, using (2.9), we have that and the second inequality follows.

First and second order optimality conditions for (P)
We recall the definition of the cost functional J(u) = F (u) + µj(u). Before establishing the optimality conditions satisfied by a local solution we address the differentiability of the functional F . The next theorem follows from the chain rule, Theorem 2.2 and assumptions (A2) and (A3).
Theorem 2.5. The functional F : Lp(0, T ; Lq(Ω)) −→ R is of class C 2 and for every u, v, v 1 , v 2 ∈ Lp(0, T ; Lq(Ω)) where z vi = G ′ (u)v i , i = 1, 2 and ϕ u ∈ Y is the adjoint state associated to u, i.e., it is the solution of

10)
and A * denotes the adjoint operator of A introduced in (2.6).
Assumptions (A1), (A4) and (A5) together with Theorem 2.1 imply, see [26,Chapter III], that for every u ∈ U ad , ϕ u ∈ L 2 (0, T ; H 1 0 (Ω)) ∩ L ∞ (Q) and there exists a constant K ∞ > 0 independent of u such that Remark 2.6. From the expressions for F ′ (u) and F ′′ (u) established in the previous theorems it is immediate that they can be extended through the same formulas to continuous linear and bilinear forms, respectively, in L 2 (Q). Moreover, assumptions (A2) and (A3), Theorem 2.1 and inequality (2.11) imply the existence of some M 2 > 0 such that Finally, we notice that the directional derivative of j at u in the direction v can be computed as In what follows, we will write J ′ (u; v) = F ′ (u)v + µj ′ (u; v). We will also denote ∂j(u) as the subdifferential of j at u in the sense of convex analysis. Existence of a global solution of (P) follows in a standard way using Theorem 2.1; see e.g. [14]. Since (P) is not a convex problem, we consider local solutions as well. Let us state precisely the different concepts of local solution.
Definition 2.7. We say thatū ∈ U ad is an L r (Q)-weak local minimum of (P), with r ∈ [1, +∞], if there exists some ε > 0 such that An elementū ∈ U ad is said to be a strong local minimum of (P) if there exists some ε > 0 such that We say thatū ∈ U ad is a strict (weak or strong) local minimum if the above inequalities are strict for u =ū.
As far as we know, the notion of strong local solutions in the framework of control theory was introduced in [1] for the first time; see also [2].
Lemma 2.8. The following properties hold: 1.ū is an L 1 (Q)-weak local minimum of (P) if and only if it is an L r (Q)-weak local minimum of (P) for every r ∈ (1, +∞).

2.
Ifū is an L r (Q)-weak local minimum of (P) for some r < +∞, then it is an L ∞ (Q)weak local minimum of (P).
3. Ifū is a strong local minimum of (P), then it is a L r (Q)-weak local minimum of (P) for all r ∈ [1, ∞].
Proof. Statement 1 is a consequence of the equivalence of all the L r (Q) topologies (1 ≤ r < +∞) in U ad . Since u L r (Q) ≤ T 1/r |Ω| 1/r u L ∞ (Q) , statement 2 follows. To prove statement 3 we use the second estimate in Theorem 2.1: for all r ≥ max{p,q}. Then statement 3 follows from statement 1 and the above inequality.
Next we state first order optimality conditions. Theorem 2.9. Supposeū is a local solution of (P) in any of the senses given in Definition 2.7. Then J ′ (ū; u −ū) ≥ 0 ∀u ∈ U ad (2.14) holds. Moreover, there existȳ andφ in Y andλ ∈ ∂j(ū) such that Proof. To prove (2.14) it is enough to use the local optimality ofū and the convexity of U ad as follows: From the expression of F ′ established in Theorem 2.5 and the convexity of j we infer Hence,ū solves the problem min u∈L ∞ (Q) where I U ad is the indicator function of the convex set U ad . Therefore, using the subdifferential calculus, see e.g. [25, Chapter I, Proposition 5.6], we obtain 0 ∈ ∂I(ū) =φ + µ∂j(ū) + ∂I U ad (ū), which implies (2.15c) for someλ ∈ ∂j(ū).
If µ = 0, we deduce from Corollary 2.10 thatφ(x, t)v(x, t) = |φ(x, t)v(x, t)| for every v ∈ L 2 (Q) satisfying the sign condition (2.16). Consequently the following identity holds. (2.16) and v(x, t) = 0 if |φ(x, t)| > 0}. (2.18) For µ > 0, from Corollary 2.10 we also infer that see [17] for a proof. The second order necessary conditions are established in [11,Theorem 3.7]. Although that result is stated for elliptic problems and a Tikhonov regularization term, the proof can be translated to our setting with the straightforward changes.
Theorem 2.12. Supposeū is a local solution of (P) in any of the senses given in Definition 2.7. Then, F ′′ (ū)v 2 ≥ 0 for all v ∈ Cū holds.

Second order sufficient conditions
In this section, we establish the sufficient second order optimality conditions. In what follows, u will denote a control of U ad satisfying (2.14). We denote byȳ andφ the associated state and adjoint state.
As mentioned in the introduction, we have to extend the cone Cū to formulate the second order sufficient conditions for optimality.
Looking at J ′ (ū; v) for every τ > 0 we consider the extended cone The extended cone E τ u introduced in (1.3) has been used in the literature to formulate the second order sufficient optimality conditions; see [17]. The cone G τ u introduced above is a smaller extension of Cū than E τ u . Indeed, given E τ u , for every On the other hand, using the characterizations of the cone Cū given by (2.18) and (2.19) the following extensions appear in a natural way as well.
If µ > 0, D τ u = v ∈ L 2 (Q) satisfying (2.16) and v(x, t) For the use of the cones E τ u and D τ u to formulate the second order sufficient optimality conditions and for a discussion of their application to the stability analysis of the control problem, the reader is referred to [17]. In that paper it is proved that a sufficient second order condition based on the cone D τ u leads to an L 2 (Q)-weak local minimum, while the same condition based on the cone E τ u implies thatū is a strong local minimum. Hereafter we will prove that the condition based on the cone yields a strong local minimumū. Our main result is as follows: Theorem 3.1. Letū ∈ U ad satisfy the first order optimality condition (2.14). Suppose in addition that there exist δ > 0 and τ > 0 such that Then, there exist ε > 0 and κ > 0 such that for all u ∈ U ad such that y u −ȳ L ∞ (Q) < ε.
Note that if τ < τ ′ , then C τ u ⊆ C τ ′ u , and hence without loss of generality we can suppose that, for µ > 0, τ < µ. Throughout the proof of Theorem 3.1 we will use the following lemma. A proof of an analogous result can be found in [16,20], so we omit it. Lemma 3.2. For all ρ > 0 there exists ε ρ > 0 such that for every u ∈ U ad satisfying y u −ȳ L ∞ (Q) < ε ρ , there holds Proof of Theorem 3.1. Consider u ∈ U ad such that y u −ȳ L ∞ (Q) < ε, where ε will be fixed later independently of u; see (3.17) below.
A second order Taylor expansion yields the existence of θ ∈ (0, 1) such that where u θ =ū + θ(u −ū). Using this and the convexity of j(·), we have In a first step, we will prove the existence of ε 0 such that for all u ∈ U ad such that y u −ȳ L ∞ (Q) < ε 0 . We will split the proof of this first step into three cases.
Case 1: u −ū ∈ C τ u . Applying Lemma 3.2 with ρ = δ/2 we deduce the existence of ε 1 > 0 such that (3.3) holds for every u ∈ U ad such that y u −ȳ L ∞ (Q) < ε 1 . Inserting this inequality in (3.5) and using the variational inequality (2.14) and the second order condition (3.1), we obtain Case 2: u −ū ∈ G τ u . In this case, we consider where ε 1 is taken as in the previous case, and C f,M∞ , C Q,∞ and M 2 are introduced in (2.5), Lemma 2.3 and (2.12), respectively. Then, from Lemma 2.4, if y u −ȳ L ∞ (Q) < ε 2 , we can estimate z u−ū C(Q) < 2ε 2 . Therefore we have Let us estimate the terms of (3.5). Since u −ū satisfies the sign condition (2.16) and u −ū ∈ G τ u , then with (3.7) we get For the remaining terms, according to the choice we made for ε 1 in Case 1 and using (2.12), we infer From (3.5), (3.8) and (3.9) we deduce for Case 3: u −ū ∈ D τ u and u −ū ∈ G τ u . Now we cannot use the second order condition (3.1), nor is the first derivative big enough to assure optimality. Hence, our method of proof is different from the previous two cases. First we define τ * = τ / max{1, C Q,1 } ≤ τ , where C Q,1 is introduced in (2.3). If u −ū ∈ G τ * u holds, then we can argue as in the proof of the Case 2 to deduce that (3.6) holds for y u −ȳ L ∞ (Q) < ε 3 with We define the set W as follows: if µ = 0, W = (x, t) ∈ Q : |φ(x, t)| > τ and u(x, t) −ū(x, t) = 0 , if µ > 0, W = (x, t) ∈ Q :φ(x, t) = −µ andū(x, t) = 0 and u(x, t) < 0, orφ(x, t) = +µ andū(x, t) = 0 and u(x, t) > 0, or |φ(x, t)| − µ > τ and u(x, t) =ū(x, t) , if (x, t) ∈ V and w = (u −ū) − v. We first notice three properties of w. In [17,Proposition 3.6] it is proved that (3.10) Using this and the fact that the supports of w and v are disjoint, and noticing that v satisfies the sign condition (2.16), which allows us to use (2.17), we obtain Finally, using (2.3), we have Regarding v, it is clear that v ∈ D τ u . From (3.11) and (3.12) we get Since u −ū ∈ G τ * u , we obtain Altogether, we conclude Therefore v ∈ G τ * u ⊂ G τ u and hence v ∈ C τ u holds. Now we combine the techniques of Cases 1 and 2. On one hand, we have that v belongs to C τ u , so that we can use the second order condition (3.1). On the other hand, the function w satisfies that its L 1 (Q)-norm bounds from below the directional derivative J ′ (ū; u −ū). Let us see in detail how to do this. We start at the inequality (3.5). Applying Lemma 3.2 we deduce the existence of ε 4 > 0 such that for all u ∈ U ad such that y u −ȳ L ∞ (Q) < ε 4 . Now, we take .
From now on, we will assume that y u −ȳ L ∞ (Q) < ε 0 . Using that u −ū = v + w and applying the inequalities (2.12), (3.1), (3.10) and (3.13) we deduce from (3.5) (3.14) Using the inequality ab ≤ 1 2 a 2 + 1 2 b 2 for appropriate real numbers a, b, we infer Inserting this estimate in (3.14) and using (3.12) and the definition of τ * , we obtain Combining this with (3.15), we obtain Next we define the constants where C Q,∞ is given in Lemma 2.3, and assume y u −ȳ L ∞ (Q) < ε 5 . From (3.11), the fact that u −ū ∈ G τ u , Lemma 2.4, and using that ε 5 ≤ ε 2 , we deduce that Since w L ∞ (Q) ≤ β − α, using the above inequality and ε 1/3 5 And using Lemma 2.3, we obtain the estimate: Using this, we have where the last inequality follows from the definition of ε 0 . This combined with (3.16) yields (3.6).

Further extensions and limitations
The method developed in the previous sections can be extended with the obvious modifications to the case of a control problem governed by an elliptic equation as well as to Neumann control problems for both elliptic and parabolic equations. However, let us mention two situations where it is difficult that the second order sufficient condition (3.1) holds.
First, consider the situation where L ≡ 0 and ν Ω = 1. In this case we have Looking at this expression it is easy to notice that the fulfillment of (3.1) would depend on a lucky combination of the signs of the adjoint state and the second derivative of the nonlinearity f . Consequently, Theorem 3.1 does not seem to be applicable to this problem. A similar situation may occur if a nonlinearity is introduced on the boundary without a boundary observation. Consider, for instance, the problem governed by the elliptic equation where y d ∈ L 2 (Ω) is given; U ad = {u ∈ L ∞ (Ω) : α ≤ u(x) ≤ β for a.e. x ∈ Ω}, with −∞ < α < β < ∞; and −∆y u = u in Ω, ∂ n y u + g(x, y u (x)) = 0 on Γ.
With the straightforward adaptations to this problem of the notation used along the paper, the second derivative of F reads as In order to apply our theorem, the second order condition should be Once again, this condition is unlikely to be fulfilled. The situation would be different if we had a boundary observation y Γ ∈ L ∞ (Γ), so that the functional F is given by F (u) = 1 2 Γ (y u (x) − y Γ (x)) 2 dσ(x).
Then we would get and the second order sufficient condition would have a chance to be fulfilled. For instance, if ȳ − y Γ L 2 (Γ) is small enough, then φ L ∞ (Γ) is small as well, and, consequently we can deduce the existence of some δ > 0 such that 1 −φ ∂ 2 g ∂y 2 (x,ȳ) ≥ δ, which implies the above second order condition. From the previous two cases we conclude that a nonlinearity in the whole domain requires a distributed observation and a boundary nonlinearity needs a boundary observation for fulfillment of the second order sufficient condition.