ANALYSIS OF SPATIO-TEMPORALLY SPARSE OPTIMAL CONTROL PROBLEMS OF SEMILINEAR PARABOLIC EQUATIONS

Optimal control problems with semilinear parabolic state equations are considered. The objective features one out of three different terms promoting various spatio-temporal sparsity patterns of the control variable. For each problem, first-order necessary optimality conditions, as well as secondorder necessary and sufficient optimality conditions are proved. The analysis includes the case in which the objective does not contain the squared norm of the control. Mathematics Subject Classification. 49K20, 49J52, 35K58, 65K10. Received March 20, 2015. Revised June 24, 2015. Accepted September 10, 2015.


Introduction
In this paper, we analyze some optimal control problems governed by semilinear parabolic equations where the cost functional involves a functional j acting on the control which promotes the sparsity of the optimal control.We present three different choices for the functional j.Each of these choices induces a different spatio-temporal sparsity pattern for the optimal control, all of them being interesting.The control problems are formulated as follows where , t, y u (x, t)) dx dt + ν 2 ΩT u 2 (x, t) dx dt, j : L 2 (Ω T ) → R is a Lipschitz continuous and convex but not Fréchet differentiable function, ν ≥ 0, μ > 0. The state y u is the solution of the semilinear, parabolic equation ⎧ ⎨ ⎩ ∂ t y u + Ay u + a(x, t, y u ) = u in Ω T , y u = 0 on Σ T = Γ × (0, T ), y u (0) = y 0 in Ω. (1.1) Here, A is the linear elliptic operator We mention that it is possible to replace the Dirichlet boundary condition y u = 0 by a Neumann boundary condition ∂ nA y u = g with g ∈ L p (Σ), provided that p is sufficiently large, so that L ∞ estimates for the solution of the boundary value problem are obtained.The goal of this paper is to carry out the first and the second-order analysis of (P ν ).This analysis will be done for each of the three following choices for the functional j When we take j = j 1 , the corresponding problem (P ν ) will be denoted by (P 1 ν ).Analogously, we define the the control problems (P 2 ν ) and (P 3 ν ) corresponding to the other two functionals j 2 and j 3 .Problems with the functional j 1 and linear elliptic equations were first analyzed in [15].Later on, a secondorder analysis in the presence of semilinear elliptic state equations was provided in [5] and adapted to measurevalued controls in [4].Note that the functional j 1 does not provide control over the structure of the spatiotemporal sparsity pattern of the optimal control.
Problems involving the functional j 2 have been studied in [6], again with measure-valued controls in place of L 1 .The term j 2 promotes optimal controls which are spatially sparse, for almost all points in time.The spatial sparsity pattern may change over time.
Finally, the functional j 3 promotes sparsity patterns which are spatially sparse and constant throughout the time interval.Corresponding optimal control problems with linear elliptic and parabolic equations have been studied in [11], and the term directionally sparse controls was coined there.An extension of this work to measure-valued controls can be found in [13].
The motivation for considering measure-valued controls in some of the above references is that problem (P ν ) is not well-posed in L 2 (Ω T ) in case ν = 0, provided that control constraints are also absent (i.e.α = −∞, β = ∞).In this situation, minimizing sequences will converge in the weak- * topology of an appropriate measure space.Due to the presence of the control bounds in (P ν ), we can obtain solutions in L 2 (Ω T ) even when ν = 0, see Theorem 2.4 below.
Unless stated otherwise, the references above pertain to problems with linear state equations and convex objectives, hence no second-order analysis is necessary.
Besides the first-order necessary conditions, we derive in this paper second-order necessary and sufficient conditions for the non-convex problems (P 1 ν )-(P 3 ν ), which, in case ν > 0, both use the same cone of critical directions and thus provide the minimal gap between second-order necessary and sufficient conditions.Note that the second-order directional derivatives v → j (u; v 2 ) of the above functionals do not exist in all directions.
It is therefore necessary to define suitable substitutes, see (4.5)-(4.7).It was already shown in [5] that j 1 = 0 can be used in case of the first functional.This is however not true for j 2 and j 3 .
The paper is organized as follows.We summarize our assumptions and some preliminary results in the following section.Section 3 is devoted to the derivation of first-order optimality conditions.As a corollary, we analyze the sparsity structure of the solution in all three cases, see Remark 3.11 and Figure 1.In Section 4 we address second-order necessary optimality conditions and in Section 5 the second-order sufficient conditions.
We point out that the case ν = 0 is explicitly included in the analysis.The only problem that remains open is the second-order sufficient condition for problem (P 3 0 ).We comment on this case at the end of Section 5.

Assumptions and preliminary results
Throughout the paper, Ω denotes an open, bounded subset of R n , 1 ≤ n ≤ 3, with a Lipschitz boundary Γ ; see ( [14], Sect.1.3).The final time T > 0 is given and fixed.We make the following assumptions on the functions and parameters involved in the control problem (P ν ).
The proof of the existence and uniqueness of a solution of (1.1) in W (0, T ) ∩ L ∞ (Ω T ) is standard.The reader is referred, for instance, to [2] where the arguments used for a Robin boundary condition can be easily adapted to the Dirichlet case.For the proof of the differentiability we can proceed as follows.We set endowed with the graph norm.Y is a Banach space and Using that y ∈ L ∞ (Ω T ) and (2.3) we deduce that a(•, •, y) ∈ L p(0, T ; L q(Ω)).Hence, F is well defined and we can apply the implicit function theorem to deduce that G is of class C 2 and to show that (2.7) and (2.8) represent its first and second derivatives, respectively.
Remark 2.2.In Assumptions 2 and 3, the condition p, q ≥ 2 is not necessary for Theorem 2.1.Indeed, it is enough to impose p, q ∈ [1, +∞].However, the assumption p, q ∈ [2, +∞] is useful to get some extra regularity for y u and it will simplify our presentation, avoiding some technicalities.Now, we have the following differentiability result.

Theorem 2.3. Under the Assumptions
Moreover, for all u, v, v 1 and v 2 of L p(0, T ; L q(Ω)) we have ) where

.11)
A * being the adjoint operator of A.
The fact that F ν is of class C 2 is an immediate consequence of Theorem 2.1 and the chain rule.On the other hand, since y u ∈ L ∞ (Ω T ), we deduce from (2.5) that ∂L ∂y (•, •, y u ) ∈ L p(0, T ; L q(Ω)), which implies that ϕ u is well defined and enjoys the indicated regularity.The formulas (2.9) and (2.10) follow from standard computations.
Analogously to Y , we define the space endowed with the graph norm.As established for Y , we also have the embedding We conclude this section by stating the following theorem, whose proof follows from classical arguments by taking a minimizing sequence.Theorem 2.4.Problem (P ν ) has at least one solution ūν .

First-order optimality conditions
Since (P ν ) is not a convex problem, we will deal with local solutions.We say that ūν is a local solution of (P ν ) if there exists ε > 0 such that J ν (ū ν ) ≤ J ν (u) for all u ∈ B ε (ū ν ), where B ε (ū ν ) denotes the open ball in L 2 (Ω T ) centered at ūν and with radius ε.Moreover, ūν is said a strict local minimum if the previous inequality is strict for every u ∈ B ε (ū ν ) different from ūν .
The next theorem states the first-order optimality conditions satisfied by a local minimum of (P ν ).To this end, we recall that the tangent cone T K (ū ν ) of K at ūν with respect to the L 2 (Ω T )-topology is given by In the theorem, ∂j(ū ν ) denotes the subdifferential in the sense of convex analysis of j at the point ūν .

Problem (P 1 ν )
Recall the functional j 1 : L 1 (Ω T ) −→ R defined by Let us state some properties of j 1 .First, a simple computation shows that λ ∈ ∂j 1 (u) if and only if λ ∈ L ∞ (Ω T ) and holds a.e. in Ω T .Moreover, the directional derivatives of j 1 are given by where Ω + T,u , Ω − T,u and Ω 0 T,u denote the set of points of Ω T where u is positive, negative or zero, respectively.Now, taking j = j 1 in Theorem 3.1, we deduce from the variational inequality (3.4) the following properties.Corollary 3.2.Let ūν , φν and λν be as in Theorem 3.1, then the following relations hold for almost all ) Here, Proj [a,b] (c) refers to the projection of c ∈ R onto the interval [a, b] ⊂ R.Moreover, λν ∈ L 2 (0, T ; H 1 0 (Ω)) holds and it is unique.Finally, if ν > 0, we also have that ūν ∈ L 2 (0, T ; H 1 0 (Ω)).
The proofs of (3.7) and (3.9) were given in ( [5], Cor.3.2) and (3.8) can be found in ( [3], Thm.3.1).In both cases, the control problems are elliptic, but there is no change in the proofs (up to the replacement of the argument x by (x, t)) with the parabolic case, since the above corollary is just a consequence of (3.4).In the case ν > 0, the first equivalence of (3.7) shows the sparsity of ūν and the regularity follows from the second relation.For ν = 0, if the set of points (x, t) ∈ Ω T where | φν (x, t)| = μ has a zero Lebesgue measure (which is expected in many cases), then ūν (x, t) ∈ {α, 0, β} for almost all (x, t) ∈ Ω T , which means that the optimal control has a bang-bang-bang structure.
Using this characterization, it is easy to check that (3.10) implies λ ∈ ∂j 2 (u).Now, we prove that (3.10) holds for λ ∈ ∂j 2 (u) and u = 0. From (3.13), we obtain Therefore there exists a constant c > 0 such that λ(t for almost all t ∈ (0, T ).We infer from (3.15) that λ L 2 (L ∞ ) = 1 holds, and thus we obtain The characterization of ∂j 2 (0) follows directly from (3.13).Finally, (3.12) follows from (3.10).Now, we compute the directional derivatives j 2 (u; v).First, we define the auxiliary functional Analogously to (3.6), we obtain that the directional derivative of j Ω is given by Proof.Since j 2 is positively homogeneous, (3.19) is obvious and we need to consider only the case u = 0. Let us take 0 < ρ < 1, then It is enough to take the limit when ρ 0 to deduce (3.18).Now, we deduce from Theorem 3.1 the following corollary in the case j = j 2 .
The regularity properties of λν and ūν are immediate consequences of (3.22) and (3.20), respectively.
Remark 3.7.Let us observe that ūν = 0 if μ is bigger than a certain value μ 0 .Indeed, as pointed out in Remark 3.3, there exists a constant M > 0 such that φν L ∞ (ΩT ) ≤ M with M depending on α and β, but independent of μ.If we take μ 0 = M √ T , then ūν = 0 for every μ > μ 0 .Let us prove it by contradiction.If ūν = 0, then the identity (3.22) holds, and consequently But, (3.10) and the above inequality imply that ūν = 0. Therefore, we may influence the size of an optimal control's support by adjusting μ in the interval [0, M √ T ].

Problem (P 3 ν )
Now, we consider the functional j 3 : L 1 (Ω; L 2 (0, T )) → R given by Let us study the properties of this functional.To this end, we introduce a new functional that will be used later in this paper.Let Ψ : Given an element u ∈ L 1 (Ω; L 2 (0, T )), we denote and Ω 0 u = Ω \ Ω u .Now we characterize ∂j 3 (u) and compute j 3 (u; v).Proposition 3.8.The following statements hold.
We have to prove that |E| = 0. Let us take in (3.4) Note that v ∈ T K (ū ν ) holds in view of α < 0 and β > 0. Then we have Using the second relation of (3.27), we deduce from the above inequality Since ν > 0, the above inequality is possible only if |E| = 0. Let us consider the case ν = 0.The second implication of (3.31) is proved as the corresponding implication to the case ν > 0. The first implication is also proved arguing as above, the only difference is that the identity |E| = 0 does not follow from ν > 0, but from the strict inequality φν (x) L 2 (0,T ) < μ.
Finally, (3.32) is an immediate consequence of (3.4) and (3.27).The uniqueness of λν follows from the representation (3.32).Remark 3.10.As in Remarks 3.3 and 3.7, we can obtain the existence of a constant M > 0 independent of μ such that φν L ∞ (L 2 ) ≤ M .Therefore, (3.30) and (3.31) imply that ūν = 0 if μ > M. Hence, we can influence the size of of an optimal control's support by adjusting the parameter μ ∈ [0, M].Remark 3.11.It is interesting to compare the sparsity properties of the local solutions ūν corresponding to the studied problems.From (3.7) and (3.8) we obtain that the local solutions ūν of (P 1 ν ) are sparse in space and time.However, the solutions of (P 3 ν ) are only sparse in space as proved by (3.30) and (3.31), the sparsity region remaining constant throughout time.When we look at (3.20) and (3.21), we observe that the sparsity region of the solutions of (P 2 ν ) can change with the time.Thus we confirm the sparsity patterns as anticipated in the introduction.Any of the three formulations can be interesting with different possible applications.
In Figure 1, we show the optimal controls in the linear case using T = 1, Ω = (0, 1) and the parameters , where the desired state is given by The state equation is the one-dimensional linear parabolic equation with homogeneous initial and Dirichlet boundary conditions.

Second-order necessary optimality conditions
In this section, ūν denotes an element of K, with associated elements (ȳ ν , φν , λν ) ∈ Y × Φ × ∂j(ū ν ), such that the optimality system (3.2)-(3.4)holds.In order to address the necessary second-order optimality conditions, we introduce the cone of critical directions as follows.
Before proving this proposition we have to establish the following lemma.
Proof of Proposition 4.1.It is obvious that The contrary inequality is a consequence of Lemma 4.2, hence v ∈ C ūν .Now, we are going to define replacements for the second directional derivatives of the functional j, denoted by j , which are obtained by formal calculations.Note that the symbol j does not mean that the respective terms are second-order directional derivatives.Indeed, those derivatives do not exist for all directions v. Given u, v ∈ L 2 (Ω T ), we set J ν (u; v 2 ) = F ν (u) v 2 + μ j (u; v 2 ), with F ν (u) defined by (2.10) and j (u; v 2 ) is defined for the three different functionals under investigation j 1 (u; v 2 ) = 0, (4.5) In (4.6) and (4.7), j Ω is given by (3.17) and Ω u is defined in Section 3.3.With this notation, we have the following second-order necessary optimality conditions valid for the three functionals j i .Theorem 4.3.Let ν ≥ 0 and ūν be a local minimum of (P ν ).Then J ν (ū ν ; v 2 ) ≥ 0 for every v ∈ C ūν .
The rest of the section will be devoted to the proof of this theorem.We distinguish three cases.

Problem (P 1 ν )
For this problem we recall that, by definition, J ν (ū ν ; v 2 ) = F ν (ū ν ) v 2 ; see (4.5).Therefore we can formulate the second-order sufficient condition in terms of F ν .We will distinguish the cases ν > 0 and ν = 0.For ν > 0 there are different equivalent ways of formulating the second-order sufficient optimality conditions.Theorem 5.1.Let us assume that ν > 0. Then the following statements are equivalent (5.4) where z v = G (ū ν ) v is the solution of the linearized parabolic equation (2.7) corresponding to y u = ȳν .
Before establishing the theorem on the second-order sufficient conditions, we need to prove a technical lemma.
Proof.First, we get from (1.1) and the boundedness of Now, for every u ∈ K, subtracting the equations satisfied by y u and ȳν , and using the mean value theorem we obtain From here, it follows (5.9) We proceed with the adjoint states ϕ u and φν in a similar way From (5.8) and (5.9) along with the assumptions (2.3) and (2.5) we obtain Subtracting the equations satisfied by z u,v and z v we get From the mean value theorem and (2.3) we deduce (5.11) Hence, we also have (5.12) From the expression (2.10) we infer Let us estimate these four terms.First, (2.3), (2.5), (5.8), (5.11) and (5.12) imply Using (5.9) and taking 0 , we deduce from the above inequality Using again (5.9) along with Assumption (2.6), we infer the existence of ε 2 > 0 such that I 2 satisfies the above inequality for u − ūν L 2 (ΩT ) ≤ ε 2 .Analogously, using (5.9), (2.4) and the fact that φν ∈ L ∞ (Ω T ), we deduce the same estimate for I 3 for some ε 3 > 0. Finally, from (5.8), (5.10) and ( 2.3) we get , we obtain the corresponding estimate for I 4 .Adding the estimates for I i , we conclude (5.7) for ε = min 1≤i≤4 ε i .Theorem 5.3.Let ν > 0 and assume that ūν satisfies (5.3).Then there exists ε > 0 such that where B ε (ū ν ) denotes the ball of L 2 (Ω T ) centered at ūν and with radius ε.
, then from the definition (5.1) it follows Hence, making a Taylor expansion of F ν (u) around ūν , using (4.4), (5.14), and the above inequality we get From the definition of ε and ε 1 , we deduce (5.13) from the above inequality.
Case II: u − ūν ∈ C τ ūν .-Now,from (4.2), (5.3), and (5.15) we infer Next we consider the case ν = 0.Under this assumption the relations (5.2)-(5.5)are not equivalent.It is known that (5.2) is not a sufficient second-order condition for local optimality, in general; see the example by Dunn [7].On the other hand, (5.3) is never fulfilled for ν = 0. Indeed, the reader is referred to [3] for the proof of this statement in the case of an elliptic state equation, which can be reproduced in the parabolic case just replacing x by (x, t).Finally, we prove that the condition (5.4) is sufficient for the local optimality of ūν .However, the conclusion (5.13) does not hold.A weaker consequence is deduced from (5.4) for ν = 0. Theorem 5.4.Let ν = 0 and assume that ūν satisfies (5.4).Then there exists ε > 0 such that ) denotes the ball of L 2 (Ω T ) centered at ūν and with radius ε.
Proof.The proof follows the same steps as the preceding one, with minor modifications.Let us point out the changes.First, we take , where C 0 was introduced in (5.14).Second, using again Lemma 5.2 we obtain (5.17) Finally, we set ε = min{ε 1 , ε 2 }.Now, if u − ūν ∈ C τ ūν , then we argue as in the proof of Theorem 5.3, using this time the first inequality of (5.14) and later (5.5) to deduce If u − ūν ∈ C τ ūν we proceed exactly as in the proof of Theorem 5.3 using (5.17) and (5.4) instead of (5.15) and (5.3).

Problem (P 2 ν )
For the problem (P 2 ν ) we have that , where j 2 (ū ν ; v 2 ) is defined by (4.6).Though the term j 2 (ū ν ; v 2 ) can help to the coercivity of the second derivative J ν (ū ν ; v 2 ), it makes the analysis of the second-order conditions technically more complicated, as we will see in the next theorem.Theorem 5.5.Let us assume that ν > 0. Then the following statements are equivalent where z v = G (ū) v is the solution of (2.7) corresponding to y u = ȳν .
Proof.For 0 < t < T we set and Then we have for every w ∈ L 2 (Ω T ) by (3.17) Hence, for every f ∈ L 2 (0, T ) we have Analogously, we have that I − k → I − .It remains to prove the convergence I 0 k → I 0 to conclude the proof.To this end, we observe that the convergence Using again the weak convergence v k v, we have and v(x, t) dx dt.
The last three limits imply that Then, we can take a subsequence, denoted in the same way, such that According to (3.13), we distinguish three cases for λν (x, t).
. Then, from (5.24) we get The next lemma shows that j 2 satisfies a second-order Taylor (directional) expansion and is a preparation for Theorem 5.8.Lemma 5.7.Let u ∈ L 2 (Ω T ) be arbitrary.For any δ > 0, there exists ε > 0 such that holds for all v L 2 (ΩT ) ≤ ε.
Proof.We take ε 1 as in the proof of Theorem 5.3, and Let us take an arbitrary element u ∈ K ∩ B ε (ū ν ).If u − ūν ∈ C τ ūν , then we can repeat the proof of Theorem 5.3 to obtain (5.29).Let us consider the case u − ūν ∈ C τ ūν .Making a Taylor expansion of F ν (u) around ūν , using Lemma 5.7 and (5.30), we obtain Finally, we analyze the case ν = 0. To this end, we need a Taylor expansion of j 2 similar to Lemma 5.7, but we now have to estimate the remainder in terms of both j 2 and j 2 , since the second-order condition (5.20)only provides a growth w.r.t.z v L 2 (ΩT ) .Lemma 5.9.Let u ∈ L 2 (Ω T ) be arbitrary.There exists ε > 0 and C > 0, such that holds for all v L 2 (ΩT ) ≤ ε.
Proof.In case u = 0, the assertion follows from j 2 (0; v) = j 2 (v) and j 2 (0; v 2 ) = 0. Now, let u = 0. We set We proceed as in the proof of Lemma 5.7, and define f and g in the same way.Thus, (5.27) and (5.28) hold.Moreover, using the lower bound for f θ L 2 (0,T ) proved there, we deduce from (5.28) It remains to compare the last terms with the first and second directional derivative of j 2 .To this end we need to compute Ψ (f )g 2 .For convenience, we define The last inequality follows from the convexity of z → z 3/2 for z ≥ 0 and j 2 (u; v 2 ) ≥ 0.
Proof.Let us define τ = τ C K and Due to (5.5), we have Therefore, (5.20) holds for all v ∈ E τ ūν .Then, from (5.14) and (5.20) we get for every v ∈ E τ ūν and all μ 0 ∈ (0, μ) (5.34) Given ε > 0 to be fixed later, we take an arbitrary element u ∈ B ε (ū ν )∩K and we distinguish two cases.First, we assume that u − ūν ∈ E τ ūν .Then, we argue similarly to the proof of Theorem 5.4 and use (5.5) and (5.14) assuming that 0 < ε ≤ ε 1 with ε 1 chosen as in the proof of Theorem 5.4.Now, we suppose that u − ūν ∈ E τ ūν .By using Lemma 5.9, we obtain the Taylor expansion The first line on the right-hand side is non-negative, since ūν satisfies the first-order condition (3.4).For the second line, we use (5.34).The third line is bounded by using (5.7) which holds with ρ = δ/4 for ε small enough.
In the last two lines, we use which again holds for ε small enough.Now, the above inequality simplifies to (5.36) Now, from (2.9) we obtain Then, it is enough to use (2.5) with M = ȳν L ∞ (ΩT ) to deduce Using this estimate in (5.35), (5.36) we find for ε small enough.

Problem (P 3 ν )
For the problem (P 3 ν ) we have that J ν (ū ν ; v 2 ) = F ν (ū ν ) v 2 + μ j 3 (ū ν ; v 2 ), where j 3 (ū ν ; v 2 ) is given by (4.7).Analogously to Theorems 5.1 and 5.5 we have the next result.Theorem 5.11.Let us assume that ν > 0. Then the following statements are equivalent (5.39) Proof.It is enough to prove that (5.37) implies (5.38).The proof follows the same steps of that of Theorem 5.5.The only difference is the way of obtaining the inequality To prove this it is enough to observe that the mapping L 2 (Ω T ) v → j 3 (ū ν ; v 2 ) ∈ R is convex and lower semicontinuous.Now, we prove the theorem analogous to Theorems 5.3 and 5.8.Theorem 5.12.Let ν > 0 and assume that ūν satisfies (5.37).Then there exist ε > 0 and δ > 0 such that Proof.We assume that ūν ≡ 0, the case ūν ≡ 0 being immediate.We argue by contradiction.If (5.40) does not hold for any ε > 0 and δ > 0, then for any integer k ≥ 1 there exists an element u k ∈ K such that (5.41) Let us define (5.42) We take a subsequence, if necessary, such that v k v in L 2 (Ω T ).The proof is split into three steps.
Step I. v ∈ C ūν .First we observe that v k ∈ T K (ū ν ) for every k.Since T K (ū ν ) is convex and closed in L 2 (Ω T ), we have that v ∈ T K (ū ν ) as well.On the other hand, since j 3 is a Lipschitz and convex function we have that The last equality is an immediate consequence of the definition of v k in (5.42).Using this inequality and (5.41) we get This inequality and (4.2) imply that F ν (ū ν ) v + μ j 3 (ū ν ; v) = 0, hence v ∈ C ūν .
In the above proof, the fact that ν > 0 was crucial to get the contradiction.The proof of the sufficient second order conditions for the case ν = 0 is an open problem for us.An important difference between the cases j 2 and j 3 is that there is no singularity in j 2 (u; v 2 ) if u = 0, however we can have singularities in the integral defining j 3 (u; v 2 ) for u = 0 when u(x) L 2 (0,T ) → 0. The integrals in (4.7) can be +∞.This renders the handling of the remainder terms in the Taylor expansions of j 3 (u) around ūν rather complicated.To be more precise, we were not able to show a remainder term estimate parallel to Lemma 5.9 for j 3 .This estimate, however, was crucial in the proof of Theorem 5.10 since in the case of ν = 0, (5.39)only provides a growth in terms of z v

Figure 1 .
Figure 1.Different sparsity structures of optimal controls using j 1 (top right), j 2 (bottom left) and j 3 (bottom right).The desired state is shown top left.The problem parameters are given in Remark 3.11.