Analysis of optimal control problems of semilinear elliptic equations by BV-functions

Optimal control problems for semilinear elliptic equations with control costs in the space of bounded variations are analysed. BV-based optimal controls favor piecewise constant, and hence 'simple' controls, with few jumps. Existence of optimal controls, necessary and suffcient optimality conditions of first and second order are analysed. Special attention is paid on the effect of the choice of the vector norm in the definition of the BV-seminorm for the optimal primal and adjoined variables.

1. Introduction. This paper is dedicated to the study of the optimal control problem (P) min u∈BV (ω) where y is the unique solution to the Dirichlet problem −∆y + f (x, y) = uχ ω in Ω, y = 0 on Γ. (1.1) The control domain ω is an open subset of Ω. We assume that α > 0, β ≥ 0, γ ≥ 0, y d ∈ L 2 (Ω), and Ω is a bounded domain in R n , n = 2 or 3, with Lipschitz boundary Γ. Additionally we make the following hypothesis: if n = 3, then γ > 0 is assumed. (1.2) Here, BV (ω) denotes the space of functions of bounded variation in ω and ω |∇u| stands for the total variation of u. The assumptions on the nonlinear term f (x, y) in the state equation will be formulated later. By introducing the penalty term involving the mean of u when β > 0 we realize the fact that constants functions constitute the kernel of the BV-seminorm. If γ = 0, in dependence on the order of the nonlinearity f it can be necessary to choose β > 0 to guarantee that (P) admits a solution.
The use of the BV-seminorm in (P) enhances that the optimal controls are piecewise constant in space. Thus the cost functional in (P) models the objective of simultaneously determining a control of simple structure and resulting in a state y = y(u) which is as close to y d as possible. Comparing with the common formulation of using L 2 (ω) or L p (ω) control-cost functionals, with p > 2 to match the nonlinearity f , these later functionals will produce smooth optimal controls which may be more intricate to realize in practice than controls which result from the BV −formulation. Piecewise constant behavior of the optimal controls can also be obtained by introducing bilateral bounds a ≤ u(x) ≤b together with only the tracking term in (P). In this case we can expect optimal controls which exhibit bang-bang structure. If an L 1 (ω) control cost term is added then the optimal control will be of the form bang-zero-bang. But this type of behavior is distinctly different from that which is allowed in (P), since the value of the piecewise constants plateaus is not prescribed. This is distinctly different from the bilaterally constraint case where the optimal control typically assumes one of the extreme values a orb. This in turn can lead to unnecessarily high control costs.
Possibly one of the first papers where this was pointed out, but not systematically investigated is [15]. In [9] semilinear parabolic equations with temporally dependent BV-functions as controls were investigated. Thus we were focusing on controls which are optimally switching in time. The analysis for this case is simpler and exploits specific properties of BV-functions in dimension one. Numerically the simple structure of the controls which is obtained for BV-constrained control problems was already demonstrated in [5,9] and a recent master thesis [19]. BV-seminorm control costs are also employed in [8], where the control appears as coefficient in the p-Laplace equation. Beyond these papers the choice of the control costs related to BV-norms or BV-seminorms has not received much attention in the optimal control literature yet.
In mathematical image analysis, to the contrary, the BV-seminorm has received a tremendous amount of attention. The beginning of this activity is frequently dated to [22]. Let us also mention the recent paper [2] which gives interesting insight into the structure of the subdifferential of the BV-seminorm. Fine properties of BVfunctions, in the context of image reconstruction problems, in particular the stair casing effect were, analyzed for the one-dimensional case in [21], and in higher dimensions in [20,14], for example. In [13] the authors provided a convergence analysis for BV-regularized mathematical imaging problems by finite elements, paying special attention to the choice of the vector norm in the definition of the BV-seminorm.
Let us also compare the use of the BV-term in (P) with the efforts that have been made for studying optimal control problems with sparsity constraints. These formulations involve either measure-valued norms of the control or L 1 -functionals combined with pointwise constraints on the control. We cite [5,7] from among the many results which are now already available. The BV-seminorm therefore can also be understood as a sparsity constraint for the first derivative.
Let us briefly describe the structure of the paper. Section 2 contains an analysis of the state equation and the smooth part of the cost-functional. The non-smooth part of the cost-functional is investigated in Section 3. Special attention is given to the consequences which arise from the specific choice which is made for the vector norm in the variational definition of the BV-seminorm. In particular, we consider the Euclidean and the infinity norms. Existence of optimal solutions and first order optimality conditions are obtained in Section 4. Second order sufficient optimality conditions are provided in Section 5. Finally in Section 6 we consider (P) with an additional H 1 (ω) regularisation term and investigate the asymptotic behavior as the weight of the H 1 (ω) regularisation tends to 0.
2. Analysis of the state equation and the cost functional. We recall that a function u ∈ L 1 (ω) is a function of bounded variation if its distributional derivatives ∂ xi u, 1 ≤ i ≤ n, belong to the Banach space of real and regular Borel measures M(ω).
Given a measure µ ∈ M(ω), its norm is given by where C 0 (ω) denotes the Banach space of continuous functions z :ω −→ R such that z = 0 on ∂ω, and |µ| is the total variation measure associated with µ. On the product space M(ω) n we define the norm where | · | is a norm in R n .
On BV (ω) we consider the usual norm that makes BV (ω) a Banach space; see [1,Chapter 3] or [18, Chapter 1] for details. We recall that the total variation of u is given by We also use the notation as already employed in (P). For these topologies ∇ : BV (ω) −→ M(ω) n is a linear continuous mapping. In the sequel we will denote a u = 1 |ω| ω u(x) dx andû = u − a u for every u ∈ BV (ω).
By using [1,Theorem 3.44] it is easy to deduce that there exists a constant C ω such that In addition, we mention that BV (ω) is the dual space of a separable Banach space. Therefore every bounded sequence {u k } ∞ k=1 in BV (ω) has a subsequence converging weakly * to some u ∈ BV (ω). The weak * convergence u k * ⇀ u implies that u k → u strongly in L 1 (ω) and ∇u k * ⇀ ∇u in M(ω) n ; see [1, pages 124-125]. We will also use that BV (ω) is continuously embedded in L p (ω) with 1 ≤ p ≤ n n−1 , and compactly embedded in L p (0, T ) for every p < n n−1 ; see [1,Corollary 3.49]. From this property we deduce that the convergence u k * ⇀ u in BV (ω) implies that u k → u strongly in every L p (0, T ) for all p < n n−1 . We make the following assumption on the nonlinear term of the state equation.
By using these assumptions, the following theorem can be proved in a standard way; see, for instance, [26, §4.2.4]. For the Hölder continuity result, the reader is referred to [17,Theorem 8.29].
Proposition 2.1. For every u ∈ Lp(ω) the state equation (1.1) has a unique solution y u ∈ C σ (Ω) ∩ H 1 0 (Ω) for some σ ∈ (0, 1). In addition, for every M > 0 there exists a constant K M such that In the sequel we will denote Y = C(Ω)∩H 1 0 (Ω) and S : Lp(ω) −→ Y the mapping associating to each control u the corresponding state S(u) = y u . We have the following differentiability property of S. and respectively. The proof is a consequence of the implicit function theorem. Let us give a sketch. We define the space V = {y ∈ Y : ∆y ∈ Lp(Ω)} endowed with the norm y V = y C(Ω)) + y H 1 0 (Ω) + ∆y Lp(Ω) .
Thus, V is a Banach space. Now we introduce the mapping F : From (2.4) we deduce that F is of class C 2 and ∂F ∂y (y, u)z = −∆z + ∂f ∂y (x, y)z.
Next, we separate the smooth and the non smooth parts in J: where g : M(ω) n −→ R is given by g(µ) = µ M(ω) n . In the rest of this section we study the differentiability of F . From Proposition 2.2 and the chain rule the following proposition can be obtained.
The derivatives of F are given by

9)
and (2.11) The C(Ω) regularity of ϕ u follows from the assumptions on y d ∈ L 2 (Ω) and the fact that y u ∈ L ∞ (Ω).
Remark 2.4. If n = 2, since BV (ω) is embedded in L 2 (ω), then the functional F : BV (ω) −→ R is well defined and it is of class C 2 with derivatives given by (2.9) and (2.10). However, if n = 3, then BV (ω) is only embedded in L 3/2 (ω). Hence, for elements u ∈ BV (ω) Proposition 2.1 is not applicable and, therefore, the functional F is not defined in BV (ω). To deal with the case n = 3 we introduced the assumption (1.2), i.e. γ > 0. Hence, the functional F : BV (ω) ∩ L 2 (ω) −→ R is well defined and of class C 2 .
The assumption (1.2) can be avoided if we suppose that the nonlinearity f (x, y) has only polynomial growth of arbitrary order in y. In this case, Propositions 2.1 and 2.2 hold if we change Y to Y q = L q (Ω) ∩ H 1 0 (Ω) with q < ∞ arbitrarily big. We recall that for a right hand side of the state equation belonging to L 3/2 (Ω) the solution of the state equation does not belong to L ∞ (Ω), in general, even for linear equations. However, since L 3/2 (Ω) ⊂ W −1,3 (Ω), we can use [25,Theorem 4.2] to deduce that y u ∈ L q (Ω) ∀q < ∞. To analyze the semilinear case one can follow the classical approach of truncation of the nonlinear term, Schauder's fix point theorem, and L qestimates from the linear case combined with the monotonicity of the nonlinear term. Finally, since γ = 0, we have that the functional F : Remark 2.5. In the state equation, the Laplace operator −∆ can be replaced by a more general linear elliptic operator with bounded coefficients. All the results proved in this paper hold for these general operators.
3. Analysis of the functional G. Now, we analyze the functional G. We already expressed G as the composition G = g • ∇. Concerning the functional g, we note that it is Lipschitz continuous and convex. Hence, it has a subdifferential and a directional derivative, which are denoted by ∂g(µ) and g ′ (µ; ν), respectively. Before giving an expression for g ′ (µ; ν), we have to specify the norm that we use in R n . Indeed, in the definition of the norm µ M(ω) n we have considered a generic norm | · | in R n . The choice of the specific norm strongly influences the structure of the optimal controls. In this paper, we focus on the Euclidean and the | · | ∞ norms, which lead to different properties for g, that we consider separately in the following two subsections. To illustrate one aspect, let us observe that the use of the | · | ∞ norm on R n in the definition of · M(ω) n implies that (3.1) In particular, it holds that However, for the Euclidean norm we have, in general, that Indeed, the identity (3.1) is an immediate consequence of the definitions of the norms · M(ω) and · M(ω) n . To verify (3.2) we give an example. Let us fix n different points in ω and take ε > 0 small enough such that the balls B ε (ξ i ) are disjoint. Now, applying Uryshon's lemma, cf. [23, Lemma 2.12], we get functions On the other hand, we get 3.1. The use of the Euclidean norm | · | 2 . In order to give an expression for g ′ (µ; ν), let us introduce some notation. We recall that if µ ∈ M(ω) n , its associated total variation measure is defined as a positive scalar measure as follows where B is the σ-algebra of Borel sets in ω, and |µ(E k )| 2 denotes the Euclidean norm in R n of the vector µ(E k ). Let us denote by h µ the Radon-Nikodym derivative of µ with respect to |µ|. Thus we have Given a second vector measure ν ∈ M(ω) n , the following Lebesgue decomposition holds: ν = ν a + ν s , dν a = h ν d|µ|, where ν a and ν s are the absolutely continuous and singular parts of ν with respect to |µ|, and h ν is the Radon-Nikodym derivative of ν with respect to |µ|. Then, the following identity is fulfilled The reader is referred to [1,Chapter 1]. Now, we analyze the subdifferential ∂g(µ). It is well known that an element This is equivalent to the next two relations Observe that λ belongs to the dual of M(ω) n , which is not a distributional space. In the special case where λ ∈ C 0 (ω) n , we can establish some precise relations between λ and µ. Before proving these relations, let us mention that here we have Then, using that |λ(x)| 2 ≤ 1 ∀x ∈ ω and |h µ (x)| 2 = 1 |µ|-a.e. in ω we deduce from the identity in ω. Using again that |h µ (x)| 2 = 1, |µ|-a.e., we conclude that λ(x) = h µ (x), |µ|-a.e. Therefore, we have that |µ| {x ∈ ω : |λ(x)| 2 < 1} = 0, which implies 2.
Next we study the directional derivatives of g.
where ν = ν a + ν s = h ν d|µ| + ν s is the Lebesgue decomposition of ν respect to |µ|. Proof. As above, let us write dµ = h µ d|µ|. Then we have Since the quotients are dominated by |h ν | 2 , we applied Lebesgue's dominated convergence theorem above. Moreover, we use that |h µ (x)| 2 = 1 |µ|-a.e. in ω in the last equality and also to justify the differentiability of the norm | · | 2 at every h µ (x) with x in the support of |µ|. Now, we come back to the mapping G. To this end, let us recall that the adjoint operator ∇ * is defined by Proposition 3.3. The following identities hold for all u ∈ BV (ω): Proof. Since ∇ : BV (ω) −→ M(ω) n is a linear and continuous mapping and g : M(ω) n −→ R is convex and continuous, we can apply the chain rule [16, Chapter I, Proposition 5.7] to deduce that ∂(g • ∇)(u) = ∇ * ∂g(∇u), which immediately leads to (3.7).
To verify (3.8) it is enough to observe that and to apply (3.6).
Proof. For the proof it is enough use (3.1) to obtain Then, we proceed as in the proof of [10, Proposition 3.3].
With the same proof we infer that Proposition 3.3 is also true for the | · | ∞ norm with (3.8) being interpreted as follows where 4. Existence of an optimal control and first order optimality conditions. The proof of the existence of an optimal control follows the lines of [9, Theorem 3.1] with the obvious modifications.
Then, problem (P) has at least one solution. Moreover, if f is affine with respect to y, the solution is unique. Now, we prove the first order optimality conditions satisfied by any local minimum of (P). whereφ ∈ H 1 0 (Ω) ∩ C(Ω) is the adjoint state corresponding toū. Proof. Let us denote byφ ∈ C(Ω) ∩ H 1 0 (Ω) the adjoint state corresponding to the local solutionū. Given v ∈ BV (ω) ∩ L 2 (ω), from the local optimality ofū and the convexity of G we deduce for every 0 < ρ < 1 small enough Passing to the limit as ρ → 0 in the above inequality and using (2.9) we obtain for Replacing v by u −ū, this inequality can be written This along with (3.7) implies which implies (4.1).
Sinceλ ∈ M(ω) n and M(ω) n is not a distribution space, sometimes it can be more convenient to handle a different optimality system involving distributional spaces, mainly if we think of the numerical analysis. To this end, we present the following equivalent optimality conditions. Theorem 4.3. Let us assume that n = 2. Givenū ∈ BV (ω), letȳ andφ be the associated state and adjoint state. Then, there existsλ ∈ ∂g(∇ū) satisfying (4.1) if and only if there existsΦ ∈ C 0 (ω) n such that Proof. Assume thatλ ∈ ∂g(∇ū) satisfies (4.1). We define a linear form T 0 in M(ω) n as follows We prove that T 0 is weakly * continuous on its domain. Let {µ k } k ⊂ D(T 0 ) and µ ∈ D(T 0 ) be such that µ k * ⇀ µ in M(ω) n . By definition of D(T 0 ) there exists elements {v k } k ⊂ BV (ω) and v ∈ BV (ω) such that µ k = ∇v k and µ = ∇v. Without loss of generality we assume that the integrals of each v k and v in ω are zero. Using (2.1), we know that {v k } k is bounded in BV (ω). From the continuity of the embedding BV (ω) ⊂ L 2 (ω) due to n = 2 and the convergence ∇v k * ⇀ ∇v in M(ω) n , we obtain that v k ⇀ v in L 2 (ω). Therefore, we get with (4.1) which implies the weak * continuity of T 0 . Hence, there exists a weakly * continuous linear form T : M(ω) n −→ R extending T 0 ; [24, Theorem 3.6]. In this case, we know that T can be identified with an elementΦ ∈ C 0 (ω) n , i.e. Hence, we haveλ ∈ ∂g(∇ū); see (3.3)-(3.5). Finally, (4.1) follows from (4.2) and the definition of T 0 . This concludes the proof.
Remark 4.4. Theorem 4.3 is still valid in dimension n = 3 if we take γ = 0 and we assume that the nonlinearity of f (x, y) has a polynomial growth of arbitrary order with respect to the variable y; see Remark 2.4. Indeed, let us observe that the limit (??) is still valid because v k ⇀ v in L 3/2 (Ω) andφ+ β ωū ds is a continuous function inΩ.
2. For the | · | ∞ norm, for any 1 ≤ j ≤ n such that if ∂ xjū = 0 we have Φ j C0(ω) = 1, and 5. Second order optimality conditions. The goal of this section is to prove necessary and sufficient second order optimality conditions for problem (P). In the whole section,ū will denote a fixed element of BV (ω)∩L 2 (ω) satisfying the optimality conditions given in Theorem 4.2. As in Section 3, we will distinguish the cases where the norms | · | 2 and | · | ∞ in R n are used in the definition of the measure ∇u M(ω) n .
In the sequel, we will denote h v = (h v,1 , . . . , h v,n ) andh = (h 1 , . . . ,h n ). First, we state the second order necessary optimality conditions. To this end we define the cone of critical directions Cū as the closure in L 2 (ω) of the cone Then, we have the following result. Theorem 5.1. Ifū is a local minimum of (P), then F ′′ (ū)v 2 ≥ 0 ∀v ∈ Cū.
Proof. We will prove the result for every v ∈ Eū. Then, the theorem follows by using the continuity of quadratic from v ∈ L 2 (ω) → F ′′ (ū)v 2 ∈ R. Given v ∈ Eū and ρ > 0 we set We have with Schwarz inequality Taking into account (5.2) we get for 1 ≤ j ≤ n Using this identity and (5.1) we get
Making a Taylor expansion of F aroundū, using the convexity of G and (5.6), (5.7) and (5.9), we get for some 0 ≤ θ ≤ 1 Case II: u −ū ∈ C τ u . This implies that Moreover, from (5.10) and the definition of ε we infer and therefore δ + C 2 2τ z u−ū L 2 (Ω) ≤ 1. (5.14) Using again the convexity of G, (5.11), (5.13) and (5.14) we infer Finally, choosing ε still smaller, if necessary, we have that [4, page 2364] The last two inequalities imply (5.8) with κ = δ 8 . We observe that (5.7) is a sufficient second order condition for strict local optimality ofū in the L 2 (ω) sense. Moreover, by using (5.8) we can prove stability of the optimal states with respect to perturbations in the data of the control problem. However, it does not provide information on the optimal controls. If γ > 0 we can change (5.7) by a stronger assumption leading to a quadratic growth of the controls instead of the states; i.e. y u −ȳ 2 L 2 (Ω) can be replaced by u −ū 2 L 2 (ω) in (5.8). However, if γ = 0, then this is not possible; see [4].
Remark 5.5. The reader can easily check that Theorems 5.2 and 5.3 also hold when the | · | 2 norm is used. However, to reduce the gap between the necessary and sufficient conditions for optimality, we should prove that the conditions imply (5.8) and (5.17), respectively. This, however, remains as a challenge.

7.
Conclusions. An analysis for BV-regularised optimal control problems associated to semilinear elliptic equations was provided. Existence, first order necessary and second order sufficient optimality conditions were investigated. Special attention was given to the different cases which arise due to the choice of a particular vector norm in the definition of the BV-seminorm. If (P) is additionally regularised by an H 1 (ω)-seminorm, then the set where the gradient of the optimal solution vanishes, can be characterised conveniently by an adjoint variable, see (6.4) and (6.6). For the original problem (P) without H 1 (ω)-seminorm regularisation, such a transparent description of the set where the measure |∇ū| vanishes is not available, rather it was replaced by the properties specified in Theorem 4.3.