Approximation of Elliptic Control Problems in Measure Spaces with Sparse Solutions

Optimal control problems in measure spaces governed by elliptic equations are considered for distributed and Neumann boundary control, which are known to promote sparse solutions. Optimality conditions are derived and some of the structural properties of their solutions, in particular sparsity, are discussed. A framework for their approximation is proposed which is efficient for numerical computations and for which we prove convergence and provide error estimates.


Introduction.
This paper is dedicated to the approximation of the optimal control problem (1.1) (P) min where y is the unique solution to the Dirichlet problem (1.2) −Δy + c 0 y = u in Ω, y = 0 on Γ with c 0 ∈ L ∞ (Ω) and c 0 ≥ 0. We assume that α > 0, y d ∈ L 2 (Ω), and Ω is a bounded domain in R n , n = 2 or 3, which is supposed to either be convex or have a u, z = sup which is equivalent to the total variation of u. It has been observed that the use of measures leads to optimal controls which are sparse. This is relevant for many applications in distributed parameter control; see [6]. Moreover, the support of the optimal control provides information on the optimal placements of control actuators. Formally, the same features can be achieved by using L 1 (Ω) control cost. In this case, however, the optimal control problem is not wellposed in the sense of a possible lack of existence of a minimizer because L 1 (Ω) does not allow an appropriate topology for compactness arguments. Other techniques have been used to overcome this difficulty, including the use of regularization techniques or the introduction of control constraints; see, for instance, [4], [15], [16].
The focus of this paper is to give an approximation framework which, in spite of the difficulties due to the presence of measures, leads to implementable schemes for which a priori error estimates can be provided. We show that the optimal control measure can be approximated efficiently by a linear combination of Dirac measures. This is important for practical applications because it provides a way of controlling a distributed system by finitely many point actuators, giving information on where they have to be placed. A similar framework in the context of inverse problems was considered in [1].
The plan of the paper is as follows. In the next section we provide optimality conditions for (1.1) and derive some properties of the solution, in particular sparsity and actuator location. In section 3, we introduce the approximation framework and prove convergence of the discretized problems to the continuous one. Rate of convergence results are provided in section 4. In section 5 we show that analogous results can also be obtained for Neumann control problems. Finally, the last section is devoted to numerical test problems.
2. Optimality conditions. Before establishing the optimality conditions for problem (1.1) and deducing some consequences from them, let us observe some important facts. First, given a measure u ∈ M(Ω), we say that y is a solution to (1.2) if where A = −Δ + c 0 I. It is well known (see, for instance, [3]) that there exists a unique solution to (1.2) in the sense of (2.1). Moreover, y ∈ W 1,p 0 (Ω) for every 1 ≤ p < n n−1 and (2.2) y W 1,p 0 (Ω) ≤ C p u M(Ω) . Since W 1,p 0 (Ω) ⊂ L 2 (Ω) for every 2n n+2 ≤ p < n n−1 , the cost functional is well defined on M(Ω). Furthermore, the control-to-state mapping is injective, and therefore the cost functional J is strictly convex. Then, it can be obtained by the standard approach that (1.1) has a unique solution; see [6] for details. Hereafter, this optimal solution will be denoted byū with an associated stateȳ. By using subdifferential calculus of convex functions and introducing the adjoint state we get the following results (see also [6], [7]). Proof. By standard arguments from Lagrange multiplier theory and the Sobolev embedding theorem, we deduce the existence of a λ ∈ C 0 (Ω) with (2.5) λ ∈ ∂ · M(Ω) (ū) and αλ = −φ.
By the definition of the convex subdifferential, the first inclusion is equivalent to for all u ∈ M(Ω). Taking u = 2ū and u = 0, respectively, we obtain the two inequalities and hence (2.3) by the second relation of (2.5). Inserting (2.3) and λ = − 1 αφ into (2.6) yields which implies (2.4).
Proof. Let us denote by J α the cost functional associated to the parameter α. Similarly, let (u α , y α , ϕ α ) denote the solution to the corresponding optimality system. For each α > 0 the following inequalities hold: Consequently, From the adjoint state equation and the embedding of H 2 (Ω) ∩ H 1 0 (Ω) → C 0 (Ω), we deduce the existence of a constant C > 0 such that , we obtain from the above inequality and (2.4) that u α = 0 for every α >ᾱ.
In the case where we consider the observation of the state only in a subset ω y ⊂ Ω, we have the following property of the support of the optimal control. Proposition 2.3. Let ω y be an open subset of Ω such that Ω \ ω y is connected and consider the functional Then the associated optimal controlū satisfies supp(ū) ⊂ω y .
where χ ωy is the indicator function of ω y . Applying the maximum principle to the problem we deduce thatφ is identically zero in Ω \ω y or In both cases the equality (2.4) can only be achieved inω y ; therefore (2.7) implies the claim of the proposition. Let us close this section by pointing out that the results of our paper can also be adapted to the situation where the control domain is a priori restricted to a strict subdomain ω u of Ω, and the controls are restricted to be nonnegative (cf. [7]).

Approximation of (1.1).
In this section Ω will be assumed to be convex. We consider a nodal basis finite element approximation of (1.1). Associated with a parameter h we consider a family of triangulations {T h } h>0 ofΩ. To every element T ∈ T h we assign two parameters ρ(T ) and σ(T ), where ρ(T ) denotes the diameter of T and σ(T ) is the diameter of the biggest ball contained in T . The size of the grid is given by h = max T ∈T h ρ(T ). The following usual regularity assumptions on the triangulation are assumed: (i) There exist two positive constants ρ and σ such that where | · | denotes the Lebesgue measure. Associated to these triangulations we define the space where P 1 is the space formed by the polynomials of degree less than or equal to one. For every u ∈ M(Ω), we denote by y h the unique element of Y h satisfying The approximation of the optimal control problem (1.1) is defined as where y h is the solution to (3.2).
Since we have not discretized the control space, this approach is related to the variational discretization method introduced in [8]. Below we will show that among all the solutions to (3.3) there is a unique one which is a finite linear combination of Dirac measures concentrated in the interior vertices of the triangulation, leading to a simple numerical implementation.
Before any discussion of the solutions to problem (3.3), let us introduce some additional notation. Hereafter we will denote by {x j } N (h) j=1 the interior nodes of the triangulation T h . Associated to these nodes we consider the nodal basis of Y h given by the functions We also consider the space Above δ xj denotes the Dirac measure centered at the node x j . It is obvious that D h can be identified with the dual of Y h through the duality relation The operator Π h is the nodal interpolation operator for Y h , and we have the following result concerning the operator Λ h . Theorem 3.1. The following properties hold. 1. For every u ∈ M(Ω) and every z ∈ C 0 (Ω) and z h ∈ Y h we have

For every u ∈ M(Ω) we have
3. There exist a constant C > 0 such that for every u ∈ M(Ω) where p is the conjugate of p. 4. Given u ∈ M(Ω), and let y h andỹ h be the solutions to (3.2) associated to the controls u and Λ h u, respectively. Then the equality y h =ỹ h holds.
which proves (3.4). For (3.5) we proceed as follows: To verify (3.6) we introduce the function s h ∈ Y h by Since any subsequence converges to u, the whole sequence converges Downloaded 03/27/14 to 143.50.47.57. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php to u weakly * in M(Ω). From this convergence and (3.6) we obtain and consequently (3.7) holds.
To prove (3.8) we take an arbitrary element z ∈ W 1,p 0 (Ω) with 1 ≤ p < n n−1 . Using (3.5) and the well known interpolation error estimates in Sobolev spaces (see, for instance, [5, Chapter 3]) we obtain (Ω) for 1 < p < n n−1 , (3.8) follows from the above inequalities. For p = 1, we have p = ∞ and the above inequality can be expressed as (Ω), from this inequality we only get (3.9).
The last statement of the theorem is an immediate consequence of (3.4). Now, we turn to the study of (3.3). First, we observe that analogously to J, the functional J h is convex. However, it is not strictly convex. This is a consequence of the noninjectivity of the control-to-discrete-state mapping and the nonstrict convexity of the norm of M(Ω). Although the existence of a solution can be proved in the same way as for the problem (1.1), we cannot claim its uniqueness. Nevertheless, ifũ h is a solution to (3.3) and we takeū h = Λ hũh , then statement 4 of Theorem 3.1 and the inequality (3.6) imply that J h (ū h ) ≤ J h (ũ h ), and henceū h is also a solution to (3.3). Since for u h ∈ D h , the mapping u h → y h (u h ), the solution to (3.2) for u = u h , is linear, injective, and dim D h = dim Y h , this mapping is bijective. Therefore, the cost functional J h is strictly convex on D h , and hence (3.3) has a unique solution in D h , which will be denoted byū h hereafter. We summarize this discussion in the following theorem. where the x j should be taken as the nodes associated with the degrees of freedom (which no longer need to correspond to vertices of the triangulation, e.g., vertices and edge midpoints for quadratic elements).
We finish this section by proving the convergence of the solutions in D h to problems (3.3) (3.13) whereȳ andȳ h are the continuous and discrete states associated toū andū h , respectively.
Proof. First, let us verify that where y h (u h ) and y u are the discrete and continuous states associated to the controls u h and u, respectively. From the compact embedding M(Ω) → W −1,p (Ω) for every 1 ≤ p < n n−1 , we deduce the strong converge u h → u in W −1,p (Ω). Let us denote by y u h the continuous state associated to u h . From [9] we obtain the strong convergence y u h → y u in W 1,p (Ω), where we have used that the boundary Γ is Lipschitz continuous as a consequence of the convexity of Ω. Moreover, from [2] we have that y h (u h ) − y u h L 2 (Ω) → 0. Finally, by the triangular inequality we obtain the desired convergence.

Error estimates.
This section is devoted to the proof of error estimates for the optimal costs as well as for the optimal states. We still require Ω to be convex and in addition we assume As in the previous sections, we denote byȳ andȳ h the continuous and discrete states associated to the optimal controlsū andū h , respectively. Theorem 4.1. There exists a constant C > 0 independent of h such that where κ = 1 if n = 2 and κ = 1/2 if n = 3. Downloaded 03/27/14 to 143.50.47.57. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php Proof. We establish some preliminary estimates. Given u ∈ M(Ω), with associated continuous and discrete states y and y h , we know from [2] that with κ defined as in the statement of the theorem. Taking r as in (4.1) and using Hölder's inequality and (3.1), we deduce that for all φ ∈ L r (Ω), holds. As a consequence of (4.3) and (4.4), with φ = y − y d , we get
In the following theorem we establish a rate of convergence for the states.
By the optimality ofū we have for all u ∈ M(Ω) that where (·, ·) denotes the scalar product in L 2 (Ω). In particular, taking u =ū h , we get Analogously, the optimality ofū h implies that We point out that by definition of Y h , we have S h u = 0 in Ω \ Ω h . Then, the scalar product above in L 2 (Ω) coincides with that in L 2 (Ω h ). Now, we rearrange terms in (4.10) as follows: Now, adding (4.9) and (4.11) we obtain (4.12) Let us estimate the right-hand terms. For the first one we apply the Cauchy-Schwarz inequality, exploit the fact S hū − S hūh = 0 in Ω \ Ω h , and use (4.8) to deduce where we have used that {ū h } h>0 , {S hū } h>0 and {S hūh } h>0 are bounded due to (3.11), (3.12), and (4.3), respectively. For the second term we use (4.4) and once again (4.8) as well as the fact that S h u = 0 in Ω \ Ω h to obtain (4.14) where we have also used that y d ∈ L r (Ω) and (2.2). Finally, (4.12), (4.13), and (4.14) prove (4.7). Remark 4.3. Let us observe that (4.2) and (4.7) imply that for some constant C > 0 independent of h. Remark 4.4. All the previous results remain correct for a general elliptic operator provided the coefficients a ij are Lipschitz continuous functions inΩ and a 0 ≥ 0 is in L ∞ (Ω).

A Neumann control problem.
In this section, we assume that the system is controlled on the boundary. The control problem is formulated as for c 0 ∈ L ∞ (Ω), c 0 ≥ 0, and c 0 ≡ 0 and given f ∈ L 1 (Ω). Here we will assume Ω ⊂ R n , n = 2 or 3, to be convex and polyhedral. Again by the Riesz representation theorem M(Γ) is identified with the dual space of C(Γ); see, for instance, [14,Chapter 6].
Concerning the state equation (5.1), analogously to the Dirichlet problem (1.2), we say that an element y ∈ W 1,p (Ω), p < n n−1 , is a solution to (5.1) if We have the following theorem. Theorem 5.1. The problem (5.1) has a unique solution belonging to W 1,p (Ω) for every 1 ≤ p < n n−1 , and there exists a constant C p > 0 such that As a consequence of this theorem, we have that the functional J Γ : M(Γ) → R is well defined. Moreover, it is continuous and strictly convex. Therefore, it has a unique minimizer that hereafter will be denoted byū with associated optimal stateȳ. Analogously to Theorem 2.1, if we denote the adjoint state associated toū byφ, then the following identities hold: Then, (5.2) and (5.3) imply a sparsity structure ofū analogous to (2.7).
To carry out the numerical analysis of problem (1.1), we consider the same triangulation as in section 3. On this triangulation we define the space of discrete states by Y h = y h ∈ C(Ω) : y h |T ∈ P 1 for every T ∈ T h and the discrete state equation The approximation of the Neumann control problem results in where y h is the solution to (5.4). Before analyzing this problem, let us prove the following error estimates concerning the discretization of the state equation. Downloaded 03/27/14 to 143.50.47.57. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php Theorem 5.2. Given u ∈ M(Γ), let y and y h be the solutions to (5.1) and (5.4). Then, there exists a constant C > 0 independent of h, f , and u such that with κ as in Theorem 4.1.
Proof. Here we follow the lines of the proof [2, Theorem 3]. For any function g ∈ L 2 (Ω), let z ∈ H 2 (Ω) be the solution to Using Green's formula, we obtain where we have used the classical finite element error estimate; see, for instance, [5,Chapter 3]. Since g ∈ L 2 (Ω) is arbitrary, this gives the desired estimate. Analogously to section 3, we will denote by {x j } M(h) j=1 the boundary nodes of the triangulation T h . Associated to these nodes we consider the space We also consider the space Above, δ xj denotes the Dirac measure centered at the node x j . It is obvious that D Γ is the (scaled) norm in M(Ω). The subdifferential of the indicator function is then given by the normal cone, which can be characterized by the variational inequality We now pass to the discrete setting by replacing the continuous controlū with its discretizationū h and introducing the discrete adjoint stateφ h = N (h) j=1 ϕ j e j ∈ Y h . The above variational inequality can then be reformulated using a complementarity function asū h + max(0, −ū h +φ h − α) + min(0, −ū h +φ h + α) = 0, which should be understood componentwise in terms of the vector of expansion coefficients (λ 1 , . . . , λ N (h) ) and (ϕ 1 , . . . , ϕ N (h) ). This is a locally Lipschitz mapping from R N (h) × R N (h) → R N (h) and thus the reformulated discrete optimality system can be solved by a locally superlinearly convergent semismooth Newton method [10], [12]. The corresponding algorithm was implemented in MATLAB (R2011a).
We first illustrate the structural properties of the optimal controls. Figure 6.1 shows the norm of the optimal control u α as a function of the penalty parameter α. As verified in Proposition 2.2, there exists anᾱ (≈ 0.187), such that u α ≡ 0 for α >ᾱ.
The statement of Proposition 2.3 is illustrated in Figure 6.2, where the optimal controls for the target y d = 10 exp(−50 x 2 ) and different observation domains ω y are compared. As a reference, Figure 6.2(a) shows the control for ω y = Ω (in the form of its expansion coefficients λ j at each grid point, with linear interpolation for better visibility). In contrast, the control for ω y = χ {|x1|<1/2} χ {|x2|<1/4} Ω vanishes outside of ω y ; see Figure 6.2(b).
We now investigate the convergence behavior as h → 0. In the absence of a known exact solution, we take as a reference solution the computed optimal discrete control and optimal discrete state on the finest grid with N * = 2 10 , corresponding to h * = 2·10 −3 . We first consider distributed control, with the target y d,1 given in Figure 6.3(a). rate obtained in Theorem 4.1. The corresponding L 2 error y h − y h * L 2 of the discrete states also decays with a linear rate, which is faster than predicted by Theorem 4.2.
For the case of Neumann control, we set α = 5 · 10 −2 and c 0 = 10 −2 and consider the target y d,2 shown in Figure 6.3(b). Again, both the error in the functional value ( Figure 6.5(a)) and in the state ( Figure 6.5(b)) follow an approximately linear convergence rate. To illustrate the sparsity properties of Neumann boundary controls, Figure 6.6 shows the optimal control u h,α (again, in the form of its linearly interpolated coefficients λ j ) for α = 10 −3 , 10 −2 and 10 −1 , plotted along boundary sections as indicated. Downloaded 03/27/14 to 143.50.47.57. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php (a) target y d, 1 (b) target y d,2   Downloaded 03/27/14 to 143.50.47.57. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 7. Conclusion. By considering optimal control problems in spaces of measures, controls with strong sparsity properties can be obtained. Although the nonreflexive Banach space setting complicates the analysis, a straightforward numerical approximation that retains the structural properties of the measure norm is possible. In a sense, the results of this paper justify the "intuitive" discretization of regular Borel measures by Dirac measures on a set of nodes.