A review on sparse solutions in optimal control of partial differential equations

In this paper a review of the results on sparse controls for partial differential equations is presented. There are two different approaches to the sparsity study of control problems. One approach consists of taking functions to control the system, putting in the cost functional a convenient term that promotes the sparsity of the optimal control. A second approach deals with controls that are Borel measures and the norm of the measure is involved in the cost functional. The use of measures as controls allows to obtain optimal controls supported on a zero Lebesgue measure set, which is very interesting for practical implementation. If the state equation is linear, then we can carry out a complete analysis of the control problem with measures. However, if the equation is nonlinear the use of measures to control the system is still an open problem, in general, and the use of functions to control the system seems to be more appropriate.


Introduction
In the control of distributed parameter systems, those formulated by partial differential equations, usually we cannot put control devices at every point of the domain. Actually, we are allowed to use small regions to put the controllers. Then the big issue is which region is the most convenient to localize them. Of course, we have to determine the power of the This work was partially supported by the Spanish Ministerio de Economía y Competitividad under project MTM2014-57531-P.
B Eduardo Casas eduardo.casas@unican.es 1 Dpto. de Matemática Aplicada y Ciencias de la Computación, E.T.S.I. Industriales y de Telecomunicación, Universidad de Cantabria, Av. de los Castros s/n, 39005 Santander, Spain controllers as well. These controls are called sparse because they are not zero only in a small region of the domain. In the last few years, some researchers have focused their investigation in this direction. First, it was observed that the use of the L 1 norm of the control in the cost functional leads to the sparsity of the solution. Of course, this introduces some mathematical difficulties in the problem due to the lack of differentiability of this functional. However, despite this difficulty, a lot of progress has been done, and the numerical computations show the interest and applicability of this approach.
Taking a further step in this direction, we find that many times it is even desirable to put the controllers only in finitely many points of the domain, or along a line (in two dimensions), or on a surface (in three dimensions). In these cases we need to use controllers that are localized in a zero Lebesgue measure set. These controllers cannot be identified with functions, they are measures. This is the starting point of a new type of control problems where the controls are Borel measures. Adding the norm of the measure to the cost functional, we obtain optimal controls having the desired sparsity property.
In this paper we present the results obtained in the analysis of sparse control problems, both taking functions or measures as controls. The case of elliptic and parabolic control problems is considered. The paper is organized as follows. In Sect. 2 the sparse control of semilinear elliptic equations is studied. In Sect. 3 we present the parabolic case. In Sects. 4 and 5 the elliptic and parabolic cases corresponding to measure controls are analyzed. Though some results are indicated and references are provided, we have not considered the numerical approximation of the control problems because this would lead to a very long paper.

Sparse control of semilinear elliptic equations
As far as the author knows, the first paper devoted to the study of sparse controls of elliptic systems is due to Stadler [29]. In this paper the author considers a distributed control problem associated to a linear elliptic equation. The cost functional is the usual quadratic cost that includes the Tikhonov term, and the L 1 norm of the control is added to the functional. Control constraints are included in the formulation of the problem. The resulting problem is strictly convex, then it has a unique solution. Therefore, the first order optimality conditions are necessary and sufficient for global optimality. He derives these conditions and deduces from them the sparsity of the optimal control. Finally, he uses a semismooth Newton method to compute a discrete version of the control problem. For some additional analysis of the same problem and the proof of error estimates for the numerical discretization, the reader is referred to [31,32]. Later the control problem associated to a semilinear elliptic equation and a more general cost functional was investigated in [9,10]. The material presented in this section is based on the theoretical part of [10].

Setting of the control problem and preliminary results.
In this section Ω will denote an open bounded subset of R n , n = 2 or 3, with a Lipschitz boundary Γ .
We consider the following control problem A is the linear operator We make the following assumptions on the functions and parameters involved in the control problem (P).

Assumption 1
The coefficients of A have the following properties: a 0 , a i j ∈ L ∞ (Ω), for some Λ > 0.

Remark 1
In Assumption 3 we made the hypothesis α < 0 < β. Since we are looking for sparsity of the optimal control, it does not make sense to consider 0 < α or β < 0. However, the cases α = 0 or β = 0 are frequent in practice. In these situations, the sparsity of the optimal control is also induced by the presence of the term γ u L 1 (Ω) , see Remark 2. Nevertheless, in this case the L 1 norm is linear on K, hence the cost functional J is differentiable and the control problem (P) falls into the framework of well studied optimal control problems.
The first step in the analysis of (P) is the study of the state equation and the relation control-to-state. This is established in the next theorem.

Theorem 1
The following statements hold.

6)
and The existence and uniqueness of a solution of (2.1) in Y is obtained by classical arguments; see, for instance, [2] and [23,Chapter 8]. The differentiability of G can be obtained from the implicit function theorem as follows. We considered the space endowed with the graph norm. Now we define the nonlinear operator Then, it is immediate to check that F is of class C 2 and F (G(u), u) = 0 for every u ∈ L p (Ω). Using the results of [23,Chapter 8] again, we deduce that is an isomorphism. Thus, the assumptions of the implicit function theorem are fulfilled and some simple calculations prove (2.6) and (2.7). By using the chain rule and the previous theorem we infer the differentiability of F. Theorem 2 Functional F : L 2 (Ω) −→ R is of class C 2 and the first and second derivatives are given by

10)
A * being the adjoint operator of A.
The existence of a solution of (P) can be proved by standard arguments. However, due to the nonconvexity of (P), we have to distinguish between local and global solutions.

Definition 1
We will say thatū is a local minimum of (P) in the Since K is a bounded subset of L ∞ (Ω), ifū is a (strict) local minimum of (P) in the L p (Q) sense, for some 1 ≤ p < +∞, thenū is a (strict) local minimum of (P) in the L q (Q) sense for every q ∈ [1, +∞]. However, ifū is a local minimum in the L ∞ (Ω) sense, it is not necessarily a local minimum in the L p (Ω) sense for any p ∈ [1, +∞). In the sequel, if nothing is precised, when we say thatū is a local minimum of (P), it should be intended in the L p (Ω) sense for some p ∈ [1, +∞].

First and second order optimality conditions.
To derive the first order optimality conditions we need to say something about the nondifferentiable term j of the cost functional. Since j is convex and Lipschitz, the subdifferential in the sense of convex analysis and the generalized gradients introduced by Clarke coincide. Moreover, a simple computation shows that λ ∈ ∂ j (u) if and only if if u(x) = 0. (2.11) As any Lipschitz and convex functional, j has directional derivatives that can be easily computed for u, v ∈ L 1 (Ω), where Ω + u , Ω − u and Ω 0 u represent the sets of points where u is positive, negative or zero, respectively. Now, we have the following result.

E. Casas
Theorem 3 Ifū is a local minimum of (P), then there existȳ,φ ∈ Y andλ ∈ ∂ j (ū) such that Sketch of Proof Since K is convex, for any u ∈ K and 0 ≤ ρ ≤ 1, we have thatū +ρ(u−ū) ∈ K. Hence, the local optimality ofū implies that J (ū) ≤ J (ū + ρ(u −ū)) for every ρ > 0 small enough. Now, from the convexity of j we deduce for every ρ > 0 sufficiently small Taking the limit as ρ → 0 and using the differentiability of F and (2.8) we get whereφ satisfies (2.13b). Thereforeū is the solution of the convex problem min u∈L ∞ (Ω) where I K denotes the indicator function of K, taking the value 0 if u ∈ K and +∞ if u / ∈ K. Finally, since I is convex, we can use the subdifferential calculus to obtain Hence, there exists an elementλ ∈ ∂ j (ū) such that −(φ + νū + γλ) ∈ ∂ I K (ū), which is equivalent to (2.13c).
From (2.13c) we deduce the following corollary.

Corollary 1
Letū be a local minimum of (P). Then the following properties hold. If ν > 0:ū Moreover, for any ν ≥ 0 the functionλ ∈ Y , it is unique and it satisfies The proof of the result for ν > 0 can be found in [10] The case ν = 0 is analyzed in [4]. Let us point out that the relations (2.14b) and (2.15a) imply the sparsity of local optimal controls. Since K is a bounded subset of L ∞ (Ω), it is easy to prove that there exists a constant M independent of γ and ν such that φ L ∞ (Ω) ≤ M. Therefore, if γ > M, then (2.14b) and (2.15a) prove thatū ≡ 0. However, for γ = 0, then (2.14a) and (2.15b), (2.15c) prove thatū is not zero wheneverφ is not zero. Typically in this case,ū(x) = 0 almost everywhere in Ω. Now, we can monitor γ in the interval [0, M] to get an optimal control with a small support.
The following example shows how the sparsity of the optimal control can be monitored through the parameter γ . The domain Ω = B 1 (0) ⊂ R 2 is the unit circle and the state equation (2.1) is given by −Δy + y 3 = u in Ω, The part L of the objective is the standard tracking functional, i.e. L(x, y) = 1 2 (y−y d (x)) 2 with y d (x 1 , x 2 ) = 4 sin(2 π x 1 ) sin(π x 2 ) e x 1 , We have taken the parameters ν = 0.002, α = −12 and β = +12. The pictures in Fig. 1 show the solution of (P) for the values of γ = 0 and γ = 2 i · 10 −3 for i = 0, . . . , 7. The pictures are taken from [8].
To formulate the sufficient second order optimality conditions we need to introduce the cone of critical directions. Letū ∈ K satisfy the first order optimality conditions (2.13). Then, the following cone is a straightforward extension of the corresponding cone for a finite dimensional optimization problem.
The following result was proved in [10].
Theorem 4 Let us assume thatū ∈ K satisfy the first order optimality conditions (2.13). Then, Cū is a closed convex cone of L 2 (Ω). Additionally, ifū is a local solution of (P), then Let us observe that there is no a second order contribution of the functional j, it only appears in the definition of the cone Cū. Next we give a sufficient condition for local optimality. First we recall that in finite dimension a sufficient condition for a local minimum u of a functional J is the following: J (ū) = 0 and J (ū)v 2 > 0 ∀v = 0. This second 326 E. Casas However, both second order conditions are not equivalent, in general, in infinite dimensional optimization problems. Therefore, the issue is whether the first second order condition is sufficient for a local minimum in infinite dimensional problems or we need to assume the second condition, or none of both is sufficient. The following example shows that it is not enough to assume, in general, that J (ū) > 0 for v = 0.
The functionū(t) ≡ 0 satisfies the first-order necessary condition J (ū) = 0 and However,ū is not a local minimum of J . Indeed, if we define It is a classical result that the condition J (ū)v 2 ≥ δ v 2 is sufficient. Surprisingly, in the control problems, when the Tikhonov term appears in the cost functional (ν > 0), the condition J (ū)v 2 > 0 if v = 0 is a sufficient condition as well. This results is also valid for the case of control constraints if we restrict v to the cone Cū.
Actually, the situation when ν > 0 is similar to the finite dimensional case. The following theorem is an immediate consequence of [10, Theorm 3.8]. (2.13) hold. Then the following statements are equivalent.
The assumption ν > 0 is essential in the Theorems 5 and 6. When ν = 0, the situation is completely different. Since we have pointwise control constraints, the critical cone Cū is too small to formulate the sufficient second order conditions. The following example due to Dunn [21] where a(x) = 1 − 2x. The set of admissible functions u is defined by  Let us setū(x) = max{0, −a(x)}; thenū(x) = 0 holds on [0, 1/2] and 0 <ū(x) < 2 on (1/2, 2). We have Since u −ū is nonnegative for all u ∈ K,ū satisfies the first order necessary optimality conditions.
Sinced(x) > 0 on [0, 1/2), the critical cone for this example is For all v ∈ Cū, we obtain Then we have This example shows that, in general, it is necessary to extend Cū to a bigger cone to formulate the second order condition. To this end, the cone C τ u was introduced in [4] to deal with the case ν = 0. There it was proved that a second order condition can be formulated on this extended cone, but it is not the condition that the reader is maybe thinking of. It was also proved in [4] A different condition was given: (2.13). We also suppose that there exist δ > 0 and τ > 0 such that Then, there exist ε > 0 and κ > 0 such that
Let us compare (2.18) and (2.19). The inequality (2.18) is crucial to analyze the stability of the solutionū of the control problem with respect to perturbations in the data of the problem; see, for instance, [16]. In this paper the authors analyze the stability for a solution of a control problem associated to a semilinear parabolic equation, but the results and the methods of proof are identical to the elliptic case. However, (2.19) only allows, in general, to prove the stability of the optimal states with respect to perturbations of the data of (P). In particular, using (2.19) we can prove that the difference between the optimal states of (P) for ν > 0 and the optimal ones for ν = 0 are of order o( √ ν); see [16]. When ν > 0, the sufficient second order condition is a crucial tool for the proof of error estimates among continuous and discrete optimal controls; see [9,10]. However, for ν = 0, we can get only error estimates for the corresponding optimal states by using (2.19). Under some additional assumption on the optimal adjoint state, some estimates can be deduced as well for the optimal controls in some cases. This has been done for linear-quadratic control problems in [31].
Finally, let us mention that the analysis for state-constrained control problems of semilinear elliptic equations with sparse controls can be found in [15].

Sparse control of semilinear parabolic equations
In this section, we analyze some optimal control problems governed by semilinear parabolic equations where the cost functional involves a functional j acting on the control which promotes the sparsity of the optimal control. The reader is referred to [11] for the proofs of the results of this section. Related references are [5,14]. These papers are devoted to the control of the Navier-Stokes and FitzHugh-Nagumo systems respectively by sparse controls.
The control problem is formulated as follows R is a Lipschitz continuous and convex but not Fréchet differentiable function, ν ≥ 0, γ > 0. The state y u is the solution of the semilinear, parabolic equation Here, A is the same linear elliptic operator considered in Sect. 2. We mention that it is possible to replace the Dirichlet boundary condition y u = 0 by a Neumann boundary condition ∂ n A y u = g with g ∈ L p (Σ), provided that p > n − 1, then L ∞ (Ω T ) estimates for the solution of the boundary value problem are obtained.

E. Casas
The analysis of (P) will be done for each of the three following choices for the functional j When we take j = j 1 , the corresponding problem (P) will be denoted by (P 1 ). Analogously, we define the control problems (P 2 ) and (P 3 ) corresponding to the other two functionals j 2 and j 3 . Each of these choices for j induces a different spatio-temporal sparsity pattern for the optimal control, all of them being interesting. The functional j 3 with linear elliptic and parabolic equations has been studied in [24], and the term directionally sparse controls was coined there. Due to the linearity of the equation and the convexity of the cost functional, no second-order analysis was necessary in [24].
Besides the first-order necessary conditions, we derive in this section second-order necessary and sufficient conditions for the non-convex problems (P 1 )-(P 3 ), which, in case ν > 0, both use the same cone of critical directions and thus provide the minimal gap between secondorder necessary and sufficient conditions. Note that the second-order directional derivatives v → j (u; v 2 ) of the above functionals do not exist in all directions.

Assumptions and preliminary results.
In this section Ω denotes an open, bounded subset of R n , 1 ≤ n ≤ 3, with a Lipschitz boundary Γ . The final time T > 0 is given. We make the following assumptions on the functions and parameters involved in the control problem (P).
Assumption 1 The assumptions on A are the same than in Sect. 2.
Assumption 3 We also assume −∞ < α < 0 < β < +∞, γ > 0, ν ≥ 0, and L : Ω T × R −→ R is a Carathéodory function of class C 2 with respect to the last variable such that L(·, ·, 0) ∈ L 1 (Ω T ). Furthermore, for every M > 0 there exists a function ψ M ∈ and for a.a. (x, t) ∈ Ω T , In the sequel, we will denote the set of feasible controls by As usual, we denote by W (0, T ) the space of functions y ∈ L 2 (0, T ; H 1 0 (Ω)) such that ∂ t y ∈ L 2 (0, T ; H −1 (Ω)). It is well known that every function y ∈ W (0, T ) belongs, after a modification on a set of zero Lebesgue measure, to C([0, T ], L 2 (Ω)). Now, we analyze the existence, uniqueness and regularity of a solution of (3.1), as well as the differentiability of the relation control-to-state.
The proof of the existence and uniqueness of a solution of (3.1) in W (0, T ) ∩ L ∞ (Ω T ) is standard. The reader is referred, for instance, to [3] where the arguments used for a Robin boundary condition can be easily adapted to the Dirichlet case. For the proof of the differentiability we can proceed as follows. We set Using that y ∈ L ∞ (Ω T ) and (3.3) we deduce that a(·, ·, y) ∈ Lp(0, T ; Lq (Ω)). Hence, F is well defined and we can apply the implicit function theorem to deduce that G is of class C 2 and to show that (3.7) and (3.8) represent its first and second derivatives, respectively. Now, we have the following differentiability result of F.

11)
A * being the adjoint operator of A.
The fact that F is of class C 2 is an immediate consequence of Theorem 9 and the chain rule. On the other hand, since y u ∈ L ∞ (Ω T ), we deduce from (3.5) that ∂ L ∂ y (·, ·, y u ) ∈ Lp(0, T ; Lq (Ω)), which implies that ϕ u is well defined and enjoys the indicated regularity. The formulas (3.9) and (3.10) follow from standard computations.
Analogously to Y , we define the space endowed with the graph norm. As established for Y , we also have the embedding Φ ⊂ C([0, T ], L 2 (Ω)).
We conclude this section with the following theorem, whose proof follows from classical arguments by taking a minimizing sequence.
Theorem 10 Problem (P) has at least one solutionū.

First order optimality conditions.
The next theorem states the first-order optimality conditions satisfied by a local minimum of (P); see Definition 1 and the subsequent comments. The proof follows the same steps than the one of Theorem 3, using (3.9) instead of (2.8).

Theorem 11 Ifū is a local minimum of (P), then there existȳ
Now, we use the optimality system (3.12) to deduce the sparse structure ofū for the three choices of j.
Problem (P 1 ) The subdiferential and the directional derivatives of the functional j 1 : L 1 (Ω T ) −→ R were given in (2.11) and (2.12), where it is enough to change Ω by Ω T .
In the case ν = 0, (3.14) implies that if the set of points (x, t) ∈ Ω T where |φ(x, t)| = γ has a zero Lebesgue measure (which is expected in many cases), thenū(x, t) ∈ {α, 0, β} for almost all (x, t) ∈ Ω T , which means that the optimal control has a bang-bang-bang structure.
From the state equation (3.12a) we can get an L ∞ (Ω T ) estimate forȳ depending on α and β, but independent ofū. Now using (3.12b), we deduce an estimate φ L ∞ (Ω) ≤ M with M independent ofū. Hence, from (3.13) and (3.14) we conclude thatū ≡ 0 if γ > M. Therefore, we may influence the size of an optimal control's support by adjusting γ in the interval [0, M].
Problem (P 2 ) The characterization of the subdifferential of j 2 and its directional derivatives are given in the following propositions. . in (0, T ). (3.17) Proposition 2 For every u, v ∈ L 2 (0, T ; L 1 (Ω)) we have either

16)
From the previous propositions and Theorem 11 we deduce the following corollary.
Analogously to the problem (P 1 ), we can prove that there exists M > 0 independent of γ such thatū ≡ 0 if γ > M. Hence, we can monitor the parameter γ in the interval [0, M] to get a suitable support forū.

Problem (P 3 )
The expression for the subdifferential and the directional derivatives of j 3 are given in the following proposition.

Proposition 3
The following statements hold.
for a.a. x ∈ Ω u and t ∈ (0, T ), (3.22) where As a consequence of the above proposition and Theorem 11 we get the following corollary.

Corollary 4
Letū,φ andλ be as in Theorem 11, then the following relations hold for almost (3.26)

Moreover,λ is unique.
Once again, we can prove the existence of some M > 0 independent of γ such thatū ≡ 0 if γ > M. Remark 3 It is interesting to compare the sparsity properties of the local solutionsū corresponding to the studied problems. From (3.13) and (3.14) we obtain that the local solutions u of (P 1 ) are sparse in space and time. However, the solutions of (P 3 ) are only sparse in space as proved by (3.24) and (3.25), the sparsity region remaining constant throughout time. When we look at (3.19) and (3.20), we observe that the sparsity region of the solutions of (P 2 ) can change with the time. Any of the three formulations can be interesting with different possible applications.

Second order optimality conditions.
To derive the second order necessary optimality conditions we introduce the cone of critical directions analogously as it was done in Sect. 2.2 (3.27) The set Cū is a closed, convex cone in L 2 (Ω T ). On this cone we formulate the second order necessary conditions. Now we introduce the following notation for the three different functionals under investigation (3.30) In (3.29), j Ω : L 1 (Ω) −→ R is defined by j (u) = u L 1 (Ω) and j Ω (u; v) is given by (2.12).
In (3.30), Ω u is defined as in Proposition 3. The above expressions don't mean that there exist the second derivatives, they are simply notation. When the second directional derivative exists for some u and some v, then it is given by the above expressions. Concerning j 3 (u; v 2 ), the reader should observe that a simple application of Schwarz inequality implies that the integrand in Ω u is nonnegative. Hence, j 3 (u; v 2 ) is always well defined but it can take the value +∞ in some cases. In the sequel we set J (u; v 2 ) = F (u)v 2 + γ j (u; v 2 ), where j is one of the three functionals above defined.
For the sufficient second order conditions we introduce an extended cone, analogously as we did in Sect. 2.2.
The second order optimality conditions can be formulated indistinctly on Cū or C τ u if ν > 0. However for ν = 0, then the cone C τ u is the correct one. Analogously to Theorem 6 we have the following result.
Theorem 13 Let us assume that ν > 0. Then the following statements are equivalent where z v = G (ū) v is the solution of the linearized parabolic Eq. (2.8) corresponding to y u =ȳ.
For the case ν = 0 we have the following result.
In the case ν = 0 and j = j 3 , the second order sufficient conditions for local optimality is an open question for the moment.

Elliptic control problems in measure spaces with sparse solutions
This section is dedicated to the analysis of the optimal control problem (P) min where y u is the unique solution to the Dirichlet problem

1)
A being the operator introduced in Sect. 2, satisfying the Assumption 1. We assume that γ > 0, y d ∈ L 2 (Ω) and Ω is a bounded domain in R n , n = 2 or 3 with a Lipschitz boundary Γ . The controls are taken in the space of real regular Borel measures M (Ω). As usual, M (Ω) is identified by the Riesz theorem with the dual space of C 0 (Ω)-consisting of the continuous functions inΩ vanishing on Γ -endowed with the norm which is equivalent to the total variation of u.
We will see that this formulation leads to optimal controls which are sparse. This is relevant for many applications in distributed parameter control; see [19]. Moreover, the support of the optimal control provides information on the optimal placements of control actuators. The main advantage of the use of measures as controls is that the support of the optimal controls can be much smaller than the corresponding to the formulation with L 1 (Ω) functions, as in Sect. 2. Actually, this supports can have a zero Lebesgue measure. Most of the results presented in this section are proved in [6].
As usual, given a measure u ∈ M (Ω), we say that where A * is the adjoint operator of A and It is well known, see for instance [30], that there exists a unique solution to (4.1) in the sense of (4.2). Moreover, y ∈ W 1, p 0 (Ω) for every 1 ≤ p < n n−1 and Since W 1, p 0 (Ω) ⊂ L 2 (Ω) for very 2n n+2 ≤ p < n n−1 , the cost functional is well defined on M (Ω). Furthermore, the control-to-state mapping is injective, and therefore the cost functional J is strictly convex. Then, it can be obtained by the standard approach that (P) has a unique solution; see [19] for details. Hereafter, this optimal solution will be denoted byū with an associated stateȳ. By using subdifferential calculus of convex functions and introducing the adjoint state we get the following result (see also [19,20]).

Theorem 16 There exists a unique elementφ
Since the control problem is convex, the conditions (4.4)-(4.6) are necessary and sufficient for optimality ofū. Relations (4.5) and (4.6) imply the sparsity ofū. Indeed, this is an immediate consequence of the following lemma whose proof can be found in [7] Lemma 1 Let μ ∈ M (Ω) and z ∈ C 0 (Ω), both of them not zero, be such that and let μ = μ + − μ − be the Jordan decomposition of μ. Then we have Combining Lemma 1 and (4.5), (4.6) we deduce (4.7) As the numerical results show, the set {x ∈ Ω : |φ(x)| = γ } is small, which yields the sparsity ofū. Moreover, it can be proved that there exists a real number M > 0 independent of γ such thatū ≡ 0 if γ > M. Remark 4 There is a very interesting paper by Pieper and Vexler [28] where the following regularity result is proved: under the assumption y d ∈ L ∞ (Ω), thenȳ ∈ H 1 0 (Ω) ∩ L ∞ (Ω), and consequentlyū ∈ M (Ω) ∩ H −1 (Ω). Hence, one can imagine that the optimal control is supported on curves in the two-dimensional case, or on surfaces in the three-dimensional case. Dirac measures are excluded and measures concentrated on curves are also excluded for n = 3.
In [6], the numerical analysis of (P) is carried out by using a convenient finite element approximation of M (Ω). In particular, convergence of the optimal controls and error estimates for the difference between the continuous and discrete optimal states are established; Fig. 2 Optimal control see [28] for some improved error estimates. Since M (Ω) is not a separable Banach space, we cannot get strong convergence of the discretizations, but we can provē The discrete problem can be solved by a semismooth Newton method. Let us illustrate the theoretical results presented above with an example taken from [6]. In the computations we have taken Ω = (−1, +1) 2 , the operator is A = −Δ, γ = 10 −2 , and y d (x) = 10 exp (−50|x| 2 ). The optimal control is shown in Fig. 2. Before finishing this section let us address a natural question that the reader is probably wondering. What happen if the state equation is semilinear? Let us consider the Eq. (2.1) with u ∈ M (Ω). The associated control problem was studied in [12]. The main difficulty is the solvability of the state equation. Indeed, it is well known that (2.1) has no solution if we take A = −Δ, a(x, y) = y 3 and u = δ x 0 with x 0 ∈ Ω; see [1]. This difficulty is overcome in [12] in two different ways. First we assume that the growth order of a(x, y) with respect to y is polynomial of arbitrary degree if n = 2 and degree < 3 if n = 3. Under this assumption we have existence and uniqueness of a solution of the state equation. A second approach consists of taking the controls in a subspace of M (Ω) for which (2.1) has a unique solution for arbitrary nonlinearities of a(x, y) with respect to y. Of course, as in Sect. 2, we require the function a to be monotone nondecreasing with respect to y. can conclude that M ∞ (Ω) is the natural space for the controls. However, since M ∞ (Ω) is not a closed subspace of M (Ω), the proof of the existence of a solution for the associated control problem is not obvious, but it was established in [12]. In both approaches we get the optimality system satisfied by the locally optimal controls, and we prove the sparsity given by (4.7).

Parabolic control problems in measure spaces with sparse solutions
In this section we consider control problems for parabolic equations where the controls are measures. As we mentioned in Sect. 3, the possibilities to formulate a control problem leading to sparse optimal controls are higher for the parabolic case than for the elliptic one. Here we will show three different formulations that are the natural extensions of the three problems studied in Sect. 3. We restrict the formulation to linear equations because of the difficulty of dealing with measures for nonlinear parabolic equations. Additionally the cost functional will involve a tracking type term for the state plus the norm of the corresponding control. In all these formulations the state equation will be ⎧ ⎨ where y 0 ∈ L 2 (Ω), and A is defined as in Sect. 2, enjoying the same assumptions. We also assume that Ω ⊂ R n with 1 ≤ n ≤ 3 and Γ is Lipschitz. We denote the solution of (5.1) by y u . Now the three problems are formulated as follows: We analyze each of these problems in the next subsections.

Analysis of (P 1 )
The problem (P 1 ) has been studied in [13]. In this problem the controls are measures in Ω T . If u is a function, then its norm in the measure space coincides with the L 1 (Ω T )-norm. Hence, this can be considered as an extension of the functional j 1 introduced in Sect. 3 to the space of measures. The term involving the state is different of the corresponding term for the problems (P 2 ) and (P 3 ). The reason for that is the lack of regularity of the solution of (5.1) when u ∈ M (Ω T ). We say that a function y ∈ L 1 (Q) is a solution of (5.1) if the following identity holds where Let us observe that the problem has a unique solution φ ∈ L 2 (0, T ; H 1 0 (Ω)) ∩ C([0, T ]; L 2 (Ω)) for every f ∈ L ∞ (Ω T ). Moreover, the regularity φ ∈ C(Ω T ) holds; see [26,Chapter 3].
Then we have the following result of existence, uniqueness and regularity; see, for instance, [13].

Theorem 17
There exists a unique solution y of (5.1). Moreover, y ∈ L r (0, T ; W 1, p 0 (Ω)) for all p, r ∈ [1, 2) with (2/r ) + (d/ p) > n + 1, and the following estimate holds From this theorem we deduce that y u ∈ L q (Ω T ) if 1 ≤ q < min{2, n+2 n }. This is the reason to replace the L 2 (Ω T )-norm in the cost functional by the L q (Ω T )-norm. Next we will assume that 1 < q < min{2, n+2 n }. The solution of (P 1 ) is characterized for the first order optimality conditions expressed in the following theorem.

Theorem 18
Letū denote a solution to (P 1 ) with associated stateȳ. Then, there exists an elementφ ∈ L 2 (0, T ; The continuity ofφ follows from the fact that g ∈ L q (Ω T ) and q > 5/2. From Lemma 1 and (5.6) we deduce the following corollary that shows the sparse character of the optimal control.

Corollary 5 Under the assumptions and notations of Theorem 18 we have that
whereū =ū + −ū − is the Jordan decompositions ofū.

Analysis of (P 2 )
This section is based on the paper [7]. Hereafter we denote by As a consequence of this theorem is immediate that (P 2 ) has at least a solution. Moreover, due to its strict convexity we get the uniqueness of such a solution. Hereafterū will denote the solution to (P 2 ) andȳ the associated state. Now, we give the first order optimality conditions, which are necessary and sufficient due to the convexity of (P 2 ). From (5.11) and (5.12) we deduce the sparse property of the optimal controlū. Let us consider the Jordan decompositionū(t) =ū + (t) −ū − (t) for almost every t ∈ I . Then we have the following theorem.
If we denote byφ the adjoint state associated withū, which is the solution of (5.10), then we have [ If we compare the sparse structure ofū defined through (5.13), (5.14) and (5.15) we can observe the same differences that we already found in the comparison of (3.19), (3.20) and (3.24), (3.25); see Remark 3. Related with the material presented in this section is the controllability of parabolic equations by sparse measures; see [17,18].