OPTIMALITY CONDITIONS AND ERROR ANALYSIS OF SEMILINEAR ELLIPTIC CONTROL PROBLEMS WITH L 1 COST FUNCTIONAL

Abstract. Semilinear elliptic optimal control problems involving the L1 norm of the control in the objective are considered. Necessary and sufficient second-order optimality conditions are derived. A priori finite element error estimates for piecewise constant discretizations for the control and piecewise linear discretizations of the state are shown. Error estimates for the variational discretization of the problem in the sense of [13] are also obtained. Numerical experiments confirm the convergence rates.


Introduction.
In this paper we consider an optimal control problem subject to a semilinear elliptic state equation. The objective functional contains the L 1 norm of the control and it is therefore nondifferentiable. Problems of this type are of interest for two reasons. First, the L 1 norm of the control is often a natural measure of the control cost. Second, this term leads to sparsely supported optimal controls, which are desirable, for instance, in actuator placement problems [17]. In optimal control of distributed parameter systems, it may be impossible or undesirable to put the controllers at every point of the domain. Instead, we can decide to control the system by localizing the controls in small regions. The big issue is to determine the most effective location of the controls. An answer to this question is given by solving the control problem with an L 1 norm of the control.
However, the nondifferentiability of the objective leads to some difficulties. While first-order necessary optimality conditions can be derived in a standard way via Clarke's calculus of generalized derivatives, second-order conditions require additional effort. From the first-order optimality conditions, we deduce a representation formula (see (3.5c)) for the subdifferentialλ of the nondifferentiable term at the optimal control u, i.e.,λ ∈ ∂ ū L 1 (Ω) . This formula is new and it has some important consequences. First, it proves the uniqueness ofλ, which is not usually obtained for a nondifferentiable optimization problem. Second, it proves thatλ is not only an L ∞ (Ω) function, but it is a Lipschitz function inΩ, which implies, with formula (3.5a) for the optimal control, thatū is also Lipschitz inΩ. This extra regularity for the optimal control is essential in deriving the error estimates. We should emphasize that there are no error estimates if we do not have extra regularity of the optimal control. Moreover A is the linear operator We make the following assumptions on the functions and parameters involved in the control problem (P). Assumption 1. The coefficients of A have the following regularity properties: a 0 ∈ L ∞ (Ω), a ij ∈ C 0,1 (Ω), and (2.2) a 0 (x) ≥ 0 and n i,j=1 a ij (x) ξ i ξ j ≥ Λ |ξ| 2 for a.a. x ∈ Ω and ∀ξ ∈ R n . Assumption 2. a : Ω × R −→ R is a Carathéodory function of class C 2 with respect to the second variable, with a(·, 0) ∈ Lp(Ω) for some n <p, and satisfying  Assumption 3. We also assume −∞ < α < 0 < β < +∞, μ > 0, ν > 0, and L : Ω× R −→ R is a Carathéodory function of class C 2 w.r.t. the second variable such that L(·, 0) ∈ L 1 (Ω) and for every M > 0 there exists a function ψ M ∈ Lp(Ω), with n <p < +∞, satisfying (2.4) ∂ j L ∂y j (x, y) ≤ ψ M (x) ∀|y| ≤ M and for a.a. x ∈ Ω, with j = 1, 2.
In what follows, we will denote the set of feasible controls by Let us notice that the usual function L(x, y) = 1 2 (y−y d (x)) 2 satisfies Assumption 3 if y d ∈ Lp(Ω).
Remark 2.1. In Assumption 3 we made the hypothesis α < 0 < β. In the case where 0 ≤ α ≤ β or α ≤ β ≤ 0, the L 1 norm is linear; hence the cost functional J is differentiable, and the control problem (P) falls into the framework of well studied optimal control problems. Here we are interested in analyzing the nondifferentiable case. Downloaded 08/20/13 to 193.144.185.39. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php Moreover, since we are looking for sparsity of the optimal control, it does not make sense to consider 0 < α or β < 0. However, the cases α = 0 or β = 0 are frequent in practice. In these situations, the sparsity of the optimal control is also induced by the presence of the term μ u L 1 (Ω) ; see Remark 3.3.
The next theorem states that the control-to-state map is well posed and differentiable.
Theorem 2.2. The following statements hold: 1. For any u ∈ L p (Ω), with n/2 < p ≤p, there exists a unique solution of (2.1) The existence and uniqueness of a solution of (2.1) in H 1 0 (Ω) ∩ L ∞ (Ω) is obtained by classical arguments; see, for instance, [4]. The W 2,p (Ω) regularity follows from the C 1,1 regularity of Γ, Assumptions 1 and 2, and the result of Grisvard [11,Theorem 2.4.2.5]. The differentiability of G can be obtained from the implicit function theorem as follows. We define the nonlinear operator Then, it is immediate to check that F is of class C 2 and F (G(u), u) = 0 for every u ∈ L p (Ω). Using [11,Theorem 2.4.2.5] again, we deduce that is an isomorphism. Thus, the assumptions of the implicit function theorem are fulfilled and some simple calculations prove (2.5) and (2.6).
As an immediate consequence of the previous theorem, we get that the smooth part F of the objective functional enjoys the following differentiability result. Theorem 2.3. Functional F : L 2 (Ω) −→ R is of class C 2 , and the first and second derivatives are given by A * being the adjoint operator of A. Finally, it is obvious that problem (P) has at least one global solution, which belongs to L ∞ (Ω) because of the control constraints. The reader is referred to the book by Tröltzsch [18,Chapter 4.4] for the proof of these results.
Note that under some extra assumptions for L the existence of a solution of (P) in L 2 (Ω) can still be proved for α = −∞ or β = +∞. For instance, if L is bounded from below, i.e., L(x, y) ≥ C L , with C L ∈ R, then the cost functional J is coercive and consequently (P) has again a global solution in L 2 (Ω). Indeed, from the first-order optimality conditions we can deduce that this solution is not only in L 2 (Ω) but it belongs also to L ∞ (Ω).
Remark 2.4. Theorem 2.2 is also valid for convex polygonal domains Ω ⊂ R 2 . The only difference is that p is not only bounded above byp, it also depends on the angles of the polygon Ω. Indeed, let ω be the biggest angle of Ω. Using the results of Grisvard [11,Chapter 4], if ω ≤ π/2, then p can be chosen as in Theorem 2.2, only bounded byp. However, if ω > π/2, then n/2 < p < min{p, 2 2−π/ω } is the correct interval. With this modification, Theorem 2.3 also is valid as well as the rest of the results in this paper.
3. First-and second-order optimality conditions. In this section, we will derive the necessary first-and second-order optimality conditions and we will also provide a sufficient second-order condition with a minimal gap with respect to the necessary ones. Since (P) is not a convex problem we will deal with local solutions. As usual,ū is said to be a local solution of (P) in the L q (Ω) sense, 1 ≤ q ≤ +∞, if there exists ε > 0 such thatū is a solution of the problem whereB ε (ū) denotes the closed ball of L q (Ω) with the center atū and the radius ε. The solution is called strict ifū is the unique global solution of (P ε ) for some ε > 0. It is immediate to check that ifū is a local solution in the L q (Ω) sense for any 1 ≤ q < ∞, then it is also a local solution in the L ∞ (Ω) sense. On the other hand, since K is bounded in L ∞ (Ω), ifū is a local solution in the L q (Ω) sense, for some 1 ≤ q < ∞, thenū is also a local solution in the L p (Ω) sense for any 1 ≤ p < ∞. Therefore, we can distinguish two different notions of local minima: L 2 (Ω) sense or L ∞ (Ω) sense. The results proved in this paper will hold for either of these two notions of local minima. Therefore, we will not distinguish between these two notions and we will simply speak about local minima. In the study of the optimality conditions there is a difficulty coming from the nondifferentiability of the function j(u) = u L 1 (Ω) involved in the objective function of (P). Since j is convex and Lipschitz, we can apply some classical results to deduce the first-order conditions. However, the second-order necessary and sufficient optimality conditions, as presented here, are new to the best of our knowledge. The sufficient second-order conditions will be used in the next section to derive the error Downloaded 08/20/13 to 193.144.185.39. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php estimates of finite element approximations, which shows their utility. Before stating these optimality conditions we recall some properties of the function j. Since j is convex and Lipschitz, the subdifferential in the sense of convex analysis and the generalized gradients introduced by Clarke coincide. Moreover, a simple computation shows that λ ∈ ∂j(u) if and only if holds a.e. in Ω. Also j has directional derivatives given by where Ω + u , Ω − u and Ω 0 u represent the sets of points where u is positive, negative or zero, respectively. Finally, the following relation holds: We refer to Clarke [ Theorem 3.1. Ifū is a local minimum of (P), then there existȳ,φ ∈ W 2,p (Ω) andλ ∈ ∂j(ū) such that Corollary 3.2. Letū,φ, andλ be as in the previous theorem; then the following relations hold:ū Moreover, from the first and last representation formulas it follows thatū,λ ∈ C 0,1 (Ω) andλ is unique for any fixed local minimumū. Downloaded 08/20/13 to 193.144.185.39. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php Proof. The derivation of the formula (3.5a) is standard in control theory. Now, from (3.1), (3.5a), and the fact that α < 0 < β, we get and analogously we deduce that ifū(x) < 0, thenφ(x) > μ. These three properties are equivalent to (3.5b). Let us prove (3.5c). Taking into account (3.1), (3.5b), and (3.5a), we obtain For the caseφ(x) < −μ we can proceed as for the caseφ(x) > μ, which completes the proof of (3.5c). The Lipschitz property ofλ follows from (3.5c) and from the fact thatφ ∈ W 2,p (Ω) → C 1 (Ω). Finally, (3.5a) leads to the Lipschitz regularity ofū.
Remark 3.3. Let us point out that the relation (3.5b) implies the sparsity of local optimal controls. This property was observed by [17] and it continues to hold in the cases α = 0 or β = 0. Indeed, if α = 0, it is easy to deduce from (3.5a) thatū(x) = 0 if and only ifφ(x) ≥ −μ, which also implies the sparsity. For β = 0, we have that u(x) = 0 if and only ifφ(x) ≤ +μ.
In order to address the second-order optimality conditions we need to introduce the critical cone. Given a controlū ∈ K for which there existsλ ∈ ∂j(ū) satisfying (3.4), we define Proposition 3.4. The set Cū is a closed, convex cone in L 2 (Ω). Before proving this proposition we have to establish the following lemma.
Proof. The first inequality of (3.8) is an immediate consequence of (3.3). Let us prove the second inequality. For every k ∈ N we define +k] (v(x)) otherwise, Downloaded 08/20/13 to 193.144.185.39. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php Finally, dividing the previous expression by ρ and passing to the limit as k → ∞, we obtain the second inequality of (3.8).
Identities (3.9) are an obvious consequence of (3.8) and the equality satisfied by the elements of Cū.
Remark 3.6. Let us observe that for any v ∈ L 2 (Ω) satisfying In particular, (3.9) implies that this identity holds for all the elements of Cū.
Proof of Proposition 3.4. It is obvious that Cū is a closed cone of L 2 (Ω). Let us prove that it is convex.
and using the convexity of j we get The contrary inequality is a consequence of Lemma 3.5; hence v ∈ Cū. Let us introduce some notation. We define the Lipschitz regularity ofd being a consequence of the regularity properties established in Corollary 3.2. From (3.5a) we deduce as usual On the other hand, from (3.9) we have This identity, along with (3.7) and (3.11), leads to Now, we can formulate the second-order necessary optimality conditions as follows. Downloaded 08/20/13 to 193.144.185.39. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php Theorem 3.7. Let us assume thatū is a local minimum of (P); then and ρ k = 1/k 2 . Then, as in the proof of Lemma 3.5, we have thatū Letλ be the unique element of ∂j(ū) associated withū; see Corollary 3.2. Then, by (3.9) (3.12). Owing to the sign condition for v k , we obtain Let us analyze the case where Analogously, we obtain that Thus, from this identity, (3.13), and (3.14) we conclude that On the other hand, the second identity of (3.13) can be written as Now, using the fact thatū is a local minimum and taking into account (3.15) and (3.16), we infer for all ρ > 0 sufficiently small that Dividing the last expression by ρ 2 /2 and letting ρ 0, we obtain F (ū) v 2 k ≥ 0. Finally, passing to the limit when k → ∞, we conclude that F (ū) v 2 ≥ 0. Downloaded 08/20/13 to 193.144.185.39. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php We finish the section by proving the sufficient condition in Theorem 3.9 with a minimal gap w.r.t. the necessary one proved in Theorem 3.7. Before we do so, we recall that a natural assumption would be the positivity of the second derivative F (ū) on the critical cone Cū. Due to the L 2 regularization term, this already implies that F (ū) is uniformly positive even on a larger cone. This is established in the next theorem. Moreover, this second equivalent condition will be used for the numerical analysis in section 4.
Theorem 3.8. Letū ∈ K andλ ∈ ∂j(ū) such that (3.4) hold. Then the following statements are equivalent: it is obvious that the second condition implies the first one. Let us prove the other implication. We will proceed by contradiction. Then, we assume that the first condition holds, but not the second. Hence, there exists a Since C 1/k u is a cone, we can divide v k by its L 2 (Ω) norm and, by taking a subsequence if necessary, we can assume that Since v k ∈ C 1/k u , then v k satisfies the sign conditions (3.7), and therefore v also does. Then, (3.8) implies On the other hand, using again that v k ∈ C 1/k u , we get Inequalities (3.18) and (3.19), along with the sign condition (3.7) satisfied by v, imply that v ∈ Cū. Now, we observe that Theorem 2.2 and the compactness of the embedding . From this property along with the continuity and convexity of v ∈ L 2 (Ω) → v 2 L 2 (Ω) , and the expression (2.8), we conclude that F (ū) : L 2 (Ω) → R is a weakly lower semicontinuous quadratic functional. Then, from (3.17) we infer which is only possible if v = 0 because of condition 1 of Theorem 3.8; therefore, dx + ν Downloaded 08/20/13 to 193.144.185.39. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php converges to ν, which is a contradiction.
Finally, we prove the sufficient second-order optimality condition.
Proof. Let ε > 0 and u ∈ K ∩ B ε (ū) be given. We define A second-order Taylor expansion of F yields It follows from the continuity of F that the last term is of order o(ρ 2 ). Using the convexity of j, this shows In case v ∈ C τ u , Theorem 3.8 and Lemma 3.5 imply This shows (3.20) for ε sufficiently small.

Finite element approximation of (P).
The goal of this section is to study the approximation of problem (P) by finite elements. Both the state and the controls will be discretized. We prove the convergence of the discretization and derive some associated error estimates. To this aim, we consider a family of triangulations {T h } h>0 ofΩ, defined in the standard way, e.g., in [3,Chapter 3.3]. Due to the assumption that Ω has a smooth boundary, the triangulation covers a polygonal approximation Ω h . With each element T ∈ T h , we associate two parameters ρ(T ) and σ(T ), where ρ(T ) denotes the diameter of the set T and σ(T ) is the diameter of the largest ball contained in T . Define the size of the mesh by h = max T ∈T h ρ(T ). To simplify the presentation of the results, in what follows we suppose that Ω is convex. We also assume that the following regularity assumptions on the triangulation are satisfied which are standard in the context of L ∞ error estimates.
(i) There exist two positive constants ρ and σ such that We will use piecewise linear approximations for the states; thus we set where P 1 is the space of polynomials of degree less than or equal to 1.
The discrete version of (2.1) is defined as follows: Thanks to the monotonicity of the nonlinear term of (4.1) and using Brouwer's fixed point theorem, it is easy to prove the existence and uniqueness of a solution y h (u) of (4.1) for any u ∈ L 2 (Ω h ). Now, we define the space of discrete controls by Every element u h ∈ U h can be written in the form where χ T is the characteristic function of T . The set of discrete feasible controls is given by Finally, the discrete control problem is formulated as follows: It is immediate that (P h ) has at least one solution and we have the following first-order optimality conditions analogous to those of problem (P); see Theorem 3.1.
It is an easy exercise to check thatλ h can be written in the form

Inequality (4.2c) can be written in the form
which leads to the representation formula Using (4.3) and (4.4a) and arguing as in the proof of Corollary 3.2, we can prove The rest of the section is divided into two parts. In the first part, we prove that the family of problems (P h ) realizes a convergent approximation of problem (P) in a two-fold sense: global solutions of (P h ) converge to global solutions of (P) and strict local solutions of (P) can be approximated by local solutions of (P h ). In the second part of the section, we prove some error estimates for these approximations.

Convergence of the discretizations.
Before proving the convergence of the solutions of (P h ) to solutions of (P), we need to establish some convergence properties of the finite element approximation of the state and adjoint state equations. The next result is well known; see [1], [5], and [16] and the references therein.
, and let y u , y h (v), ϕ u , and ϕ h (v) be the solutions of (2.1), (4.1) (with u replaced by v), (2.9), and (4.2b) (withȳ h replaced by y h (v)), respectively. Then the following a priori estimates hold: The above definition is equivalent to the weak (or weak * ) convergence of any extension when h → 0, this is, in particular, the case if we extend u h by an L q (Ω) function independent of h. We also say that {u h } h>0 is bounded in L q (Ω) if there exists a bounded extension {ũ h } h>0 ⊂ L q (Ω), which is equivalent to the boundedness u h L q (Ω h ) ≤ C for all h > 0 and some C > 0. Now we have the first convergence theorem. Theorem 4.4. For every h > 0 letū h be a global solution of problem (P h ), then the sequence {ū h } h>0 is bounded in L ∞ (Ω) and there exist subsequences, denoted in the same way, converging to a pointū in the weak L ∞ (Ω) topology. Any of these limit points is a solution of problem (P). Moreover, we have Downloaded 08/20/13 to 193.144.185.39. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php (4.8) lim whereλ ∈ ∂j(ū) is given by (3.5c) andλ h ∈ ∂j h (ū h ) is given by (4.4c).
Proof. The sequence {ū h } h>0 is clearly bounded in L ∞ (Ω). Let us assume that, for a subsequence denoted in the same way,ū h ū weakly in L ∞ (Ω) when h → 0. Letũ be a solution of (P) and takeũ h ∈ K h defined bỹ Sinceũ ∈ C 0,1 (Ω) (see Corollary 3.2), we know that ũ −ũ h L ∞ (Ω h ) → 0. Then, using thatū ∈ K,ũ h ∈ K h ,ū h is a solution of (P h ), andũ is a solution of (P), we get with the help of Lemma 4.2 that The above inequalities imply thatū is a solution of (P) and J h (ū h ) → J(ū). On the other hand, from Lemma 4.2 we infer whereȳ h andȳ are the states associated withū h andū, respectively. Therefore From this convergence and the weak convergenceū h ū (for an arbitrary extension ofū h to Ω) in L 2 (Ω) we deduce thatū h →ū strongly in L 2 (Ω). Now, Lemma 4.2 implies thatȳ h →ȳ andφ h →φ in H 1 (Ω) ∩ C(Ω). From formula (4.4c), for every T ∈ T h we deduce the existence of x T ∈ T such that and therefore and thus where Lφ is the Lipschitz constant ofφ. Finally, using (4.4a) and (3.5a), we can argue in a similar way to conclude that ū −ū h L ∞ (Ω h ) → 0. The next theorem is a kind of reciprocal result to the previous one for local solutions. It is important from a practical point of view because it states that every strict local minimum of problem (P) can be approximated by local minima of problems (P h ). Downloaded 08/20/13 to 193.144.185.39. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php Theorem 4.5. Letū be a strict local minimum of (P); then there exists a sequence {ū h } h>0 of local minima of problems (P h ) such that (4.8) holds.
Proof. Letū be a strict local minimum of (P); then there exists ε > 0 such that u is the unique solution of (4.10) min where B ε (ū) is a ball in L q (Ω) and all the elements of U h are extended to Ω by taking u h (x) =ū(x) for any x ∈ Ω \ Ω h . We will distinguish two cases: q = 2 or q = ∞; recall the comments made at the beginning of section 3. Let us consider the discrete problems For every h sufficiently small, the problem (P εh ) has at least one solution. Indeed, the only delicate point is to check that K h ∩B ε (ū) is not empty. To this end, we definê Then, thanks to the Lipschitz regularity ofū, we have ū −û h L ∞ (Ω) → 0; therefore, u ∈ K h ∩B ε (ū) for any h ≤ h 0 and some h 0 > 0 sufficiently small. Letū h be a solution of (P εh ) for h ≤ h 0 . Then we can argue as in the proof of Theorem 4.4 to deduce that any subsequence of {ū h } h≤h0 converges strongly in L 2 (Ω) to a solution of (P ε ). Since this problem has a unique solution, we have ū −ū h L 2 (Ω) → 0 for the whole sequence as h → 0. If q = 2, this implies that the constraintū h ∈B ε (ū) is not active for h small, and henceū h is a local solution of (P h ) and (4.2) is fulfilled. Therefore, we proceed as in the proof of Theorem 4.4 to deduce (4.8).

Then (4.11) is equivalent to
which leads to the representation formula analogous to (4.4a), Also we have where Lū is the Lipschitz constant ofū. An analogous inequality is valid for β ε (x) − β εT . Finally, we have for any Consequentlyū is a local minimum of (P h ).

Error estimates.
In this section, {ū h } h>0 denotes a sequence of local minima of problems (P h ) such that ū −ū h L ∞ (Ω h ) → 0 when h → 0,ū being a local minimum of (P); see Theorems 4.4 and 4.5. The goal of this section is to obtain estimates ofū −ū h in the L 2 and L ∞ norms. As we did in the proof of Theorem 4.5, we extend all the functions u h ∈ U h to Ω by taking u h (x) =ū(x) for every x ∈ Ω \ Ω h . Analogously we extendλ h to Ω by settingλ h (x) =λ(x) for x ∈ Ω\Ω h . Now, we recall that Corollary 3.2 implies thatū,λ ∈ C 0,1 (Ω), whereλ ∈ ∂j(ū) and (ū,λ) satisfies (3.4) along with the stateȳ and the adjoint stateφ associated withū.
To derive the error estimates we are going to begin by invoking the first-order optimality conditions (3.4c) and (4.2c). Taking u =ū h in (3.4c), we get Now, for any u h ∈ K h , we deduce from (4.2c) that From here we get Downloaded 08/20/13 to 193.144.185.39. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php Adding (4.12) and (4.13) we deduce that for any u h ∈ K h . This inequality is crucial in the proof of error estimates. To deal with the left-hand side of (4.14) we needū to satisfy the sufficient second-order condition F (ū) v 2 > 0 for every v ∈ Cū \ {0}, or, equivalently (see Theorem 3.8), (4.15) ∃δ > 0 and ∃τ > 0 such that Lemma 4.6. Let us assume that (4.15) holds. Then, there exists h δ > 0 such that Proof. Using the mean value theorem we obtain On the other hand, since F is of class C 2 in L 2 (Ω), there exists ε > 0 such that From the convergence ū −ū h L ∞ (Ω h ) → 0, we deduce the existence of h ε > 0 such that ū −ū h L 2 (Ω) < ε for h ≤ h ε . Then, the last two relations lead to If we prove thatū h −ū ∈ C τ u for every h small enough, then (4.16) follows from (4.15) and the previous inequality. Therefore, the rest of the proof is devoted to showing thatū h −ū ∈ C τ u for every h sufficiently small. Let us define then there exist an element v ∈ L 2 (Ω) and a sequence h k → 0 such that v h k v in L 2 (Ω). It is obvious that each v h satisfies (3.7), and thus v also does. On the other hand, (4.8) and Lemma 4.2 imply that d −d h L ∞ (Ω h ) → 0, whered andd h are defined by (3.10) and (4.5), respectively. From (3.11), we know thatū(x 0 ) = α wheneverd(x 0 ) > 0. Moreover, there exist ρ > 0 and h ρ > 0 such thatd h (x) > 0 for almost all x ∈ Ω satisfying |x − x 0 | < ρ and h ≤ h ρ < ρ. Then, (4.6) implies that u h (x 0 ) = α too; hence v h (x 0 ) = 0 for h ≤ h ρ and almost all x 0 satisfyingd(x 0 ) > 0. Analogously, we can prove that v h (x 0 ) = 0 for h small enough ifd(x 0 ) < 0. We have Downloaded 08/20/13 to 193.144.185.39. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php . And also we have v h k (x) → 0 pointwise for almost every x ∈ Ω 0 . Consequently, v = 0 in Ω 0 holds (see [12, p. 207]), and therefore Now, we study the limit of j (ū; v h k ). First we observe that The limit in the integral over Ω 0 u is more complicated. First, we observe that (3.5b) implies If |φ(x 0 )| < μ, then arguing as above we have that and h is small, and hence v(x) ≤ 0 in the same set. Analogously we obtain that v h k (x) ≥ 0 wheneverφ(x) = −μ and h is small, and consequently v(x) ≥ 0. These results lead to (4.19) lim From (4.18), (4.19), and the fact thatλ(x) = +1 (respectively, −1) for x ∈ Ω − μ (respectively, Ω + μ ), we deduce that From identities (4.17) and (4.20) it follows that This equality holds for any weakly convergent subsequence of {v h } h>0 ; therefore Consequently, given τ > 0 satisfying (4.15), there exists h τ > 0 such that which concludes the proof ofū h −ū ∈ C τ u . Thus, inequality (4.16) holds for any Combining (4.14) and (4.16), and assuming thatū satisfies the second-order sufficient condition (4.15), we get for any u h ∈ K h . The rest of the section is devoted to estimating the right-hand side of the above inequality. To deal with the first two terms we give the following lemma.
Proof. If we denote by ϕ h (v) and ϕ u the discrete and continuous adjoint states associated with v and u, respectively, we have Now, it is enough to use (4.7b) to deduce (4.22) from the above inequality.
Using (4.22) and Young's inequality we can estimate the first two terms in (4.21) by From this inequality and (4.21) we infer for any u h ∈ K h . Let us introduce a convenient elementũ h ∈ K h which approximatesū. We definẽ Lemma 4.8. The following statements hold: The proof of this lemma follows the steps of [6,Lemma 4.8]. Indeed,λ does not play any role in the proof. The only thing to take into account is thatū andd are Lipschitz functions. Inserting this controlũ h into (4.23) we get Let us estimateλ −λ h . By the estimates (4.9) and (4.7c) we infer Inserting this estimate into (4.24) and using once again Young's inequality we get Finally, we estimate the last term. Lemma 4.9. The following inequality holds: Proof. By (3.1), (4.3), and the propertyū h =ū in Ω \ Ω h , we get With (4.25) and (4.26) we have proved the following theorem. Theorem 4.10. Letū be a local minimum of problem (P) and let {ū h } h>0 be a sequence of local minima of problems (P h ) such that ū −ū h L ∞ (Ω h ) → 0. Let us assume that (4.15) holds. Then there exists a constant Finally, combining this theorem with (4.7c) and (4.9) and the representation formulas forū andū h we get the following result.
Corollary 4.11. Under the assumptions of Theorem 4.10, we have for some constant C > 0 independent of h. Downloaded 08/20/13 to 193.144.185.39. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 5. A variational discretization of (P). In this section we consider a partial discretization of (P). As in section 4, we consider a triangulation of Ω under the same hypotheses; Y h is defined in the same way leading to the same discrete state equation (4.1). However, we do not discretize the controls and we set U h = L ∞ (Ω). Rather, the controls are implicitly discretized by the representation formula; see (5.1). This idea was introduced by Hinze [13] and was called variational discretization of the control problem. This discretization is numerically implementable (although the discretized problem continues to be an infinite dimensional optimization problem) thanks to the fact that the optimal controlū h is a projection of the adjoint state, which is piecewise linear, translating this property toū h . This incomplete discretization leads to an error estimate ofū −ū h of order h 2 in the L 2 (Ω h ) norm as we will prove in this section.
The problem (Q h ) is defined as follows: The proof of the existence of a solutionū h of (Q h ) is the same as for the problem (P). The optimality conditions satisfied by a local minimum of (Q h ) are given by Theorem 4.1 with K h replaced by K. This change leads to the same relations formulas proved in Corollary 3.2, i.e.,ū These expressions are valid for every x ∈ Ω h . Also we have Theorem 4.4 is valid for the problem (Q h ). Let us mention the only two changes in the proof. First, given a solutionũ of (P), we do not need to introduceũ h as we did in the proof; we just takeũ h =ũ because now K h = K. On the other hand, using (3.5c) and (5.1c) we have that With these changes the proof follows the same steps. Theorem 4.5 is also valid. In fact, its proof is easier for the new problem (Q h ) by using the properties (5.1). For instance, in the definition of (P ε ) we have to replace K h by K; then it is obvious that Downloaded 08/20/13 to 193.144.185.39. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php u is a feasible control of (P ε ). We consider the functions α ε and β ε and we get from (4.11) thatū Moreover, from the definition of α ε and β ε we also havē From the last two inequalities we deduce that With these observations the proof of Theorem 4.5 is immediate. Finally, we have the following error estimates. Theorem 5.1. Letū be a local minimum of problem (P) and let {ū h } h>0 be a sequence of local minima of problems (Q h ) such that ū −ū h L ∞ (Ω h ) → 0. Let us assume that (4.15) holds. Then there exists a constant C > 0 independent of h such that Proof. Arguing as in the previous section we see that the inequality (4.14) is valid for any u h ∈ K. Then we select u h =ū and (4.14) becomes Obviously Lemmas 4.6, 4.7, and 4.9 are still valid; then applied to the previous inequality we obtain the estimate ū −ū h L 2 (Ω h ) ≤ Ch 2 . This estimate combined with (4.7a) and (4.7b) proves (5.2c). Now (5.2d) is an immediate consequence of (4.7c) and the estimate obtained for the controls. The L ∞ (Ω h ) estimate for the controls follows from (5.2d) and the representation formulas (3.5a) and (5.1a). Finally, the estimates forλ −λ h are consequences of the estimates (5.2c) and (5.2d) and the representation formulas (3.5c) and (5.1c).  Table 6.1 L 2 and L ∞ errors in the control on the unit circle and the unit square in case of full discretization. The error was computed against the solution on the finest grid, using h * = 2 −8 on the unit circle and h * = 2 −9 on the unit square.

Unit circle
Unit square h conditions for (Q h ) are as follows: where we use the relationship between u h and ϕ h due to (5.1). The nonlinear system (6.2) is solved via a semismooth Newton method and for a sequence of different meshes. The error in ϕ with respect to the solution on the finest grid (h * = 2 −8 ) for the unit circle is shown in Table 6.2. It confirms the quadratic rate of convergence w.r.t. h. By the Lipschitz continuity of the projection (6.3) with Lipschitz constant 1/ν, Downloaded 08/20/13 to 193.144.185.39. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php the same convergence order holds for the controlū h . Since the computed controls u h have kinks on the triangles of the mesh and since the meshes are not nested, computing the actual error in u h would be rather complicated. Table 6.2 shows the results for the unit square with a finest grid size h * = 2 −9 . Table 6.2 L 2 and L ∞ errors in the adjoint state on the unit circle and unit square in case of variational discretization.

Unit circle
Unit square