Second Order Optimality Conditions and Their Role in PDE Control

If \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$f: \mathbb{R}^{n} \to \mathbb{R}$\end{document} is twice continuously differentiable, f′(u)=0 and f″(u) is positive definite, then u is a local minimizer of f. This paper surveys the extension of this well known second order sufficient optimality condition to the case \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$f: U \to \mathbb{R}$\end{document}, where U is an infinite-dimensional linear normed space. The reader will be guided from the case of finite dimensions via a brief discussion of the calculus of variations and the optimal control of ordinary differential equations to the control of nonlinear partial differential equations, where U is a function space. In particular, the following questions will be addressed: Is the extension to infinite dimensions straightforward or will unexpected difficulties occur? How have second order sufficient optimality conditions to be modified, if simple inequality constraints are imposed on u? Why do we need second order conditions and how can they be applied? If they are important, are we able to check if they are fulfilled? It turns out that infinite dimensions cause new difficulties that do not occur in finite dimensions. We will be faced with the surprising fact that the space, where f″(u) exists can be useless to ensure positive definiteness of the quadratic form v↦f″(u)v2. In this context, the famous two-norm discrepancy, its consequences, and techniques for overcoming this difficulty are explained. To keep the presentation simple, the theory is developed for problems in function spaces with simple box constraints of the form α≤u≤β. The theory of second order conditions in the control of partial differential equations is presented exemplarily for the nonlinear heat equation. Different types of critical cones are introduced, where the positivity of f″(u) must be required. Their form depends on whether a so-called Tikhonov regularization term is part of the functional f or not. In this context, the paper contains also new results that lead to quadratic growth conditions in the strong sense. As a first application of second order sufficient conditions, the stability of optimal solutions with respect to perturbations of the data of the control problem is discussed. Second, their use in analyzing the discretization of control problems by finite elements is studied. A survey on further related topics, open questions, and relevant literature concludes the paper.

ory is developed for problems in function spaces with simple box constraints of the form α ≤ u ≤ β. The theory of second order conditions in the control of partial differential equations is presented exemplarily for the nonlinear heat equation. Different types of critical cones are introduced, where the positivity of f (u) must be required. Their form depends on whether a so-called Tikhonov regularization term is part of the functional f or not. In this context, the paper contains also new results that lead to quadratic growth conditions in the strong sense.
As a first application of second order sufficient conditions, the stability of optimal solutions with respect to perturbations of the data of the control problem is discussed. Second, their use in analyzing the discretization of control problems by finite elements is studied. A survey on further related topics, open questions, and relevant literature concludes the paper.

Introduction
Any reader certainly knows the following standard facts about extremal problems posed in the vector space R n . If a differentiable function f : R n → R attains a local minimum at a vectorū, then the first order necessary optimality condition f (ū) = 0 must be fulfilled. If f is twice continuously differentiable in a neighborhood ofū, then the Hessian matrix f (ū) has to be positive semidefinite.
Conversely, ifū ∈ R n satisfies the condition f (ū) = 0 and the matrix f (ū) is positive definite, then f attains a local minimum atū. This is a second order sufficient optimality condition. In exercises, a standard way of solving extremal problems consisted of two steps: First, a stationary solutionū is determined by f (u) = 0 and next the second order conditions are used to confirm if this is a local minimum, maximum, or nothing of both.
However, second order conditions are more important than that; they are not only useful for verifying local minima. In some cases, they even cannot be verified. We will highlight their importance in this survey. Our aim is to survey some main principles and applications of second order optimality conditions for certain optimization problems in infinite-dimensional spaces with special emphasis on the optimal control of partial differential equations (PDEs). The second order analysis of control problems for PDEs was developed in the past 25 years and we will shed some light on parts of this rapid development.
It will turn out that the theory of second order conditions for PDE optimal control problems is more rich than that for extremal problems in finite-dimensional spaces.

Second Order Conditions in Calculus of Variations and Optimal Control
Calculus of Variations It is an interesting fact that the history of extremal problems was started with problems in function spaces, hence in infinite-dimensional spaces. It was the famous Brachistochrone problem, posed in 1696 by J. Bernoulli that initiated the theory of extremal problems. The simplest problem with fixed end points can be written as where L : [a, b] × R 2 → R, x a , x b ∈ R are given. Let us assume for simplicity that the function (t, x, u) → L(t, x, u) is of class C 2 and that the unknown function x is continuously differentiable so that a discussion of corners can be avoided and the integral above exists. This assumption is not met by the Brachistochrone problem, but we adopt this simplification for convenience.
A functionx ∈ C 1 [a, b] is said to be a weak local solution of the variational problem (2.1), if J (x) ≤ J (x) holds for all x ∈ C 1 [a, b] out of a C 1 [a, b]-ball aroundx that also satisfy the boundary conditions in (2.1). The Euler differential equation is known to be the first order necessary condition for a weak local solutionx. There must hold d dt Any solution to this equation is called an extremal. To make sure that an extremal is a weak local solution of the variational problem, a second order sufficient condition is used. The second order Fréchet derivative of J (x) exists for allx ∈ C 1 [a, b] and

x(t),x (t) v(t)v (t)
The existence of δ > 0 such that is sufficient for an extremalx to be a weak local solution of (2.1) (cf. [54], Sect. 7.4, proof of Theorem 2; notice that by continuity (2.2) holds also for all v ∈ H 1 0 (a, b)). Comparing this condition with (4.14) below, the reader will confirm that this coercivity condition appears in an adapted form also for control problems with PDEs.
Remark 2.1 Although (2.2) is a second order sufficient optimality condition for an extremal, a question remains. How can one verify analytically that a given quadratic form J (x) satisfies (2.2)? The strong Legendre condition ∂ 2 L ∂u 2 t,x(t),x (t) > 0 ∀t ∈ [a, b], along with the strong Jacobi condition is sufficient for (2.2), hence also sufficient for weak local optimality of an extremalx, cf. [54], Sect. 6, Theorem 6 and Sect. 7.4, Theorem 2. The strong Jacobi condition requires that the solution of the so-called Jacobi differential equation does not have a zero in the interval (a, b]. For the definition of the Jacobi equation, we refer again to [54], Sect. 6.
These remarks on the calculus of variations reveal that the theory of second order sufficient conditions in infinite-dimensional spaces is more challenging and rich than in finite dimensions.
Optimal Control of Ordinary Differential Equations Considering the derivative x as a control function u, the problem of calculus of variations (2.1) can be re-written as a simple optimal control problem with unconstrained control function u, where x and u are coupled by the initial-boundary value problem x (t) = u(t), t ∈ (a, b), x(a) = x a and the terminal condition x(b) = x b is given. Let us now introduce the more general nonlinear ordinary differential equation (ODE) x = g(t, x, u) and skip for simplicity the terminal condition. Moreover, we add bound constraints α ≤ u(t) ≤ β on the control function u. Then the following simple optimal control problem for an ODE is obtained in the fixed interval of time Here, u is taken from L ∞ (a, b) and x is obtained in the Sobolev space W 1,∞ (a, b). The Pontryagin maximum principle [74] is the associated fundamental first order necessary optimality condition. In general, it is not sufficient for local optimality and second order conditions can be invoked to ensure local optimality. Let us briefly sketch this concept for the case of unconstrained controls, i.e. we assume for simplicity α = −∞, β = ∞. We introduce the Lagrangian function where ϕ ∈ W 1,∞ (a, b) is the Lagrange multiplier for the ODE in (2.3). It is called adjoint state associated with (x, u) and is defined as solution of the adjoint equation Letū be a control that, along with the statex and the associated adjoint statē ϕ satisfies the Pontryagin principle. We define for convenience For the second order sufficient condition, we assume the existence of some δ > 0 such that for all pairs (x, u) ∈ W 1,2 (a, b) × L 2 (a, b) that satisfy the linearized equations Thenū is a locally optimal control, where 'local' is defined with respect to the norm of L ∞ (a, b). This is a second order sufficient optimality condition for the case of unconstrained controls. Again, a question remains: How can one verify the condition (2.4), (2.5)? The so-called strong Legendre-Clebsch condition along with the existence of a bounded solution of a certain Riccati equation, is sufficient for (2.4), (2.5) to hold, cf. Maurer [65]. The solvability of the Riccati equation is the analogon to the Jacobi condition of the calculus of variations. We presented the second order condition for the case of unconstrained controls. If the box constraints α ≤ u ≤ β are given, then the condition is more complicated. This extension is discussed in [36].
We do not further detail the theory of second order sufficient optimality conditions for the optimal control of ODEs here and refer the reader to the textbook [54] and to the recent book [72] along with the extensive list of references therein. In [72], special emphasis is laid on problems, where optimal controls are of bang-bang type.
However, we explicitly mention the fundamental paper [53], where the so-called two-norm discrepancy was first discussed that plays an essential role in our theory for PDEs, too. The reader will have observed that the coercivity condition (2.4) is formulated for the L 2 -norm, but local optimality is only ensured in the L ∞ -norm. This is characteristic for the two-norm discrepancy.
Moreover, we quote [66] on first and second order sufficient optimality conditions for optimization problems in infinite-dimensional spaces. Readers who are interested in the abstract theory of second order sufficient optimality conditions in infinitedimensional spaces are also referred to the detailed discussion in [6].
Optimal Control of PDEs The optimal control theory of PDEs was very much stimulated by the celebrated monograph by J.L. Lions [60]. In his book, optimal control problems for linear PDEs of elliptic, parabolic, and hyperbolic type with convex quadratic objective functional are discussed. Thanks to the linearity of the PDE, these problems are convex. Therefore, the first order necessary optimality conditions are sufficient for optimality and second order sufficient conditions are not needed. This explains why the theory of second order conditions came up two decades later when optimal control problems were discussed extensively for semilinear PDEs. A first list of characteristic papers marking exemplarily the development of this topic is [16,27,[31][32][33]. All the difficulties that are known from the optimal control of ODEs occur also here, often in a stronger form. In particular, the two-norm discrepancy is an important obstacle.
In our survey, we will give a short course on second order conditions that finally concentrates on the control of PDEs. We will also sketch nonlinear optimization problems in finite-and infinite-dimensional spaces to have a comparison of the various difficulties.

On the General Role of Second Order Optimality Conditions
The reader will have learned in calculus that, looking for local extrema of f , one has to start with finding a stationary solutionū that satisfies the first order necessary optimality condition f (ū) = 0. Next, in the case of minimizing f , it has to be checked, if f (ū) is positive definite, i.e. if a second order sufficient optimality condition is fulfilled. In exercises, these two steps had to be done analytically.
Can we use the same approach for problems in infinite dimensional spaces, say in calculus of variations or optimal control? The simple answer is yes, provided that two conditions are satisfied: The stationary solutionū must be given analytically, i.e. exactly, and also the definiteness of f (ū) must be verified analytically. There are many nice examples in calculus of variations or optimal control of ODEs, where these conditions are fulfilled. Also in the control of PDEs, a few academically constructed examples are known that obey a second order sufficient condition, cf. e.g. [55,70,83].
However, for many interesting examples of control theory, in particular for applied problems, the solution cannot be determined analytically. They have to be found by numerical methods and hence are only given approximately. Then it must be verified that a stationary solution exists in the neighborhood of the approximate one that satisfies the second order sufficient condition. This is extremely difficult but can be done in exceptional cases, we refer to [80].
Even if a stationary solutionū is given analytically, the second order sufficient condition must be checked analytically again. Only in simple cases of infinite dimension, one can decide about the definiteness of a quadratic form v → f (ū)v 2 by numerical methods, cf. [80].
For the same reason, the verification of the coercivity of quadratic forms by analytical tools like in the calculus of variations and in the optimal control of ODEs (Jacobi condition, solvability of Riccati equations) is not realistic. It is certainly difficult to decide about the solvability of a Riccati equation by a numerical method. Now the question arises why we need second order sufficient optimality conditions, if they can be verified only for simple, more or less academic problems?
The answer is that second order conditions develop their power mainly as theoretical assumption. They ensure the stability of optimal solutions with respect to perturbations of the problems such as finite element discretizations. Also a priori error estimates for the numerical approximation of optimal control problems for nonlinear differential equations are developed by assuming second order sufficient optimality conditions. If they are satisfied, the local minimizer is the only one in a neighborhood, i.e. it cannot be the accumulation point of local minima. Second order conditions are also the standard assumption to guarantee the convergence of numerical optimization methods.
In some sense, second order conditions play a similar role as regularity qualifications that are needed to prove the existence of Lagrange multipliers: In general, they cannot be verified in advance, because the optimal solution is unknown, for which they should be assumed. However, if they were not fulfilled, the numerical solution of the problem might cause troubles. Moreover, if there is a class of differentiable optimal control problems, where one is unable to work out a theory of second order sufficient optimality conditions, then this class might be ill-posed. In view of this, second order conditions are of paramount importance for the optimal control theory.

The Case of Finite Dimensions
Let us return to the minimization of a function f : R n → R that we briefly sketched in the introduction. The situation becomes more complicated, if f is minimized subject to finitely many constraints, say finitely many equations or inequalities that are imposed on the vector u.
If only equality constraints are given, a local optimizer must obey the wellknown Lagrange multiplier rule of calculus as first order necessary optimality conditions, provided that a certain regularity condition is satisfied. If also inequality constraints are imposed on u, then the Karush-Kuhn-Tucker or Fritz-John type theorems of differentiable nonlinear optimization are known as first order necessary conditions.
We will not address the associated second order conditions of the Karush-Kuhn-Tucker theory and refer the reader to the textbooks [40,61,71]. However, we will sketch some aspects of second order conditions for finite-dimensional optimization problems with simple box constraints, i.e. with upper and lower bounds on the vector u. We embed this analysis in an associated one for infinite-dimensional problems.

Is the Extension of SSC to Infinite Dimensions Straightforward?
Let (U, · U ) be a real Banach space and J : U → R be a real-valued function. We start our tour through second order conditions by the extremal problem min u∈U J (u). (3.1) is satisfied, thenū is said to be a strict local solution.
The following basic result is well-known: Ifū ∈ U is a local solution of (3.1) and J is Gâteaux differentiable atū, then the first order necessary optimality condition must be satisfied. This result holds no matter whether the dimension of U is finite or infinite. If J is twice continuously Fréchet differentiable in a neighborhood ofū (i.e. J is of class C 2 aroundū), then the second order necessary optimality condition must hold in addition to (3.3). Here, we denote by J (ū)v 2 the quadratic form J (ū) (v, v). The proofs of (3.1) and (3.3) are identical with those that are known for the case of a finite-dimensional space U , say U = R n . This little discussion shows that the first and second order necessary optimality conditions of unconstrained optimization can be extended without change to the case of infinite dimensions provided that J has the needed differentiability properties. As we shall show next, the situation changes with second order sufficient optimality conditions (SSC). We first discuss the situation in finite dimensions and continue by the case of infinite dimensions.
Finite-Dimensional Space U Here, the well-known second order sufficient optimality condition is as follows: Ifū ∈ U satisfies the first order necessary condition (3.3) and thenū is a strict local solution of (3.1). It is important that the condition (3.5) is equivalent to the existence of some δ > 0 such that The constant δ is the minimum eigenvalue of the Hessian associated with J .
Infinite-Dimensional Space U Now we allow U to be infinite-dimensional. Then (3.5) and (3.6) are not necessarily equivalent for continuous quadratic forms. To see this, we consider the following example.
Example 3. 2 We fix the space U := L ∞ (0, 1) and define the quadratic form Q : Obviously, Q satisfies the condition (3.5), but it is easy to confirm that there cannot exist δ > 0 such that To see this, assume that such δ > 0 exists. To construct a contradiction, we select v(x) = χ [0,ε] and pass to the limit ε ↓ 0. Then the left-hand side of the inequality above tends to zero while the right-hand side is equal to δ; a contradiction.
Comparing this result with the finite-dimensional case, the following question naturally appears: Is the positivity condition (3.5) sufficient for local optimality in infinite dimensions? In other words, does (3.5), together with the first order necessary condition (3.3) imply local optimality ofū in any infinite-dimensional space? Do we need instead the condition (3.6) for this purpose or do we have to impose another condition for local optimality? Our next example, taken from [30], shows that (3.5) is not sufficient for optimality.
However, the following theorem is known since long time. The proof of this theorem, given by Cartan [10], is a straightforward extension of the standard one known for finite dimensional spaces U . Theorem 3.4 might create the impression that the theory of second order conditions is fairly analogous to that in finite dimensions. This expectation is wrong, because the celebrated two-norm discrepancy occurs in many problems of interest. It names the difficulty that the coercivity condition (3.6) is not true in the spaces where the functional J is twice differentiable. The following example shows that this obstacle already appears in simple unconstrained extremal problems of the form (3.1).
Example 3.5 (Two-norm discrepancy for an unconstrained extremal problem) Consider the extremal problem min u∈L 2 (0,1) whereū(t) ≡ −π/2 is a global solution. Easy but formal computations lead to Comparing this situation with (3.6), the reader might expect thatū is a strict local minimum of (3.8). Unfortunately, this is not true. For every 0 < ε < 1, the functions are also global solutions of (3.8), with J (ū) = J (u ε ) and ū − u ε L 2 (0,1) = 2π √ ε. We observe the surprising fact that infinitely many different global solutions of (3.8) are contained in any L 2 -neighborhood ofū, henceū cannot be a strict solution.
What is wrong with that example and similar ones given in [2] or [83]. Below, we follow our explanations in [30].
The reason is that J is not of class C 2 in L 2 (0, 1); the computation of the second derivative was too formal. Theorem 3.4 cannot be applied in L 2 (0, 1). However, J is of class C 2 in L ∞ (0, 1) and the formally computed expression for This phenomenon is called the two-norm discrepancy: the functional J is twice differentiable with respect to one norm, but the inequality J (ū)v 2 ≥ δ v 2 holds in a weaker norm in which J is not twice differentiable; see, for instance, [53]. This situation arises frequently in infinite-dimensional problems but it does not happen for finite dimensions because all the norms are equivalent in this case.
The following theorem on second order optimality conditions deals with the twonorm discrepancy. Theorem 3.6 (SSC in the case of the two-norm discrepany) Let U be a vector space endowed with two norms ∞ and 2 , such that J : (U, ∞ ) → R is of class C 2 in a neighborhood ofū and the following properties hold and there exists some ε > 0 such that Then there holds the quadratic growth condition so thatū is a strict local solution of (3.1) with respect to the norm · ∞ .
Proof We select u ∈ U and perform a Taylor expansion atū. Invoking J (ū) = 0, we get with some intermediate point u θ between u andū in view of (3.9) and (3.10), provided that u −ū ∞ ≤ ε.
To our knowledge, Ioffe [53] was the first who proved a result of this type by using two norms in the context of optimal control for ordinary differential equations. Theorem 3.6 was stated in this abstract setting in [30]. In the context of PDE constrained optimization, the proof of Theorem 3.6 can also be found e.g. in [83,Theorem 4.23].
Theorem 3.6 can be applied to Example 3.5 to deduce thatū is a strict local minimum in the sense of L ∞ (0, 1).
If the two-norm discrepancy occurs in an optimal control problem, we consider two norms, for instance, the L ∞ -norm for differentiation and the L 2 -norm for expressing the coercivity of J . Then local optimality should hold only in the stronger L ∞ -sense.

Short Account on Optimality Conditions with Box Constraints
In many applications, constraints must be imposed on the unknown u that express the limitation of available resources. Moreover, often such constraints are needed for the existence of an optimal solution to (3.1). We do not aim at a discussion of Karush-Kuhn-Tucker conditions for general constraints of the form u ∈ C, where C is a non-empty closed (in general nonconvex) subset of U that may be expressed by nonlinear equality and inequality constraints.
Let a convex, closed, and nonempty set U ad ⊂ U be given. We consider the problem with constraints min u∈U ad J (u). (3.12) Theorem 3.7 (First order necessary condition with constraints) Ifū is a local solution to (3.12), then the variational inequality It is obvious that the condition (3.3) cannot be expected under constraints. To see this, consider the simple example in (3.14) We have the two global (and hence also local) solutionsū 1 = −1 andū 2 = 1. In both solutions, (3.3) fails to hold, while (3.13) is fulfilled.
To survey second order conditions in finite dimensions, let us now assume that U = R n and that real constants α < β are given. We define In this case, the restriction u ∈ U ad is given by so-called box constraints. They are very useful to express second order optimality conditions in an elementary way. We introduce the critical cone atū ∈ U ad This set Cū contains exactly those v that satisfy the conditions Theorem 3.8 (Second order optimality conditions for J : R n ⊃ U ad → R) Letū ∈ U ad and assume that J is of class C 2 in a neighbourhood ofū. Ifū is a local solution to the constrained problem (3.12), then the second order necessary condition must be satisfied. Ifū ∈ U ad satisfies the variational inequality (3.13) and the second order sufficient condition thenū is a strict local solution to (3.12).
We do not prove this well-known result that follows from the standard Karush-Kuhn-Tucker theory of mathematical optimization in finite-dimensional spaces and refer e.g. to [61] or [71].

Remark 3.9
The second order sufficient condition (3.17) does not necessarily imply local convexity. We refer again to the very simple extremal problem (3.14) with local solutionsū 1 = −1 andū 2 = 1. The function u → −u 2 is concave! However, it holds Cū i = {0}, i = 1, 2, hence the second order sufficient condition does not include any requirement. Here, the first order conditions are already sufficient for local optimality. Notice that, in both pointsū i the function strictly increases in directions that point towards the interior or [−1, 1].

Intermediate Conclusions
Our first short course on basic facts about second order conditions revealed certain differences between extremal problems in finite and infinite dimensions. We concentrated on problems with simple box constraints. Now we proceed with the optimal control theory for PDEs. Sometimes, we will find exactly the same situation as in finite dimensions. In other cases, the situation is different. We start with an optimal control problem for the heat equation, because its background in physics is easy to explain. In control problems of the heat equation, the two-norm discrepancy occurs already with spatial domains of dimension two.

The Problem
We consider a bounded Lipschitz domain Ω ⊂ R N , N ≥ 1, with boundary Γ . The domain Ω that stands for a spatial domain that is to be heated in the fixed time interval By y(x, t) the temperature in the point x ∈ Ω at time t ∈ [0, T ] is denoted and y 0 (x) is the temperature at the initial time t = 0. We assume that, for given u, the temperature y is obtained as the solution to the semilinear heat equation in Ω.
The nonlinearity a : R → R is assumed to be monotone non-decreasing. We do not explain the meaning of the nonlinearity a that models sources or sinks of heat depending on the temperature y. Since we will need second order derivatives, we assume that a ∈ C 2 (R) with locally Lipschitz second order derivatives. The Dirichlet boundary condition says that the temperature y at the boundary Γ is zero at any time. Our theory also works for Neumann or Robin conditions. The function u is our control, while y is called the associated state; the partial differential equation (4.1) is said to be the state equation. It is a semilinear parabolic equation. In this formulation, we have tacitly assumed that to each control u there exists a unique state y. In the correct spaces, this is indeed the case, as the following theorem shows.

Remark 4.2
The Sobolev space W (0, T ) is not needed for the understanding of the further results. For convenience, we mention that W (0, T ) is defined by To understand all what follows, it suffices to consider G as mapping in L ∞ (Q).
Why do we not consider the control function u in the space L 2 (Q)? The reason is simple. Only for N = 1, the mapping G : u → y u is well posed and differentiable from L 2 (Q) to L ∞ (Q). Therefore, the space L ∞ (Q) is selected to make the Nemytskij operator (superposition operator) y(·) → a(y(·)) well defined and Fréchet differentiable. For this purpose, also the larger space L p (Q) with p > N/2 + 1 can be used.
After the discussion of the state equation we define our optimal control problem by where the set of admissible controls U ad is defined by The integral functional above is convex with respect to y and the set U ad is convex, too. Nevertheless, the functional J is in general nonconvex, because Eq. (4.1) is nonlinear (unless a is constant w.r. to y). Therefore, the discussion of second order conditions is reasonable. We should mention a theory of optimality conditions for convex problems with nonlinear equations, cf. [54]. However, to be convex, such problems have to obey certain assumptions that can hardly be verified for nonlinear differential equations.
The objective of the optimization is that the temperature y in Ω follows as closely as possible the given temperature field y d that stands for a desired cooling or heating evolution in space and time. We assume that y d belongs to L p (Q) with some p > N/2 + 1. In a somewhat academic fashion, we suppose that we are able to directly control the heat source u in Ω. Normally, also u must be generated (for instance by eddy currents or micro waves), but we do not aim at modeling a real physical situation here. This task would require another differential equation coupled with (4.1).
The constraints (4.2) also have an important mathematical reason: Unless the desired function y d belongs to the range of the mapping G : u → y u , the unconstrained problem is unsolvable for ν = 0. If ν > 0, then the problem may also fail to have an optimal solution for space dimension N > 1. The parameter ν might express the costs of the control u but it is also very useful from a mathematical point of view. As a Tikhonov regularization parameter, it increases the numerical stability of the optimal solution u. Moreover, for ν = 0, the second order sufficient optimality condition (4.13) below will hold only in exceptional cases.
By standard methods, the existence of at least one (global) solution to (P) can be proved. Here, the weak compactness of U ad in L p (Q) with p > N/2 + 1 is used together with the weak lower semicontinuity of J . Let us concentrate on the optimality conditions. We should remark that the first and second order differentiability of J is guaranteed, if J is considered as a mapping from L ∞ (Q) to R. If we define J as mapping from L 2 (Q) to R, then this differentiability only holds for dimension N = 1.

First Order Necessary Conditions
Letū be a local solution to (P). Thenū must satisfy the variational inequality (3.13), In this inequality, the control is appearing implicitly via the term G (yū)(u −ū). This term can be transformed to an explicit appearance of the control. For this purpose, L.S. Pontryagin introduced a famous tool -the adjoint equation.
is said to be the adjoint equation for (P). Its solution ϕ is called adjoint state associated with u and denoted by ϕ u to indicate the correspondence to u.
Using ϕ u , it is easy to prove that For u ∈ L 2 (Q), there holds ϕ u + ν u ∈ L 2 (Q), hence the mapping v → J (u)v can be continuously extended to L 2 (Q). Therefore, J (u) belongs to the dual space L 2 (Q) . By the Riesz theorem, we can identify J (u) with an element of L 2 (Q), We call ϕ u + ν u the gradient of J at u. The necessary optimality condition finally admits the form We have obtained the following result on necessary optimality conditions: If ν > 0, then the second implication in (4.6) can be resolved for u. This somehow explains the following important projection formula: where P [α,β] : R → [α, β] is the projection operator defined by This formula follows from (4.5), because it implies thatū(x, t) solves the quadratic optimization problem for almost all (x, t) ∈ Q. The projection formula permits to deduce higher regularity of any locally optimal control, if ν > 0. Notice that u was only assumed to be a function of L 2 (Q).

Corollary 4.5
If ν > 0, then any locally optimal controlū belongs to the space Proof The mapping u(·) → |u(·)| is continuous in H 1 (Ω), [56]. Since the function P [α,β] can be expressed in terms of the function s → |s|, the first conclusion of the corollary is an immediate consequence. Because y d and y u belong to L p (Q) with p > N/2 + 1, we have ϕū ∈ C(Q). The continuity ofū follows again from (4.7).
If ν = 0, then we cannot apply the projection formula (4.7) but only (4.6). Then the optimal control admits the values a or b, where ϕū(x, t) = 0. The control can switch between a and b and can hence be of bang-bang type.

Second Order Conditions
The second order Fréchet-differentiability of the objective functional J follows from the one for the control-to-state mapping G : u → y u that is stated in the next result: Lemma 4.6 (First and second order derivative of G) Assume that the function a is of class and z v i := G (u)v i , are the solutions to the linearized differential equation The existence of the first and second order derivatives G and G is proved by an application of the implicit function theorem, we refer to [30] or [83]. Then it is easy to obtain the equations (4.8) and (4.9) for their computation. We just insert y = G(u) in the state equation and differentiate twice in the directions v 1 and v 2 . In view of this result, the existence of the second derivative J (u) is an immediate consequence of the chain rule. The expression for J (u) is easily obtained by differentiating the mapping first in the direction v 1 and next in another direction v 2 . We find To arrive at this formula, after having computed the second order derivative G (u), we consider the linear equation (4.9) as one with auxiliary control v := −a (y u )z v 1 z v 2 at the right-hand side and invoke the adjoint state in a standard way.
As an immediate consequence we obtain for v 1 A first inspection of this formula shows that the quadratic form v → J (u)v 2 can be continuously extended from L ∞ (Q) to L 2 (Q). This follows immediately from the facts that the mapping v → z v is continuous in L 2 (Q) and the function 1 − ϕ u a (y u ) is bounded and measurable. Moreover, the mappings u → ϕ u and u → y u are locally Lipschitz.
We now have all pre-requisites that are needed for second order optimality conditions. Since we do not optimize with unconstrained controls and have to consider the bounded set U ad , the requirement of coercivity or non-negativity of the quadratic form J (ū)v 2 on the whole space L 2 (Q) would be a too strong requirement.
Therefore, we introduce the cone of critical directions. To motivate this cone, we mention the relationships u i ∼ u(x, t) and J (u) i ∼ (ϕ u + ν u)(x, t) and recall the conditions (3.16). All associated statements hold almost everywhere.  The following proposition states an important property of Cū.
We continue our second order analysis by formulating the second order necessary optimality conditions. Theorem 4.10 (Second order necessary condition) Ifū is a local solution to (P), then there holds We do not prove this theorem and refer the interested reader to [27] and [30]. Comparing this result with the condition (3.17) for the finite-dimensional case, we see that the second order necessary conditions for our optimal control problem are completely analogous to those for finite dimensions. At least, this is true for such simple problems. The situation changes with state-constrained problems, where the theory of second order necessary conditions is open.
In view of our introductory remarks on the two-norm discrepancy, the reader might expect that the second order sufficient conditions differ from the finite-dimensional case. Surprisingly, for ν > 0 there is no difference! This was recently proved. Therefore, we deal with the cases ν > 0 and ν = 0 separately.
Case ν > 0 is equivalent with the coercivity condition (4.14) Proof The implication (4.14) ⇒ (4.13) is obviously true, hence it remains to prove the converse direction. To this aim, we assume (4.13) and define It is clear that this infimum exists as a nonnegative number. Let v k ∈ Cū be an associated minimizing sequence. Then we can select a weakly converging subsequence and can assume w.l.o.g. that Now, we distinguish between two cases.
(i) v = 0: Here, we consider the expression (4.11) for J (ū), The mapping v → z v is compact in L 2 (Q), because W (0, T ) is compactly embedded in L 2 (Q) by the Aubin Lemma and the mapping v → z v is linear and continuous from L 2 (Q) to W (0, T ). Therefore, we have Inserting this in the expression for J (ū)v 2 k above, we obtain (4.15) hence δ = ν > 0 by the definition of δ.
(ii) v = 0: Here we find as the function v → J (ū)v 2 is weakly lower semicontinuous. This follows from the special form (4.11) of J (ū): In the first integral, we use the compactness of the mapping u → y u in L 2 (Q), while the second term is convex and continuous.
In both cases, we proved δ > 0 and it is clear that the right-hand side of (4.13) holds true with that δ.

Remark 4.12
In the proof, the combination of convexity and compactness arguments played a decisive role. In finite dimensions, the situation is much simpler. Here, weak and strong convergence are equivalent and the unit ball is compact, hence the first case v = 0 cannot happen.  Then there exist ε > 0 and δ > 0 such that the quadratic growth condition is fulfilled, where B ε (ū) denotes the ball of L 2 (Q) with radius ε centered atū.
Proof We argue by contradiction and assume that for any positive integer k there exists u k ∈ U ad such that Setting ρ k = u k −ū L 2 (Q) and v k = (u k −ū)/ρ k , we can assume that v k v in L 2 (Q); if necessary, we select a subsequence.
First, we prove that v ∈ Cū. From the necessary condition (4.5) and the expression (4.3) for J (ū)(u −ū), we find We also derive the converse inequality. Due to the definition of v k , we have where 0 < θ k < 1 and u θ k :=ū + θ k (u k −ū). In the last limit, we have used several convergence properties. First, the strong convergence u k →ū in L 2 (Q) and the uniform boundedness of u k in U ad ⊂ L ∞ (Q) imply the strong convergence u k →ū in L p (Q) with N/2 + 1 < p < ∞. This yields y u θ k → yū in L ∞ (Q) and hence ϕ u θ k → ϕū in L ∞ (Q). Hence, (4.18) leads to Thus it holds J (ū)v = 0. Since all v k obviously satisfy the sign conditions (4.12) and the set of elements of L 2 (Q) satisfying the sign conditions is convex and closed, then v also obeys these sign conditions. Thus, we obtained v ∈ Cū.
Invoking again (4.18) and (4.5) we get by a Taylor expansion Therefore, it holds Again we have strong convergence ofū + θ k ρ k v k =ū + θ k (u k −ū) in L p (Q) with p > N/2 + 1 and conclude from that yū +θ k ρ k v k → yū and ϕū +θ k ρ k v k → ϕū in L ∞ (Q) as k → ∞. Therefore, the concrete expression (4.11) for J (u) yields By v ∈ Cū, this property and the inequality (4.19), we get so that J (ū)v 2 = 0.
From (4.16), it follows v = 0. Finally, we argue as in formula (4.15) and obtain a contradiction by where the last equation is concluded from (4.20).
The structure of the proof is the same as for finite-dimensional optimization problems. However, it differs in some essential details, because we have to work with weak convergences and to invoke compactness properties of the solution mappings to linearized parabolic equations.
Comparing the result of this theorem with the finite-dimensional case, we see that the gap between necessary and sufficient conditions is minimal. Moreover, we mention the important fact that we are able to prove local optimality in the sense of L 2 (Q) although we had to deal with the two-norm discrepancy. This surprising result was proved first in [30] for a more general class of infinite-dimensional problems.
Finally, as a consequence of Theorem 4.13, we prove that the condition (4.16) implies thatū is a strong local solution.
Corollary 4.14 Letū satisfy the assumptions of Theorem 4.13. Then there exist δ > 0 and ε > 0 such that the quadratic growth condition in the strong sense is fulfilled.
Proof Let us assume that (4.21) does not hold for any δ and ε . Then, for any integer k ≥ 1 we can find a control u k ∈ U ad with y u k −ȳ L ∞ (Q) < 1/k such that We can take a subsequence, denoted in the same way, such that (u k ) k≥1 is weakly convergent in L 2 (Q), hence also in any L p (Ω) with p < ∞ by boundedness of U ad in L ∞ (Ω). Since y u k →ȳ in L ∞ (Q) we deduce that u k ū in L 2 (Q) and by (4.22) Passing to the limit, we get This implies u k L 2 (Q) → ū L 2 (Q) and hence, we infer that u k →ū ν strongly in L 2 (Q). Therefore, given > 0 such that (4.17) holds, we have that u k −ū ν L 2 (Q) < ε for all k sufficiently large. Then (4.22) contradicts (4.17).

Remark 4.15 In [4]
, a similar strong growth condition was proved for an elliptic optimal control problem.
(D1) For ν = 0, the second order conditions (4.13) and (4.14) are not equivalent. Therefore, we cannot exploit a coercivity condition such as (4.14). (D2) However, even the (stronger) coercivity condition (4.14) is not sufficient for local optimality as a counterexample below will show. The presence of infinitely many inequality constraints is the obstacle for this.
Example 4.16 (Counter example to (4.14)) The following example, due to Dunn [39], demonstrates that (4.14) is not in general sufficient for local optimality, even not in the sense of L ∞ . We define J : L 2 (0, 1) → R by where a(x) = 1 − 2x. The set of admissible functions u is selected by Since u −ū is nonnegative for all u ∈ U ad ,ū satisfies the first order necessary optimality conditions.
Then we have It turns out that we need an extension of the cone Cū to have some more flexibility of selecting critical directions in points whereū is close to the bounds α and β. We continue the discussion for our parabolic control problem (P).
: v satisfies the sign conditions (4.12) and Notice that we consider the case ν = 0 here, but this definition is used later also for ν > 0.

Remark 4.18
It is obvious that Cū ⊂ C τ u for all τ > 0. In the case of finite dimensions, both cones coincide, if τ is taken sufficiently small. Indeed, define then C τ u = Cū for U = R n . In U = L 2 (Q), the two cones are different, because |J (ū)(x, t)| can admit positive values that are arbitrarily close to zero.
(D3) One can think of the condition J (ū)v 2 ≥ δ v 2 L 2 (Q) ∀v ∈ C τ u as the correct sufficient second order condition. However, it is known that this condition does not hold, except in very simple cases; see [14] or [24]. Intuitively, this is somehow clear, because the term ν v 2 L 2 (Q) is missing. The next theorem provides the correct second order conditions. Theorem 4.19 (SSC for ν = 0) Assume thatū ∈ U ad satisfies the first order necessary optimality conditions (4.5) along with the second order sufficient condition Then there exists ε > 0 such that the quadratic growth condition is fulfilled.
The proof is technical and beyond the scope of this survey. It uses a variety of estimates for solutions of parabolic equations. The reader may find a proof for a more general class of parabolic equations in [24]. A preliminary version of the theorem was proved in [14].
Let us briefly motivate this form of the second order sufficient conditions that differ from what the reader might have expected. The condition (4.23) appears to be natural, since for ν = 0 the second order derivative (4.11) of J admits the form If (1 − ϕū a (yū))(x, t) ≥ δ holds for almost all (x, t) ∈ Q, then the condition (4.23) is obviously true. Therefore, (4.23) is a natural extension of (4.14) to the situation, where ν = 0. As a new theorem, we prove that for ν > 0 this condition with the extended cone is equivalent to the positivity of the quadratic form for v ∈ Cū \ {0}.

Theorem 4.20
Givenū ∈ U ad , the following conditions are equivalent if ν > 0.
Proof First, we recall that there exists a constant C > 0 independent of v such that z v L 2 (Q) ≤ C v L 2 (Q) for every v ∈ L 2 (Q). Hence, if (4.26) holds for some δ and τ , then (4.27) is fulfilled with the same τ and δC 2 . The implication (4.27) ⇒ (4.25) is obvious. Finally, we prove that (4.25) implies (4.26). To this end, we proceed by contradiction and we assume that for every integer k ≥ 1 there exists an element v k ∈ . Setting ρ k = v k L 2 (Q) , renaming v k /ρ k by v k , and selecting a subsequence if necessary, we have Since v k satisfies the sign conditions (4.12) and the set of elements of L 2 (Q) satisfying these conditions is convex and closed, we conclude that v also satisfies (4.12). On the other hand, given t) = 0, and consequently v ∈ Cū. Now, (4.28) yields By assumption (4.25), this is possible only if v = 0. But, using once again (4.28) along with (4.11), we have that v k L 2 (Q) = 1, and therefore we get the contradiction Before finishing this section, we mention that we are able to show a result analogous to Corollary 4.14 for the case ν = 0. As far as we know, an inequality analogous to (4.21) was not yet proved under the second order sufficient condition (4.23). To obtain (4.21) we suggest to consider a different cone, : v satisfies the sign conditions (4.12) and With this extended cone, the following result can be proved.
Then there exists ε > 0 such that the quadratic growth condition in the strong sense is fulfilled.
The reader is referred to [24] for a proof of this result.

Two Applications to Stability Analysis
In this part, we explain that second order sufficient optimality conditions imply certain stability properties of optimal solutions to the control problem (P) with respect to perturbations of given data. Exemplarily, we discuss the stability with respect to perturbations of the desired state y d and to changes of the regularization parameter ν.

Perturbation of y d
One of the possible interpretations of our optimal control problem (P) is that as an inverse problem: Given measurements y d of the temperature in Q, we want to determine a heat source u that generated this measured y d . Since measurements are overlaid by noise, perturbed data y ε d are given. Then the question arises, if the optimal sourceū depends continuously on the data. Under a second order sufficient condition, the answer is yes, if the regularization parameter ν is positive. In the case ν = 0, we can analogously prove that the states depend continuously on the data. We will specify this stability at the end of this section. Now, we detail the analysis for ν > 0.
Assume that a family of perturbed desired states y ε d , ε > 0, is given such that is satisfied. We consider the associated family of perturbed optimal control problems min u∈U ad We show that the family of problems {(P ε )} ε>0 realizes a good approximation of (P) in the sense that any accumulation point of any sequence of solutions (ū ε ) ε>0 of problems (P ε ) is a solution of (P). Conversely, any strict local minimum of (P) can be approximated by local minima of problems (P ε ). Moreover, we will estimate the order of this convergence.

Theorem 4.22
If (ū ε ) is any sequence of optimal controls of (P ε ) that converges weakly in L 2 (Q) to someū, thenū is optimal for (P) and holds for all p ∈ [1, ∞). Reciprocally, ifū is a strict locally optimal control of (P), then there exists a sequence (ū ε ) of locally optimal controls of (P ε ) converging toū. This sequence is constructed by global solutions of (4.33) below and any such sequence obeys (4.32).
Proof We only mention the main idea of this standard result. We have two statements. First we assume that {ū ε } is a sequence of global solutions. Then, it is easy to prove that any weak limit is a global solution of (P). Now, using the Tikhonov term, we can prove the strong convergence in L 2 (Q), hence in every L p (Q) with p < ∞. Conversely, we assume thatū is a strict local minimum of (P), then the controlsū ε are defined as (global) solution to the auxiliary problem min J ε (u), u ∈ U ad ∩ B ρ (ū), (4.33) where ρ > 0 is taken such that J achieves the strict minimum value atū in U ad ∩ B ρ (ū). Here B ρ (ū) denotes the closure of B ρ (ū). The existence of at least one such controlū ε follows by standard arguments. Arguing as before and using thatū is the unique minimum of J in U ad ∩ B ρ (ū), we can prove thatū ε →ū strongly in L p (Q) for every p < ∞. Therefore,ū ε does not touch the boundary of B ρ (ū), if ε is sufficiently small. Consequently,ū ε is a locally optimal control of (P ε ) (where the constraint u ∈ B ρ (ū) is not required).
This is just a convergence result. Next, we estimate the order of this convergence. For convenience, we define Theorem 4.23 (Lipschitz stability) Letū be a locally optimal control of (P) that satisfies the second order sufficient optimality condition (4.16) and let (ū ε ) be a sequence of locally optimal controls of (P ε ), defined by (global) solutions to (4.33), that converges toū in L 2 (Q) as ε → 0. Then there are constants C L > 0 and ε 0 > 0 such that Proof Thanks to the optimality ofū ε in (4.33) and the quadratic growth condition (4.17), we find for all sufficiently small ε > 0 Let us write for convenienceȳ := yū andȳ ε := yū ε . Simplifying we find Inserting the definition of F and F ε , expanding the associated norm squares, and applying the Cauchy-Schwarz inequality yields The inequality (4.34) is a direct consequence.

Remark 4.24
If ν = 0, then we can prove a result analogous to Theorem 4.22. The differences are the following: instead of (4.32) we have thatū ε ū in L p (Q) for every p < ∞; the associated states satisfyȳ ε →ȳ strongly in L ∞ (Q). In addition, if we assume that (4.29) holds, then the inequality ȳ ε −ȳ L 2 (Q) ≤ Cε is satisfied.

Stability with Respect to ν → 0
Let us consider a slightly changed situation: As reference control, we select a locally optimal controlū for the problem (P) with parameter ν = 0. We want to approximate this control by locally optimal controlsū ν of (P) associated with Tikhonov parameter ν > 0. Again, we are interested in an estimate for the order of approximation.
To fix the notation, we write from now on and consider the family of problems as perturbations of the problem (P) (=(P 0 )). Now, we proceed similarly as for ν > 0. We denote by (ū ν ) ν>0 a sequence of global solutions to (P ν ) and denote the associated states byȳ ν := yū ν . Since U ad ⊂ L ∞ (Q) is bounded in L p (Q) with p > N/2 + 1, we can assume w.l.o.g. thatū ν converges weakly in L p (Q) to someū ∈ U ad , i.e.ū ν ū, ν → 0.
Now we return to the approximability of a strict reference solutionū of (P).
This result is shown as Theorem 4.22, therefore we omit the proof.
To estimate the order of convergence ofū ν toū, we need some quadratic growth condition. However, we cannot assume the second order condition (4.16), because ν = 0. Therefore, we now assume the second order condition (4.29) that implies the quadratic growth condition with respect to y u .

Theorem 4.27
Letū and (ū ν ) 0<ν≤ν be as in Theorem 4.26 and assume that the second order condition (4.29) is satisfied. Then, the following identity holds Proof The second order condition (4.29) implies the quadratic growth condition (4.30). From this condition and the fact that J ν (ū ν ) ≤ J ν (ū), we get A first consequence of the above inequality is that ū ν L 2 (Ω) ≤ ū L 2 (Ω) . In view of this, we conclude further From this, we infer

A Priori Error Estimates and Problems with Pointwise State Constraints
In the preceding section, we explained how second order conditions can be formulated for an optimal control problem for a semilinear heat equation. We discussed two characteristic but simple applications. Now, we survey some other aspects of second order sufficient optimality conditions. For convenience, we restrict ourselves to elliptic equations with a simple monotone nonlinearity and consider again a standard quadratic objective functional. The theory for elliptic problems is less technical than that for parabolic ones.

SSC for a Semilinear Elliptic Equation
We consider the following elliptic optimal control problem min u∈U ad where y u ∈ H 1 0 (Ω) is the solution to the semilinear elliptic PDE and In this problem, Ω, Γ , and the monotone function a : R → R are given as in Sect. 4. Moreover, y d ∈ L 2 (Ω) is a given desired state.
We consider the state y u associated with a given control u in the space H 1 0 (Ω) ∩ L ∞ (Ω). It is known that the mapping u → y u is twice continuously Fréchet differentiable from L p (Ω) to C(Ω), if p > N − 1, cf. [30] or [83]. The analysis of optimality conditions is fairly analogous to the parabolic case; we sketch only the main points: The adjoint state ϕ u associated with a given control u is defined as the unique solution to the (linear) adjoint equation 2) The first and second order optimality conditions can now be easily transferred from the parabolic case to the elliptic one: Just substitute Ω for Q. We obtain for the first and second order derivatives of J in the direction v ∈ L ∞ (Ω) where z v is the unique solution to The first order necessary optimality condition is again (3.13) and, for ν > 0, the second order sufficient optimality condition is (4.13), with the meaning that the quantities have in our elliptic case. Let us repeat it for convenience also here:

A Priori Error Estimates
To solve (PE), one has to discretize the problem and to reduce it this way to an optimization problem in a Euclidean vector space. One of the most widely used options is the discretization of the solutions y of the partial differential equation (5.1) by linear finite elements. The control function u can be discretized in different ways.
Then an important question arises: Does there exist a locally optimal solution to the discretized control problem that is close to a selected locally optimal solution to the original continuous version? Can we estimate the distance? For this goal, second order sufficient conditions are indispensible! Occasionally, a quadratic growth condition is assumed for this purpose; however, this condition is equivalent to a second order sufficient condition.
Let us roughly sketch the setting and the estimate for a convex bounded domain Ω ⊂ R 2 . We consider a family of regular triangulations (T h ) h>0 ofΩ with mesh size h. The triangulations consist of a union of triangles T ∈ T h . For the notion of regularity and the concrete details of the triangulation, we refer to [3]. Let us only mention that the union of all triangles of (T h ) h>0 generates the closure of a polygonal The corners of Ω h are located at the boundary Γ of Ω. We consider the following sets of discretized control and state functions: In other words, we consider piecewise constant controls and piecewise linear and continuous state functions. As discretized state equation, we consider the variational problem For each u h ∈ U h , we denote by y h (u h ) the unique element of V h that satisfies The existence of the solution y h (u h ) follows by a simple application of Brouwer's fixed point theorem, while the uniqueness is obtained by the monotonicity of the nonlinearity a. The finite dimensional discretized optimal control problem (PE h ) is defined by The existence of at least one global solutionū h for (PE h ) follows from the continuity of J h and the compactness of U h,ad . Assuming the second order sufficient optimality conditions forū, the following main result was shown in [3] for a more general setting: Theorem 5.1 Letū be a locally optimal control of (PE) that satisfies the second order sufficient optimality condition (5.6). Then there exists a sequence of local solutionsū h of (PE h ) such thatū h →ū in L 2 (Ω) as h ↓ 0. For any such a sequence (ū h ), there exist C > 0 and h 0 > 0 such that We formulated a result for piecewise constant control approximation. After the paper [3], many contributions to other forms of discretizations and equations were published. We mention exemplarily [13] and [21] for piecewise linear control approximation and [47] for the so-called variational discretization. Since the number of associated contributions is very large, we refer to the surveys [51,52]. Some other related publications are quoted in Sect. 7.9.

Pointwise State Constraints
In the optimal control problems discussed above, only the pointwise control constraints α ≤ u ≤ β were allowed. For many interesting problems of applications, this is not enough. For instance, interpreting the parabolic control problem (P) as heating problem, the temperature y should not exceed a certain bound γ > 0.
Let us discuss the consequences for the elliptic problem (PE) with additional constraint y ≤ γ , in Ω, i.e. we investigate the problem where J is the functional defined in (PE) and y u is the solution to the semilinear equation (5.1). Problems with pointwise state constraints are difficult with respect to their mathematical analysis and numerics. In particular, this refers to the analysis of second order conditions, where still important questions are open. Let us briefly discuss the reason for this.

First Order Necessary Conditions
To obtain first order necessary optimality conditions, the pointwise state constraints are included in a Lagrangian function by associated Lagrange multipliers, where μ ∈ M(Ω) is a regular Borel measure. Such multipliers exist under a so-called regularity condition that is here taken as linearized Slater condition. We have Theorem 6.1 Letū be a locally optimal control for (PE S ) that satisfies the following linearized Slater condition: There exists u 0 ∈ U ad such that Then there exists an associated Lagrange multiplierμ ∈ M(Ω) such that ∂L ∂u (ū,μ)(u −ū) ≥ 0 ∀u ∈ U ad , (6.1) Introducing an adjoint state ϕū, (6.1) can be expressed again in the form but the adjoint state ϕū is now the solution ϕ to the following elliptic equation with a measure on the right-hand side, cf. [5,11], where definition, existence and regularity of ϕū are discussed. In particular, there holds ϕū ∈ W 1,s 0 (Ω) for all 1 ≤ s < n/(n − 1), where n is the dimension of Ω. For some recent results proving better regularity properties for distributed elliptic control problems, the reader is referred to [31].

Second Order Sufficient Conditions
It seems that the theory of associated second order conditions should now be straighforward. For all v in a suitable critical cone Cū ,μ of functions in L 2 (Ω) that is to be defined yet, we might have Here, we are faced with the first difficulty: What is the correct critical cone? For the necessary conditions (6.5), this is not known and the theory of necessary optimality conditions for pointwise state constraints is widely open.
In [33] that was, to our best knowledge, the first paper on second order sufficient conditions for problems with pointwise state constraints, the construction of the critical cone was quite complicated yet. Several improvements were made that culminated so far in [16] for state-constrained problems with semilinear parabolic and elliptic equations. For (PE S ), the following critical cone was introduced in the context of this problem, and v satisfies the conditions (6.7)-(6.9) .
The further conditions defining Cū ,μ are the sign conditions and This cone is the direct extension of the one known for finite-dimensional optimization problems with inequality constraints. The following theorem on second order sufficiency follows from [16,Theorem 4.3] that was proved for a much more general version of elliptic state constrained control problems.
Theorem 6.2 [16] Suppose that n = dim Ω ≤ 3 and thatū satisfies with yū all constraints of the state constrained problem (PE S ) along with the first order necessary optimality conditions of Theorem 6.1. If the second order sufficient optimality condition (6.6) is fulfilled, then there exist ρ > 0 and δ > 0 such that for all admissible controls there holds the quadratic growth condition One assumption of this theorem might surprise. Why do we need n = dim Ω ≤ 3? The reason is that the proof needs the continuity of the mapping v → z v from L 2 (Ω) to C(Ω). The regularity of solutions of PDEs is known to depend on the dimension n and this causes restrictions on n.
Up to now, second order sufficient optimality conditions that are based on the smallest critical cone Cū ,μ are only known in the following cases.
• Elliptic problems: n ≤ 3 for distributed control and n ≤ 2 for Neumann boundary control, • Parabolic equations: n = 1 for distributed control but no result for Neumann boundary control.
Can one weaken the assumptions to obtain sufficient conditions for higher dimension of Ω, say by extending the critical cone? This is possible in some cases, we refer for instance, to [59] for parabolic equations with n ≤ 3. Nevertheless, there is always a limit for the dimension: The proofs of second order sufficiency theorems need in our problem the extension of the quadratic form v → ∂ 2 L ∂u 2 (ū,μ) v 2 to L 2 (Ω). For this purpose, the integral must be finite for all v ∈ L 2 (Ω). This is another obstacle restricting the possible space dimensions. A short inspection of the expression (5.4) of ∂ 2 L ∂u 2 (ū,μ) v 2 = J (ū)v 2 shows the following: The adjoint state ϕū is contained in W 1,s (Ω) for all 1 ≤ s < n/(n − 1). By Sobolev embedding, we have W 1,s (Ω) ⊂ L q (Ω) for all q ≤ ns n−s . Inserting the limit s = n/(n − 1), we get q < n n n−1 n − n n−1 = n n − 2 as limit for the integrability order of ϕū. Assume a convex domain Ω. Then, for v ∈ L 2 (Ω), maximal elliptic regularity ensures z v ∈ W 2,2 (Ω) ⊂ L 2n/(n−4) (Ω), hence z 2 v ∈ L n/(n−4) (Ω). For n = 5, the integrability order of ϕū is q < 5/3, while z 2 v ∈ L 5 (Ω). Here, the integral (6.10) is still finite. In the same way, we find for n = 6 that q < 3/2 and z 2 v ∈ L 3 (Ω). Here, the integral (6.10) can be infinite. This shows that n = 5 is currently the limit for establishing second order sufficient optimality conditions (6.10) in the case of elliptic distributed control. For boundary control or parabolic equations, these limits for the dimension are smaller. Therefore, the question of second order sufficient optimality conditions for state-constrained elliptic and parabolic control problems is answered only for sufficiently small dimensions.

Miscellaneous Results on Second Order Sufficient Conditions
Second order optimality conditions (SSC) were discussed extensively for optimal control problems with nonlinear PDEs. Let us survey some related papers. The associated collection of papers is by far not complete, but it shows how diverse the application of second order condition can be.

Other Nonlinear PDEs
First of all, SSC were established for a variety of nonlinear equations. We refer to the theory of SSC for quasilinear elliptic state equations in [17,28], their application to the control of Navier-Stokes equations in [15,22,34,82,84], and to the control of the FitzHugh-Nagumo system in [24,25]. Elliptic optimal control problems with nonlocal radiation interface conditions were discussed with SSC in [38,67,69]. We also mention the work [76], where SSC were derived for a nonstandard parabolic control problem with boundary condition ∂ n y = u(y)[c − y] (the state y is inserted in the control u) and to the discussion of Hölder regularity of optimal solutions in [45].

SSC for Problems with State Constraints
The theory of SSC was investigated for different problems with pointwise state constraints for parabolic or elliptic PDEs in [16,75,78,79]. SSC for control problems with finitely many state constraints (pointwise or of integral type) are used in [9,12,20,23,27]. For parabolic control problems with pointwise state constraints and control depending only on the time, the restrictions on the dimension of Ω can be overcome, [35]. SSC for the convergence analysis of a semismooth Newton method are used in the case of state constraints in [46]. Second order conditions for parabolic control problems with time depending state constraints were investigated in [8,41]. SSC for nonlinear weakly singular integral equations with application to parabolic boundary control were discussed in [77]. For control problems with regularized mixed pointwise control-state constraints, SSC were applied with semilinear elliptic equations in [57,58].

SSC for Optimal Bang-Bang Controls
While SSC for bang-bang type controls were discussed extensively in the optimal control of ODEs, cf. [72], the discussion of this issue in the control of PDEs was started very recently. We refer to [14]; see also [24] and [31].

SSC for Optimal Sparse Controls
In the fast developing field of optimal sparse controls, the objective functional of the problems is not differentiable, because a multiple of the L 1 -norm of the control is added to a quadratic differentiable functional. In the context of our elliptic problem (PE), the functional would read where κ accounts for sparsity. The larger κ is, the smaller is the support of the optimal controlū. Although J is not differentiable, SSC can be established by applying second order derivatives only to the differentiable quantities of the control problems. The conditions can be applied as an assumption ensuring the stability of optimal solutions with respect to perturbations. We mention [24,25,31]. An application to a priori error estimates with sparse controls is presented in [18,19].

Second Order Necessary Optimality Conditions
In contrast to the theory of second order sufficient conditions, there is a smaller number of papers on second order necessary optimality conditions. This does not mean that these conditions are not important. Quite the contrary, they are decisive for estimating how far associated sufficient conditions are from the necessary ones. The gap between them should be as close as possible. The theory is well developed for problems with control constraints and finitely many state constraints, cf. [7-9, 12, 20, 26, 27]. However, it is widely open in the presence of pointwise state constraints.

Extension of the Neighborhood for Local Optimality
We pointed out in our survey that the two-norm discrepancy is a characteristic difficulty in the theory of second order sufficient conditions, mainly in the control of PDEs. In early papers on the subject, the neighborhood of local optimality that is obtained from the SSC, was usually an L ∞ -neighborhood. Recently, it was proved under mild assumptions in [25,29,30] that the local optimality even holds true in L 2 -neighborhoods. In this context, we also mention [7].

Verification of SSC and Test Examples
In calculus of variations and optimal control of ODEs, there is a variety of problems with practical background and exactly known optimal solution. Then it can be checked analytically, if SSC are fulfilled. In the theory of PDE constrained optimal control, the situation is somehow different. Meaningful real world problems cannot be exactly solved, because the solution of the PDE must be determined numerically. Therefore, the published problems with exactly known optimal solution are mathematical constructions; they are important for testing numerical algorithms.
In particular, this refers to problems, where the optimal solution satisfies a second order condition. Such examples were constructed for nonlinear PDEs e.g. in [35,55,70,83], to name only a few of them.
In this context, an important question arises. Can SSC be numerically verified, say by computing the smallest eigenvalue of a reduced Hessian matrix for the discretized optimal control problem? The general answer is no, also for the control of ODEs. An impressive counterexample was constructed in [80]. Here, the optimal solution for the finite element discretization of the example satisfies a second order sufficient condition for any mesh size h. However, the limit as h ↓ 0 is a saddle point and not a local minimum. In some very special cases, numerical computations along with analytical estimations were used to verify SSC, cf. [81,88].

SSC in Stability Analysis with Respect to Perturbations
Assuming SSC, the Lipschitz stability of optimal controls with respect to various perturbations in the data of control problems can be proved. In the framework of nonlinear PDE control, we refer to [1,31,50,63,64,82]. In [50], also the stability with respect to a discretization of the control problem is investigated. We also mention the paper [37] that was written for the control of ODEs but inspired many investigations in stability analysis and a priori error estimates in the control of PDEs. In this context, we also quote [62] in the context of Lipschitz stability.

SSC in a Priori Error Estimates
We mentioned that SSC are indispensible for estimating the distance of the optimal solution of a discretized optimal control problem to an unknown exact one, if the PDEs are nonlinear. Let us quote some additional references. SSC were used in the contributions [18,50,73] to derive error estimates. We also mention [68] for improving the approximation order by a postprocessing step. For a priori error estimates in problems with regularized state constraints, SSC were assumed in [58].

SSC in the Convergence Analysis of SQP Methods
The sequential quadratic programming (SQP) technique is a Newton type method that is local quadratic convergent in a neighborhood of a selected locally optimal reference solution of the control problem, if a second order sufficient optimality condition is satisfied at this solution. There is an extensive list of references on the convergence analysis of SQP methods. We refer exemplarily to [42-44, 48, 49, 83, 86, 87]. SSC are also applied in papers, where the semismooth Newton method is analyzed for the control of nonlinear PDEs. We refer to the monograph [85] and the references therein.