ON OPTIMAL CONTROL PROBLEMS WITH CONTROLS APPEARING NONLINEARLY IN AN ELLIPTIC STATE EQUATION\ast

An optimal control problem for a semilinear elliptic equation is discussed, where the control appears nonlinearly in the state equation but is not included in the objective functional. The existence of optimal controls is proved by a measurable selection technique. First-order necessary optimality conditions are derived and two types of second-order sufficient optimality conditions are established. A first theorem invokes a well-known assumption on the set of zeros of the switching function. A second relies on coercivity of the second derivative of the reduced objective functional. The results are applied to the convergence of optimal state functions for a finite element discretizion of the control problem.

1. Introduction.In this paper, we consider the following optimal control problem: (P) min Precise assumptions on the data of the control problem (P) will be given in section 2. However, it is important to mention here that f is assumed to be monotone nonincreasing with respect to the variable y so that (1.1) has a unique solution.
This paper has two main features: the control appears nonlinearly in the state equation and the objective functional only depends on the state.In mainstream papers, the control appears linearly in the state equation and often in a quadratic Tikhonov regularization term in the objective functional.
Our setting has several consequences: \bullet The existence proof of optimal controls must differ from the standard one that is based on minimizing sequences of controls and their weak convergence.We will prove existence by a measurable selection theorem.In the framework of optimal control theory of ODEs similar techniques have been used.We refer, e.g., to the textbooks [19] and [13].In PDE control, this issue has been addressed in [14], [15], or [21].
Using the mentioned measurable selection theorem, we are able to prove the convergence of a numerical approximation of the control problem.Indeed, in section 7 we consider a discretization of (P) by finite elements.We prove the strong convergence in H 1 (\Omega ) of optimal discrete states to associated ones of the original continuous problem.To our knowledge, this application of measurable selection theorems is new for the numerical approximation of problems like (P).\bullet For optimality conditions, the superposition operator u \mapsto \rightar f (\cdot , y, u) should be Fr\' echet-differentiable.In view of this, the only useful space that does not need strong growth conditions on f is L \infty (\Omega ).Then, however, the wellknown two-norm discrepancy is unavoidable in the discussion of second-order sufficient optimality conditions.\bullet Moreover, and this is another issue, second-order optimality conditions are delicate when the control is not explicitly included in the cost functional.Actually, the classical assumption J \prime \prime (\= u)v 2 \geq \delta \| v\| 2 \forall v in the critical cone is not satisfied when u appears linearly in the state equation; see, e.g., [24,Lemma 5.1].Nevertheless, due to the fact that f is nonlinear with respect to the control, the coercivity of J \prime \prime (\= u) can be fulfilled as we show in one example.This fact is crucial for some of our results on second-order conditions.We will prove sufficient optimality conditions in two ways.In the first, we rely on the fact that bang-bang controls can be expected as optimal.This is due to the missing control in the objective.Here, by using a structural assumption on the optimal adjoint state, we show that the control satisfying the first-order optimality conditions is locally optimal in the sense of L \infty (\Omega ).Additionally, a quadratic growth condition of J can be deduced; see Theorem 5.3.
The second way is based on coercivity of J \prime \prime and a Legendre Clebsch condition on the Hamiltonian.Similar assumptions are known from ODE control.We refer to [20] and to the references therein.Under these hypotheses, we show local optimality of stationary controls in the sense of L 2 (\Omega ), although the differentiability relies on the space L \infty (\Omega ).
Summarizing, our paper contains the following main novelties.We show the existence of optimal controls by a measurable selection theorem.We prove different types of results on second-order sufficiency---one for bang-bang controls and one based on some hidden coercivity of J \prime \prime , all for the case of controls appearing nonlinearly in the state equation.Finally, we discuss basic convergence properties for numerical discretizations of our problem.

Assumptions and preliminary results.
In this paper we make the following assumptions.
It is easy to check with (2.2) that x \mapsto \rightar f (x, y(x), u(x)) is a function belonging to L \= p (\Omega ) for every (y, u) \in V \times L \infty (\Omega ).Moreover, \scrF is of class C 2 .In addition, if y u = G(u), then \scrF (y u , u) = 0 holds.We also have that \partial n A z = 0 on \Gamma has a unique solution z \in V depending continuously on v \in L \= p (\Omega ).Now, the theorem follows easily by applying the implicit function theorem.
Thanks to the chain rule, we deduce the differentiability of J from the previous theorem.
Theorem 2.3.Under Assumptions 1 and 2, the mapping J : L \infty (\Omega ) -\rightar \BbbR is of class C 2 and we have (2.16) We finish this section with the following observation: from (2.16) and (2.5) we deduce the existence of a constant M \ast such that 3. Existence of optimal controls.If the control appears linearly in the state equation, the standard method for proving existence of an optimal control is as follows.An infimal sequence of controls is considered that is bounded.Then a weakly converging subsequence is selected that eventually converges to an optimal control.This method cannot in general be applied to controls appearing nonlinearly.
We will use a technique that is based on measurable selection theorems, in particular on the well-known Filippov theorem.This method was often applied to control problems with ordinary differential equations but rarely used for partial differential equations.
Let us start with some examples that introduce some specific difficulties with controls appearing nonlinearly.
Example 3.1.Our first example illustrates an important effect that is related to the nonlinear appearance of controls.We consider the following linear state equation with control u \in L 2 (\Omega ): Select a sequence of bang-bang control functions u k \in L \infty (\Omega ) such that u k (x) \in \{ - 1, 1\} a.e. in \Omega and u k \rightha 0 (weakly) in L 2 (\Omega ).It is easy to construct such a sequence.Obviously, we have u k (x) 2 = 1 \forall k, hence the associated states y k = y u k are y k (x) = 1 a.e. in \Omega .This stationary sequence converges uniformly to y = 1, but this function y is not the state associated with the weak limit control \ũ = 0.
The reader might object that the nonlinearity u 2 is too simple, since u 2 k \equiv 1 does not depend on k.Therefore, we discuss also a less naive example.
Example 3.2.Here, we consider the nonlinear equation where we insert the sequence u k defined in Example 3.1.The associated states y k solve the equation The states y k converge strongly in H 1 (\Omega ) \cap L \infty (\Omega ) to the solution y of \biggl\{ - \Delta y(x) + y(x) = e - y(x) in \Omega , \partial n y = 0 on \Gamma , but not to the state y \ũ = 0 that is associated with the weak limit control \ũ = 0. Downloaded 09/24/20 to 193.144.185.28.Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.phpTheorem 3.3.Under Assumptions 1--3, the optimal control problem (P) has at least one solution.
Proof.Let \{ u k \} \infty k=1 \subset \scrU ad be a minimizing sequence of (P): J(u k ) \searrow inf (P) as k \rightar \infty .From (2.8) we deduce that \{ y u k \} \infty k=1 is bounded in H 1 (\Omega ) \cap C 0,\mu ( \= \Omega ).Therefore, there exists a subsequence, denoted in the same way, and a function \= y \in H 1 (\Omega )\cap C( \= \Omega ) such that y u k \rightha \= y in H 1 (\Omega ) and y u k \rightar \= y in C( \= \Omega ).Moreover, \= y satisfies The first identity follows from Lebesgue's dominated convergence theorem.Indeed, taking M = max k\geq 1 \| y u k \| C( \= \Omega ) we deduce from (2.5) and the mean value theorem Hence, we have with (2.4) that \{ L(\cdot , y u k )\} \infty k=1 is dominated by an L 1 (\Omega )-function.To conclude the proof, it is enough to show that \= y is the solution of (1.1) associated to some control \= u \in \scrU ad .Then, \= u is a solution of (P).To this end, we introduce the multifunction Since f is continuous respect to the last variable, we have that F (x) is a closed and bounded interval of \BbbR for almost all x \in \Omega .Now, we define S = \{ g \in L \= p (\Omega ) : g(x) \in F (x) for a.a.x \in \Omega \} .
It is obvious that S is a convex and closed subset of L \= p (\Omega ).Finally, setting g k (x) = f (x, \= y(x), u k (x)), we have that \{ g k \} \infty k=1 is a sequence contained in S. Indeed, it is obvious that g k (x) \in F (x) for almost all x \in \Omega .Let us prove that every function g k belongs to L \= p (\Omega ).From (2.8) we have that holds.Then, from (2.2) and using the mean value theorem we infer that k=1 is a bounded sequence in L \= p (\Omega ).Therefore, we can extract a subsequence, denoted in the same form, such that g k \rightha \= g in L \= p (\Omega ).Since S is weakly closed in L \= p (\Omega ) we have that \= g \in S. Now, from the classical Filippov theorem (see [16] or [19]), we deduce the existence of a measurable function \= u : \Omega -\rightar [\alpha , \beta ], i.e., \= u \in \scrU ad , such that \= g(x) = f (x, \= y(x), \= u(x)) for almost all x \in \Omega .We conclude the proof by showing that \= y = y \= u .It is obvious that Using again (2.2) we get Hence, passing to the limit in the above equation, we obtain that A\= y = f (\cdot , \= y, \= u) in \Omega , as desired.Downloaded 09/24/20 to 193.144.185.28.Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 4. First-and second-order necessary optimality conditions.The goal of this section is to derive the first-and second-order necessary optimality conditions to be fulfilled by any local solution of (P).Given p \in [1, \infty ], we say that \= u is a local minimum of (P) in the L p (\Omega )-sense if where B \varepsi (\= u) = \{ u \in L p (\Omega ) : \| u -\= u\| L p (\Omega ) \leq \varepsi \} .We say that \= u is a strict local minimum if the above inequality is strict whenever u \not = \= u.The Hamiltonian of (P) is given by H : \Omega \times \BbbR 3 -\rightar \BbbR , H(x, y, \varphi , u) = L(x, y) + \varphi f (x, y, u).Now we formulate the first-order necessary optimality conditions for (P).Now we assume that p < \infty and prove (4.2).Let u \in \scrU ad be a fixed control.We define h(x) = f (x, \= y(x), u(x)) -f (x, \= y(x), \= u(x)).From (2.1) and (2.2) we deduce that h \in L \= p (\Omega ).Let \{ v j \} \infty j=1 be a dense sequence in L 1 (\Omega ).For every k \geq 1 we define the function g k \in L 1 (\Omega ) k+1 by g k = (1, v 1 , . . ., v k ).Given \rho \in (0, 1) arbitrarily, we deduce from Lyapunov's convexity theorem the existence of measurable sets E k \rho \subset \Omega such that \int Looking at the first component of the above vector identity, we infer that Now, considering the remaining components, we observe that This implies that lim From the density of \{ v j \} \infty j=1 in L , there exists some 1 < q < n n - 1 such that L \= p (\Omega ) is compactly embedded in W 1,q (\Omega ) \ast .Consequently, we have the strong convergence (1 -1 \rho \chi E k \rho )h \rightar 0 in W 1,q (\Omega ) \ast .Thus, we can select E \rho = E k \rho with some sufficiently large k, so that Now, we define The reader is referred to [4,5,8,12,22] for some previous papers using this type of diffuse perturbations.Obviously, u \rho belongs to \scrU ad .Let us denote by y \rho and \= y the states associated with u \rho and \= u, respectively.We also set z \rho = 1 \rho (y \rho -\= y).Subtracting the equations satisfied by y \rho and \= y and dividing by \rho , using the mean value theorem and definition (4.6), we get Since u \rho \rightar \= u in L \= p (\Omega ) when \rho \rightar 0, we know by Theorem 2.1 that y \rho \rightar \= y in H 1 (\Omega ) \cap C 0,\mu ( \= \Omega ).Then, using again [18, section 3.14], we deduce from (4.5) that z \rho \rightar z in H 1 (\Omega ) \cap C 0,\mu ( \= \Omega ), where z is the solution of Now we take into account that, due to (4.4), the next inequality holds, It implies that u \rho \in B \varepsi (\= u) holds for every \rho sufficiently small.Therefore, from the local optimality of \= u we obtain 0 \leq lim \rho \rightar0 Next we formulate the second-order necessary conditions for local optimality.To this end, we first observe that by the representation (2.14) the functional J \prime (\= u) : L \infty (\Omega ) -\rightar \BbbR can be extended to a linear continuous form J \prime (\= u) : L 2 (\Omega ) -\rightar \BbbR .Now, we define the cone of critical directions C \= u = \{ v \in L 2 (\Omega ) : J \prime (\= u)v = 0 and v satisfies the sign conditions (4.9) below\} , We have the following well-known result; see, for instance, [9] for a proof.
Finally, we prove that \= u satisfies (5.Therefore, \gamma = +\infty can be taken and Theorem 5.2 yields that \= u is a strict local minimum of (P) in the L \infty (\Omega )-sense.
Proof.From (2.8) we deduce the existence of a constant M such that \| y u \| C( \= \Omega ) + \| u\| L \infty (\Omega ) \leq M \forall u \in \scrU ad .(6.11) Subtracting the equations satisfied by y u and \= y, we obtain with the mean value theorem Using the well-known estimates for linear systems, assumption (2.2), and the boundedness of \scrU ad in L \infty (Q), we infer which implies (6.7).
To prove (6.8) we use again the equation satisfied by y u -\= y, the classical L \infty (\Omega )estimates for linear systems (see [23]), and again (2.2) to obtain which proves (6.8).
Proof of Theorem 6.1.The proof is split into two parts.
Proof under assumption (6.4).Here we proceed by contradiction and assume the existence of a sequence \{ u k \} \infty k=1 \subset \scrU ad such that Let us define for k \geq 1. Downloaded 09/24/20 to 193.144.185.28.Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.phpSince \| v k \| L 2 (\Omega ) = 1, we can take a subsequence, denoted in the same way, such that v k \rightha v in L 2 (\Omega ).Let us prove that v \in C \= u .Since the set of functions of L 2 (\Omega ) satisfying (4.9) is convex and closed, and since every v k obviously satisfies (4.9), then v also satisfies it.It remains to prove that J \prime (\= u)v = 0. From the optimality conditions (4.1) we deduce To derive the converse inequality, we use (6.14) Hence, we have Then, we can pass to the limit in (6.15) and deduce the desired converse inequality J \prime (\= u)v \leq 0, thus J \prime (\= u)v = 0 and hence v \in C \= u .Using again (6.14) and (4.1), and performing a second-order expansion, we obtain Dividing the above expression by \rho 2 k 2 we infer Our next goal is to pass to the limit in (6.16) and to deduce that J \prime \prime (\= u)v 2 \leq 0. Let us recall that according to (2.15) and the definition of the Hamiltonian we have and denote by y \vargamm k and \varphi \vargamm k its associated state and adjoint state, respectively.
Next we present the example of a control problem, where Theorem 6.1 is applicable.Let us fix \= u(x) \equiv - 1 and \= y \equiv +2.Inserting \= y in the state equation, e \Omega can be fixed accordingly: e \Omega = 2(1 -e - 2 ).
We confirm now that \= u = - 1 is locally optimal in the sense of L 2 (\Omega ).The second derivative of the reduced objective functional J is \varphi , \= u) = 6e - 2 > 0.Then, thanks to Theorem 6.1, the control \= u is locally optimal in the L 2 (\Omega )-sense.7. Convergence of numerical approximations.In this section, we assume that \Omega is a convex domain in \BbbR n with n = 2 or 3. We also suppose that \Omega is polygonal if n = 2 or polyhedral if n = 3.In addition to Assumptions 1--3, we also assume that a ij \in C 0,1 ( \= \Omega ) for 1 \leq i, j \leq n and f (\cdot , 0, 0) \in L 2 (\Omega ).Under these hypotheses it is well known that the solution of (1.1) belongs to H 2 (\Omega ) and that, with (2.8) and Assumption 2, we have \exists M \alpha ,\beta > 0 such that \| y u \| H 2 (\Omega ) \leq M \alpha ,\beta \forall u \in \scrU ad .where P i (T ) denotes the space of polynomials in T of degree i.We set U ad,h = \scrU h \cap \scrU ad .Now, we introduce the discrete version of (1.1) as follows: for u \in L \infty (\Omega ), y h (u) is the solution of the nonlinear system \biggl\{ Find y h \in Y h such that a(y h , z h ) = \int \Omega f (x, y h (x), u)z h (x) dx \forall z h \in Y h .The existence of a solution of (7.2) is proved by an easy application of Browder's fixed point theorem.The uniqueness follows from the monotonicity of f and the coercivity of the bilinear form a.
We approximate problem (P) by the discretized optimal control problem (\scrP h ) min where y h (u h ) is the solution of (7.2) for u = u h .For (\scrP h ), we obtain the following convergence theorem.