Constrained subspace estimation via convex optimization

Given a collection of M experimentally measured subspaces, and a model-based subspace, this paper addresses the problem of finding a subspace that approximates the collection, under the constraint that it intersects the model-based subspace in a predetermined number of dimensions. This constrained subspace estimation (CSE) problem arises in applications such as beamforming, where the model-based subspace encodes prior information about the direction-of-arrival of some sources impinging on the array. In this paper, we formulate the constrained subspace estimation (CSE) problem, and present an approximation based on a semidefinite relaxation (SDR) of this non-convex problem. The performance of the proposed CSE algorithm is demonstrated via numerical simulation, and its application to beamforming is also discussed.


I. INTRODUCTION
Input data in many signal processing applications admit a subspace representation, i.e., the intrinsic dimension of the objects of interest is much smaller than the dimension of the ambient space in which they live.Examples include image and video recognition where a single object under different pose or illumination conditions can be modeled as a subspace [1], or pattern recognition applications in which features obtained after a dimensionality reduction stage such as principal component analysis (PCA) are commonly used [2].In other signal analysis problems, one has to cope with the effects of the multiplicity of appearances of objects under different warpings or elastic deformations, where an invariant representation of the signal is shown to have the form of a subspace [3].
A central problem in these applications is to estimate an average subspace from a collection of experimentally measured subspaces.In our previous work [4], [5], we have presented a solution for this subspace averaging problem that minimizes an extrinsic mean-squared error, defined by the squared Frobenius norm between projection matrices.In addition, the solution in [4] automatically returns the dimension of the optimal average subspace.Variations on this problem have been considered in [6] and [7].An application to image classification that first estimates the mean representation of the object manifold by means of the proposed subspace averaging technique has recently been described in [8].
In this paper, we extend our previous work and consider the problem of finding a subspace that best approximates the collection of subspaces according to an extrinsic distance measure and, at the same time, intersects a model-based subspace in a given number of dimensions.The model-based subspace can be determined from the physical process that generates the measurements, or from prior information about the problem.For instance, in a beamforming problem [9] the model-based subspace might encode prior information about the angle of arrival of a source impinging on a linear array of antennas.When the direction-of-arrival (DoA) of the impinging source is not perfectly known because of mismatch errors due to imperfect array calibration or local scattering problems, this prior information can be encoded in the form of a low-dimensional subspace in which the incident signal is expected to lie.
The rest of the paper is organized as follows: in Section II, we provide a brief review of the subspace averaging technique.Section III formulates the non-convex constrained subspace estimation (CSE) problem, and presents an approximate solution based on semidefinite relaxation (SDR).In Section IV we illustrate the performance of the CSE algorithm by means of numerical simulations, and present an application to beamforming as a motivating example.The main conclusions are summarized in Section V.
Notation The notation G (d, n) will be used to represent the complex Grassmann manifold whose points parameterize all subspaces of dimension d in an n-dimensional ambient space.We use A ∈ G (d, n) to denote a point (subspace) on the Grassmann manifold, whereas A is used to denote a matrix whose columns form a unitary basis for that subspace.The superscripts (⋅) T and (⋅) H denote transpose and Hermitian, respectively.The trace, rank, and Frobenius norm of a matrix B will be denoted, respectively, as tr(B), rank(B), and B F .Finally, I n denotes the n × n identity matrix.

II. SUBSPACE AVERAGING
Given a collection of experimentally measured subspaces V m ∈ G (q m , n), m = 1, . . ., M; a classical problem in statistical signal processing is that of finding an average subspace of dimension d s , V s ∈ G (d s , n), that "best approximates" the collection of subspaces according to an extrinsic distance measure.
The extrinsic distance metric between two subspaces, V s and V m , is defined as the Frobenius norm of the difference between the respective projection matrices [10], [5] where Notice that the two subspaces can have different dimensions.Actually, writing the squared extrinsic distance in terms of the cosines of the principal angles between the two subspaces it is easy to show that [11] In our previous work [4], [5], we showed that, given the collection {V m } M m=1 , the average or central subspace of dimension d s maximizes the cost function where S(n, d s ) denotes the complex Stiefel manifold of orthonormal d s -frames in C n , and P is the average of projection matrices, defined as This average is not, itself, a projection matrix.It is known that the solution ( 3) is given by any unitary matrix whose column space is the same as the subspace spanned by the d s principal eigenvectors of P [12], [13].Moreover, in [4] it was shown that the optimal choice of d s is determined by the index of the last eigenvalue of P that exceeds 12, a result that is identical to a finding in [14] for the unrelated problem of constructing low-dimensional time-frequency projections.

III. CONSTRAINED SUBSPACE ESTIMATION
In some applications, in addition to the collection of measured subspaces we might have a model-based subspace W of dimension d w that is used to constrain the sought solution by requiring the solution to intersect W in d i dimensions.This model-based subspace can be determined from the physical process that generates the measurements, or from prior information about the problem.As a motivating example, consider for instance a beamforming problem in which one of the impinging signals is known to lie in a given subspace of dimension d w (as we will see in Section IV this model-based subspace can be determined from the DoA of the estimated source and the presumed mismatch error).
Problem Statement.Our problem can be stated as follows: given a collection of experimentally measured subspaces V m ∈ G (q m , n), for m = 1, . . ., M, and a model-based subspace W ∈ G (d w , n), find an average subspace V s of dimension d s which maximizes the cost function of (3), under the constraint that it intersects the model-based subspace in d i dimensions.To avoid trivial cases where the unconstrained solution is optimal, we assume From the unconstrained subspace average described in Section II, it is clear that the information from the collection of subspaces can be summarized through the average projection matrix P. Therefore, the optimization problem we want to solve can be written as maximize ( The solution can be parameterized as follows where W ∈ C n×dw is a matrix whose columns form a unitary basis for the model-based subspace and W ⊥ ∈ C n×(n−dw) is a unitary basis for its orthogonal complement.The matrices A ∈ C dw×di and B ∈ C n×(ds−di) are unitary slices to be determined during the optimization procedure.It is clear that any matrix that can be expressed as ( 6) determines a subspace of dimension d s that intersects W in at least and where X 11 is a d w ×d w matrix.With these definitions, problem (5) can be re-phrased as maximize Now, taking into account that tr(A H X 11 A) = tr(X 11 AA H ) = tr(X 11 P A ) where P A is a projection matrix defined by where the last constraint enforces the orthogonality between the subspaces spanned by the basis [A T 0] T and B.
Except for the rank constraints, problem (8) is convex.Therefore we can use the idea of semidefinite relaxation (SDR) [15] and solve the following convex relaxation (eliminating also some redundant constraints) maximize PA,PB tr X 11 P A + tr XP B subject to P A 0, P B 0, To solve problem (9) we used CVX, a package for specifying and solving convex programs [16].When the relaxation is tight, the solution of ( 9) returns projection matrices P A and P B of ranks d i and d s − d i , and the global optimum satisfies all constraints; otherwise, we need some approximation.There are several ways proposed in the literature to extract solutions with the desired ranks when the problem is not tight, such as Gaussian randomization or rank reduction procedures (also called purification methods) [17].These methods, however, incur some computational cost.Therefore, in the following subsection we propose an approximate solution to be applied when the problem is not tight.
Approximate linear algebraic solution to the SDR.In the convex relaxation, the coefficient matrices [A T 0] T and B are estimated simultaneously.In fact, problem (9) can be viewed as two max-trace problems which are coupled through the constraints.The approximation proceeds in two steps: first, it obtains the part of the solution that intersects W in d i dimensions, and second, it completes the subspace with the remaining d s − d i dimensions which are extracted from the orthogonal complement of the d i -dimensional subspace obtained in the first step.In the proposed approximate solution, the two subspaces are obtained sequentially by solving two separate problems.
1) Obtain a d i -dimensional subspace belonging to the model-based subspace W. To this end solve maximize (10) whose solution, A * , is given by any unitary matrix whose column space is the same as the subspace spanned by the d i principal eigenvectors of P W = W H PW.Then, the basis for WA is 2) Now, let P ⊥ F = I n − FF H be the projection onto the orthogonal complement of F. Similarly to the previous step, a basis for the remaining portion of the subspace that lives in the space spanned by the columns of W ⊥ is given by the d s − d i main eigenvectors of the (P ⊥ F ) H PP ⊥ F .We denote this n × (d s − d i ) matrix as G.
3) A unitary basis for the constrained subspace is finally given by -Solve problem (10) and find F = WA * 2.-Find a unitary basis G formed by the Finally, the proposed Constrained Subspace Estimation (CSE) algorithm is summarized in Algorithm 1.

IV. SIMULATION RESULTS
Example 1.We generate M perturbed versions of a central subspace V 0 ∈ G(d s , n), as follows: 1) Generate a unitary basis for the model-based subspace W ∈ G(d w , n).
2) Generate a unitary basis for the central subspace, V 0 ∈ G(d s , n), taking at random d i dimensions from W and d s − d i dimensions from W ⊥ .3) Generate a collection of M matrices of dimensions n × n as follows: where V 0 ∈ C n×ds is a unitary basis for V 0 , and Z m ∈ C n×n is a matrix whose entries are independent and identically distributed complex Gaussian random variables with zero mean and variance 1n.The value of determines the signal-to-noise-ratio, which is defined as SNR = 10 log 10 ds n 2 .An orthogonal basis for the m-th subspace, V m , is then constructed from the first d s orthonormal vectors of the QR decomposition of G m .In words, the basis of each measured subspace V m is an additively perturbed version of the basis of the signal subspace V 0 , which intersects W ∈ G(d w , n) in d i dimensions.On the other hand, the SNR measures the spreading of the subspaces: i.e., a low SNR means that the subspaces are very far from each other, whereas a high SNR means that the subspaces are tightly clustered around V 0 .For the first example, we have considered a collection of M = 10 subspaces in G (8,20).The solution and the model-based subspaces both have dimension 8, and they are constrained to intersect in d i = 4 dimensions.As a figure of merit we compute the extrinsic distance between the solution V s and the true V 0 defined in (1).For comparison, we also compute the results of the unconstrained average subspace (cf.Section II) which is denoted here as UNC.Fig. 1 shows the average extrinsic distance in logarithmic scale estimated from 100 independent Monte Carlo simulations for the UNC method (in dashed line) and proposed CSE algorithm (in solid line) for two different values of the ambient space dimension.As we can observe, when the subspaces are spread out (i.e., for low SNR values) the CSE algorithm, which enforces the intersection constraint with the model-based subspace in the required number of dimensions, provides a more accurate estimate.Keeping fixed the rest of the parameters, the estimate for both methods improves when the ambient space dimension grows.
Example 2. In the second example, we analyze the tightness of the semidefinite relaxation.In general, the tightness depends on a variety of parameters such as the ambient dimension, the number of subspaces, or the SNR.As an example, Fig. 2 shows the tightness estimated from 100 independent Monte Carlo simulations when M = 100, n = 20, d w = 6 and d s = 6.We can see that for high SNRs (all subspaces are well clustered) the problem is tight with very high probability.
Application to beamforming.Let us consider an array of n sensors and K uncorrelated sources impinging on the array.Assuming the standard AWG model for the noise, the theoretical covariance matrix of the array output vector has the following form where 2 k are the powers of the incident signals, and a(✓ k ) are the steering vectors.
In classical beamforming problems the goal is to estimate the angle or direction-of-arrival (DoA) of every impinging source.In our problem, however, we do not look for accurate DoA estimates, but only for a good approximation (in terms of captured signal power) of the span of the corresponding steering vectors.Suppose now that for a given source (without loss of generality we take this to be the first source) we have a previous estimate of the DoA with a possible angle mismatch error up to degrees.Therefore, the true DoA belongs to the interval ✓ 1 ∈ ⌦ = [ ✓1 − , ✓1 + ].This prior information can be approximated in the form of a modelbased subspace, W ∈ G(d w , n), which contains most of the energy of all steering vectors with DoAs in ⌦.The solution for the model-based subspace is given by the first d w discrete prolate spheroidal wave functions [18], and the corresponding d w -dimensional Slepian subspace [19], [20].With this prior information, the problem amounts to estimating a subspace As the average of projection matrices we use P = R tr( R), where R is the sample covariance matrix estimated from M snapshots.
To evaluate the performance of the subspace estimation algorithms, we use the relative efficiency defined as where V 0 is a unitary basis for the subspace spanned by the K true steering vectors, and P ⊥ V is the projection onto the orthogonal complement of the estimated subspace.Note that when ⌘ = 1 the estimated subspace is perfectly aligned with the array manifold vectors.We consider a scenario with n = 20 antenna elements, K = 2 sources with equal power, and M = 50 snapshots.We assume that an estimate of ✓ 1 with a mismatch error ± is available, and from this prior information we obtain a model-based subspace of dimension d w = 3.The CSE algorithm estimates a subspace of dimension d s = 2 that intersects W in d i = 1 dimension.For comparison, we extract the 2-dimensional signal subspace of the sample covariance matrix (labeled as UNC in the figure).Fig. 3 shows the relative efficiency versus the SNR for two different values of the mismatch error.For low SNR values, the additional information provided by the model-based subspace provides a significant boost in performance even for very large values of the mismatch error.However, for high SNRs, the constrained solution is only better for small values of the mismatch error, since otherwise the prior information provided by the model-based subspace is not reliable and forcing an intersection can even be detrimental.

V. CONCLUSIONS
Finding an average subspace that intersects another subspace in a given number of dimensions is an interesting optimization problem that arises in a number of applications.In this paper, we have used semidefinite relaxation techniques to form a convex relaxation of this problem, and then proposed the Constrained Subspace Estimation (CSE) algorithm.Some preliminary results in a beamforming problem indicate that, when the model-based subspace provides reliable prior information about the solution, the CSE algorithm outperforms the unconstrained estimate.Future work will include the analysis of the tightness of the relaxation, and a more detailed performance evaluation of the CSE algorithm in several application domains.

Fig. 1 .
Fig. 1.Extrinsic distance between the true subspace and the estimate provided by the UNC and the CSE algorithms when the ambient space dimension varies (n = 20 and n = 60) while other parameters are kept constant (ds = dw = 8, d i = 4 and M = 10 subspaces).
) Average of projections P, unitary basis for the model-based subspace W, d s , d w , d i Output: Unitary basis for the average subspace V s such that dim (V s W) = d i Initialize: Generate W = W W ⊥ and Output P A and P B if Relaxation is tight then 1.-Find an d w × d i unitary basis for P A : F 1 2.-Find an n × (d s − d i ) unitary basis for P B : F 2 3.-Find solution as