Sparse Subspace Averaging for Order Estimation

This paper addresses the problem of source enumeration for arbitrary geometry arrays in the presence of spatially correlated noise. The method combines a sparse reconstruction (SR) step with a sub-space averaging (SA) approach, and hence it is named sparse sub-space averaging (SSA). In the first step, each received snapshot is approximated by a sparse linear combination of the rest of snapshots. The SR problem is regularized by the logarithm-based surrogate of the `0-norm and solved using a majorization-minimization approach. Based on the SR solution, a sampling mechanism is proposed in the second step to generate a collection of subspaces, all of which ap-proximately span the same signal subspace. Finally, the dimension of the average of this collection of subspaces provides a robust estimate for the number of sources. Our simulation results show that SSA provides robust order estimates under a variety of noise models.


INTRODUCTION
Detecting the number of signals received by an array of sensors is a crucial problem in many applications, such as wireless communications, radar, and biomedical and geophysical signal processing [1,2]. In array processing, this order estimation problem is known as source enumeration and boils down to the estimation of the array manifold dimension. Most common order estimation approaches are based on information-theoretic criteria [3][4][5][6], which minimize functions of the eigenvalues of the sample covariance matrix (SCM) penalized by a term that quantifies the model complexity. Many variants of these methods, based on random matrix theory, have been introduced to address the problem in the small sample regime where the number of observations is on the order of or smaller than the number of sensors [7][8][9]. These methods, however, are typically designed for spatially white noise and do not perform well under non-white or correlated noise. Source enumeration techniques when the noise samples have an unknown covariance matrix have recently been proposed in [10,11], but these methods require that the noise be sufficiently weaker than the signal.
Recently, [12,13] proposed a source enumeration technique for white noise based on subspace averaging (SA). The method was de- veloped for uniform linear arrays (ULAs), as it exploits the shift invariance property (SIP) to generate multiple subspaces. In particular, the SIP allows us to extract submatrices from the SCM that span the same signal subspace in noiseless conditions. Then, averaging these subspaces is a mechanism that enhances the signal subspace, while cancelling out the noise subspace. In the end, the dimension of the average subspace yields an estimate the number of sources. In [14], we extended the technique to account for independent but not identically distributed (non-iid) noise models. However, this method still exploits the SIP, and it is therefore only applicable to ULAs. Moreover it entails a large computational cost. This paper extends the SA enumeration technique to arrays with arbitrary geometry, referred to here as non-uniform linear arrays (NULAs), and to noise with arbitrary covariance matrix. The method first approximates each snapshot by a sparse linear combination of the rest of the received snapshots. In a noiseless situation, the number of nonzero coefficients of the sparse expansion would directly reveal the signal subspace dimension. In a noisy situation, the sparse expansions can be used to generate a collection of subspaces that, when averaged, will provide the order estimate using the SA method in [13]. This method, which combines sparse reconstruction with subspace averaging, is termed sparse subspace averaging (SSA).
The sparse reconstruction (SR) problem is solved by generalizing to complex-valued signals the well-known logarithm-based surrogate of the 0-norm [15]. Using the majorization-minimization (MM) framework [16], we show that the SR problem amounts to solving a reweighted regularized least squares (LS) problem. Based on the SR solution, a sampling mechanism is proposed to generate a collection of random subspaces, all of which approximately span the same signal subspace. Finally, an average or central subspace is estimated as in [13], and the dimension of this average provides a robust estimate for the number of sources impinging on the array. Sparse reconstruction approaches have been used before in array signal processing, but mainly for direction-of-arrival (DOA) estimation problems [17][18][19]. Differently from these works, in this paper we use sparse reconstruction methods as a means to generate multiple approximations to the signal subspace that, when averaged, provide accurate estimates of the underlying signal subspace rank.
Our simulation results show that SSA provides robust order estimates under a variety of noise models, ranging from uncorrelated noise with different variances to noise with arbitrary covariance matrices. Moreover, unlike the method in [13], SSA can be applied to any array geometry, not necessarily a ULA.
where H is the M × K unknown multiple input multiple output (MIMO) channel. The signals are assumed to be uncorrelated and are modelled as where Rn is the noise covariance matrix. In this paper, we do not assume any particular structure (scale identity, diagonal) for the noise covariance matrix Rn.
From (1), the covariance matrix of the received signal is where Rs = HΨH H is the signal covariance matrix. Note that since the MIMO channel H is unknown, we can assume without loss of generality that Ψ = I.
and the source enumeration or order estimation problem consists in estimating K from X orR.

SPARSE SUBSPACE AVERAGING
The key idea of subspace averaging for order estimation is to average subspaces that contain correlated (or ideally identical) versions of the signal subspace, but uncorrelated portions of the noise subspace. For ULAs, these subspaces can be obtained from consecutive subarrays by exploiting the shift invariance property [12,13,20]. When the MIMO channel H has no particular structure, however, a different approach to generate these subspaces is needed. The SSA method generates the subspaces from the sparse reconstructions of each snapshot, as we describe in the following sections.

Sparse Representation
In a noiseless situation, a basis for the K-dimensional signal subspace can be constructed from K randomly chosen columns of the data matrix X. This means that each snapshot can be represented as linear combination of other K snapshots that lie in the same subspace. Note that the K snapshots will be linearly independent with probability one. In a noisy situation, such an expansion will not exist in general, but still we can find a sparse representation of each snapshot in terms of the rest of snapshots. The snapshots selected by the sparse reconstruction algorithm will allow us to build (or generate) approximate basis for the signal subspace. Note also that the number of sources is typically much smaller than the number of snapshots K N . This is the idea behind the sparse reconstruction stage in SSA. Similar ideas have successfully been applied to sparse subspace clustering problems in [21].
The sparse expansion for each snapshot is obtained by solving where Xn is the data matrix X after removing the nth column/snapshot and αn = [αn,1, . . . , αn,N−1] T are the coefficients of the sparsest linear combination obtained for the nth snapshot. However, the optimization problem in (2) is NP-hard. One well-known solution to this problem is to replace the 0-norm with the 1-norm [22], which can be solved by the least absolute shrinkage and selection operator (LASSO) optimization algorithm [23]. Nevertheless, the 1-norm surrogate function can sometimes perform poorly and other proxies for the 0-norm are preferred, such as the logarithm of the absolute value [15]. Moreover, the equality constraint in (2) is not adequate for low signal-to-noise-ratio (SNR) scenarios. Hence, in this paper the SR will be obtained as the solution to which is the Lagrangian form of (2) where we have substituted the 0-norm by the log-surrogate and the equality by an inequality. We propose to solve (3) by using an MM-based approach [16]. The majorizer of the cost function in (3) is based on a first-order Taylor series of the logarithm, that is, n,i is the ith component of the solution at the tth iteration. Hence, the solution at the (t + 1)th iteration is computed from where a small value δ has been added to the denominator to avoid divisions by zero. Now, we can find a closed-form solution to (4) by taking the derivative and setting it to zero, which yields The sparse reconstruction algorithm is shown in Algorithm 1.

Random Generation of Subspaces
In this section we develop a method to randomly generate subspaces based on the SR obtained in Section 3.1, which follows the lines  (7) Compute kmax as the number of |αi,n| greater than µ maxi(|αi,n|) Computeγn in [13]. In noisy conditions, it is important for the SA method that the extracted subspaces contain a large common portion of the signal subspace and independent portions of the noise subspace. This translates into an averaging procedure that enhances signal coordinates while averaging out noise coordinates. Concretely, the subspace associated to the nth snapshot is constructed by randomly selecting columns from Xn, with probabilities proportional to the sparse coefficients αn as This process can be interpreted as sampling from a discrete distribution that we denote as D(Xn, γn), where γn = [γn,1, . . . , γn,N−1] T in (7) are the concentration parameters of the distribution. To shed some light on this distribution, let us explain the experiment that determines D. Let us take for example the snapshot n = N and let XN = [x [1], . . . , x[N − 1]] be the matrix with snapshots n = 1, . . . , N − 1. Solving the SR problem, we get the coefficients αN of the sparse expansion of x[N ] in terms of the rest of snapshots. The sparse coefficients, when properly normalized, give us the probabilities (γN,1, . . . , γN,N−1). The sampling experiment to generate a random subspace is as follows. Draw 1 includes x [1] with probability γN,1, and excludes it with probability (1 − γN,1); draw 2 includes x [2] with probability γN,2, and excludes it with probability (1 − γN,2); and so on. With the choice for D(Xn, γn), the probability of picking the ith snapshot from Xn is proportional to |αn,i|. Therefore, the snapshots associated to the higher values of |αn,i| should be chosen more often. For more details, on this interpretation we refer the reader to [13].
Finally, the subspaces are the column spaces of the matrices iteratively constructed as shown in Algorithm 2. This algorithm considers two thresholds: 1) kmax, which is the maximum dimension of the signal subspace (in practice an overestimate of K), and is selected as the number of |αi,n| greater than µ maxi(|αi,n|), with 0 < µ < 1; and 2)γn, which is a minimum value for the probabilities in γn, and is chosen asγn

Algorithm 3: Sparse Subspace Averaging
Input: X, dmax Output: Order estimateKSSA for n = 1, . . . , N do Find Xn by removing the nth column from X Computeαn using Algorithm 1 Generate T random projection matrices using Algorithm 2 Compute P using (8) ObtainKSSA using (9)

Subspace Averaging (SA) Criterion
The random generation procedure can be repeated T times to generate T subspaces for each value of αn, n = 1, . . . , N . Therefore, we get a total of N T orthogonal projection matrices to be used in the SA procedure. The average projection matrix is Once we get P, its eigenvalues 1 ≥ k1 ≥ k2 ≥ · · · ≥ k dmax are used to determine number of sources aŝ Basically, the criterion in (9) detects the gap between the signal and noise eigenvalues. It is noted here that to avoid numerical issues and to save computational cost, we are considering only the largest dmax eigenvalues of P, where dmax K is an overestimation of K. A summary of the proposed method is shown in Algorithm 3.

SIMULATION RESULTS
In this section, the performance of the proposed method is compared with some representative methods for different noise models. As a figure of merit we use the probability of correct detection (PD), which is the probability thatKSSA = K. For all simulations we assume that K uncorrelated narrowband equal-power signals are impinging on a NULA with M antennas. The signal-to-noise-ratio is defined as SNR = 10 log tr(Rs) tr(Rn) , and 2000 Monte Carlo simulations are averaged for all the results. We found that λ = M 2N , T = 20, µ = 0.1, and dmax = M 5 provide in general good performance over a wide range of scenarios. Since δ is a small constant that is used to avoid numerical issues, we select a δ = 10 −15 .
In the first experiment, we assume that the noise is uncorrelated across antennas with different variances at each sensor. Therefore, the noise covariance matrix is Rn = diag σ 2 1 , σ 2 2 , . . . , σ 2 M , where σ 2 m is the noise variance at the mth sensor. The noise variances are modeled as uniformly distributed independent random variables: where σ 2 is a common noise variance and 0 ≤ ≤ 1 allows us to control the spatial non-whiteness of the noise. Note that noise would be white for = 0. As methods for comparison, we select LSMDL [8] and BIC [9], which are designed for white noise, the MDL criterion adapted to uncorrelated noises [14,[24][25][26] (labeled in the figures as MDL-unc), and the linear-regression method designed for colored noise in [10]. = −2 dB. In this example SSA provides robust and accurate detection results over the entire range of variation of . Since LSMDL and BIC are originally designed for white noise, their performances degrade for higher values of . The performance of the linear regression method in [10] also degrades for higher values of . Finally, MDL-unc provides good results for higher values of , but its performance degrades when the noise is nearly spatially white.
In the second experiment, we consider an exponentially correlated noise model whose noise covariance matrix is defined as [Rn]i,j = ρ |i−j| , where 0 ≤ ρ ≤ 1 is the correlation coefficient. Fig. 2 shows PD vs. ρ for an array with M = 20 antennas, N = 80 snapshots, K = 3 sources, and SNR= 10 dB. For this noise model, we are comparing the results of MDL-unc, the method in [10] and the method in [27], which applies a rank-reduction preprocessing step followed by an order estimation criterion based on canonical correlation analysis (CCA). As Fig. 2 suggests, the proposed SSA method again provides a robust solution for the whole range of ρ. All other methods fail when ρ increases.
In the last experiment, we consider Gaussian noise with arbitrary unknown covariance matrix, that is, the noise covariance matrix is drawn from a complex Wishart distribution with M degrees of freedom and scale matrix proportional to I. The scenario has M = 100 antennas, K = 4 sources, and the number of snapshots N varies upto 100, so that we are in the small sample regime. The SNR for this experiment is chosen as SNR = 15 dB, because when the noise covariance matrix is arbitrary, the SNR must be sufficiently high for any method to work. Since CCA and method in [10] do not usually provide good results with few snapshots, we are considering N ≥ 15 for these methods, however, lower values of N are considered for SSA. As it is shown in Fig. 3, CCA does not perform well for this noise model in the small sample regime, while the method in [10] provides good results for sufficiently large N . Finally, SSA performs satisfactorily even when N/M is around 0.1.

CONCLUSIONS
In this work we have presented an order estimation technique for arbitrary geometry arrays and noise with unknown spatial correlation. The method borrows from subspace averaging techniques for source enumeration, originally proposed for uniform linear arrays and white noise. The key idea to generate a collection of subspaces to be averaged is obtaining a sparse representation of each snapshot as a linear combination of the others. To perform this sparse recovery (SR) problem, we proposed a generalization of the log-surrogate of the 0-norm, which is solved using the majorization-minimization approach. Then, based on the sparse coefficients of the reconstruction, a sampling mechanism is presented to obtain projection matrices that share a large common portion of the signal subspace. Finally, the eigenvalues of an average projection matrix are then used to estimate the number of sources. It is illustrated by some simulation examples that the proposed sparse subspace averaging (SSA) method performs robustly for a wide range of noise models in the small sample regime.