The 2017 Regional Election in Catalonia: an attempt to understand the pro-independence vote

This paper tries to unveil the main factors behind the triumph of the pro-independence vote in the 2017 Regional Election in Catalonia. The empirical analysis, which is carried out at the county level and by using a spatial econometric model, reveals that geographical location matters. The estimation results also suggest that the pro-independence vote is mainly linked to the birthplace of individuals. More specifically, it shows that the independence feeling is weaker the higher the share of citizens born outside Catalonia. On the other side, young and highly educated people are more prone to independence. Additionally, it is shown that people working in the public sector are more likely to vote for a political party in favor of Catalonia remaining in Spain, while the opposite happens for those voters working in construction. Finally, the results seem to dispel some myths associated with the role played by the county’s size and level of income on the pro-independence vote.


Introduction
December 21st, 2017. This date is going to be remembered for a long time in Catalonia, Spain and, maybe, in the rest of Europe. The Regional Election that took place at this date will most become part of the history books since the main issue at stake 1 3 was, actually, the hypothetical independence of Catalonia from Spain. It will be borne in mind not only for the outcome but also for the many unprecedented events that took place before it. Following the 'yes' result of the unofficial and illegal referendum on October 1st, 2017 in favor of independence, the Spanish Senate (Low Chamber), fearing a secessionist movement, approved the use, for the first time in its history, of the article 155 of the 1978 Spanish Constitution, which allowed the Central Government to assume control of Catalonian institutions. Just immediately, the Catalonian Parliament was dissolved and new elections were called for December. In this scenario, we believe that this call was an indirect but (contrary to the referendum) legal way of figuring out whether people living in Catalonia were for or against remaining in Spain. This is so due to the fact that, if anything characterizes the recent political context in Catalonia, is the existence of two complete opposite blocks regarding potential independence. One of them made up of the three Catalonian secessionist parties (Junts per Catalunya, Esquerra Republicana de Catalunya-Cataluya Sí, and Candidatura d'Unitat Popular), while the other grouped what could be called constitutionalist forces. 1 The outcome of the election, with the three mentioned political parties explicitly supporting independence winning absolute majority in the Parliament (although not in the number of votes), has not removed the uncertainty around the independence issue in Catalonia but embarks Catalonia and Spain upon a complex and uncertain process of necessary mutual understanding.
While relatively recent, the literature on Catalonian independence is so vast that it is impossible to do justice to it in a few words. Many papers (including Oskam 2014;Boylan 2015;Olivieri 2015;Orriols and Rodon 2016;Lepič 2017;Carbonell 2018 to name only a selected few) have addressed the Catalonian issue from different perspectives: social, economic, geopolitical, cultural, historical, legal, and so on. This is not our goal here, however. 2 Our aim is, paying attention to the pro-independence vote of the December 2017 election, to try to shed some light about voting patterns (why people voted what they voted), and draw some relevant conclusions for the future. To the best of our knowledge, there are no papers addressing this issue yet.
Although there are many reasons motivating this paper, there is one that, in our view, is very important: the existence of what, according to our estimation results can be considered, at least partially, as wrong interpretations made by some popular press and media reports. Just to mention some of them, it has become regular in the last few months to read sentences such as the higher the size (measured for example in terms of population), the lower the percentage of votes in favor of independence, or even contradictory statements regarding income, so that you can read that the richer the census section, the lower the pro-independence share, or just the opposite. 3 To say it clearly and emphatically, the results obtained in this paper cast doubt on these interpretations, not to say that they are plainly wrong. The problem here is that these popular views are based on spurious bivariate correlations that should not be taken at face value. This fact suggests that, to assess properly the influence of potential explanatory factors in favor of the pro-independence vote, you need to first specify a multivariate model and, then, evaluate jointly their impact.
Clearly related to the fact that to obtain solid, well-footed conclusions we need to model the issue at stake, there is an important point usually mentioned in the popular press that, if overlooked, could affect seriously the results obtained: namely, the geographical disjointness of the so-called "two Catalonias". Territorially, 2017 election's results present, as we will show later on, an image of two Catalonias with quite different views on independence: one is for, the other against it. From an operational point of view, this fact can produce the presence in the model of what is termed as "spatial dependence", and this being so, calls for a methodological approach to deal with it. Otherwise, the results obtained would be unreliable and almost certainly lead to spurious conclusions (see e.g. Anselin 1988). This is the main contribution of the paper, since, as far as we know, it is the first one, regardless of the case study, using a spatial econometric approach to address voting behavior. Specifically, we formally test for the presence of spatial dependence and, then, use spatial econometric techniques to model voters ' behavior. 4 Bearing all these considerations in mind, here, as in many other cases, the first crucial question has to do with data. 5 Ideally, we should use individual data but, unfortunately, these data either do not exist or are quite inaccurate (as the discrepancy between the expected outcome according to pre-election polls and the real outcome of the election has shown in many cases). Therefore, we use aggregate data, having three options: the use of provincial (4 units), county (41) 6 or municipal (947) data. The first option is too aggregate for our purposes while, between the other two, there is a trade-off related to data availability. Because of this, we finally decided to run the analysis at the county level, as we think it is informative enough and, needless to say, opens up the possibility for richer data with respect to explanatory factors. In any case, we have to admit that one needs to be cautious when interpreting the results because of the well-known problems related to the use of aggregate data, such as the ecological fallacy and the Modifiable Areal Unit Problem (MAUP) effect.
The remainder of this article is organized as follows. The next section presents, quite briefly, the pro-independence outcome of the election, paying special attention to the distribution of votes at the county level. Due to the geographical allocation of votes, the presence of spatial dependence is then tested and confirmed. Subsequently, we propose a model trying to explain the percentage of the pro-independence vote in each county. The necessity of incorporating spatial effects in the model specification is evaluated and, according to the results, a spatial lag (or spatial autoregressive) model is estimated. The last section summarizes the main findings and concludes the article.

The 2017 Regional Election in Catalonia: the triumph of the pro-independence vote
With regard to the most outstanding outcome of the 2017 Catalonian Regional Election, it is important to know that the three pro-independence parties won 47.5% of the votes which, due to the application of D'Hondt method and the allocation of mandates into the four Catalonian provinces, account for 70 out of 135 seats (51.9%) in the Catalonian Parliament. In other words, the pro-independence parties won absolute majority in the Regional Parliament. Nonetheless, the distribution for counties, displayed in Fig. 1, was quite heterogeneous. In nine counties, the percentage of votes for pro-independence parties was over 75%; namely, more than three out of four people living in them supported independence for Catalonia. On the opposite side, there were also nine counties where this share was lower than 50%; in other words, in these counties secessionist parties failed to secure a share of more than 50%. As for the geographical distribution of votes, there seems to be a clear pattern. As a rule of thumb, people residing in the coastal areas (more populated than the interior) were not in favor of independence (this area has started to be called "Tabarnia"), whereas those living in most counties located in the interior of Catalonia were. Although the analysis is carried out at county level, it is interesting to know that the distribution of votes for municipalities, displayed in the Appendix 1 (Fig. 3), conveys basically the same message. In any case, there were some exceptions that only the analysis at municipal level allows us to detect. By way of illustration, there are some municipalities, such as Carme, Orpí and Castellolí (in the county of Anoia), Mura (Bages), Orriu (Maresme) and Subirats (Alt Penedès), reporting a clear support for independence but surrounded by other municipalities clearly against it.
Although enlightening, a bit of caution is recommended with maps when trying to show the existence of geographical patterns or, in more technical terms, the presence of spatial dependence. This is so because the conclusions to be drawn are highly sensitive to the number and width of the intervals used. Hence, in order to check whether the initial impression from Fig. 1 was correct, we computed the most widely used and best-known test for spatial dependence. We are referring to the Moran's I statistic, which is given by , for i ≠ j. 7 In this case, y i (y j ) is the share of votes for pro-independence parties at county i(j) , y refers to the mean, w ij is an element of the distance matrix W between each pair of counties, ∑ N j=1 w ij is a standardization factor that corresponds to the sum of all the weights, and N is the total number of counties (41). The results, 8 using the inverse of the square of the Euclidean distance between the corresponding county centroids as the (row-standardized) distance matrix, 9 revealed a strong spatial dependence in the distribution of votes (Moran's I statistic equals to 0.275, with an associated p value of 0.000). As an add-on, the analysis at municipal level showed, as expected, an even higher degree of spatial dependence (Moran's I statistic equals to 0.474, with an associated p value of 0.000). It was also interesting to know, having proved the existence of spatial dependence, where was higher. To answer this question and then identify spatial clusters, we used Local Indicators of Spatial Association (LISA), given by (Anselin 1995). 10 When applied to our case, these indicators allowed us to detect differences in the spatial clustering among counties (or municipalities in Appendix 1). The results (Fig. 2) confirmed the existence of a cluster of counties presenting a relatively low share of the pro-independence vote that, basically, matches with "Tabarnia" since it is made up of Maresme, Vallès Oriental, Vallès Occidental, Barcelonès, Baix Llobregat, Garraf, Baix Penedès, and Tarragonès. On the contrary, county clustering in favor of independence was not so clear, as only Ripollès, Garrotxa and Berguedà 11 shape a small spatial cluster. That is to say, the pro-independence vote seems to be more dispersed, from a geographical point of view, than the anti-independence vote. It is also convenient to point out the case of Val d'Aran, as it was the only county showing negative spatial dependence. The reason appears to be obvious by just catching a glimpse to Fig. 1: Val d'Aran, the most anti-independence county, is surrounded by counties presenting high shares of votes backing Catalonia out of Spain. Finally, Fig. 4 in the Appendix model, and the rational choice theory), empirical studies tend to somewhat integrate all of them in looking for the key determinants of voting. This was the approach followed in this paper, paying particular attention to the determinants emphasized by the rational choice theory, also referred to as the model of economic voting, but as well to many other factors affecting voters' behavior.
Firstly, according to the aim of this paper, the dependent variable of our model was defined as the percentage of votes for parties in favor of independence collected in each county. As for the kind of explanatory variables we included in the model, we adapted what the theoretical models previously mentioned and most empirical papers agree to consider as key voting determinants. Anyway, it is important to note that the Catalonian independence, as Dow et al. (2018) state regarding the Scottish independence, is not only (not even primarily) an economic issue. That is to say, we fully agree with Akhter and Sheikh's statement that "The behavior of a voter is influenced by several factors such as religion, …, community, language, money, policy or ideology, purpose of the polls, … and the like" (Akhter and Sheikh 2014, p. 105). Therefore, along with some economic variables (namely per capita income, unemployment rate, poverty level, and industry-mix), we also included other variables such as birthplace, youth, education, population and population density. In addition, and taking into consideration the presence of spatial dependence (that, as we will see below, is confirmed in the model), we also included the spatial lag of the dependent variable.
As for the specification of the model, we followed a forward variable selection process. In other words, variables were sequentially added, starting with the one explaining a higher percentage of the dependent variable. This process went on until none of the remaining variables was significant. The final model to be estimated with, apart from the spatial lag of the dependent variable, five groups of variables denoted by different groups of coefficients ( , , , , ), is summarized in Eq. (1): where i denotes the county. Data were collected from the Official Statistics Website of Catalonia (IDESCAT ), and the definition of each variable included in the model, along with its descriptive statistics, is detailed in Table 1. Because of the procedure we employed to get the model, we show the estimation results in stages by including, step by step, the five sets of variables considered: (1) only the variable accounting for the percentage of people born outside Catalonia (BOC i ); (2) the group of variables referring to the age of people residing in each county ( YP i , OP i ); (3) variables referring to the level of education (PE i , PT i , HE i ) ; (4) sectoral variables (AGR i , CON i , COM i , TR i , HO i , FA i , PA i ), leaving out the industry sector, so that, as it is obvious, every coefficient should be interpreted with respect to this sector; and (5) a last miscellaneous group of variables (GDPpc i , POP i , POPD i , POV i , UR i ) sharing a common feature: none of them turned out to be statistically significant.
Before presenting the results, however, it is convenient to take a pause and demonstrate the necessity of including a spatial lag of the dependent variable in the five versions of the model. As said before, in the previous section we confirmed the presence of spatial correlation in the percentage of votes for parties in favor of independence. In any case, on the one hand we have to formally test whether there is spatial dependence in the model as well, and, on the other, we have to figure out the type of spatial dependence, if any, that exists, as the final specification of the model depends crucially on it. To tackle these issues, a series of Lagrange multiplier (LM) tests were computed on the residuals of the ordinary least squares (1) (OLS) estimation of the corresponding alternative 'non-spatial' versions of Eq.
(1) (that is, without including the spatial lag of the dependent variable): the robust LM-EL test, which null hypothesis is the absence of residual spatial autocorrelation, and the robust LM-LE test, which null hypothesis is the absence of substantive dependence. The results, displayed in Table 2, revealed that only for the LM-LE test the null hypothesis was rejected at the standard levels in the five alternative specifications of the model. Thus, our results supported the existence of substantive spatial dependence, so that the specification of Eq. (1) is correct. Thus, the results, obtained by maximum likelihood as OLS estimates are inefficient for cross-sectional models incorporating spatial error autocorrelation (Anselin 1988), are offered in Table 3. First, we want to stress that the various indexes of goodness-of-fit (last rows of Table 3) showed a good fit of the model to the data. These values are satisfactory to endorse the proposed model, so there is no doubt that the results obtained are quite reliable.
As for the independent variables, we found that the share of the pro-independence vote is closely (and negatively) associated with the percentage of the population who was born outside the region [column (1)]. 12 Despite the fact that, for diverse reasons, a quite remarkable portion of people who were born in Catalonia feel (culturally, linguistically, politically) distinct from the rest of Spain, our results convey the idea that, as a rule, the pro-independence sentiment seems to be weaker among Catalans living together with people coming from the rest of Spain and abroad. In addition, it can be stated, in line with the results, that the sense of separateness is not such a strongly-rooted feeling among people who are living, but were not born, in Catalonia.
This negative relationship between the share of inhabitants born out of Catalonia and the pro-independence vote became even stronger when the age of voters is accounted for [column (2)]. In particular, the younger the population, the higher the share of voters supporting pro-independence parties. The variable capturing the elder population did not result, however, significant. According to Salih (2014),  Table 3 Pro-independence vote: some explanatory factors p values in parentheses: ***significant at 1%; **significant at 5%; *significant at 10% Dependent variable: % of votes for parties in favor of independence support for independence is especially high among youth due to mainly two reasons: the widespread use of Catalan language at school and the dire consequences of the economic crisis. We cannot forget either that young people in Catalonia have experienced the boom of the secessionist sentiment over the new century. Although comparable just to a certain extent, our result is in stark contrast to the one found by Crescenzi et al. (2018) regarding the role of the youth in the Brexit. As Van Rompuy (2017) says, in the Brexit-referendum, unlike in the Catalonian one, older people were "wall" people, while younger people were "web" people. The opposite results in Catalonia are likely due to the fact that the Brexit and Catalonian cases are to a large extent antithetical cases, the former clearly against the EU and the latter assuming implicitly, but wrongly, that independence would not mean leaving the EU.
The addition of different levels of human capital in the equation [column (3)] offered further insights. On the one hand, it seems that individuals with professional training education are not in favor of independence. On the other, higher education plays an important role as a factor explaining the pro-independence vote. 13 As for the sectors [column (4)], the conclusion was that if we took two individuals, identical in terms of the other explanatory variables, the one working in construction would still be more likely than the one working in public services to vote in favor of independence. Coincidence or not, these are likely the two sectors more and less hit by the economic downturn, respectively. 14 Subsequently, we have to mention the remaining variables that were considered in the model specification but turned out all to be statistically non-significant. 15 Namely, variables not belonging to any of the previous groups and as diverse as per capita income, population, population density, poverty level, and the unemployment rate. These variables was included since, initially, they seemed to be, in many cases and according to, for instance, what is written by the popular press, potential explanatory factors for voting behavior. Yet, due to their limited explanatory power in the model, we show them in the final stage [column (5)]. Hence, there is an important -and straightforward-conclusion that can be derived from this analysis: as we mentioned in the Introduction, taking bivariate correlations at face value is a risky option. As an example, the correlations between the percentage of the pro-independence vote and both population (− 0.62) and population density (− 0.66) seemed to convey the message that these two variables affect significantly the pro-independence outcome. However, our analysis has revealed that this is not true as they are 13 A likely explanation of this result, in line with Azmanova (2011), is that independence increases uncertainty. As educated people tend to believe they can manage this uncertainty better than that less educated can, the firsts are more prone to independence than the seconds are. 14 Although the pro-independence movement has to do with historical reasons, there are some papers (see, for example, Dowling 2014) linking its revival with the outbreak of the crisis in 2008. Indeed, according to data from the Centre d'Estudis d'Opinió regular polls on public sentiments of Catalonian citizens, the share of population defending Catalonia as an independent state was around 25% in 2010, being nowadays higher than 40%. 15 Some of the variables included in the first four specifications resulted also non-significant, but at least others belonging to the same group were. just typical cases of spurious correlation. In other words, a multivariate analysis (as the one carried out in the present paper) is necessary to get a better understanding of the election outcome.
Finally, regarding the spatial lag of the dependent variable, it is important to note that its coefficient turns out to be positive and statistically significant in all cases, this reinforcing the previous conclusion that there is a geographical pattern in the pro-independence outcome, this being true even after assessing the role played by different explanatory factors. In other words, and as a rule, the share of pro-independence voters of each county is closely related to that of its neighbors.

Conclusions
At a time of political turmoil in Spain, due to the serious consequences that the Catalonian pro-independence movement could bring about in the country but also in Europe, 16 this paper has tried to uncover key factors helping us understand the share of the pro-independence vote in the December 21st, 2017 Regional Election in Catalonia. Unlike some popular press and media reports that, in the light of our results, have too quickly drawn some, at least partially, misleading conclusions, here a multivariate model was specified and estimated. Interestingly enough, the findings point out to the place of birth as the most influencing factor in the pro-independence outcome. Additionally, they also suggest that young people are more pro-independence, as it also happens with highly qualified people and, just somewhat surprising, those working in the construction sector. On the contrary, our findings reveal that people in the public service sector do not tend to support the pro-independence movement. Another important conclusion is that, contrary to some of the wrong conclusions reached by the popular press, this paper dismisses the commonly mentioned role played by the county size and level of income in shaping voting behavior. As a whole, our results do not strongly support the rational choice theory-model of economic voting-since its explanatory power seems to be limited. On the contrary, they reveal the role played by non-economic factors, mainly demographic and cultural ones, in explaining the pro-independence vote.
Which are the implications of these results and what can we reasonably expect for the future? We must admit that, apart from the fact that spatial dependence is likely to continue playing an important role in shaping voters' preferences, it is impossible to answer conclusively about the prospects of the pro-independence vote; in other words, there is not enough basis to render a confident judgment about whether the pro-independence movement is going to grow in strength or not. There are many forces at play and everything will depend on the way politicians (both at the regional (Catalonia) and national levels) address the issue in the near future. 17 Nevertheless, 16 We are referring to the challenge to the internal cohesion of territories in established countries that nationalism poses. The topic this paper addresses is, no doubt, timely, not only from the standpoint of Catalonia but also because of the potential consequences on other European regions that could take the Catalonian case as an example for the near future (Nagel and Rixen 2015). 17 Indeed, the process (el Procés) has been generated top-down and not the opposite. and being aware that the recently started conversations between the new Catalonian and Spanish governments are a crucial issue, as well as the somewhat ambiguity of CatComú-Podem party, 18 our findings allow us to make some very preliminary and tentative assessments.
First, despite its limited explanatory power, we could refer to the rational choice theory. More specifically, to the fundamentals of economic voting and its basic idea referred to as the reward-punishment hypothesis. From this perspective, it is no wonder to believe that the longer the duration of the economic downturn, the higher the pro-independence feeling. Not only this, if, as some respected voices argue, a new recession period is brewing, the pro-independence vote could be strengthened even further. In fact, there are some researchers (e.g. Dowling 2014) linking the rise in the pro-independence vote with the outbreak of the 2008 economic crisis.
Second, and given the importance of birthplace, it seems clear that the role played by migratory movements when it comes to foreseeing the future of the pro-independence faction might become crucial. These movements are influenced by numerous factors, one of which is the path taken by the independence process in itself. Another one is clearly related to making the knowledge of Catalan compulsory (or near compulsory) at both school and work (Clots-Figueras and Masella 2013); if this happens, and there is a big "if" here, it will undoubtedly work as a barrier for potential migrants going to Catalonia.
Third, with regard to the influence of the age of people on the pro-independence vote, we think that there are several possibilities to be discussed, without being certain which one will have a higher impact. If the recent secessionist sentiment perpetuates over time and across generations and it is effectively taking root in young people, it is obvious that not only the new voters but also the not so new will be pro-independence. However, the opposite could also be the case if, as sometimes happen, young people change their minds as they get older and the share of young people in the total population decreases.
Finally, and with regard to the educational level, it is likely that most people will become better educated in the future. According to our results, this fact conveys the message that is likely that the pro-independence vote increases in the coming years.
Putting all these possibilities together, we have to acknowledge that it is really difficult, if not impossible, to foresee what might happen with the pro-independence vote in the future. What seems to be evident is that if the independence movement in Catalonia turns out to be strengthened in the coming years, it will strike a severe blow to the Spanish government and, even if only because of the possible contagion effect, will be quite risky for Europe. 19 Only time will tell.