The impact of undesirable externalities on residential property values: spatial regressive models and an empirical study

Pollutant emissions, noise and other externalities generated by heavy infrastructures, might impact negatively on real estate values. To test this effect, this paper presents the results of an analysis based on Hedonic Linear Regression, Spatial Hedonic Linear Regression and Hedonic Geographically Weighted Regression models, carried out for the study case of the province of Taranto (Italy). The biggest steel factory in Europe is located here, and some population movements have been observed in relation to the high levels of pollution in the areas close to the factory. The variables used to measure the impact of externalities are of two types: objective indicators such as the distance from the industrial area and the levels of NO 2 and PM 10 , and subjective indicators such as the level of pollution and noise perceived by the population. Results show that the distance from factory was a positive factor in the real estate prices although not always clearly significant, and among pollution indicators, only high levels of NO 2 had a negative effect. The accessibility to employment did not prove to be a significant variable in the real estate prices, which indicates that factors related to environmental quality have a greater weight in residential location. Moreover, models including subjective indicators do not show better estimates than models considering only objective indicators. Finally, spatial regression models were useful to analyse the spatial dependence and spatial heterogeneity observed in the data.


INTRODUCTION
Many urban areas are exposed to high levels of negative externalities such as air pollution, poor water quality or the presence of toxic components.This is an important problem being addressed by the European Union due to its impact on human health and the environment (EU, 2013).Quantifying the impact of pollution in real estate values is therefore of great interest to policy makers, not only as a way to quantify the risk on public health, but also because real estate prices are derived and influence on residential location, which in turn, can generate changes in transport demand and trip patterns.
Hedonic pricing modelling was formalized by Rosen (1974) although previous empirical studies like the work by Court (1939) existed beforehand.This well-known technique has been useful in evaluating the weight of different factors on the prices of heterogeneous goods such as property values (Malpezzi, 2008).By regressing all the attributes of heterogeneous goods on the price, the contribution of each factor can be estimated.In the literature, hedonic pricing modelling has also been proposed to quantify user willingness-to-pay for reduced levels of pollution, noise and other undesirable externalities (Boyle and Kiel, 2001;Jim and Chen, 2006).
In this paper, Hedonic Linear Regression models (HLR) have been estimated to verify the hypothesis that residential dwellings exposed to bad environmental conditions will have lower market values.In addition, Spatial Hedonic Linear Regression models (SHLR) and Hedonic Geographically Weighted Regression models (HGWR) have also been estimated to test the presence of spatial relationships in the data.Anselin (1988) distinguishes between two types of spatial relationships: spatial dependence and spatial heterogeneity.The former is defined as the existence of a functional relationship between what occurs at a point in space and what occurs at neighbouring points, whereas the latter is defined as the lack of spatial structural stability in the parameters of the model.Both effects could be present in the context of real estate data due to factors such as the existence of different housing markets, the propagation effects of market prices in nearby areas or the omission in the hedonic function of relevant variables with spatial characteristics.The inclusion in the models of the spatial dependencies among observations, and the exploration of the existence of spatial heterogeneity in the estimated parameters (i.e.non-stationarity), will allow verification of the presence of spatial effects in the data.
The case study considered in this research is the province of Taranto (Southern Italy) which is one of the most polluted cities in Western Europe (Lucifora et al., 2015) due to the emission of the ILVA steel factory, the industrial seaport and an oil refinery plant located nearby.
The effect of such undesirable externalities on housing prices has been measured using two classes of indicators: -objective indicators, i.e. measures of physical variables such as the level of concentration of pollutants in the air using environment monitoring stations; -subjective indicators, i.e. residents' perception of air quality, estimated by surveys.
In the surroundings of this industrial area, population movements have been detected to areas further away with a higher environmental quality.These movements could have an impact on real estate prices while reducing the importance of accessibility to employment as a factor of residential location.The relevance and significance of these variables will be evaluated in order to check their influence on real estate prices.
The paper is organized as follows.Following the introduction, section 2 addresses the state-of-the-art of hedonic pricing models focusing on those proposed to estimate the impacts of undesirable externalities on residential location.The HLR, SHLR and HGWR model specifications are described in section 3. The case study and the data collected as well as results of the estimation of the above models are presented in section 4 and finally, the conclusions and future research issues are discussed in section 5.The results achieved could be included in a LUTI model used by governmental and other institutional bodies to assess public policies aimed at effectively managing the effect of heavy infrastructure on residential location and trip generation in the study area.

STATE OF THE ART
The impacts of urban environmental elements on property values have been evaluated by many authors.Boyle and Kiel (2001) classify this research into three main categories of studies based on the type of environmental externality being considered: air quality, water quality and externalities of heavy infrastructures.Here, we focus on the latter class of models, given the characteristics of the study area considered.
Considering the negative externalities generated by heavy infrastructures, Dale et al. (1999) used hedonic regression to study the effect of closing a lead smelter on house prices in Dallas (USA).The authors found that, consistent with the previous literature and in accordance with expectations, property values around the smelter were lower before the closure.After the closure, the prices rise in all neighbourhood types, although more slowly in the areas nearer to the lead smelter.Flower and Ragas (1994) also studied the effects of negative externalities, in this case two refineries, on real estate values during the period 1979 -1991.They tested two types of indicators to capture the effect of the refineries: dummy variables and the minimum distance from every dwelling to the closest refinery.A negative proximity effect was not significant throughout most of the time under study, except during 1982 -1983, when a tank explosion resulted in bad publicity and had negative effects on property prices in the areas closest to the refinery.
Other authors have studied the impact of Superfund sites (i.e.identified uncontrolled or abandoned places where hazardous waste is located) on property values.Kiel and Zabel (2001) specified hedonic models to estimate the individual willingness to pay to clean up a Superfund site.This technique was applied to two Superfund sites in Massachusetts (USA) and led to a cost -benefit analysis of the Superfund clean-up.
The authors found that the benefits for cleaning up the sites were greater than the cost.In a similar way, Kiel and Williams (2007) examined several Superfund sites in USA and found that whereas some sites had the expected negative impact on housing prices, in other cases had either no impact or even a positive impact.The authors used a hedonic model and meta-analysis to categorize these studies.The discussion showed that larger sites with fewer-blue-collar workers were more likely to have a decline in housing prices.
Another line of research has applied hedonic modelling to evaluate environmental variables but also considering the presence of spatial correlation in the data.As mentioned, these relationships could include spatial heterogeneity and spatial dependence.Both of these effects can reduce the efficiency of the estimation if this is done using ordinary least squares (OLS) and the parameters can also be biased when spatial autocorrelation is caused by one or more omitted variables.Early works using spatial autocorrelation techniques were made by Dubin (1992), Can (1992) and Basu and Thibodeau (1998) applying different techniques like kriged generalized least squares.Conway et al. (2010) analysed data from the housing market near Los Angeles (USA) using standard and spatially autocorrelated hedonic models to examine the effects of urban greenspace on residential values.The results showed that neighbourhood greenspace had a positive impact on housing prices even after controlling for spatial autocorrelation.
In the field of spatial heterogeneity, Long et al. (2007) compared a selection of spatial techniques to account for it in the city of Toronto (Canada).The authors used three methods: moving windows regression, geographically weighted regression and moving windows kriging.The results indicated that traditional hedonic models, even in the presence of neighbourhood and accessibility variables, did not adequately address spatial issues.For future research the authors suggested comparing the methods used with spatial autoregressive models (SAR) to test which of them is better in terms of their goodness of fit and to address spatial relationships.Le Gallo and Chasco (2015) applied hedonic housing price models and quantile conditionally parametric models to estimate the willingness to pay for less pollution and noise in the city of Madrid (Spain).The authors recommended using pollution and noise variables based on the perception of the residents instead of variables gathered by monitoring stations, given that housing prices were better explained by subjective evaluations.In addition, the results showed that the hedonic prices differed substantially between housing markets.

MODELS
Most of the research on the marginal willingness to pay for the reduction of environmental negative externalities is based on hedonic regression.This method is grounded on the estimation of a linear model with the following specification: (1) where y is the price or asking price of a dwelling, usually specified in the log form, X is a matrix with information about the independent variables such as the structural characteristics of the dwellings, variables related to transport and environmental quality indicators, β is a vector of parameters to be estimated and ε is a vector of independent and identically distributed (IID) errors.In this study, the variables of interest contained in matrix X are the objective or subjective measures of environmental quality whereas the others are control variables.
Spatial effects can usually be present in the context of real estate values due to different factors such as the existence of different housing markets (spatial heterogeneity), the diffusion effects of markets prices for housing in nearby areas (spatial dependence) or the omission in the hedonic function of relevant variables with a spatial character.To take into account these effects it is necessary to use spatial econometric models.One of the most comprehensive introductions to the spatial econometric models are provided by LeSage and Pace (2009).The most common spatial model is the SAR which assumes the existence of a diffusion process in the dependent variable.The model is specified as: (2) Where ρ is the parameter of spatial autocorrelation and W is a weighted matrix N x N where N is the number of observations.The other variables have the same meaning as in (1).
If the only requirement is to specify the presence of spatial dependence in the error term, then a spatial autoregressive model in the error term (SEM) can be used, with a specification as follows: where is a parameter of autocorrelation of the errors µ, and is a vector of IID errors.In this way, the real estate prices are not only a function of the independent variables but also of the µ errors of the neighbouring locations.
Finally, a model with both effects, autocorrelation in the dependent variable and in the error term, can be used.This type of model is known as the SAC model or the Kelejian-Prucha model and takes the form (Elhorst, 2010): Which is a combination of the expressions ( 2) and ( 3).These models have to be estimated using maximum likelihood (ML) given that the spatial relationships between observations violate the independence assumption of OLS.
The matrix W can be defined in different ways depending on whether zonal or point data are available.The four most common types of neighbourhood are: queen, rook, predetermined number of closest neighbours and the specification of a maximum neighbourhood distance.The queen type contiguity considers all the adjacent locations sharing a border or a vertex with the given location as neighbours, while the rook type contiguity considers those locations that share a border with the reference location as neighbours (Anselin, 1988).Assuming that the matrix W has to be given by the analyst, a lot of research has been dedicated to its correct specification.The estimated parameters can be directly interpreted, as in ordinary regression, using SEM type models, but the same does not occur in the cases of the SAR and SAC models which consider lags in the dependent variable.In these models feedback effects exist because a change in the dependent variable of a local observation simultaneously causes changes in the neighbouring observations which in turn have consequences on the first local observation.Therefore, in the cases of the SAR and SAC models, the estimated parameters should be seen as the representation of a state of equilibrium in the modelling process which includes the effects of spatial diffusion (Ward and Gleditsch, 2008).In this situation, the effects of each variable take the form of a matrix.LeSage and Pace (2010) recommended using a series of scaling indicators to correctly interpret the effects of every variable in the SAR and SAC models: a. Average direct effect: calculated as the mean of the elements of the main diagonal of the parameters matrix.It can be interpreted as the effects caused by the group of observations of an independent variable on the dependent variable.
b. Average indirect effect: calculated as the mean of the elements outside the main diagonal of the parameter matrix.It can be interpreted as the diffusion effect between observations caused by changes in an independent variable.c.Average total effect: calculated as the mean of the elements of the parameters matrix.It can be interpreted as the total effect, direct and indirect, received by the dependent variable.
Finally, the Geographical Weighted Regression (GWR) allows dealing with the presence of spatial heterogeneity in the data.This model takes the following general form: where indicate that the parameters are for a specific spatial location.This type of model is estimated in a similar way to linear regression, using weighted least squares with the peculiarity that the weightings are established as a function of the distance between the local regression point and the neighbouring data points.The 0 ( , ) ( , ) weightings can be established as a function of either fixed or adaptive kernels.Among the former can be found the more commonly used Gaussian type: (8) Many practical applications have also used the bi-square function from among the adaptive kernels: (9) where is the weight given to observation k, so the value of the weightings may drop to the point where .The value of may be established through theoretical considerations or by different automatic methods such as minimizing cross validation (Bowman, 1984) or Akaike Information Criterion (AIC) (Brunsdont et al., 1998).

Study area and data
The study area being considered is the province of Taranto, which consists of 29 municipalities, including the city of Taranto, with an area of 2,428 Km 2 .The population in the whole province is 584,649 inhabitants.The municipality of Taranto is the Capital of the province, although its population has steadily declined from 244,000 inhabitants in 1991 to about 200,000 in 2011 (see Figure 1b).Mobility within the region is very dependent on the motorized private vehicle, with almost 60% of the total modal share compared to only a 12% of public transport use (Italian National Institute of Statistics, 2011).The unemployment rate in the region is clearly higher than the national average (10.9%) with a magnitude in 2012 of 13%.Due to pollutant emissions by the industrial plants located in this area, Taranto is one of the most polluted provinces in Italy and Western Europe; 7% of the pollution is inhabitant related 93% is factory related.A price moderation process has been detected in the housing market due to all these factors (higher unemployment rate, lower population) and some movement of the population from zones near the industrial space to more remote areas with higher environmental quality.In addition, these areas and especially those located in the east and northeast of the region, have a greater use of private vehicle with percentages higher than 70% of the total demand.Two types of data were collected for the case study: objective indicators, i.e. direct measures, such as the level of concentration of pollutants in the air, using environment The data collection process provided a total of 473 observations (see Figure 1c with their spatial distribution).All the indicators included in the data-base and their relevant measurement units are reported in Table 1.These variables have been arranged in three different groups: structural characteristics of the dwellings, accessibility/social environment variables and environmental quality of the area.
The dependent variable LnP is the asking price of the dwelling obtained from a real estate web and was collected in October 2012.In general, there is a high correlation between asking prices and selling prices.In Spain for example other authors (Le Gallo and Chasco, 2015) have estimated that the sale price is about 8 percent lower than the asking price.In Italy, the average difference is about 15% (Banca d'Italia, 2015).Other authors have calculated similar values in other study areas (Hometrack, 2005).The spatial distribution of the average zonal prices (see Figure 1d Among the accessibility indicators to mobility opportunities, the FTV is an interaction between the number of lines connecting to a bus stop and a dummy variable with a value equal to 1 if the dwelling is less than 400 meters away from a bus stop.The accessibility variable is a Hansen type indicator (Geurs and van Wee, 2004): (10 where Ej is a measure of the attraction of zone j and Cij is a measure of the journey cost between zones i and j.In this application, Ej is the number of jobs in zone j whereas α1 and α2 are estimated parameters equal to 0.85 and 1.25 according to previous research (Coppola and Nuzzolo, 2011).The variable PRESTIGE is a dummy variable equal to one if the area where the dwelling is located has a certain special prestige.It is a qualitative indicator that should be specified by the informal knowledge of the analyst about each area and can be used as a constant that measures positive environmental factors present in the area that are difficult to measure with quantitative indicators.
There were five variables related to environmental quality which are central for the goals of this study.The variable KMILVA represents the Euclidian distance in kilometres between the industrial area and the dwelling.The variables NO2 and PM10 are average measures from the nearest monitoring station of the household (see Figure 1b).These data have been measured by The Regional Agency for the Prevention and Protection of the Environment of the Apulia Region (ARPA).The variable NO2 represents the nitrogen dioxide in µg/m 3 whereas the variable PM10 is the Particulate Matter up to 10 micrometres in size per μg/m 3 .Both types of pollutants can be a serious health risk for humans (Heinrich et al., 2013) and the EU Directive on Air Quality provides limit values for concentrations of both NO2 and PM10 as well as other pollutants (EU, 2008).The variables: AIRQ and NOISEQ are perceptions of the citizens about the quality of air and the level of noise in the surroundings of their houses.Both indicators measure citizen perception on an ordinal scale from 1 (very poor quality of air/very high level of noise) to 10 (very good quality of air/very low level of noise).These two variables were collected in 2012 using a random survey asked to 380 households.The variables AIRQ and NOISEQ represent the average values of the answers at the zonal level.

Model estimates
This section presents the estimation of two HLR models followed by the results of the four spatial regression models (SAR, SEM, SAC and GWR).The estimated parameters have been reported through Table 2 to Table 5 followed by the p -value of the t test in brackets.
The HLR1 model was estimated using the objective measures of environmental quality (KMILVA, NO2 and PM10) whereas HLR2 was estimated with the subjective measures (AIRQ and NOISEQ) using OLS in both cases.The specifications excluded variables correlated with others having an r coefficient greater than 0.5, given that the model could present problems of collinearity.This was the case of the variables ROOM and BATH which highly correlated with SQM and POP and positively correlated with TRAIN.Table 2.Estimated parameters of the Hedonic Linear Regression models The fit of the models was around 68% using the adjusted R 2 indicator.The variables representing the structural characteristics of the dwellings presented the expected signs in all cases and were significant within a confidence level of at least 95%, except in the case of IMP that had no clear significance, especially in the case of HLR1.The most influential variables on the real estate values were GA and GAR, i.e. the ownership of garden and garage in the dwelling, with a mean impact, ceteris paribus, close to 20% on property prices.
Among the variables corresponding to accessibility and the social environment, the PRESTIGE variable was clearly significant and had the expected positive sign as an import factor causing an increase of almost 75% in the price of the dwellings.The variable TRAIN in contrast, was clearly non-significant whereas the supply of bus transport lines (FTV) had a positive sign with an increase of between 2.7% and 2.9% per additional line.The accessibility indicator, ACC, was more important in the HLR1 models than in the HLR2 where it was not significant although in both cases presented the expected positive sign.
The variables relating to environmental quality were not clearly significant in all cases in the HLR1 model.The variable of distance from the ILVA steel plant presented the expected positive sign, i.e. there is a gradient with increasing prices from the plant, and was significant within a confidence level of 90%.The parameter implies an increase of 0.4% in the real estate values for every kilometre away from the industrial area (see Figure 2).The level of the pollutant NO2 was also significant and had the expected negative sign, whereas the parameter of PM10 presented a counterintuitive positive sign.This could be due to the fact that high levels of PM10 are also related to the presence of other urban activities (Pollice and Jona Lasinio, 2010) and their proximity might prove attractive for certain segments of population (i.e. the accessibility variables do not capture all the accessibility possibilities considered by the urban agents) .However, the positive sign was not significantly different from 0 at 95% confidence level, so it is not possible to say that PM10 has a positive or negative influence on the real estate values.The HLR2 model parameters were very similar to those of HLR1.The subjective environmental indicators AIRQ and NOISEQ had the expected positive signs in both cases (more subjective quality of noise and air imply higher prices) but NOISEQ was clearly significant whereas AIRQ was not.
In both HLR models the Moran I index of global spatial correlation was calculated in the residuals of the regression.The index was clearly significant leading to the acceptance of the hypothesis of spatial autocorrelation.Anselin (1988) recommends using the Lagrange multiplier test (LM) to detect specification errors due to not considering spatial dependence in the HLR models.This test can detect specification errors caused by not including the autoregressive parameter in the dependent variable (LM-Lag), in the error term (LM-Error) or in both cases (LM -SARMA).In the HLR1 and HLR2 models, the LM test was clearly significant in all cases.decrease.This result was also obtained by Flower and Ragas (1994) in the case of two oil refineries although it was the case of short -term effects on public health following adverse publicity.The meta-analysis performed by Kiel and Williams (2007) considering Superfund sites found that larger sites and less blue -collar workers are positively related with more negative impacts on housing prices.In the present case of study, the port and the ILVA steel factory cover a huge industrial area of more than 20 Km 2 (more than twice the size of the city of Taranto) so it is consistent with these results that its impact can be important.If the subjective indicators are considered, the results obtained were pretty similar to the HLR1 model in terms of goodness of fit in contrast with those of Boyle and Kiel (2001) and Le Gallo and Chasco (2015) where the subjective indicators better explained housing prices than the objective measurements.Taking into account the spatial regression models (see Table 3), the parameters were very similar in cases of structural and accessibility/social environmental variables.The environmental quality variables also presented the same parameter sign and significance, including the counterintuitive positive and not significant sign of PM10, probably due to the positive impact of accessibility to other urban activities.The parameter of the AIRQ variable was clearly not significant in the spatial models, whereas NOISEQ was significant at a 90% confidence level.It is also noteworthy that the accessibility to jobs did not turn out to be a significant variable in any of the models.This indicates that the prices are derived mainly from variables related to the structural and environmental characteristics of the dwellings instead of their proximity to places of employment concentration, such as the industrial area and the ILVA steel factory.In contrast, the public transport supply was clearly significant in the average price of the dwellings, with an increase per each additional available line similar to the results obtained by the HLR models.the spatial regression parameter ρ was significant at least at a confidence level of 94% whereas the λ parameter for spatial error was not significant in all the cases.In addition, the spatial autoregressive parameter of the models was the only one clearly significant, another positive evidence of the existence of spatial dependence.
The fit of the SAR, SEM and SAC models was very similar but slightly better in the case of the SAR models considering AIC.The goodness of fit of the spatial models can be compared with the fit of the MLR models using the likelihood ratio test (LR) which distributes χ 2 with r degrees of freedom, where r is the number of linear restrictions (Ortúzar and Willumsen, 2011).The LR test was only clearly significant in the cases of the SAR models (with values of 4.9 and 5.5, see Table 3) and therefore with only one linear restriction.
The total impacts of the models considering spatial dependence in the dependent variable (SAR and SAC models) were also calculated (see Table 4).Comparing the total impacts on the directly estimated parameters (see Table 3 and   Table 4), it can be seen how the total impacts of variables like SQM, PRESTIGE or FTV are clearly higher.This fact provides empirical evidence in favour of the existence of spillover effects associated with these variables, e.g. a greater supply of public transport near a dwelling increases the prices of neighbourhood dwellings (indirect effect from an observation) and this effect simultaneously increases the price of the first dwelling (indirect effect to an observation).The variables related to environmental quality, KMILVA, PM10 and NOISEQ also had a slightly higher total effect whereas AIRQ and ACC were again clearly not significant.
When compared with the results of other studies considering the effects of spatial relationships, these results agree with those by Conway et al. (2010) in the sense that the significant effects on housing prices detected in the HLR models remained after checking for spatial effects.
Finally, the spatial heterogeneity of the parameters has been examined using GWR and a Monte Carlo significance test (Fotheringham et al., 2002).The HGWR was performed using a Gaussian model with an adaptive kernel type (bi-square) and a bandwidth of the kernel determined by AIC minimization.The adaptive kernel type was preferred to the fixed type to assure an equal number of data points in every observation.Considering the spatial distribution of the parameters, KMILVA had the highest values in the North-East of the industrial area (i.e. the greater the distance from the industrial area, the more positive impact in the housing prices of the dwellings is) whereas in to the West, the parameters were clearly not significant.In the NO2, NOISEQ and AIRQ cases, the greater impacts of the parameters were also to the East and North-East of the industrial area, although in the case of the AIRQ variable, the impacts were higher in the dwellings located nearer to the undesired land use.
These results suggest either, the existence of non -stationarity in the parameters or the lack of a variable in the specification of the models in which the effect is captured by the environmental quality variables.A possible explanation to the non -stationarity could be that the households in the East and North-East areas (Martina Franca and Crispiano municipalities) characterised by higher priced dwellings have a higher preference for better environmental quality, i.e. there is more than one housing market in the study area.In addition, the East and North-East areas have received population from the city of Taranto and the nearby industrial zone given their higher environmental quality.However, the existence of more than one housing market is difficult to prove because the sample of dwellings obtained to the West of the Taranto province was not big enough.
These results allow researchers to compare the estimation of the SAR models to the GWR models as proposed by Long et al. (2007).The HGWR models had a higher goodness of fit considering the AIC and allowed to capture the already mentioned spatial issues of non -stationarity or spatial variable bias in the specification of the model.In addition these results were also similar to those obtained by Le Gallo and Chasco (2015) who found the existence of heterogeneity between different housing markets in the considered study area.

CONCLUSIONS
In this paper, five types of hedonic models were estimated to assess the influence of undesired externalities on dwelling prices: hedonic multiple linear regression (HLR), spatial autoregressive (SAR) in the dependent variable, spatial autoregressive in the error term (SEM), spatial autoregressive in the dependent variable and in the error term (SAC) and geographically weighted regression (GWR).The models, estimated using data collected in the province of Taranto, were compared to control the presence of spatial relationships between observations and to test if the presence of the industrial area and the ILVA steel factory was a significant factor explaining real estate values.
The HLR models showed how the distance from the industrial area was a positive factor on increasing real estate prices whereas the measured levels of NO2 showed a negative one.By contrast, this effect was not observed for the levels of PM10.These results lead to the conclusion that there is some empirical evidence of the moderate impact caused by the negative externalities of the industrial area and the ILVA steel factory on the real estate values.A consistent result with the fact that accessibility to jobs was clearly not significant in all models.This indicates that between these opposite driving forces shaping the utility of living, the environmental quality seems to have a greater weight than the proximity to employment places in the study area.
Considering the subjective indicators, the perceived air quality was clearly not significant in all the models, especially in those checking for spatial effects.The perceived quality of noise nevertheless was significant in all the specifications at least at a level of confidence of 90%.Comparing models estimates using objective indicators with those using subjective ones show that the former did not fit better than the latter.
In the spatial models, these results did not change although the statistical significance of the parameters was lower.However, the spatial models helped to capture spatial effects present in the data.The estimation of the SAR and SAC models found the existence of spillover effects whereas the GWR technique showed that in the East and North-East of the study area the effects of the environmental quality variables was stronger and statistically significant because of the spatial heterogeneity (different housing markets) or spatial variable bias in the model specification.These results show the usefulness of spatial techniques to explore and capture these effects, avoiding the problems associated with the MLR models which can cause bias in the estimated parameters.
The estimated model could be incorporated into a future LUTI model of the study area in order to simulate the impacts of different policies on land uses and transport patterns.It must be taken into account that the displacement of population from areas affected by negative externalities to less accessible areas but with better environmental quality could require a redesign of public transport services in order to adapt them to the new trip demand pattern.
Further research could improve the estimated results adding more data about real estate transaction prices in the study area.A greater sample would reduce the standard errors of the estimated parameters thereby decreasing uncertainty about their significance and the real population values.Furthermore, it could be useful to measure the effects of the industrial area in different time periods using panel data.
This would allow the effects of different events in the evolution of the real estate prices to be estimated.Finally, additional techniques like quantile conditionally parametric modelling (McMillen, 2012) could be applied in order to more thoroughly explore the spatial heterogeneity found in the data using GWR.
Stakhovych and Bijmolt (2009) concluded that high connectivity of the weights matrices had a negative impact on the detection of the true model specification and that a selection of the weight matrix based on goodness of fit criteria (log -likelihood or information criteria) usually indicates its correct specification.LeSage and Pace (2014) proposed different measurements of the correlation between neighbourhood matrices and showed how the influence of specifying W on the estimations of the if they are correctly interpreted from the true partial derivatives (direct impacts + indirect impacts, see below) and if the model is well specified.
monitoring stations and subjective indicators, i.e. residents' perception of air quality, estimated by surveys.The chosen subjective indicators were based on a random survey asked to a sample of households about the perceived quality of air and noise levels.Previous studies comparing the performance of objective and subjective indicators have found different results, generally showing that the perception of the undesirable externalities better explains the real estate values than the objective measurements obtained from monitoring stations.

Figure 1 .
Figure 1.Spatial distribution of elevations (a), population and monitoring stations (b), sample of households and infrastructures (c) and average asking price aggregated by zone (d) in the study area

Figure 2 .
Figure 2. Partial effects of the KMILVA variable

Figure 3 .
Figure 3. Spatial variation of the KMILVA parameter

Table 1 .
Description and descriptive statistics of the variables contained in the database (N=473)

Table 3 .
Spatial regression models estimation results

Table 4 .
Total impacts of the SAR and SAC models