Chapter

8. Decomposing an RPPI into Land and Structures Components

Author(s):
Statistical Office of the European Communities;International Labour Office;International Monetary Fund;Organization for Economic Co-operation and Development;United Nations;World Bank
Published Date:
May 2013
Share
  • ShareShare
Show Summary Details

Introduction

8.1 In Chapter 3 it was mentioned that for national accounts and CPI purposes, it will be useful or necessary to have a decomposition of the residential property price index (RPPI) into two components: a quality adjusted price index for structures and a price index for the land on which the house is built. The present chapter outlines how hedonic regression can be utilized to derive such a decomposition. Hedonic regression methods were discussed in Chapter 5.

8.2 Some economic reasoning will be helpful to derive an appropriate hedonic regression model. Think of a property developer who is planning to build a structure on a particular property. He or she will likely determine the selling price of the property after the structure is completed by first calculating the total expected cost. This cost will be equal to the floor space area of the structure, say S square meters, times the building cost per square meter, γ say, plus the cost of the land, which will be equal to the cost per square meter, β say, times the area of the land site, L. We follow a cost of production approach to modeling the property price. That is, the functional form for the hedonic price function is assumed to be determined by the supply side of the market, i.e., by independent contractors.(1)

8.3 Now consider a sample of properties of the same general type, which have structure areas Snt and land areas Lnt in period t for n = 1,…,N(t); the prices pnt are equal to costs of the above types plus error terms ɛnt which are assumed to have means 0. This gives rise to the following hedonic regression model for period t where βt and γt are the parameters to be estimated: (2)

The quantity of land Lnt and the quantity of structures Snt associated with the sale of property n in period t are the only two property characteristics included in this very simple model; the corresponding prices in period t are the price of a square meter of land βt and the price of a square meter of structure floor space γt. Separate linear regressions of the form (8.1) can be performed for each time period t in the sample.

8.4 The “builder’s model” (8.1) essentially relates to newly-built dwellings. To make it applicable to existing or resold houses we should account for the fact that older structures will be worth less than newer structures due to depreciation of the structures. Information on the age of the structure will therefore be needed. The next section shows how depreciation can be incorporated into the model, similar to what was done in the examples for the town of “A” presented in Chapter 5. It will also be shown how additional land and structures characteristics can be included as explanatory variables.

Accounting for Depreciation and Additional Characteristics

Depreciation

8.5 Suppose that in addition to information on the selling price of property n at time period t, pnt, the land area of the property, Lnt, and the structure area, Snt, information on the age of the structure at time t, say Ant, is available. If straight line depreciation is assumed, the following model is a straightforward extension of (8.1) to include “existing” houses:

where the parameter δ reflects the (straight line) depreciation rate as the structure ages one additional period. If structure age is measured in years, δ will probably be between 0.5% and 2%. This will be an underestimate of “true” depreciation because it will not account for major renovations or additions to the structure. The estimated straight line depreciation rate in (8.2) should therefore be interpreted as a net depreciation rate; i.e., a gross depreciation rate less the rate of renovations and additions to the structure. Model (8.2) will not work for very old structures since, if they are still in use, they will likely have been extensively renovated. (3)

8.6 Notice that (8.2) is a nonlinear regression model whereas (8.1) is a linear regression model. (4) Because the depreciation parameter δ is regarded as fixed over time, (8.2) would have to be estimated as one nonlinear regression over all time periods in the sample, whereas model (8.1) can be run as a period by period linear regression. The period t price of land in model (8.2) will be the estimate for the parameter βt and the price of a unit of a newly built structure for period t will be the estimate for γt. The period t quantity of land for property n is Lnt and the period t quantity of structures for property n, expressed in equivalent units of a new structure, is (1-δAnt)Snt, where Snt is the floor space area of property n in period t.

8.7 Expensive properties probably have relatively large absolute errors compared to inexpensive properties, so it might be better to assume multiplicative rather than additive errors. However, we prefer an additive model specification as the purpose is to decompose the aggregate value of housing into the sum of structures and land components; the use of additive errors facilitates this decomposition. When there is evidence of heteroskedasticity, weighted regressions can be considered. Several researchers suggested hedonic regression models that lead to additive decompositions of a property price into land and structures components. (5)

8.8 There is a potential problem with the above builder’s model, namely multicollinearity. Large structures are generally built on large plots of land, so that Snt and Lnt could be highly collinear (i.e., the land-structure ratios Lnt / Snt could be centered around a constant). This could give rise to unstable estimates of the quality adjusted prices βt and γt for land and structures. As will be seen in the example using data for the Dutch town of “A”, the problems of multicollinearity and instability do indeed occur. In general, multicollinearity is not a major problem if the goal is to produce an overall house price index, but it is problematic if the goal is to produce separate price indices for land and structures components. Some possible methods for overcoming the multicollinearity problem will be suggested in later on.

8.9 The hedonic regression model (8.2) has the implication that the parameters would have to be re-estimated whenever the data for a new period became available. To overcome this problem, a “rolling window” approach could be applied. A suitable window length T would be chosen, (6) the model defined by (8.2) or (8.3) would be estimated using the data for the last T periods, and the existing series for price of land and for price of structures would be updated using the chain link factors βT / βT-1 and γT / γT-1. This approach will be illustrated below.

Adding More Characteristics

8.10 The above basic nonlinear hedonic regression framework can be generalized to encompass the traditional array of characteristics used in real estate hedonic regressions. Suppose that we can associate with each property n transacted in period t a list of K characteristics Xn1t,Xn2t,,XnKt that are price determining characteristics for the land on which the structure was built and a similar list of M characteristics Yn1t,Yn2t,,YnMt that are price determining characteristics for the type of structure. The following equations generalize (8.2) to the present setup: (7)

where the parameters to be estimated are now the K quality of land parameters, η1,…,ηK, the M quality of structures parameters, λ1,…,λM, the period t quality adjusted price for land βt and the period t quality adjusted price for structures γt. The quality adjusted amount of land, Lnt*, and the corresponding quality adjusted amount of structures, Snt*, for property n in period t are defined as follows:

8.11 To illustrate how X and Y variables can be formed, consider the list of explanatory variables in the hedonic housing regression model reported by Li, Prud’homme and Yu (2006; 23). The following variables in their list of explanatory variables can be viewed as variables that affect structures quality; i.e., they are Y type variables: number of bedrooms, number of bathrooms, number of garages, number of fireplaces, age of the unit, age squared of the unit, exterior finish is brick or not, dummy variable for new units, unit has hardwood floors or not, heating fuel is natural gas or not, unit has a patio or not, unit has a central built in vacuum cleaning system or not, unit has an indoor or outdoor swimming pool or not, unit has a hot tub unit or not, unit has a sauna or not, and unit has air conditioning or not. The following variables can be assumed to affect the quality of the land; i.e., they are X type location variables: unit is at the intersection of two streets or not (corner lot or not), unit is at a cul-de-sac or not, shopping center is nearby or not, and various suburb location dummy variables.

8.12 Equations (8.3) and (8.4) show how the quality adjusted amounts of land and structures would be calculated if the goal is to construct price indices for the sales of properties of the type that are included in the hedonic regression model. If the goal is to construct price indices for the stock of properties of the type included in the regression, then the construction of appropriate weights becomes more complex. These weighting problems will be discussed in the next section.

Aggregation and Weighting Issues: Indices for Sales versus Stocks of Housing

8.13 As was explained in Chapter 5, the construction of an RPPI for the sales of property using standard hedonic regression techniques is fairly straightforward. Typically, a separate hedonic regression of the type defined by (8.3) will be run for each locality or region in a country. (8) Recall that once a particular regression has been run, period t quality adjusted prices for land, pLt, and for structures, pst, for the region under consideration can be defined in terms of the estimated parameters for the model as follows:

The corresponding quality adjusted quantities of land and structures for the region, say QLt and Qst can also be defined in terms of the estimated parameters using definitions (8.4) above as follows:

8.14 If hedonic regressions, for say R regions, of the type defined by (8.3) have been run for the T periods of data, then the algebra associated with (8.5)-(8.8) can be repeated for each region r. Denote the resulting prices and quantities for region r that are the counterparts to (8.5)-(8.8) by PLrt,PSrt,QLrt and QSrt for r = 1,…,R and t = 1,…,T. Now Fisher (sales) RPPIs for land can be constructed using the regional price and quantity data for land, PLt[PL1t,,PLRt] and QLt[QL1t,,QLRt], for each time period t (t = 1,…,T). Similarly, Fisher (sales) RPPIs for structures can be constructed using the price and quantity data for structures in each period t, Pst[Ps1t,,PSRt] and Qst[Qs1t,,QSRt] for t= 1,…,T.(9)

8.15 As was the case with stratification methods, it is now necessary to consider how to construct an RPPI for the stock of residential properties when hedonic regression methods are used. The period t hedonic cell prices PLrt and PSrt defined by the region r counterparts to (8.5) and (8.6) can still be used as cell prices to construct stock price indices for land and structures, but the counterpart quantities QLrt and QSrt defined by (8.7) and (8.8) are no longer appropriate; these quantities need to be replaced by estimates that apply to the total stock of dwelling units in the region (or some other reference population) for regression r at time t, say QLrt* and QSrt*, for r = 1,…,R. Thus, the counterpart summations in (8.7) and (8.8) are now taken over the entire stock of dwellings in region r in period t instead of just the dwelling units that were sold in period t. Period t information on the quantity of land Lnrt for every unit n in the region that is in scope for the hedonic regression model m is now required, along with the accompanying characteristics information Xnrkt for every land characteristic k, as well as data on the quantity of the structures Snrt, along with the accompanying characteristics information Ynrmt for every structures characteristic m. With these new population quantity weights, the rest of the details of the index construction are the same as was the case for the sales RPPI.

8.16 In order to construct appropriate period t population stock weights, it will be necessary for the country to have census information on the housing stock with enough details on each dwelling unit in the stock so that the required information on the quantity of land and structures and the accompanying characteristics can be calculated. If information on new house construction (plus the required characteristics data) and on demolitions is available in a timely manner, the census information can be updated and period t estimates for the constant quality amounts of land and structures, the QLrt* and QSrt* can be approximated in a timely manner. Hence, stock RPPIs for land and structures can be constructed using Fisher indices, as was the case for the sales RPPI. If timely data on new construction and demolitions is unavailable, it may only be possible to construct fixed base Laspeyres type price indices using the quantity weights from the last available housing census.

8.17 If census information is not available at all (or if data on the characteristics of the dwelling units is missing), it still may be possible to approximate RPPIs for land and structures using hedonic regression techniques. If characteristics data on the residential properties that are sold in each period is stored over a large period of time, an approximate distribution of dwelling units by type can be constructed. This information may then be used to approximate a stock based RPPI in the manner explained above.

Main Advantages and Disadvantages

8.18 This section summarizes the main advantages and disadvantages of using hedonic regression methods to construct an RPPI for land and structure components. The main advantages are:

  • If the list of available property characteristics is sufficiently detailed, the method adjusts for both sample mix changes and quality changes of the individual houses.

  • Price indices can be constructed for different types of dwellings and locations through a proper stratification of the sample. Stratification has a number of additional advantages.

  • The method is probably the most efficient method for making use of the available data.

  • The method is virtually the only method that can be used to decompose the overall price index into land and structures components.

8.19 The main disadvantages of the hedonic regression approach are:

  • The method is data intensive since it requires data on all relevant property characteristics (in particular, the age, the type and the location of the properties in the sample as well as information on the structure and lot size) so it is relatively expensive to implement.

  • The method may not lead to reasonable results due to multicollinearity problems.

  • While the method is essentially reproducible, different choices can be made regarding the set of characteristics entered into the regression, the functional form for the model, the stochastic specification, possible transformations of the dependent variable, etc., which could lead to varying estimates of overall price change.

  • The general idea of the hedonic method is easily understood but some of the technicalities may not be easy to explain to users.

Application on Data for the Town of “A”: Preliminary Approaches

8.20 The general techniques explained in this chapter will now be illustrated using the data set for the Dutch town of “A”, which was described at the end of Chapter 4. We have data on sales of detached dwellings for 14 quarters, starting in the first quarter of 2005. Recall the notation used above and in Chapters 4 and 5: there were N(t) sales of detached houses in quarter t, where pnt is the selling price of house n. There is information available on three characteristics: area of the plot in square meters, Lnt; floor space area of the structure in square meters, Snt; and age in decades of house n in period t, Ant.

The Simple Case

8.21 The simple hedonic regression model defined by (8.2) will be estimated on this data set and is repeated here for convenience:

The parameters to be estimated are βt (i.e., the price of land in quarter t), γt (the price of constant quality structures in quarter t) and δ (the common depreciation rate for all quarters). Model (8.9) has 14 unknown βt parameters, 14 unknown γt parameters and one unknown δ or 29 unknown parameters in all. (10)

8.22 The R2 for this model was equal to .8847, which is the highest yet for regressions using the data set for the town of “A”. The log likelihood was -10642.0, which is considerably higher than the log likelihoods for the two time dummy regressions that used prices as the dependent variable; recall the regression results associated with the construction of indices PH4 and PH5 defined in Chapter 5 where the log likelihoods were -10790.4 and -10697.8. The estimated decade straight line net depreciation rate was 0.1068 (0.00284).

8.23 The estimated land price seriesβ^1,,β^14 (rescaled to equal 1 in quarter 1), labeled PL1, and quality adjusted price series for structures γ^1,,γ^14 (rescaled also), labeled PS1, are plotted in Figure 8.1 and listed in Table 8.1. Using these price series and the corresponding quantity data for each quarter t, i.e., the amount of land transacted, LtΣn=1N(t)Lnt, and the quantity of constant quality structures, St*Σn=1N(t)(1-δ^Ant)Snt, an overall property price index has been constructed using the Fisher formula. This overall index, labeled P1, is also plotted in Figure 8.1 and listed in Table 8.1. For comparison purposes, the Fisher hedonic imputation index from Chapter 5, PHIF, is also presented.

Figure 8.1.The Price of Land (PL1), the Price of Quality Adjusted Structures (PS1), the Overall Cost of Production House Price Index (P1) and the Fisher Hedonic Imputation House Price Index

Source: Authors’ calculations based on data from the Dutch Land Registry

Table 8.1.The Price of Land (PL1), the Price of Quality Adjusted Structures (PS1), the Overall Cost of Production House Price Index (P1) and the Fisher Hedonic Imputation House Price Index
QuarterPL1PS1p1PHIF
11.000001.000001.000001.00000
21.295470.916031.045711.04356
31.420300.894441.074821.06746
41.122900.993421.034831.03834
51.258200.944611.051471.04794
61.093461.088791.086701.07553
71.265141.015971.099411.09460
81.132761.039661.067871.06158
91.318160.983471.097131.10174
101.083661.135911.110061.10411
111.326241.006991.117821.11430
121.309941.005021.110771.10888
130.943111.175301.093731.09824
141.504450.90321.111471.11630
Source: Authors’ calculations based on data from the Dutch Land Registry
Source: Authors’ calculations based on data from the Dutch Land Registry

8.24 It can be seen that the new overall hedonic price index based on a cost of production approach to the hedonic functional form, P1, is very close to the Fisher hedonic imputation index PHIF. However, the price series for land, PL1, and the price series for quality adjusted structures, PS1, are not credible at all: there are large random fluctuations in both series. Notice that when the price of land spikes upwards, there is a corresponding dip in the price of structures. This is a clear sign of multicollinearity between the land and quality adjusted structures variables, which leads to highly unstable estimates for the prices of land and structures.

The Use of Linear Splines

8.25 There is a tendency for the price of land per meter squared to decrease for large lots. In order to account for this, a linear spline model for the price of land will be used.(11) For lots that are less than 160 m2, it is assumed that the cost of land per meter squared is βst in quarter t. For properties that have lot sizes between 160 m2 and 300 m2, it is assumed that the cost of land changes to a price of βmt per additional square meter in quarter t. Finally, for plots above 300 m2, the marginal price of an additional unit of land is set equal to βLt per square meter in quarter t. Let the sets of sales of small, medium and large plots be denoted by SS(t), SM(t) and SL(t), respectively, for t= 1,…,14. For sales n of properties that fall into the small land size group during quarter t, the hedonic regression model is given by (8.10); for the medium group by (8.11) and for the large land size group by (8.12):

8.26 Estimating the model defined by (8.10)-(8.12) on the data for the town of “A”, the estimated decade depreciation rate was δ^ = 0.1041 (0.00419). The R2 for this model was .8875, which is an increase over the previous no-splines model where the R2 was .8847. The log likelihood was -10614.2 (an increase of 28 from the previous model’s log likelihood.) The first period parameter values for the three marginal prices for land were β^s1 = 281.4 (55.9), β^M1 = 380.4 (48.5) and β^L1 = 188.9 (27.5). In other words, in quarter 1, the marginal cost per m2 of small lots is estimated to be 281.4 Euros per m2, for medium sized lots, the estimated marginal cost is 380.4 Euros/m2, and for large lots, the estimated marginal cost is 188.9 Euros/m2. The first period parameter value for quality adjusted structures is γ^1= 978.1 Euros/m2 with a standard error of 82.3. The lowest t statistic for all of the 57 parameters was 3.3, so all of the estimated coefficients in this model are significantly different from zero.

8.27 Once the parameters for the model have been estimated, then in each quarter t, the predicted value of land for small, medium and large lot sales, VLSt, VLMt and VLLt, respectively, can be calculated along with the associated quantities of land, LLSt, LLMt, and LLLt, as follows:

The corresponding average quarterly prices, PLSt, PLMt and PLLt, for the three types of lot are defined as the above values divided by the above quantities:

8.28 The average land prices for small, medium and large lots defined by equation (8.19) and the corresponding quantities of land defined by (8.16)-(8.18) can be used to construct a chained Fisher land price index, which is denoted by PL2. This index is plotted in Figure 8.2 and listed in Table 8.2. As before, the estimated quarter t price per meter squared of quality adjusted structures is γ^t and the quantity of constant quality structures is given by St*Σn=1N(t)(1-δ^Ant)Snt. The structures price and quantity series γ^t and St* were combined with the three land price and quantity series to form a chained overall Fisher house price index P2, which is also graphed in Figure 8.2 and listed in Table 8.2. The constant quality structures price index PS2 (which is a normalization of the series γ^1,,γ^14 is presented as well.

8.29 The overall house price index resulting from the spline model, P2, is fairly close to the Fisher hedonic imputation index PHIF. However, the spline model does not generate sensible series for the price of land, PL2, and the price of structures, PS2: both series are extremely volatile but in opposite directions. As was the case with the previous cost of production model, the present model suffers from a multicollinearity problem.

8.30 Comparing Figures 8.1 and 8.2, it can be seen that in Figure 8.1 the price index for land is above the overall price index for the most part and the price index for structures is below the overall index while in Figure 8.2, this pattern reverses. This instability is again an indication of multicollinearity. In the following section an attempt to cure this problem will be made by imposing monotonicity restrictions on the prices of the constant quality structures.

Figure 8.2.The Price of Land (PL2), the Price of Structures (PS2), the Overall Price Index Using Splines on Land (P2) and the Fisher Hedonic Imputation Price Index

Source: Authors’ calculations based on data from the Dutch Land Registry

Table 8.2.The Price of Land (PL2), the Price of Structures (PS2), the Overall Price Index Using Splines on Land (P2) and the Fisher Hedonic Imputation Price Index
QuarterpL2pS2p2PHIF
11.000001.000001.000001.00000
21.105340.995891.041371.04356
31.020081.098031.064651.06746
41.050821.025421.036081.03834
50.993791.080781.042941.04794
60.748261.311221.069821.07553
70.934841.207191.089121.09460
80.772021.267181.053451.06158
91.199661.017241.094251.10174
100.771391.348131.094721.10411
110.921191.248841.105961.11430
120.976951.191881.097311.10888
130.840551.275311.088111.09824
141.292610.978751.106131.11630
Source: Authors’ calculations based on data from the Dutch Land Registry
Source: Authors’ calculations based on data from the Dutch Land Registry

An Approach Based on Monotonicity Restrictions

8.31 It is likely that Dutch construction costs did not fall significantly during the sample period. (12) If this is indeed the case, monotonicity restrictions on the quarterly prices of quality adjusted structures, γ1,γ2,γ3,,γ14, can be imposed on the hedonic regression model (8.10)-(8.12) by replacing the constant quality quarter t structures price parameters by the following sequence of parameters γ1,γ1+(ø2)2,γ1+(ø2)2+(ø3)2,,, γ1+(ϕ2)2+(ϕ3)2++(ϕ14)2, where ø2,ø3,,ø14 are scalar parameters. (13) For each quarter t starting at quarter 2, the price of a square meter of constant quality structures γt is thus equal to the previous period’s price γt-1 plus the square of a parameter ϕt-1, t-1)2. Now replace this reparameterization of the structures price parameters γt in (8.10)-(8.12) in order to obtain a linear spline model for the price of land with monotonicity restrictions on the price of constant quality structures.

8.32 Implementing this new model using the data for the Dutch town of “A”, the estimated decade depreciation rate was δ^ = 0.1031 (0.00386). The R2 for this model was .8859, a drop from the previous unrestricted spline model where the R2 was .8875. The log likelihood was -10630.5, a decrease of 16.3 over the previous unrestricted model. Eight of the 13 new parameters ϕt are zero in this monotonicity restricted hedonic regression. The first period parameter values for the three marginal land prices are β^s1 = 278.6 (37.2), β^M1 = 380.3 (41.0) and β^L1 = 188.0 ; these values are almost identical to the corresponding estimates in the previous unrestricted model. The first period parameter estimate for quality adjusted structures is γ^1 = 980.5 (49.9) Euros/m2, which is little changed from the previous unrestricted estimate of 978.1 Euros/m2.

8.33 Once the parameters for the model have been estimated, convert the estimated ϕt parameters into estimated parameters using the following recursive equations:

Now use equations (8.13)-(8.19) in the previous section in order to construct a chained Fisher index of land prices, which is denoted by PL3. This index is plotted in Figure 8.3 and listed in Table 8.3. As in the previous two models, the estimated period t price for a squared meter of quality adjusted structures is γ^t and the corresponding quantity of constant quality structures is St*Σn=1N(t)(1-δ^Ant)Snt. The price and quantity series γ^t and St* were combined with the three land price and quantity series to construct a chained overall Fisher house price index P3 which is also graphed in Figure 8.3 and listed in Table 8.3. The constant quality structures price index PS3 (a normalization of the series γ^1,,γ^14 may be found in Figure 8.3 and Table 8.3 as well.

Figure 8.3.The Price of Land (PL3), the Price of Quality Adjusted Structures (PS3), the Overall House Price Index with Monotonicity Restrictions on Structures (P3) and the Overall House Price Index Using Splines on Land (P2)

Source: Authors’ calculations based on data from the Dutch Land Registry

Table 8.3.The Price of Land (PL3), the Price of Quality Adjusted Structures (PS3), the Overall House Price Index with Monotonicity Restrictions on Structures (P3) and the Overall House Price Index Using Splines on Land (P2)
QuarterPL3PS3P3p2
11.000001.000001.000001.00000
21.100471.000001.041481.04137
31.074311.058491.064571.06465
41.007521.058491.036271.03608
50.993881.080781.043161.04294
60.895601.203001.071681.06982
70.938141.203001.089611.08912
80.854901.203001.054081.05345
90.950971.203001.095031.09425
100.944241.210311.096251.09472
110.965141.210311.105521.10596
120.945961.210311.097341.09731
130.922521.210311.087521.08811
140.962621.210311.104271.10613
Source: Authors’ calculations based on data from the Dutch Land Registry
Source: Authors’ calculations based on data from the Dutch Land Registry

8.34 The new overall house price index P3 that imposed monotonicity on the quality adjusted price of structures in Figure 8.3 can hardly be distinguished from the previous overall house price index P2, which was based on a similar hedonic regression model except that the movements in the price of structures were not restricted. The fluctuations in the price of land and quality adjusted structures are no longer violent.

8.35 While the above results seem “reasonable”, the early rapid rise in the price of structures and the slow growth in structures prices from quarter 6 to 14 are not very likely. In the following section, one more method for extracting separate structures and land components out of real estate sales data will therefore be tried.

An Approach Based on Exogenous Information on the Price of Structures

8.36 Many countries have new construction price indices available on a quarterly basis. This is the case for the Netherlands. (14) If one is willing to make the assumption that construction costs for houses have the same rate of growth over the study period across all cities in the Netherlands, the information on construction costs can be used to eliminate the multicollinearity problem encountered in the previous sections.

8.37 Recall equations (8.10)-(8.12) above. These are the estimating equations for the unrestricted hedonic regression model based on costs of production. In the present section, the constant quality price parameters for the structures, the γt for t = 2,…,14 in (8.10)-(8.12), are replaced by the following numbers, which involve only the single unknown parameter γ1: (15)

where μt is the statistical agency’s construction cost price index for the location and the type of house under consideration, normalized to equal 1 in quarter 1. The new hedonic regression model is again defined by equations (8.10)-(8.12) except that the 14 unknown γt parameters are now defined by (8.20), so that only γ1 needs to be estimated. The number of parameters to be estimated in this new restricted model is 44 whereas the old number was 57.

8.38 Using the data for the town of “A”, the estimated decade depreciation rate was δ^ = 0.1028 (0.00433). The R2 for this model was .8849, a small drop from the previous restricted spline model, where the R2 was .8859, and a larger drop from the unrestricted spline model R2 in section 8.5, which was .8875. The log likelihood was -10640.1, a decrease of 10 over the monotonicity restricted model. The first period parameter estimates for the 3 marginal prices for land are now β^s1 = 215.4 (30.0), β^M1 = 362.6 (46.7) and β^L1 = 176.4 (28.4). They differ slightly from the previous figures. The first period parameter estimate for the quality adjusted structures is γ^1 = 1085.9 (22.9) Euros/m2, which is significantly higher than the unrestricted estimate of 980.5 Euros/m2. So the imposition of a (nationwide) growth rate on the change in the price of quality adjusted structures has had some effect on the estimates for the levels of land and structures prices.

8.39 As usual, equations (8.13)-(8.19) were used in order to construct a chained Fisher index of land prices, which is denoted by PL4. This index is plotted in Figure 8.4 and listed in Table 8.4. As for the previous three models, the estimated price in quarter t for a square meter of quality adjusted structures is γ^t (which now equals γ^tμt) and the corresponding quantity is St*Σn=1N(t)(1-δ^Ant)Snt. These structures price and quantity series were again combined with the three land price and quantity series to form a chained overall Fisher house price index P4, which is graphed in Figure 8.4 and listed in Table 8.4. The constant quality structures price index PS4 (a normalization of the series γ^1,,γ^14 is also presented.

8.40 A comparison of Figures 8.3 and 8.4 shows that the imposition of the national growth rates for new dwelling construction costs has changed the nature of the land and structures price indices: in Figure 8.3, the price series for land lies below the overall house price series for most of the sample period while in Figure 8.4, the pattern is reversed: the price series for land lies above the overall house price series for most of the sample period (and vice versa for the price of structures). But which model is best? Although the previous model can be preferred on statistical grounds because the log likelihood is somewhat higher, we would nevertheless prefer the present model that uses of exogenous information on structures prices because it yields a more plausible pattern of price changes for land and structures.

Figure 8.4.The Price of Land (PL4), the Price of Quality Adjusted Structures (PS4) and the Overall House Price Index using Exogenous Information on the Price of Structures (P4)

Source: Authors’ calculations based on data from the Dutch Land Registry

Table 8.4.The Price of Land (PL4), the Price of Quality Adjusted Structures (PS4) and the Overall House Price Index using Exogenous Information on the Price of Structures (P4)
QuarterPL4PS4P4
11.000001.000001.00000
21.138640.992911.04373
31.165261.015181.06752
41.042141.039471.03889
51.118931.007091.04628
61.181831.017211.07541
71.235011.012151.09121
81.132571.015181.05601
91.212041.034411.09701
101.195451.044531.09727
111.177471.068831.10564
121.115881.092111.09815
131.050701.113361.08863
141.096481.113361.10486
Source: Authors’ calculations based on data from the Dutch Land Registry
Source: Authors’ calculations based on data from the Dutch Land Registry

Choosing the “Best” Overall Index

8.41 This section is concluded by listing and charting our four “best” overall indices: the chained stratified sample Fisher index PFCH constructed in Chapter 4, the chained hedonic imputation Fisher index PHIF studied in Chapter 5, the index P3 that resulted from the cost based hedonic regression model with monotonicity restrictions constructed earlier, and the index P4 that resulted from the cost based hedonic regression model using exogenous information on the price of structures studied in the present section. As can be seen from Figure 8.5, all four indices paint much the same picture. Note that P3 and P4 are virtually identical.

Figure 8.5.House Price Indices Using Exogenous Information (P4) and Using Monotonicity Restrictions (P3), the Chained Fisher Hedonic Imputation Index and the Chained Fisher Stratified Sample Index

Source: Authors’ calculations based on data from the Dutch Land Registry

8.42 All things considered, the hedonic imputation index PHIF is our preferred index since it has fewer restrictions than the other indices and seems closest to a matched model index in spirit, followed by the two cost of production hedonic indices P4 and P3, followed by the stratified sample index PFCH. The latter likely suffers from some unit value bias. Hedonic indices can be biased too (if important explanatory variables are omitted or if an “incorrect” functional form is chosen), but in general we would prefer hedonic regression methods over stratification methods. If separate land and structures indices are required, we are in favour of the cost based hedonic regression model that uses exogenous information on the price of structures.

Table 8.5.House Price Indices Using Exogenous Information (P4) and Using Monotonicity Restrictions (P3), the Chained Fisher Hedonic Imputation Index and the Chained Fisher Stratified Sample Index
QuarterP4P3PHIFPFCH
11.000001.000001.000001.00000
21.043731.041481.043561.02396
31.067521.064571.067461.07840
41.038891.036271.038341.04081
51.046281.043161.047941.04083
61.075411.071681.075531.05754
71.091211.089611.094601.07340
81.056011.054081.061581.06706
91.097011.095031.101741.08950
101.097271.096251.104111.11476
111.105641.105521.114001.12471
121.098151.097341.108881.10483
131.088631.087521.098241.10450
141.104861.104271.116301.11189
Source: Authors’ calculations based on data from the Dutch Land Registry
Source: Authors’ calculations based on data from the Dutch Land Registry

Rolling Window Hedonic Regressions

8.43 A problem with the hedonic regression model discussed in the previous section (and all other hedonic models discussed in this Handbook except hedonic imputation models) was mentioned in Chapter 5: when more data are added, the indices generated by the model change. This feature of these regression based methods makes these models unsatisfactory for statistical agency use, where users expect the official numbers to remain unchanged as time passes. Users may tolerate a few revisions to recent data but typically, they would not like all the numbers to be revised back into the indefinite past as new data become available. A simple solution to this problem is available, however, the so-called rolling window approach. This approach will be outlined in more detail and applied to the cost based hedonic regression model that uses exogenous information on the price of structures.

8.44 First, one chooses a “suitable” number of time periods (equal to or greater than two) where it is thought that the hedonic model yields “reasonable” results; this will be the window length (say M periods) for the sequence of regression models which will be estimated. Secondly, an initial regression model is estimated and the appropriate indices are calculated using data pertaining to the first M periods in the data set. Next, a second regression model is estimated where the data consist of the initial data less the data for period 1 but adding the data for period M+1. Appropriate price indices are calculated for this new regression model but only the rate of increase of the index going from period M to M+1 is used to update the previous sequence of M index values. This procedure is continued with each successive regression dropping the data of the previous earliest period and adding the data for the next period, with one new update factor being added with each regression. If the window length is a year, then this procedure is called a rolling year hedonic regression model; for a general window length, it is called a rolling window hedonic regression model. (16)

8.45 Using the data for the town of “A”, the rolling window procedure was applied with a window length of 9 quarters. The hedonic regression model defined by equations (8.10)-(8.12) and (8.20) was initially estimated for the first 9 quarters. The resulting price indices for land and for constant quality structures and the overall index are denoted by PRWIA, PRWS4 and PRW4 and are listed in the first 9 rows of Table 8.6. (17) Next, a regression covering quarters 2-10 was run and the resulting land, structures and overall price indices were used to update the initial indices; i.e., the price of land in quarter 10 of Table 8.6 is equal to the price of land in quarter 9 times the price relative for land (quarter 10 land index divided by the quarter 9 land index) obtained from the regression covering quarters 2-10, etc. Similar updating was done for the next 4 quarters using regressions covering quarters 3-11, 4-12, 5-13 and 6-14.

8.46 The rolling window indices can be compared to the corresponding indices based on the data pertaining to all 14 quarters constructed in the previous section by looking at Table 8.6. Recall that the estimated depreciation rate and the estimated quarter 1 price of quality adjusted structures for the last model were δ^ = 0.1028 and γ^1 = 1085.9, respectively. If by chance the 6 rolling window hedonic regressions generated the exact same estimates for δ and γ, then the indices resulting from the rolling window regressions would coincide with the indices PL4, PS4 and P4. The estimates for δ generated by the 6 rolling window regressions are 0.10124, 0.10805, 0.11601, 0.11103, 0.10857 and 0.10592. The estimates for γ1 generated by the 6 rolling window regressions are 1089.6, 1103.9, 1088.1, 1101.0, 1123.5 and 1100.9. While these estimates are not identical to the corresponding estimates of 0.1028 and 1085.9 for P4, they are fairly close. So we can expect the rolling window indices to be close to their counterparts for the last model in the previous section. The R2 values for the 6 rolling window regressions were .8803, .8813, .8825, .8852, .8811 and .8892.

8.47 The rolling window series for the price of quality adjusted structures, PRWS, is not listed in Table 8.6 since it is identical to the series PS4.(18) The rolling window price series for land, PRWL, is extremely close to its counterpart PL4, and the overall rolling window price series for detached dwellings in the town of “A”, PRW, is also close to its counterpart P4. The corresponding series in Table 8.6 are so close to each other that we decided not to provide a chart.

Table 8.6.The Price of Land (PL4), the Price of Quality Adjusted Structures (PS4), the Overall House Price Index using Exogenous Information on the Price of Structures (P4) and their Rolling Window Counterparts (PRWL) and (PRW)
QuarterPRWLPL4PRWp4PS4
11.000001.000001.000001.000001.00000
21.140731.138641.043811.043730.99291
31.167561.165261.067661.067521.01518
41.042801.042141.039091.038891.03947
51.120551.118931.046351.046281.00709
61.183921.181831.075421.075411.01721
71.237831.235011.091231.091211.01215
81.134081.132571.056021.056011.01518
91.214171.212041.096981.097011.03441
101.197721.195451.097381.097271.04453
111.185231.177471.107181.105641.06882
121.118891.115881.097791.098151.09201
131.051911.050701.088931.088631.11335
141.096051.096481.104361.104861.11335
Source: Authors’ calculations based on data from the Dutch Land Registry
Source: Authors’ calculations based on data from the Dutch Land Registry

8.48 Using the data for the town of “A”, rolling window hedonic regressions gave much the same results as a hedonic regression that covers the whole sample period. This supports our view that the rolling window approach can be used by statistical agencies to compile an RPPI based on hedonic regressions, including a decomposition into land and structures components.

The Construction of Price Indices for the Stock of Dwelling Units

8.49 This section shows how hedonic regression models can be used to form an approximate RPPI for the stock of dwelling units. We will first look at the hedonic imputation model discussed in Chapter 5 and compare the resulting index with an approximate stock based index using the stratification approach.

The Hedonic Imputation Model

8.50 Recall that the hedonic imputation model was defined by equations (5.25), where Lnt, Snt and Ant denoted, respectively, the land area, structure area, and age (in decades) of property n sold in period t. To form a price index for the stock of detached houses in the town of “A”, it would in principle be necessary to know L, S and A for all detached houses in “A” during some base period. This information is not available to us, but we can treat the total number of detached houses sold over the sample period as an approximation to the stock of this type.(19) In our data set there were N(1) + N(2) +…+ N(14) = 2289 of such transactions. (20)

8.51 The estimated parameters for land size, structure size and depreciation in quarter t are denoted by β^t, γ^t and δ^t ; α^t denotes the constant term. Our approximation to the total value of the housing stock for quarter t, Vt, is defined as

That is, Vt is (approximated by) the imputed value of all houses traded during the 14 quarters in our sample, where the regression coefficients from the quarter t hedonic imputation model given by (5.25) serve as weights for the characteristics of each house. Dividing the Vt series by the value for quarter 1, V1, is our first estimated stock price index, Pstock1, for the town of “A”.(21) This is a form of a Lowe index; see the CPI Manual (2004) for the properties of Lowe indices. In Table 8.7 and Figure 8.6 this price index for the stock of houses is compared with the corresponding sales based Fisher hedonic imputation price index, PHIF.

Figure 8.6.Approximate Stock Price Indices and Based on Hedonic Imputation (PStock1) and Stratification (PStock2) and the Fisher Hedonic Imputation Sales Price Index

Source: Authors’ calculations based on data from the Dutch Land Registry

Table 8.7.Approximate Stock Price Indices and Based on Hedonic Imputation (Pstock1) and Stratification (Pstock2) and the Fisher Hedonic Imputation Sales Price Index
QuarterPStock1PStock2pHIF
11.000001.000001.00000
21.047911.027121.04356
31.072551.079861.06746
41.041311.032571.03834
51.050401.052901.04794
61.075491.059341.07553
71.095941.077121.09460
81.063161.071721.06158
91.101371.083591.10174
101.107081.114821.10411
111.112891.126161.11430
121.104621.112911.10888
131.092781.107641.09824
141.113701.106861.11630
Source: Authors’ calculations based on data from the Dutch Land Registry
Source: Authors’ calculations based on data from the Dutch Land Registry

8.52 An additional approximate stock price index based on stratification, PStock2 is also graphed in Figure 8.6 and listed in Table 8.7. This index uses the unit value prices for the nonempty cells in the stratification scheme in each quarter, as explained in Chapter 4, and uses the imputed prices based on the hedonic imputation regressions from Chapter 5 for the empty cells in each quarter. The quantity vector used for PStock2 is the (sample) total quantity vector by cell, which makes PStoct2 an alternative Lowe price index. It can be seen that while PSock2 has the same general trend as PStoc1 and PHIF, it differs substantially from these hedonic imputation indices during several quarters. These differences are due to the existence of some unit value bias in the stratification indices. Thus, although stratification indices can be constructed for the stock of dwelling units of a certain type and location (with the help of hedonic imputation for empty cells), it appears that the resulting stock indices will not be as accurate as indices that are entirely based on the use of hedonic regressions. (22)

The Use of Exogenous Information on the Price of Structures

8.53 The same kind of construction of an approximate stock price index can be applied to the other hedonic regression models discussed in this chapter. Here we will show how this works for the cost based model that used exogenous information on the price of structures. This model was defined by equations (8.10)-(8.12) and (8.20). Recall that the sets of period t sales of small, medium and large lot houses were denoted by Ss(t), SM (t) and SL(t), respectively; the total number of sales in period t was denoted by N(t) for t = 1,…,14. The estimated model parameters are δ^t, γ^t and β^s1, β^M1 and β^L1 for t = 1,…,14. The estimated period t values of all small, medium and large lot houses traded over the 14 quarters, VLSt, VLMt and VLLt, respectively, are defined by (8.22)-(8.24):

The estimated period t value of quality adjusted structures, Vst, is defined by

where all structures traded during the 14 quarters are included.

8.54 The quantities that correspond to the above period t valuations of the three land stocks and the stock of structures are defined as follows: (23)

8.55Approximate stock prices, PLSt, PLMt, PLLt, and PSt, that correspond to the values and quantities defined by (8.22)-(8.29), can be computed in the usual way:

Using the above prices and quantities, an approximate stock index of land prices, PLStock, is formed by aggregating the three types of land and an approximate constant quality stock price index for structures, PSStock, is simply formed by normalizing the series pst. The approximate overall stock index, PStock, is obtained by aggregating the three types of land with the constant quality structures (or, equivalently, by aggregating PLStock and PSStock). Since the quantities are constant over all 14 quarters, the Laspeyres, Paasche and Fisher price indices are all equal. (24) The stock price indices PLStock, PSStock and PStock are charted in Figure 8.7 and listed in Table 8.8. For comparison purposes, the corresponding price indices based on sales of properties for the model presented previously, PL4, PS4 and P4, are also listed in Table 8.8. As can be seen from Table 8.8, the approximate stock price index for structures PSStock coincides with the sales based price index for constant quality structures PS4, so PS4 is not charted in Figure 8.7.

Figure 8.7.Approximate Price Indices for the Stock of Houses (PStock), the Stock of Land (PLStock), the Stock of Structures (PSStock) and the Corresponding Sales Indices (PL4and P4)

Source: Authors’ calculations based on data from the Dutch Land Registry

Table 8.8.Approximate Price Indices for the Stock of Houses (PStock), the Stock of Land (PLStock), the Stock of Structures (PSStock) and the Corresponding Sales Indices (PL4 and P4)
QuarterPStockp4PLStockpL4PSStockPS4
11.000001.000001.000001.000001.000001.00000
21.043311.043731.132791.138640.992910.99291
31.067981.067521.161711.165261.015181.01518
41.040421.038891.042091.042141.039471.03947
51.047671.046281.119731.118931.007091.00709
61.075401.075411.178731.181831.017211.01721
71.091921.091211.233571.235011.012151.01215
81.057631.056011.132991.132571.015181.01518
91.098291.097011.211711.212041.034411.03441
101.100651.097271.200291.195451.044531.04453
111.105921.105641.171781.177471.068831.06883
121.100381.098151.115071.115881.092111.09211
131.089341.088631.046681.050701.113361.11336
141.107771.104861.097841.096481.113361.11336
Source: Authors’ calculations based on data from the Dutch Land Registry
Source: Authors’ calculations based on data from the Dutch Land Registry

8.56 The overall approximate price index for the total stock of detached houses in the town of “A” (PStock) can hardly be distinguished from the corresponding overall sales price index (P4) in Figure 8.7. Similarly, the approximate price index for the stock of land in “A” (PLStock) can barely be distinguished in Figure 8.7 from the corresponding sales price index for land (PL4). Nevertheless, there are small differences between the stock and sales indices, as Table 8.8 shows.

8.57 Our conclusion is that the hedonic regression models for the sales of houses can readily be adapted to compute Lowe type price indices for the stock of houses. There do not appear to be major differences between the two index types when using our data set, but this result may not hold for other data sets.

McMillen (2003) discusses a Cobb Douglas demand side model. On identification issues in hedonic regression models, see Rosen (1974).

Following Muth (1971), Thorsnes (1997; 101) has a related cost of production model. He assumed that the value of the property under consideration in period t, ρt, is equal to the price of housing output in period t, ρt, times the quantity of housing output H(L,K) where the production function H is a CES function. Thus Thorsnes assumed that pt = ρt H(L,K) = ρt [αLσ + βKσ]1/σ where ρt, σ, α and β are parameters, L is the lot size of the property and K is the amount of structures capital (in constant quality units). Our problem with this model is that there is only one independent time parameter ρt whereas our model has two, βt and γt for each t, which allow the price of land and structures to vary freely between periods.

See for example Meese and Wallace (1991; 320) who found that the age variable in their hedonic regression model had the wrong sign.

The model defined by (8.2) can be converted into a linear regression model.

The model becomes a modified adjacent period hedonic regression model for T = 2.

This generalization was suggested by Diewert (2007).

Separate hedonic regressions may also be run for different types of property as well as for different locations. However, cost considerations may mean that a comprehensive system of regressions covering all properties in the country cannot be implemented so that there will only be a sample of representative hedonic regressions. The aggregation issues in the sampling case are too complex to be considered here; the exact details for constructing a national index would depend on the nature of the sampling design.

As was the case for stratification methods, fixed base or chained indices could be constructed. Rolling window hedonic regressions could also be run. The rolling window approach will be explained later.

Model (8.9) is similar in structure to the hedonic imputation model described earlier except that the present model is more parsimonious; there is only one depreciation rate, as opposed to 14 depreciation rates in the imputation model defined by equations (5.25), and there is no constant term. The important factor in both models is that the prices of land and quality adjusted structures are allowed to vary independently across time periods.

This approach follows that of Diewert, de Haan and Hendriks (2010) (2011). The use of linear splines to model nonlinearities in the price of land as a function of lot size is due to Francke (2008).

Some direct evidence on this assertion will be presented in the following section.

This method for imposing monotonicity restrictions was used by Diewert, de Haan and Hendriks (2010) with the difference that they imposed monotonicity on both structures and land prices, whereas here, monotonicity restrictions are imposed on structures prices only.

From the Statistics Netherlands (2010) online source, Statline, the following series was downloaded for the New Dwellings Output Price Index for the 14 quarters in our sample of house sales: 98.8, 98.1, 100.3, 102.7, 99.5, 100.5, 100.0, 100.3, 102.2, 103.2, 105.6, 107.9, 110.0, 110.0. This series was normalized to 1 in the first quarter by dividing each entry by 98.8. The resulting series is denoted by μ1 (=1), μ2,…,μ14

The technique suggested here for decomposing property prices into land and structures components can be viewed as a variant of a technique used by Davis and Heathcote (2007) and Davis and Palumbo (2008).

This procedure was recently used by Shimizu, Nishimura and Watanabe (2010) and Shimizu, Takatsuji, Ono and Nishimura (2010) in their hedonic regression models for Tokyo house prices. An analogous procedure has also been recently applied by Ivancic, Diewert and Fox (2011) and de Haan and van der Grient (2011) in their adaptation of the GEKS method for making international comparisons to the scanner data context.

We imposed the restrictions (33) on the rolling window regressions and so the rolling window constant quality price index for structures, PRWS, is equal to the constant quality price index for structures listed in Table 8.4, PS4.

By construction, PS4 and PRWS are both equal to the official Statistics Netherlands construction price index for new dwellings, μt1 for t = 1,…,14.

This approximation would probably be an adequate one if the sample period were a decade or so. Obviously, our sample period of 14 quarters is too short to be accurate and there are also sample selectivity problems, i.e., newer houses will be over represented. However, the method we are suggesting here can be illustrated using this rough approximation.

We did not delete the observations for houses that were transacted multiple times over the 14 quarters since a particular house transacted during two or more of the quarters is not actually the same house due to depreciation and renovations.

Since Vt is a value, it does not appear to be a price series at first glance. But in each quarter, the quantity vector which underlies this value is a vector of ones of dimension 2289, which is constant over the 14 quarters. Hence Vt can also be interpreted as a price series, which is normalized to equal one in quarter 1.

If the imputed prices are used for every one of the 45 cell prices for each period (instead of just for the zero transaction cells as was the case for the construction of PSctock2) and the same total sample quantity vector is used as the approximate stock quantity vector, then the resulting Lowe index turns out to be exactly equal to PStock1. Thus these two different ways for constructing a stock index turn out to be equivalent. The fact that PStock1 is not equal to PStock2 is clear evidence that there is unit value bias in the cells of the stratification scheme: the cells are simply not defined narrowly enough.

The quantities defined by (8.26)-(8.29), which are constant over the 14 quarters, are equal to 77455, 258550, 253590 and 238476 for small lots, medium size lots, large lots and structures, respectively.

Fixed base and chained Laspeyres, Paasche and Fisher indices are also equal under these circumstances.

    Other Resources Citing This Publication