The Difference Between Hedonic Imputation Indexes and Time Dummy Hedonic Indexes
  • 1 0000000404811396https://isni.org/isni/0000000404811396International Monetary Fund

Contributor Notes

Author(s) E-Mail Address: msilver@imf.org

Statistical offices try to match item models when measuring inflation between two periods. For product areas with a high turnover of differentiated models, however, the use of hedonic indexes is more appropriate since they include the prices and quantities of unmatched new and old models. The two main approaches to hedonic indexes are hedonic imputation (HI) indexes and dummy time hedonic (DTH) indexes. This study provides a formal analysis of the difference between the two approaches for alternative implementations of the Törnqvist "superlative" index. It shows why the results may differ and discusses the issue of choice between these approaches.

Abstract

Statistical offices try to match item models when measuring inflation between two periods. For product areas with a high turnover of differentiated models, however, the use of hedonic indexes is more appropriate since they include the prices and quantities of unmatched new and old models. The two main approaches to hedonic indexes are hedonic imputation (HI) indexes and dummy time hedonic (DTH) indexes. This study provides a formal analysis of the difference between the two approaches for alternative implementations of the Törnqvist "superlative" index. It shows why the results may differ and discusses the issue of choice between these approaches.

I. Introduction

This paper outlines and compares the two main and quite distinct approaches to the measurement of hedonic price indexes: dummy time hedonic indexes and hedonic imputation indexes (also referred to as “characteristic price index numbers,” Triplett, 2004). Both approaches not only correct price changes for changes in the quality of items purchased, but also allow the indexes to incorporate matched and unmatched models. They provide a means by which price changes can be measured in product markets where there is a rapid turnover of differentiated models. However, they can yield quite different results. This paper provides a formal exposition of the factors underlying such differences and the implications for choice of method. This is undertaken for the Törnqvist index, a superlative formula. As will be explained below, superlative index number formulas, which include the Fisher index, have desirable properties and provides results similar to each other.

The standard way price changes are measured by national statistical offices is through the use of the matched models method. In this method the details and prices of a representative selection of items are collected in a base reference period and their matched prices collected in successive periods so that the prices of “like” are compared with “like.” If, however, there is a rapid turnover of available models, then the sample of product prices used to measure price changes becomes unrepresentative of the category as a whole. This is as a result of both new unmatched models being introduced (but not included in the sample), and older unmatched models being retired (and thus dropping out of the sample). Hedonic indexes use matched and unmatched models and in doing so put an end to the matched models sample selection bias (see Cole, et al., 1986; Silver and Heravi, 2003 and 2005; and Pakes, 2003). The need for hedonic indexes can be seen in the context of the need to reduce bias in the measurement of the U.S. consumer price index (CPI), which has been the subject of three major reports—the Stigler Committee (1961), Boskin Commission (1996), and the Committee on National Statistics (2002) called the Schultze panel. Each found the inability to properly remove the effect on price changes of changes in quality to be a major source of bias. Hedonic regressions were considered to be the most promising approach to control for such quality changes, although the Schultze panel cautioned for the need for further research on methodology:

Hedonic techniques currently offer the most promising approach for explicitly adjusting observed prices to account for changing product quality. But our analysis suggests that there are still substantial unresolved econometric, data, and other measurement issues that need further attention. (Committee on National Statistics, 2002, p. 6).

At first sight, the two approaches to hedonic indexes appear quite similar. Both rely on hedonic regression equations to remove the effects on price of quality changes. They can also incorporate a range of weighting systems, can be formulated as a geometric, harmonic, or arithmetic aggregator function, and as chained or direct, fixed-base comparisons. Yet they can give quite different results, even when using comparable weights, functional forms, and the same periodic comparison. This is because they work on different principles. The dummy variable method constrains hedonic regression parameters to be the same over time. A hedonic imputation index paradoxically relies on parameter change as the essence of the measure.

There has been some valuable research on the two approaches (see Berndt and Rappaport 2001; Diewert, 2002; Silver and Heravi, 2003; Pakes, 2003); Haan, 2004; and Triplett, 2004), although no formal analysis, to the author’s knowledge, of the factors governing the differences between the approaches. Berndt and Rappaport (2001) and Pakes (2003) have highlighted the fact that the two approaches can give different results, and both advise the use of hedonic imputation indexes when parameters are unstable, a proposal considered in section 5.

This paper first examines the alternative formulations of the two main methods, in Section II, and then, in Section III, develops an expression for their differences. Section IV discusses the practical issue of choice between the approaches in light of the findings, and Section V concludes.

II. Hedonic Indexes

A hedonic regression equation of the prices of i = 1, …,N models of a product, pi, on their quality characteristics zki, where zk = 1,….,K price-determining characteristics, is given in a log-linear form by:

lnpi=γ0+K=1Kβkzki+εi.(1)

The βk are estimates of the marginal valuations the data ascribes to each characteristic (Rosen, 1974; Griliches, 1988; and Triplett, 1987; see also Diewert, 2003; and Pakes, 2003). Statistical offices use hedonic regressions for CPI measurement when a model is no longer sold and a price adjustment for the quality difference is needed. This adjustment is in order that the price of the original model can be compared with that of a non-comparable replacement model. Silver and Heravi (2001) refer to this as “patching.” However, it is only when a model is missing that a new replacement is found, and this is on a one-to-one basis. In dynamic markets, such as personal computers (PCs), old models regularly leave the market and new ones are regularly introduced, not necessarily on a one-to-one basis. There is a need to incorporate the prices of all unmatched models of differing quality and hedonic indexes provide the required measures.

A. Hedonic Imputation (HI) Indexes

Hedonic imputation (hereafter—HI) indexes take a number of forms: (i) as either equally-weighted or weighted indexes; (ii) depending on the functional form of the aggregator, say a geometric aggregator as against an arithmetic one; (iii) with regard to which period’s characteristic set is held constant; and (iv) as direct binary comparisons between periods 0 and t, or as chained indexes. For chained indexes the individual links are calculated between periods 0 and 1, 1 and 2,…, t – 1 and t, and the results combined by successive multiplication.

We consider in this section, as equations (2) and (3) respectively, hedonic Laspeyres and Paasche indexes—weighted, arithmetic, constant base (Laspeyres), and current period (Paasche), aggregators for binary comparisons—and then focus on a generalized hedonic Törnqvist index, given by equation (4). The Törnqvist index is a weighted, geometric aggregator which makes symmetric use of base and current information in binary comparisons. It is a superlative index and, thus, has highly desirable properties. An index number is defined as exact when it equals the true cost of living index for a consumer whose preferences are represented by a particular functional form. A superlative index is defined as an index that is exact for a flexible functional form that can provide a second-order approximation to other twice-differentiable functions around the same point. Superlative indexes are generally symmetrical with respect to their use of information from the two time periods (see Diewert, 2004). Fisher and Walsh index formulas are also superlative indexes and closely approximate the Törnqvist index. The Fisher index is the preferred target index in the international CPI Manual (Diewert, 2004, chapters 15–18).

We start by outlining the hedonic formulations of the well-known Laspeyres and Paasche indexes. Consider the hedonic function p^i0=h0(zi0) from the semi-logarithmic form of (1), estimated in period 0 with a vector of K quality characteristics zi0=zi10,,ziK0 and N0observations and similarly for period 1. Let quantities sold in periods 0 and 1 be qi0 and qi1 respectively.

A hedonic Laspeyres index for matched and unmatched period 0 models is given by:

PHLas=i=1N0h1(zi0)qi0i=1N0h0(zi0)qi0(2)

and a hedonic Paasche index for matched and unmatched period 1 models by:

PHLas=i=1Nth1(zi1)qi1i=1Nth0(zi1)qi1.(3)

It is apparent from equations (2) and (3) that a hedonic Laspeyres index holds characteristics constant in the base period and a hedonic Paasche index holds the characteristics constant in the current period. Thus the differences between the hedonic valuations in Laspeyres and Paasche are dictated by the extent to which the characteristics change over time; that is, (zi1zi0). The farther the zi values differ over time, say due to greater technological change, the less justifiable is the use of an individual estimate and the less faith there is in a compromise geometric mean of the two indexes—a Fisher index. Note that new (unmatched) models available in period 1, but not in period 0, are excluded from equation (2) and old (unmatched) models available in period 0, but not in period 1, are excluded from equation (3). Laspeyres and Paasche HI indices suffer from a sample selectivity bias.

Let i ϵ St (t = 0,1) be the set of models available in period t. Let i ϵ SM ≡ S0 ∩ S1 be the set of matched models with common characteristics zim=zi0=zi1 in both periods 0 and 1.

Unmatched new models present in period 1, but not in period 0 are given by i ϵ S1 (1¬0); and unmatched old models present in period 0, but not in period 1, by i ϵ S0(0¬1). Let the number of models in these respective sets be denoted by NM, N0 (0¬1) and N1 (1¬0). A hedonic Törnqvist index, for matched models only, is given by the first term on the right-hand-side of equation (4). The hedonic Törnqvist index is generalized to include disappearing and new models by the respective inclusion of the second and third terms on the right-hand-side of equation (4). An alternative and equivalent formulation to equation (4) would be to include only these last two terms, but with the products taken over iϵ S0 and i ϵ S1 respectively. However, we use equation (4) as it provides a more detailed, and analytically useful, decomposition of the price changes of the different sets of models.

A generalized hedonic Törnqvist index is given by:

PHTörnqvist=[isMh1(zim)isMh0(zim)]s˜im×[is0(0¬1)h1(zi0)is0(0¬1)h0(zi0)]si02×[is1(1¬0)h1(zi1)is1(1¬0)h0(zi1)]si12(4)

where relative expenditure shares for model i in period t are given by sit=pitqit/jpjtqjt for t=0,1 and expenditure shares for matched models m are an average of those in periods 0 and 1, that is, s˜im=(si0+si1)/2 for i ϵ SM. Note that s˜im (for i ϵ SM) plus si0/2 (for i ϵ S0 (0-1) ) plus si1/2 (for iϵ S1 (1-0)) sum to unity. For illustration, if the expenditure shares of matched models were 0.6 and 0.7 in periods 0 and 1, respectively; of unmatched old models in period 0, 0.4; and unmatched new models in period 1, 0.3; then is˜im=0.65, isi0/2=0.2 and isi1/2=0.15.

Of note is that estimated prices are used for matched models; a good case can be made for using actual prices for matched models when available (Haan, 2004, p. 2). Equation (4) is a (superlative) Törnqvist HI index generalized to include new and disappearing models. In Section II. B. a dummy time hedonic index will be identified as an alternative approach to estimating a generalized Törnqvist hedonic index. The issue addressed by the paper is to identify an expression for the differences between the hedonic imputation and dummy time hedonic approaches. As will be seen, an econometric device is useful in this respect which requires we work with predicted, rather than actual, prices for matched models, although this paper is not alone in this (Pakes, 2003).

B. Dummy Time Hedonic (DTH) Indexes

Dummy time hedonic (hereafter—DTH) indexes are a second approach to estimating price changes that use hedonic regressions to control for the different quality mix of new and disappearing models. As with HI indexes, DTH indexes do not require a matched sample. In this section we show how the generalized Törnqvist hedonic index in (4) can be estimated as a DTH index. The DTH formulation is similar to equation (1) except that a single regression is estimated on the data in the two time periods, 0 and 1 compared, i ∈ St for t = 0,1. The prices, pit, for each model i, are regressed on a dummy variable D0it which is equal to 1 in period 1 and zero otherwise, and zkit=1,.,K, price-determining characteristics in a regression with well-behaved residuals εit:

lnpit=δ0+δ1D0it+k=1kβkzkit+εitfor iStand t=0,1.(5)

The exponent of the estimated coefficient δ1* is an estimate of the quality-adjusted price change between period 0 and period 1 regardless of the reference quality vector. Consider for simplicity the case of only matched models where there is no need for the quality characteristics zkit in (5). Then δ1*=1/ni=1n(ln pi1ln pi0) and exp (δ1*) is the geometric mean of pi1/pi0 with an adjustment, as detailed in van Garderen and Shah (2002). Also note that for unmatched models, since βk0=βk1=βk in (5), the value of δ1* is invariant to the level of zkit— the lines of the functions of ln pit on zkit for periods t = 0,1 are parallel and the shift intercept constant.

It may at first be although that weighted indexes such as the target Törnqvist index cannot be compared with DTH indexes, as in (5), since the latter are unweighted (equally weighted). However, Diewert (2002 and 2005) shows that if a weighted least squares (WLS) estimator is applied to (5), the resulting estimate of price change will correspond to a weighted index number formula. More particularly, the formulation of the weights for the WLS estimator dictates which index number formula the DTH estimate corresponds to. A WLS estimator is equivalent to an OLS estimator applied to data which have been repeated in line with their weight, akin to repeated sampling. A DTH price change estimate based on a WLS estimator, with weights s˜im=(si0+si1)/2 for matched models and si0/2 or si1/2 for the unmatched old and new models respectively, corresponds to a generalized Törnqvist index (Diewert, 2005). In section 3.1 use is made of this weighting structure to derive and compare generalized DTH and HI Törnqvist index estimates.

The regression equation (5) constrains each of the βk coefficients to be the same across the two periods compared. In restricting the slopes to be the same, the (log of the) price change between periods 0 and 1 can be measured at any value of z, as illustrated by the difference between the dashed lines in Figure 1. For convenience it is first evaluated at the origin as δ1* Bear in mind that the HI indexes outlined above estimate the differences between price surfaces with different slopes. As such, the estimates have to be conditioned on particular values of z, which gives rise to the two estimates (whose arithmetic equivalents are) considered in (2) and (3): the base HI using z0 and the current period HI using z1 as shown in Figure 1. The very core of the DTH method is to constrain the slope coefficients to be the same, so there is no need to condition on particular values of z. The DTH estimates implicitly and usefully make symmetric use of base and current period data. As with hedonic imputation indexes, DTH indexes can take fixed and chained base forms, although they can also take a fully constrained form whereby a single constrained regression is estimated for say January to December with dummy variables for each month, although this is impractical in real time since it requires data on future observations.

III. Why Hedonic Imputation and Dummy Time Hedonic Indexes Differ

A. Algebraic Differences: A Reformulation of the Hedonic Indexes

There has been little analytical work undertaken on the factors governing differences between the two approaches. To compare the HI approach to the DTH approach we first need to reformulate the HI indexes. We note that the HI approach relies on two estimated hedonic equations, h1(zi1) and h0(zi0) for periods 0 and 1 respectively:

lnpi1=γ01+k=1Kβk1zki1+εi1(6)
lnpi0=γ00+k=1Kβk0zki0+εi0(7)

We assume that the errors in each equation are similarly distributed, then phrase the two equations as a single hedonic regression equation with dummy time intercept and slope variables:

lnpit=γ00+γ1D0it+k=1Kβk0zkit+k=1KβkDkit+εitfor iStand t=0,1(8)

where D0it=1 if observations are in period 1 and 0 otherwise, γ1=(γ01γ00), Dkit=zki1 if observations are in period 1 and 0 otherwise, and βk=(βk1βk0). The estimated γ1* is an estimate of the change in the intercepts of the two hedonic price equations and is thus an HI index evaluated at a particular value of zkit,iSt, i ϵ St and t = 0,1; let this value be denoted by z˜kt which is equal to zero at the intercept. An HI index evaluated at z˜kt=0 has no economic meaning.

For our phrasing of a HI index in (10) to correspond to the generalized hedonic Törnqvist index in (4) two things are required. First, a weighted least squares (WLS) estimator should be used to estimate γ1 from equation (8) with weights s˜im=(si0+si1)/2 for matched models and si0/2 or si1/2 for unmatched old and new models respectively (Diewert, 2002 and 2005). Second, the estimate of γ1 in (8) is at the intercept, while the generalized HI Törnqvist index (4) requires it be evaluated at the mean value of zkit implicit in the generalized Törnqvist HI index of (4):

Z˜kt=Z¯kTörnt=iSM(Zkim)s˜ki×iS0(0¬1)(Zki0)Ski02×iSt(1¬0)(Zki1)Ski12.(9)

For a Törnqvist HI index the γ1* estimate is evaluated at z˜kt=z¯kTörnt. This requires an adjustment to the γ1*estimate.

The required generalized HI Törnqvist index is given by the exponent of:

γ1*+k=1Kβk*z¯kTörntfor iStand t=0,1(10)

where βk* is a WLS estimate of (βk1βk0). Annex 1 demonstrates, by way of Figure 2, that, for a single, k = 1 variable, z, k=1kβk*z¯kTörnt is the required adjustment.

Consider now the DTH index in (5) which constrains βk=(βk1βk0)=0 in (8) and thus k=1kβkDki to be zero. The DTH index in (5) corresponds to a generalized DTH Törnqvist index if estimated using WLS where the weights are those outlined after equation (5) above. A natural question is how does the estimated DTH index δ1* in (5), which is invariant to values of zkit, differ from the HI index evaluated at the means z¯kTörnt in (10)?

B. How Does a Törnqvist HI Index Differ from a Törnqvist DTH Index?

This difference is first considered by comparing γ1* from (8), the HI index, and δ1* from (5), the DTH index. We are interested in the difference in these two estimated intercept shifts, where; z˜kt=0; between the estimated constant-quality shift parameters from the constrained (DTH) and unconstrained (HI) regression equation (5) and equation (8) respectively. Being intercept shifts, the difference between these two indexes will be determined at the origin, where z˜kt=0. This is useful as a first stage in the derivation. However, we then extend the analysis to examine how the expressions differ at, more usefully, the mean z¯kTörnt from (9). We now turn to a consideration of the difference (δ1*γ1*) between these two dummy variable parameter estimates at the origin, as ‘omitted variable bias’ due to the omission of k=1kβkDki in (8).

Expressions for the bias in estimated regression parameters due to the omission of relevant variables are well established (see Davidson and McKinnon, 1993). The bias for a, for example, parameter estimate of β1 in a regression equation: y = β0 + β1 x1 + β2 x2 + u from a regression that excludes x2 is equal to the coefficient on the excluded variable, β2, multiplied by the coefficient on the included variable, α1, from an auxiliary regression of the excluded on the included variable, i.e. x2 = α0 + α1x1 + ω. Consider a simplified case of (8) of a single k = 1 characteristic and two time periods, the principles being readily extended. The auxiliary regression is the slope dummy variable, D1it(=z1i1 if period 1 and 0 otherwise, in (8)), regressed on the remaining right-hand-side variables in (8), the intercept dummy D0it and the z1it characteristic with an error term ωit:

D1it=λ00+λ1D0it+λ20z1it+ωit.(11)

Omitted variable bias is the product of the estimated coefficient on the omitted variable, β1* (for k = 1 in (8)), and the estimated coefficient λ1* from the above regression (11)Davidson and McKinnon (1993). Thus the difference (before taking exponents) DTH minus HI, at the intercept is (δ1*γ1*)=β1*×λ1*.

Our next concern is to derive this difference at z¯1Törnt rather than at z˜kt=0. Since the DTH method holds the parameter estimates constant through any value of zkt, the (log of the) DTH index is thus given by δ1*=β1*×λ1*+γ1*.

However, the (log of the) Törnqvist HI index at z¯1Törnt from (8) for one variable is estimated as:

γ1*+β1*z¯1Törnt.(12)

Thus the ratio of the DTH and HI indexes at the intercept is exp(δ1*)/exp(γ1*)=exp(β1*×λ1*) and the DTH index is thus given by exp(δ1*)=exp(β1*×λ1*)×exp(γ1*). The Törnqvist HI index at z¯1Törnt from (8) for one variable is estimated as: exp(γ1*+β1*z¯1Törnt). Thus the ratio of the Törnqvist DTH index to the Törnqvist HI index at z¯1Törnt is:

(Töornqvist DTH indexTöornqvist HI index)z¯1Törnt=exp[(β1*1β1*0)(λ1*z¯1Törnt)](13)

where β1*1 and β1*0 are WLS estimates.

If either of the two terms making up the product on the right-hand-side is close to zero then there will be little difference between the indexes. Neither parameter instability nor a change in the mean characteristic is sufficient in itself to lead to a difference between the formulas. The β1*=(βk*1βk*0) from (5) is the estimated marginal valuation of the characteristic between periods 0 and 1, which can be positive or negative, but may be more generally althought to be negative to represent diminishing marginal utility/cost of the characteristic.

C. Interpretation of (λ1*z¯1Törnt)

Equation (13) shows us that the change in the coefficients is one factor determining difference between the two methods. The second expression, (λ1*z¯1Törnt), is more difficult to interpret and we consider it here. Bear in mind that the left-hand-side of the regression in equation (11), D1it is 0 in period 0 and z1i1 in period 1 and that D0it on the right-hand-side is 1 in period 1 and zero in period 0. If we assume quality characteristics are positive, λ1 will always be positive as the change from 0 in period 0 to their values in period 1. Consider the weighted (Törnqvist) mean z¯1Törn0 for i ϵ SM and i є S0(0¬t) (matched period 0 and unmatched old period 0) and z¯1Törn1 for i ϵ S M and i ϵ St (1¬0) (matched period 1 and unmatched new models in period 1).

If we assume for simplicity that z¯1Törn0=z¯1Törn1, then λ1*z¯1Törn1 since λ1* is an estimate of the change in D1it in (11) arising from changing from period 0 to period 1, where it is z¯1Törn1 and has an expectation of z¯1Törn1. The λ1* estimate is conditioned in (11) on z¯1Törnt, the change from z¯1Törn0 to z¯1Törn1, but since we assume these two have not changed, our estimate of λ1*z¯1Törn1 holds true. Thus the second part of the difference expression in (13), (λ1*z¯1Törnt), is simply (z¯1Törn1z¯1Törnt) which, given our assumption of z¯1Törn0=z¯1Törn1, is equal to 0. It follows from the right-hand side of (13), that for samples with negligible change in the mean values of the characteristics, the DTH and HI will be similar irrespective of any parameter instability. Diewert (2002) and Aizcorbe (2003) have shown that the DTH and HI indexes will be the same for matched models and this analysis gives support to their finding. However, we find first, that it is not matching per se that dictates the relationship; for unmatched models all that is required is that z¯1Törn1=z¯1Törnt which may occur without matching—it simply requires the means of the characteristics not to change. Second, that even when the means change the two approaches will be equal if βk*1βk*0=0 i.e. there is parameter stability. Finally, it follows that if either of the two right-hand side expressions in (13) are large, the differences between the indexes will be compounded.

But what if z¯1Törn0z¯1Törn1 ? Since the estimated coefficient on x1 of a regression of y on x1 and x2 is given by: yx1x22yx2x1x2x12x22(x1x2)2, the estimated coefficient λ1* from (11) is given

by:

λ1*=z¯11(NN1)σz2Ncov(D1it,z1it)(z¯11z¯1t)(NN1)σz2(z¯11z¯1t)2N1(14)

for an unweighted regression where N1 and N are the respective number of observations in period 1 and both periods 0 and 1, σz2 is the variance of z andcov(D1it,z1it)=(zt)2N1z¯1z¯ is the covariance of D1it and z1it from (11). Readers are reminded that from (13):

(Törnqvist DTH indexTörnqvist HI index)z¯Törnt=exp[(β1*1β1*0)(λ1*z¯Törn1)].(15)

First, as noted, if there is either negligible parameter instability or a negligible change in the mean of the characteristic, then there will be little difference between the formulas. However, as parameter instability increases and the change in the mean characteristic increases, the multiplicative effect on the difference between the indexes is compounded. The likely direction and magnitude of any difference is not immediately obvious. Assume diminishing marginal valuations of characteristics, so that (β1*1β1*0)0. Second, even assuming a positive technological advance, (z¯11z¯1t)0 and given (NN1), cov(D1it,z1it),z¯11 and σz2 are positive, it remains difficult to establish from (14) the effect on λ1* of changes in its constituent parts. However, third, as N1 becomes an increasing share of N, and at the limit if N1 takes up all of N (i.e. (N – N1) 0), then cov(D1it,z1it)σz2 and importantly, z¯11z¯1t and the difference between the formulas tends to zero. Note that (14) is based on an OLS estimator and for a WLS Törnqvist estimator similar principles apply, although the determining factor for a DTH index to exceed a HI index is for the weights of new models to be increasing, a much more reasonable scenario. Thus the nature and extent of any differences between the two indexes will, aside from the parameter change, also depend on (i) changes in the mean quality of models (z¯11z¯1t), (ii) the relative number of models in each period, (NN1), (iii) the dispersion in z, σz2 (iv) the mean of the characteristics in period 1,z¯11 ,(v) the cov(D1it,z1it) which as (NN1) 0, i.e. N1 takes up all of N, then cov(D1it,z1it)σz2 and z¯11=z¯1 and λ1*=0.

D. Treatment of Unmatched Observations

Diewert (2002) and Aizcorbe (2003) show that while the DTH and HI indexes will be the same for matched models, they differ in their treatment of unmatched data. Consider hedonic functions hi1(zi1) and hi0(zi0) for periods 1 and 0 respectively, as in (2) and (3), and a (constrained) time dummy regression equation (5). Consider further an unmatched observation only available in period 1. A base period HI index such as (2) would exclude it, while a current period HI index, such as (3), would include it, and a geometric mean of the two would give it half the weight in the calculation of that of a matched observation. A Törnqvist hedonic index, (4), would also give an unmatched model half the weight of a matched one. For a DTH index, such as (5), an unmatched period 1 model would appear only once in period 1, in the estimation of constrained parameters, as opposed to twice for matched data. We would therefore expect superlative HI indexes, such as (4), to be closer to DTH indexes than their constituent elements, (2) and (3), because they make symmetric use of the data.

E. Observations With Undue Influence

HI indexes, such as (2) and (3), explicitly incorporate weights. Silver (2002) has shown that weights are implicitly incorporated in DTH indexes by means of the OLS or WLS estimator used. Silver (2002) has further shown, for DTH indexes, that the manner in which the estimator incorporates the weights may not fully represent the weights, due to adverse influence and leverage effects generated by observations with unusual characteristics and above average residuals.

F. Chaining

Chained base HI indexes are preferred to fixed base ones, especially when matched samples degrade rapidly. In such a case, their use reduces the spread between Laspeyres and Paasche indexes. However, caution is advised in the use of chained monthly series when prices may oscillate around a trend (i.e. ‘bounce’) and as a result, chained indexes can ‘drift’ (Forsyth and Fowler, 1981 and Szulc, 1983).

IV. Choice Between Hedonic Indexes and Dummy Time Hedonic Indexes

The main concern with the DTH index approach as given by equation (5) is that by construction, it constrains the parameters on the characteristic variables to be the same. The HI indexes have no such constraint. Berndt and Rappaport (2001) found, for example, from 1987 to 1999 for desktop PCs, the null hypothesis of adjacent-year equality to be rejected in all but one case. For mobile PCs the null hypothesis of parameter stability was rejected in eight of the 12 adjacent-year comparisons. Berndt and Rappaport (2001) preferred the use of HI indexes if there was evidence of parameter instability. Pakes (2003), using quarterly data for hedonic regressions for desktop PCs over the period 1995 to 1999, rejected just about any hypothesis on the constancy of the coefficients. He also advocated HI indexes on the grounds that “…. since hedonic coefficients vary across periods it [the DTH index approach] has no theoretical justification.” Pakes (2003: 1593).

The concern over parameter instability for DTH methods is warranted. Consider constraining the estimated coefficients in a DTH index to either β1*1 or β1*0, the index is likely to give quite different results, and this difference is a form of “spread.” A DTH index constrains the parameters to be the same, an average of the two. There is a sense in which we have more confidence in an index based on constraining similar parameters, than one based on constraining two disparate parameters. Equation (15) showed that (β1*1β1*0) was a determining factor in the nature and magnitude of any difference between the HI and DTH indexes.

However, equation (15) also showed how the ratio of DTH and HI indexes was not solely dependent on parameter instability. It depended on the exponent + of the product of two components: the change over time in the (WLS estimated) hedonic coefficients and the difference in (statistics that relate to) the (weighted) mean values of the characteristic. Even if parameters were unstable, the difference between the indexes may be compounded or mitigated by the change in the other component.

Note that base and current period HI indexes, (2) and (3), can differ as a result of using a constant zi1 as against a constant zi0. Diewert (2002) has argued that HI indexes have the disadvantage that two distinct estimates will be generated and it is somewhat arbitrary how these two estimates are to be averaged to form a single estimate of price change. Of course Diewert (2003) also resolves this very problem by considering superlative hedonic indexes, that is by not using just the base or just the current period characteristic configurations, but a symmetric average such as (4).

Yet there is a sense that in constraining the coefficients to be the same, the DTH index performs a similar averaging function, but with the parameter estimates. Rather than using a base or current period coefficient set, it constrains them to form an average. There is then the question of which form of constraint is preferred: averaging the characteristic set (HI) or the coefficients (DTH)?

There is much in the theory of superlative index numbers that argues for taking a symmetric mean of the characteristic quantities or value shares. The result is that HI indexes fall more neatly into existing index number theory. At least in this sense they are to be preferred, although (15) provides useful insights into their differences.

V. Conclusions

It is recognized that extensive product differentiation with a high model turnover is an increasing feature of product markets (Triplett, 1999). The motivation of this paper lay in the failure of the matched models method to adequately deal with price measurement in this context and the need for hedonic indexes as the most promising alternative (Schultze and Mackie, 2002). The paper first, developed in Section II a Törnqvist, generalized, hedonic index, that is a Törnqvist index number formula which was generalized to deal with matched and unmatched models and used hedonic regressions to control for quality changes. The paper second, considered HI and DTH indexes as the two main approaches to estimating a Törnqvist hedonic index. That the two approaches can yield quite different results is of concern. In Section III the paper provided a formal exposition of the factors underlying the difference between the two approaches. It was shown that differences between the two approaches may arise from both parameter instability and changes in the characteristics and such differences are compounded when both occur. It further showed that similarities between the two approaches resulted if there was little difference in either component.

Consideration of the issue of choice between the two approaches was based in Section IV on minimizing parameter instability as a concept of spread. The analysis led to the advice that (i) either the DTH or HI index approaches are acceptable if either the parameters are relatively stable or the values of the characteristic set do not change much over time; otherwise, (ii) HI indexes are preferred when there is evidence of parameter instability. Superlative formulations, such as the Törnqvist HI index, are well grounded in index number theory and more intuitively acceptable than a DTH index, which constrains the parameters to be the same, for which there is less obvious justification.

APPENDIX

Hedonic Imputation Index Estimate at a Törnqvist Mean

The required estimate is depicted below in Figure 2 as the vertical difference between the two hedonic functions, h1(zi1) and h0(zi0) at a common z¯kTörnt. This is given by A +B-C in Figure 2. A is β1z¯kTörnt where β1 is the slope of h1(zi1); B is estimated by γ1*; and C is estimated by β0z¯kTörnt where β0 is the slope of h0(zi0). A+B-C is estimated by γ1*+βk*z¯kTörnt where βk* is a WLS estimate of (βktβk0) equation (10) generalizes this for more than one k.

References

  • Aizcorbe, A., 2003, “The Stability of Dummy Variable Price Measures Obtained from Hedonic Regressions” (Unpublished; Washington: Federal Reserve Board).

    • Search Google Scholar
    • Export Citation
  • Berndt, E.R. and N.J. Rappaport, 2001, “Price and Quality of Desktop and Mobile Personal Computers: A Quarter-Century Historical Overview,” American Economic Review, Vol. 91, No. 2, pp. 26873.

    • Search Google Scholar
    • Export Citation
  • Boskin, M.S. (Chair) Advisory Commission to Study the Consumer Price Index 1996, Towards a More Accurate Measure of the Cost of living, Interim Report to the Senate Finance Committee, Washington D.C.

    • Search Google Scholar
    • Export Citation
  • Cole, R., Y.C. Chen, J.A. Barquin-Stolleman, E. Dulberger, N. Helvacian, and J.H. Hodge, 1986, “Quality-Adjusted Price Indexes for Computer Processors and Selected Peripheral Equipment,” Survey of Current Businesses, Vol. 66, No. 1, January, pp. 4150.

    • Search Google Scholar
    • Export Citation
  • Committee on National Statistics, 2002, At What Price? Conceptualising and Measuring Cost-of-Living and Price Indexes, Panel on Conceptual, Measurement and Other Statistical Issues in Developing Cost-of-Living Indexes, ed. by Charles Schultze and Chris Mackie (Washington: Committee on National Statistics, National Academy Press).

    • Search Google Scholar
    • Export Citation
  • Davidson, J.R. and J.G. Mackinnon, 1993, Estimation and Inference in Econometrics, (Oxford: Oxford University Press).

  • Diewert W.E., 2002, “Hedonic Regressions: A Review of Some Unresolved Issues” (Unpublished; Vancouver: Department of Economics, University of British Columbia).

    • Search Google Scholar
    • Export Citation
  • Diewert W.E., 2003, “Hedonic Regressions: A Consumer Theory Approach,” in Scanner Data and Price Indexes, ed. by Mathew Shapiro and Rob Feenstra, National Bureau of Economic Research, Studies in Income and Wealth, Vol. 61 (Chicago: University of Chicago Press) pp. 31748.

    • Search Google Scholar
    • Export Citation
  • Diewert W.E., 2004, Chapters 15-20 in Consumer Price Index Manual: Theory and Practice, (Geneva: International Labour Office). Available via the Internet at www.ilo.org/public/english/bureau/stat/guides/cpi/index.htm.

    • Search Google Scholar
    • Export Citation
  • Diewert W.E., 2005, “Weighted Country Product Dummy Variable Regressions and Index Number Formulas,” Review of Income and Wealth, Vol. 51, No. 4, December, pp. 56170.

    • Search Google Scholar
    • Export Citation
  • Forsyth, F.G., and R.F. Fowler, 1981, “The Theory and Practice of Chain Price Index Numbers,” Journal of the Royal Statistical Society, Series A, Vol. 144, No. 2, pp. 22447.

    • Search Google Scholar
    • Export Citation
  • Griliches, Z., 1988, “Postscript on Hedonics,” in Technology, Education, and Productivity, ed. by Z. Griliches, New York: Basil Blackwell, Inc., 119122.

    • Search Google Scholar
    • Export Citation
  • Haan, J. de, 2003, “Time Dummy Approaches to Hedonic Price Measurement,” Paper presented at the Seventh Meeting of the International Working Group on Price Indices, May, 27–29 (Paris: INSEE). Available via the Internet at http://www.ottawagroup.org/meet.shtml.

    • Search Google Scholar
    • Export Citation
  • Haan, J. de, 2004, “Hedonic Regressions: The Time Dummy Index as a Special Case of the Törnqvist Index, Time Dummy Approaches to Hedonic Price Measurement,” Paper presented at the Eighth Meeting of the International Working Group on Price Indices, August, 23–25 (Helsinki: Statistics Finland). Available via the Internet at http://www.ottawagroup.org/meet.shtml.

    • Search Google Scholar
    • Export Citation
  • Pakes A., 2003, “A Reconsideration of Hedonic Price Indexes with an Application to PCs,” The American Economic Review, Vol. 93, No. 5, December, pp. 157693.

    • Search Google Scholar
    • Export Citation
  • Rosen, S, 1974, “Hedonic Prices and Implicit Markets: Product Differentiation and Pure Competition,” Journal of Political Economy, Vol. 82, pp. 3449.

    • Search Google Scholar
    • Export Citation
  • Schultze, C. and C. Mackie, 2002, (editors), see Committee on National Statistics, 2002, op. cit.

  • Silver, M., 2002, “The Use of Weights in Hedonic Regressions: the Measurement of Quality-Adjusted Price Changes” (Unpublished; Cardiff: Cardiff Business School, Cardiff University).

    • Search Google Scholar
    • Export Citation
  • Silver, M. and S. Heravi, 2001, “Scanner Data and the Measurement of Inflation,” The Economic Journal, Vol. 11, June, pp. 384405.

    • Search Google Scholar
    • Export Citation
  • Silver, M. and S. Heravi, 2003, “The Measurement of Quality-Adjusted Price Changes,” in Scanner Data and Price Indexes, ed. by M. Shapiro and R. Feenstra, National Bureau of Economic Research, Studies in Income and Wealth, Vol. 61 (Chicago: University of Chicago Press) pp. 277317.

    • Search Google Scholar
    • Export Citation
  • Silver, M. and S. Heravi, 2005, “Why the CPI Matched Models Method May Fail Us: Results from an Hedonic and Matched Experiment Using Scanner Data,” Journal of Business and Economic Statistics, Vol. 23, No. 3, pp. 269-81.

    • Search Google Scholar
    • Export Citation
  • Stigler, G., 1961, “The Price Statistics of the Federal Government,” Report to the Office of Statistical Standards, Bureau of the Budget (New York: NBER).

    • Search Google Scholar
    • Export Citation
  • Szulc, B. J., 1983, “Linking Price Index Numbers,” in Price Level Measurement, pp. 53766, (Ottawa: Statistics Canada).

  • Szulc, B. J., 1987, “Hedonic Functions and Hedonic Indexes,” in The New Palgrave: A Dictionary of Economics, ed. by J. Eatwell, M. Milgate, and P. Newman (New York: Stockton Press) pp. 630634.

    • Search Google Scholar
    • Export Citation
  • Szulc, B. J., 1999, “The Solow Productivity Paradox: What do Computers do to Productivity?,” Canadian Journal of Economics, Vol. 32, No. 2, April, pp. 30934.

    • Search Google Scholar
    • Export Citation
  • Szulc, B. J., 2004, Handbook on Hedonic Indexes and Quality Adjustments in Price Indexes, Directorate for Science, Technology and Industry (Paris: Organisation for Economic Cooperation and Development).

    • Search Google Scholar
    • Export Citation
  • van Garderen K.R. and C. Shah, 2002, “Exact Interpretation of Dummy Variables in Semi-Logarithmic Equations,” Econometric Journal, Vol. 5, No. 1, pp. 14959.

    • Search Google Scholar
    • Export Citation
1

Cardiff University. We acknowledge useful comments from Paul Armknecht (IMF), Ernst Berndt (MIT), Erwin Diewert (University of British Columbia), Kevin Fox (UNSW), Jan de Haan (Statistics Netherlands), and two anonymous referees for the Journal of Business and Economic Statistics.