Abstract
Statistical offices try to match item models when measuring inflation between two periods. For product areas with a high turnover of differentiated models, however, the use of hedonic indexes is more appropriate since they include the prices and quantities of unmatched new and old models. The two main approaches to hedonic indexes are hedonic imputation (HI) indexes and dummy time hedonic (DTH) indexes. This study provides a formal analysis of the difference between the two approaches for alternative implementations of the Törnqvist "superlative" index. It shows why the results may differ and discusses the issue of choice between these approaches.
I. Introduction
This paper outlines and compares the two main and quite distinct approaches to the measurement of hedonic price indexes: dummy time hedonic indexes and hedonic imputation indexes (also referred to as “characteristic price index numbers,” Triplett, 2004). Both approaches not only correct price changes for changes in the quality of items purchased, but also allow the indexes to incorporate matched and unmatched models. They provide a means by which price changes can be measured in product markets where there is a rapid turnover of differentiated models. However, they can yield quite different results. This paper provides a formal exposition of the factors underlying such differences and the implications for choice of method. This is undertaken for the Törnqvist index, a superlative formula. As will be explained below, superlative index number formulas, which include the Fisher index, have desirable properties and provides results similar to each other.
Hedonic techniques currently offer the most promising approach for explicitly adjusting observed prices to account for changing product quality. But our analysis suggests that there are still substantial unresolved econometric, data, and other measurement issues that need further attention. (Committee on National Statistics, 2002, p. 6).
At first sight, the two approaches to hedonic indexes appear quite similar. Both rely on hedonic regression equations to remove the effects on price of quality changes. They can also incorporate a range of weighting systems, can be formulated as a geometric, harmonic, or arithmetic aggregator function, and as chained or direct, fixed-base comparisons. Yet they can give quite different results, even when using comparable weights, functional forms, and the same periodic comparison. This is because they work on different principles. The dummy variable method constrains hedonic regression parameters to be the same over time. A hedonic imputation index paradoxically relies on parameter change as the essence of the measure.
There has been some valuable research on the two approaches (see Berndt and Rappaport 2001; Diewert, 2002; Silver and Heravi, 2003; Pakes, 2003); Haan, 2004; and Triplett, 2004), although no formal analysis, to the author’s knowledge, of the factors governing the differences between the approaches. Berndt and Rappaport (2001) and Pakes (2003) have highlighted the fact that the two approaches can give different results, and both advise the use of hedonic imputation indexes when parameters are unstable, a proposal considered in section 5.
This paper first examines the alternative formulations of the two main methods, in Section II, and then, in Section III, develops an expression for their differences. Section IV discusses the practical issue of choice between the approaches in light of the findings, and Section V concludes.
II. Hedonic Indexes
A hedonic regression equation of the prices of i = 1, …,N models of a product, pi, on their quality characteristics zki, where zk = 1,….,K price-determining characteristics, is given in a log-linear form by:
The βk are estimates of the marginal valuations the data ascribes to each characteristic (Rosen, 1974; Griliches, 1988; and Triplett, 1987; see also Diewert, 2003; and Pakes, 2003). Statistical offices use hedonic regressions for CPI measurement when a model is no longer sold and a price adjustment for the quality difference is needed. This adjustment is in order that the price of the original model can be compared with that of a non-comparable replacement model. Silver and Heravi (2001) refer to this as “patching.” However, it is only when a model is missing that a new replacement is found, and this is on a one-to-one basis. In dynamic markets, such as personal computers (PCs), old models regularly leave the market and new ones are regularly introduced, not necessarily on a one-to-one basis. There is a need to incorporate the prices of all unmatched models of differing quality and hedonic indexes provide the required measures.
A. Hedonic Imputation (HI) Indexes
Hedonic imputation (hereafter—HI) indexes take a number of forms: (i) as either equally-weighted or weighted indexes; (ii) depending on the functional form of the aggregator, say a geometric aggregator as against an arithmetic one; (iii) with regard to which period’s characteristic set is held constant; and (iv) as direct binary comparisons between periods 0 and t, or as chained indexes. For chained indexes the individual links are calculated between periods 0 and 1, 1 and 2,…, t – 1 and t, and the results combined by successive multiplication.
We consider in this section, as equations (2) and (3) respectively, hedonic Laspeyres and Paasche indexes—weighted, arithmetic, constant base (Laspeyres), and current period (Paasche), aggregators for binary comparisons—and then focus on a generalized hedonic Törnqvist index, given by equation (4). The Törnqvist index is a weighted, geometric aggregator which makes symmetric use of base and current information in binary comparisons. It is a superlative index and, thus, has highly desirable properties. An index number is defined as exact when it equals the true cost of living index for a consumer whose preferences are represented by a particular functional form. A superlative index is defined as an index that is exact for a flexible functional form that can provide a second-order approximation to other twice-differentiable functions around the same point. Superlative indexes are generally symmetrical with respect to their use of information from the two time periods (see Diewert, 2004). Fisher and Walsh index formulas are also superlative indexes and closely approximate the Törnqvist index. The Fisher index is the preferred target index in the international CPI Manual (Diewert, 2004, chapters 15–18).
We start by outlining the hedonic formulations of the well-known Laspeyres and Paasche indexes. Consider the hedonic function
A hedonic Laspeyres index for matched and unmatched period 0 models is given by:
and a hedonic Paasche index for matched and unmatched period 1 models by:
It is apparent from equations (2) and (3) that a hedonic Laspeyres index holds characteristics constant in the base period and a hedonic Paasche index holds the characteristics constant in the current period. Thus the differences between the hedonic valuations in Laspeyres and Paasche are dictated by the extent to which the characteristics change over time; that is,
Let i ϵ St (t = 0,1) be the set of models available in period t. Let i ϵ SM ≡ S0 ∩ S1 be the set of matched models with common characteristics
Unmatched new models present in period 1, but not in period 0 are given by i ϵ S1 (1¬0); and unmatched old models present in period 0, but not in period 1, by i ϵ S0(0¬1). Let the number of models in these respective sets be denoted by NM, N0 (0¬1) and N1 (1¬0). A hedonic Törnqvist index, for matched models only, is given by the first term on the right-hand-side of equation (4). The hedonic Törnqvist index is generalized to include disappearing and new models by the respective inclusion of the second and third terms on the right-hand-side of equation (4). An alternative and equivalent formulation to equation (4) would be to include only these last two terms, but with the products taken over iϵ S0 and i ϵ S1 respectively. However, we use equation (4) as it provides a more detailed, and analytically useful, decomposition of the price changes of the different sets of models.
A generalized hedonic Törnqvist index is given by:
where relative expenditure shares for model i in period t are given by
Of note is that estimated prices are used for matched models; a good case can be made for using actual prices for matched models when available (Haan, 2004, p. 2). Equation (4) is a (superlative) Törnqvist HI index generalized to include new and disappearing models. In Section II. B. a dummy time hedonic index will be identified as an alternative approach to estimating a generalized Törnqvist hedonic index. The issue addressed by the paper is to identify an expression for the differences between the hedonic imputation and dummy time hedonic approaches. As will be seen, an econometric device is useful in this respect which requires we work with predicted, rather than actual, prices for matched models, although this paper is not alone in this (Pakes, 2003).
B. Dummy Time Hedonic (DTH) Indexes
Dummy time hedonic (hereafter—DTH) indexes are a second approach to estimating price changes that use hedonic regressions to control for the different quality mix of new and disappearing models. As with HI indexes, DTH indexes do not require a matched sample. In this section we show how the generalized Törnqvist hedonic index in (4) can be estimated as a DTH index. The DTH formulation is similar to equation (1) except that a single regression is estimated on the data in the two time periods, 0 and 1 compared, i ∈ St for t = 0,1. The prices,
The exponent of the estimated coefficient
It may at first be although that weighted indexes such as the target Törnqvist index cannot be compared with DTH indexes, as in (5), since the latter are unweighted (equally weighted). However, Diewert (2002 and 2005) shows that if a weighted least squares (WLS) estimator is applied to (5), the resulting estimate of price change will correspond to a weighted index number formula. More particularly, the formulation of the weights for the WLS estimator dictates which index number formula the DTH estimate corresponds to. A WLS estimator is equivalent to an OLS estimator applied to data which have been repeated in line with their weight, akin to repeated sampling. A DTH price change estimate based on a WLS estimator, with weights
The regression equation (5) constrains each of the βk coefficients to be the same across the two periods compared. In restricting the slopes to be the same, the (log of the) price change between periods 0 and 1 can be measured at any value of z, as illustrated by the difference between the dashed lines in Figure 1. For convenience it is first evaluated at the origin as
III. Why Hedonic Imputation and Dummy Time Hedonic Indexes Differ
A. Algebraic Differences: A Reformulation of the Hedonic Indexes
There has been little analytical work undertaken on the factors governing differences between the two approaches. To compare the HI approach to the DTH approach we first need to reformulate the HI indexes. We note that the HI approach relies on two estimated hedonic equations,
We assume that the errors in each equation are similarly distributed, then phrase the two equations as a single hedonic regression equation with dummy time intercept and slope variables:
where
For our phrasing of a HI index in (10) to correspond to the generalized hedonic Törnqvist index in (4) two things are required. First, a weighted least squares (WLS) estimator should be used to estimate γ1 from equation (8) with weights
For a Törnqvist HI index the
The required generalized HI Törnqvist index is given by the exponent of:
where
Consider now the DTH index in (5) which constrains
B. How Does a Törnqvist HI Index Differ from a Törnqvist DTH Index?
This difference is first considered by comparing
Expressions for the bias in estimated regression parameters due to the omission of relevant variables are well established (see Davidson and McKinnon, 1993). The bias for a, for example, parameter estimate of β1 in a regression equation: y = β0 + β1 x1 + β2 x2 + u from a regression that excludes x2 is equal to the coefficient on the excluded variable, β2, multiplied by the coefficient on the included variable, α1, from an auxiliary regression of the excluded on the included variable, i.e. x2 = α0 + α1x1 + ω. Consider a simplified case of (8) of a single k = 1 characteristic and two time periods, the principles being readily extended. The auxiliary regression is the slope dummy variable,
Omitted variable bias is the product of the estimated coefficient on the omitted variable,
Our next concern is to derive this difference at
However, the (log of the) Törnqvist HI index at
Thus the ratio of the DTH and HI indexes at the intercept is
where
If either of the two terms making up the product on the right-hand-side is close to zero then there will be little difference between the indexes. Neither parameter instability nor a change in the mean characteristic is sufficient in itself to lead to a difference between the formulas. The
C. Interpretation of
Equation (13) shows us that the change in the coefficients is one factor determining difference between the two methods. The second expression,
If we assume for simplicity that
But what if
by:
for an unweighted regression where N1 and N are the respective number of observations in period 1 and both periods 0 and 1,
First, as noted, if there is either negligible parameter instability or a negligible change in the mean of the characteristic, then there will be little difference between the formulas. However, as parameter instability increases and the change in the mean characteristic increases, the multiplicative effect on the difference between the indexes is compounded. The likely direction and magnitude of any difference is not immediately obvious. Assume diminishing marginal valuations of characteristics, so that
D. Treatment of Unmatched Observations
Diewert (2002) and Aizcorbe (2003) show that while the DTH and HI indexes will be the same for matched models, they differ in their treatment of unmatched data. Consider hedonic functions
E. Observations With Undue Influence
HI indexes, such as (2) and (3), explicitly incorporate weights. Silver (2002) has shown that weights are implicitly incorporated in DTH indexes by means of the OLS or WLS estimator used. Silver (2002) has further shown, for DTH indexes, that the manner in which the estimator incorporates the weights may not fully represent the weights, due to adverse influence and leverage effects generated by observations with unusual characteristics and above average residuals.
F. Chaining
Chained base HI indexes are preferred to fixed base ones, especially when matched samples degrade rapidly. In such a case, their use reduces the spread between Laspeyres and Paasche indexes. However, caution is advised in the use of chained monthly series when prices may oscillate around a trend (i.e. ‘bounce’) and as a result, chained indexes can ‘drift’ (Forsyth and Fowler, 1981 and Szulc, 1983).
IV. Choice Between Hedonic Indexes and Dummy Time Hedonic Indexes
The main concern with the DTH index approach as given by equation (5) is that by construction, it constrains the parameters on the characteristic variables to be the same. The HI indexes have no such constraint. Berndt and Rappaport (2001) found, for example, from 1987 to 1999 for desktop PCs, the null hypothesis of adjacent-year equality to be rejected in all but one case. For mobile PCs the null hypothesis of parameter stability was rejected in eight of the 12 adjacent-year comparisons. Berndt and Rappaport (2001) preferred the use of HI indexes if there was evidence of parameter instability. Pakes (2003), using quarterly data for hedonic regressions for desktop PCs over the period 1995 to 1999, rejected just about any hypothesis on the constancy of the coefficients. He also advocated HI indexes on the grounds that “…. since hedonic coefficients vary across periods it [the DTH index approach] has no theoretical justification.” Pakes (2003: 1593).
The concern over parameter instability for DTH methods is warranted. Consider constraining the estimated coefficients in a DTH index to either
However, equation (15) also showed how the ratio of DTH and HI indexes was not solely dependent on parameter instability. It depended on the exponent + of the product of two components: the change over time in the (WLS estimated) hedonic coefficients and the difference in (statistics that relate to) the (weighted) mean values of the characteristic. Even if parameters were unstable, the difference between the indexes may be compounded or mitigated by the change in the other component.
Note that base and current period HI indexes, (2) and (3), can differ as a result of using a constant
Yet there is a sense that in constraining the coefficients to be the same, the DTH index performs a similar averaging function, but with the parameter estimates. Rather than using a base or current period coefficient set, it constrains them to form an average. There is then the question of which form of constraint is preferred: averaging the characteristic set (HI) or the coefficients (DTH)?
There is much in the theory of superlative index numbers that argues for taking a symmetric mean of the characteristic quantities or value shares. The result is that HI indexes fall more neatly into existing index number theory. At least in this sense they are to be preferred, although (15) provides useful insights into their differences.
V. Conclusions
It is recognized that extensive product differentiation with a high model turnover is an increasing feature of product markets (Triplett, 1999). The motivation of this paper lay in the failure of the matched models method to adequately deal with price measurement in this context and the need for hedonic indexes as the most promising alternative (Schultze and Mackie, 2002). The paper first, developed in Section II a Törnqvist, generalized, hedonic index, that is a Törnqvist index number formula which was generalized to deal with matched and unmatched models and used hedonic regressions to control for quality changes. The paper second, considered HI and DTH indexes as the two main approaches to estimating a Törnqvist hedonic index. That the two approaches can yield quite different results is of concern. In Section III the paper provided a formal exposition of the factors underlying the difference between the two approaches. It was shown that differences between the two approaches may arise from both parameter instability and changes in the characteristics and such differences are compounded when both occur. It further showed that similarities between the two approaches resulted if there was little difference in either component.
Consideration of the issue of choice between the two approaches was based in Section IV on minimizing parameter instability as a concept of spread. The analysis led to the advice that (i) either the DTH or HI index approaches are acceptable if either the parameters are relatively stable or the values of the characteristic set do not change much over time; otherwise, (ii) HI indexes are preferred when there is evidence of parameter instability. Superlative formulations, such as the Törnqvist HI index, are well grounded in index number theory and more intuitively acceptable than a DTH index, which constrains the parameters to be the same, for which there is less obvious justification.
APPENDIX
Hedonic Imputation Index Estimate at a Törnqvist Mean
The required estimate is depicted below in Figure 2 as the vertical difference between the two hedonic functions,
References
Aizcorbe, A., 2003, “The Stability of Dummy Variable Price Measures Obtained from Hedonic Regressions” (Unpublished; Washington: Federal Reserve Board).
Berndt, E.R. and N.J. Rappaport, 2001, “Price and Quality of Desktop and Mobile Personal Computers: A Quarter-Century Historical Overview,” American Economic Review, Vol. 91, No. 2, pp. 268–73.
Boskin, M.S. (Chair) Advisory Commission to Study the Consumer Price Index 1996, Towards a More Accurate Measure of the Cost of living, Interim Report to the Senate Finance Committee, Washington D.C.
Cole, R., Y.C. Chen, J.A. Barquin-Stolleman, E. Dulberger, N. Helvacian, and J.H. Hodge, 1986, “Quality-Adjusted Price Indexes for Computer Processors and Selected Peripheral Equipment,” Survey of Current Businesses, Vol. 66, No. 1, January, pp. 41–50.
Committee on National Statistics, 2002, At What Price? Conceptualising and Measuring Cost-of-Living and Price Indexes, Panel on Conceptual, Measurement and Other Statistical Issues in Developing Cost-of-Living Indexes, ed. by Charles Schultze and Chris Mackie (Washington: Committee on National Statistics, National Academy Press).
Davidson, J.R. and J.G. Mackinnon, 1993, Estimation and Inference in Econometrics, (Oxford: Oxford University Press).
Diewert W.E., 2002, “Hedonic Regressions: A Review of Some Unresolved Issues” (Unpublished; Vancouver: Department of Economics, University of British Columbia).
Diewert W.E., 2003, “Hedonic Regressions: A Consumer Theory Approach,” in Scanner Data and Price Indexes, ed. by Mathew Shapiro and Rob Feenstra, National Bureau of Economic Research, Studies in Income and Wealth, Vol. 61 (Chicago: University of Chicago Press) pp. 317–48.
Diewert W.E., 2004, Chapters 15-20 in Consumer Price Index Manual: Theory and Practice, (Geneva: International Labour Office). Available via the Internet at www.ilo.org/public/english/bureau/stat/guides/cpi/index.htm.
Diewert W.E., 2005, “Weighted Country Product Dummy Variable Regressions and Index Number Formulas,” Review of Income and Wealth, Vol. 51, No. 4, December, pp. 561–70.
Forsyth, F.G., and R.F. Fowler, 1981, “The Theory and Practice of Chain Price Index Numbers,” Journal of the Royal Statistical Society, Series A, Vol. 144, No. 2, pp. 224–47.
Griliches, Z., 1988, “Postscript on Hedonics,” in Technology, Education, and Productivity, ed. by Z. Griliches, New York: Basil Blackwell, Inc., 119–122.
Haan, J. de, 2003, “Time Dummy Approaches to Hedonic Price Measurement,” Paper presented at the Seventh Meeting of the International Working Group on Price Indices, May, 27–29 (Paris: INSEE). Available via the Internet at http://www.ottawagroup.org/meet.shtml.
Haan, J. de, 2004, “Hedonic Regressions: The Time Dummy Index as a Special Case of the Törnqvist Index, Time Dummy Approaches to Hedonic Price Measurement,” Paper presented at the Eighth Meeting of the International Working Group on Price Indices, August, 23–25 (Helsinki: Statistics Finland). Available via the Internet at http://www.ottawagroup.org/meet.shtml.
Pakes A., 2003, “A Reconsideration of Hedonic Price Indexes with an Application to PCs,” The American Economic Review, Vol. 93, No. 5, December, pp. 1576–93.
Rosen, S, 1974, “Hedonic Prices and Implicit Markets: Product Differentiation and Pure Competition,” Journal of Political Economy, Vol. 82, pp. 34–49.
Schultze, C. and C. Mackie, 2002, (editors), see Committee on National Statistics, 2002, op. cit.
Silver, M., 2002, “The Use of Weights in Hedonic Regressions: the Measurement of Quality-Adjusted Price Changes” (Unpublished; Cardiff: Cardiff Business School, Cardiff University).
Silver, M. and S. Heravi, 2001, “Scanner Data and the Measurement of Inflation,” The Economic Journal, Vol. 11, June, pp. 384–405.
Silver, M. and S. Heravi, 2003, “The Measurement of Quality-Adjusted Price Changes,” in Scanner Data and Price Indexes, ed. by M. Shapiro and R. Feenstra, National Bureau of Economic Research, Studies in Income and Wealth, Vol. 61 (Chicago: University of Chicago Press) pp. 277–317.
Silver, M. and S. Heravi, 2005, “Why the CPI Matched Models Method May Fail Us: Results from an Hedonic and Matched Experiment Using Scanner Data,” Journal of Business and Economic Statistics, Vol. 23, No. 3, pp. 269-81.
Stigler, G., 1961, “The Price Statistics of the Federal Government,” Report to the Office of Statistical Standards, Bureau of the Budget (New York: NBER).
Szulc, B. J., 1983, “Linking Price Index Numbers,” in Price Level Measurement, pp. 537–66, (Ottawa: Statistics Canada).
Szulc, B. J., 1987, “Hedonic Functions and Hedonic Indexes,” in The New Palgrave: A Dictionary of Economics, ed. by J. Eatwell, M. Milgate, and P. Newman (New York: Stockton Press) pp. 630–634.
Szulc, B. J., 1999, “The Solow Productivity Paradox: What do Computers do to Productivity?,” Canadian Journal of Economics, Vol. 32, No. 2, April, pp. 309–34.
Szulc, B. J., 2004, Handbook on Hedonic Indexes and Quality Adjustments in Price Indexes, Directorate for Science, Technology and Industry (Paris: Organisation for Economic Cooperation and Development).
van Garderen K.R. and C. Shah, 2002, “Exact Interpretation of Dummy Variables in Semi-Logarithmic Equations,” Econometric Journal, Vol. 5, No. 1, pp. 149–59.
Cardiff University. We acknowledge useful comments from Paul Armknecht (IMF), Ernst Berndt (MIT), Erwin Diewert (University of British Columbia), Kevin Fox (UNSW), Jan de Haan (Statistics Netherlands), and two anonymous referees for the Journal of Business and Economic Statistics.

