An Index Number Formula Problem
The Aggregation of Broadly Comparable items
Author: Mick Silver

## Contributor Notes

Index number theory informs us that if data on matched prices and quantities are available, a superlative index number formula is best to aggregate heterogeneous items, and a unit value index to aggregate homogeneous ones. The formulas can give very different results. Neglected is the practical case of broadly comparable items. This paper provides a formal analysis as to why such formulas differ and proposes a solution to this index number problem.

## Abstract

Index number theory informs us that if data on matched prices and quantities are available, a superlative index number formula is best to aggregate heterogeneous items, and a unit value index to aggregate homogeneous ones. The formulas can give very different results. Neglected is the practical case of broadly comparable items. This paper provides a formal analysis as to why such formulas differ and proposes a solution to this index number problem.

## I. Introduction

1. There is a consensus as to which price index number formula is best when price and quantity/value information are available for the aggregation of heterogeneous items. The economic theoretic approach to index number formulas supports superlative index numbers, primarily the Fisher, Törnqvist, and Walsh indexes, all of which give similar answers. The axiomatic approach supports the Fisher index. Such findings are part of the internationally-accepted manuals on consumer, producer, and (forthcoming)2 trade price indexes—ILO et al. (2004a and 2004b). What is less well known is that for the aggregation of homogeneous items, the unit value index is the best formula and superlative price index numbers are biased, and for the aggregation of heterogeneous items, superlative price index numbers are best and the unit value index is biased. This paper provides a formal decomposition of the difference between a unit value index and the Fisher, and other, price indexes, and addresses the question as to what is the appropriate formula for differentiated, broadly-comparable products that lie in the continuum between homogeneous and heterogeneous products.

2. The bias in superlative index numbers for homogeneous goods (or services) is a neglected and important index number issue. Say, for example, the price of good A was 10 in both the reference and current period and the price of good B was 12 in both periods, but there was a shift in quantities from say 6, for both A and B in the reference period, to 8 for A and 4 for B in the current period. A superlative, or any other index number formula for heterogeneous goods, would give an answer of unity, no overall price change. However, the correct answer for homogeneous goods would be a unit value fall of 3 per cent appropriately reflecting the shift in the quantity basket in the current period from the higher price level of 12 for A to the lower price level of 10 for B. The good, and cost of living with regard to this good, is, on average, now cheaper. The CPI Manual (ILO et al. 2004a, Chapter 20) and 2008 System of National Accounts (SNA) advocate the use of unit value indexes for homogeneous goods and services:

“When there is price variation for the same quality of good or service, the price relatives used for index number calculation should be defined as the ratio of the weighted average price of that good or service in the two periods, the weights being the relative quantities sold at each price. Suppose, for example, that a certain quantity of a particular good or service is sold at a lower price to a particular category of purchaser without any difference whatsoever in the nature of the good or service offered, location, timing or conditions of sale, or other factors. A subsequent decrease in the proportion sold at the lower price raises the average price paid by purchasers for quantities of a good or service whose quality is the same and remains unchanged, by assumption. It also raises the average price received by the seller without any change in quality. This must be recorded as a price and not a volume increase.” Commission of the European Communities and others, (2008), 2008 SNA, paragraph 15.69.

3. Index number theory recognizes that the appropriateness of each formula depends on whether the items aggregated are homogeneous or otherwise—Diewert (1995) and (Balk, 1998 and 2005). As matters stand the advice is to simply determine whether or not items are homogeneous and apply the appropriate formula. But what if the goods and services are broadly comparable, that is they are of different qualities such that some of the price dispersion is due to product differentiation and some due to say search costs or price discrimination? A superlative index would ignore any shift to higher or lower average (quality-adjusted) price levels, but a unit value index would wrongly treat changes in compositional mix of items of different quality as price changes, the familiar unit value bias. Given that these formulas will generally give quite different answers it is important to determine why they differ, the conditions under which each is suitable, and what to do when, as is likely to be the case, neither is.

4. The rationale for unit value and superlative indexes are outlined in Section II. Section III provides a formal analysis of how unit value and Fisher price indexes differ. In Section IV a solution is proposed to the problem of aggregating broadly comparable goods. An application using scanner data is provided in Section V with conclusions in VI.

5. The application of the results of this paper is in the determination of price and volume measures at the national and micro level for economic aggregates. It applies to consumer, commodity, producer (input and output), import, and export price indexes, as well as price indexes of capital goods, such as house price indexes. Since price indexes are used as deflators there is a concomitant application to volume indexes. The concern is with aggregation where price and quantity/value information is available for broadly comparable items, for example, for measuring the aggregate price and volume change of different qualities of automobiles, but not over automobiles and beef.

## II. Superlative and Unit Value Indexes

### A. Superlative Index Numbers

6. The Fisher, PF and Törnqvist, PT, index number formulas are both commonly used superlative indexes.3 The Fisher price index is a geometric mean of Laspeyres, PL, and Paasche, PP, price indexes and is defined for a price comparison between the current period t and a reference period 0, over m=1, ….,M matched items whose respective prices and quantities are given by ${p}_{m}^{t}$ and ${q}_{m}^{t}$ for period t, and ${p}_{m}^{0}$ and ${q}_{m}^{0}$ for period 0, by:

$\begin{array}{ccc}{P}_{F}\equiv \sqrt{\frac{\sum _{m=1}^{M}{p}_{m}^{t}{q}_{m}^{0}}{\sum _{m=1}^{M}{p}_{m}^{0}{q}_{m}^{0}}×\frac{\sum _{m=1}^{M}{p}_{m}^{t}{q}_{m}^{t}}{\sum _{m=1}^{M}{p}_{m}^{0}{q}_{m}^{t}}}=\sqrt{{P}_{L}×{P}_{P}}& \phantom{\rule{7.0em}{0ex}}& \left(1\right)\end{array}$

7. The Törnqvist price index is defined as:

8. Both PF and PT make symmetric use of each period’s price and quantity information. Diewert (1976 and 1978), from an approach based on economic theory, demonstrated that both Fisher and Törnqvist indexes belong to a class of superlative indexes4 that have the desirable property of incorporating substitution effects, that is the affect of consumers substituting their basket of goods towards those with relatively low price increases, thus lowering the cost of living. Laspeyres and Paasche indexes are fixed (quantity) basket price indexes and allow for no such substitution.

9. In the test or axiomatic approach desirable properties for an index number are chosen and different formula evaluated against them. Fisher described his index as “ideal” because it satisfied the tests proposed including the “time reversal” and “factor reversal” tests.5 In practice Laspeyres-type indexes are often calculated because data on current period information are unavailable in real time.6 The arguments presented in this paper apply as much to the use of unit value indexes against Laspeyres-type price index formulas, as against superlative index number formulas.

### B. Unit Value Indexes

10. A unit value index, PU, is given by:

$\begin{array}{ccc}{p}_{U}\equiv \left(\frac{\sum _{m=1}^{M}{p}_{m}^{t}{q}_{m}^{t}}{\sum _{m=1}^{M}{q}_{m}^{t}}\right)/\left(\frac{\sum _{m=1}^{M}{p}_{m}^{0}{q}_{m}^{0}}{\sum _{m=1}^{M}{q}_{m}^{0}}\right).& \phantom{\rule{7.0em}{0ex}}& \left(3\right)\end{array}$

11. If the items whose prices are being aggregated are identical—that is, perfectly homogeneous—a unit value index has desirable properties. Balk (2005) identified it as the target index for homogeneous goods.

12. Consider the case where the exact same item is sold at different prices during the same period, say lower sales and higher prices in the first week of the month and higher sales and lower prices in the last week of the month. The unit value for the monthly index solves the time aggregation problem and appropriately gives more weight to the lower prices than the higher ones in the aggregate. Further, if the elementary unit value index in equation (3) is used as a price index to deflate a corresponding change in the value, the result is a change in total quantity which is intuitively appropriate, i.e.

$\begin{array}{ccc}\frac{\sum _{m=1}^{M}{p}_{m}^{1}{q}_{m}^{1}}{\sum _{n=1}^{N}{p}_{n}^{0}{q}_{n}^{0}}/\left[\left(\frac{\sum _{m=1}^{M}{p}_{m}^{1}{q}_{m}^{1}}{\sum _{m=1}^{M}{q}_{m}^{1}}\right)/\left(\frac{\sum _{m=1}^{M}{p}_{m}^{0}{q}_{m}^{0}}{\sum _{m=1}^{M}{q}_{m}^{0}}\right)\right]=\frac{\sum _{m=1}^{M}{q}_{m}^{1}}{\sum _{m=1}^{NM}{q}_{m}^{0}}& \phantom{\rule{7.0em}{0ex}}& \left(4\right)\end{array}$

13. Note that the summation of quantities in the top and bottom of the right-hand-side of equation (2) must be of the exact same type of item for the expression to make sense.

14. Balk (1998) showed that the unit value index does not satisfy (i) the Proportionality Test: P(pp,q0,qt) = λ for λ > 0; that is, if all prices are multiplied by the positive number λ, then the new price index is λ. The unit value index only satisfies the proportionality test in the unlikely event that relative quantities do not change; (ii) the Identity or Constant Prices Test: P(p, p, q0,qt) = 1; that is, if the price of every good is identical during the two periods, then the price index should equal unity, no matter what the quantity vectors are. The unit value index only satisfies the identity test if relative quantities, that is the composition of the products compared, do not change; and (iii) Invariance to Changes in the Units of Measurement (commensurability) Test: $P\left({\alpha }_{1}{p}_{1}^{0},\dots ,{\alpha }_{n}{p}_{n}^{0};{\alpha }_{1}{p}_{1}^{t},\dots ,{\alpha }_{n}{p}_{n}^{t};{\alpha }_{1}^{-t}{q}_{1}^{0},\dots ,{\alpha }_{n}^{-t}{q}_{n}^{0};{\alpha }_{1}^{-t}{q}_{1}^{t},\dots ,{\alpha }_{n}^{-t}{q}_{n}^{t}\right)=P\left({p}_{1}^{0},\dots ,{p}_{n}^{0};{p}_{1}^{t},\dots ,{p}_{n}^{t};{q}_{1}^{0},\dots ,{q}_{n}^{0};{q}_{1}^{t},\dots ,{q}_{n}^{t}\right)$ for all α1 > 0, …, αn > 0; that is, the price index does not change if the units of measurement for each product are changed. Changes in units de facto arise when the quality of items change. However, the commensurability test is satisfied in the homogeneous case, when items are identical. Moreover, these tests were devised for the aggregation of heterogeneous items and are not meaningful for homogeneous items. For example, in the introduction we outlined the case where prices do not change, but a shift in quantities switches the average price to a lower level leading to a fall in the overall price level—there is a meaningful failure of the identity test.

15. Bradley (2005) takes a cost-of-living index defined in economic theory and compared the bias that results from using unit values as “plug-ins” for prices. He finds that if there is no price dispersion in either the current or reference period compared—the homogeneous case—the unit value (plug-in) index will not be biased against the theoretical index.

16. There is a literature on bias in import and export price indexes that use unit value indexes for goods in a commodity group used by customs documents as proxies for price changes. Such groups can be too widely defined to ensure homogeneity and the findings are that such unit value indexes substantially misrepresent price changes due to compositional changes in quantities and quality mix of what is exported and imported in the category concerned—Angermann (1980), Alterman (1991), Ruffles and Williamson, (1997), and Silver (2008). It is necessary to consider the factors determining such differences.

## III. The Difference Between a Unit Value and a Fisher Index

17. Párniczky (1974) and Balk (1998) respectively compare unit value indexes to the Paasche and Fisher price indexes. These seminal decompositions, while useful, undertook the decomposition in terms of quantity-weighted covariances of changes. However, quantity weighting implicitly assumes homogeneity, thus negating the analysis for this purpose and, further, the decompositions do not distinguish switches to levels. Both are issues of concern here. We provide a new decomposition.

18. We first define Laspeyres and Paasche price indexes, PL and PP, and a Laspeyres quantity index, QL, respectively as:

where ${s}_{m}^{0}$ were defined in equation (2) above as period 0 value shares; xm is the mth price relative, and ym the mth quantity relative, defined as: items. It follows from equation (1), (3), and (5) that the ratio of a unit value index to a Fisher price index is given by:

$\begin{array}{ccc}\frac{{P}_{UV}}{{P}_{F}}=\frac{{P}_{UV}}{{P}_{L}}×\frac{{P}_{L}}{{P}_{F}}=\frac{{P}_{UV}}{{P}_{L}}×{\left[\frac{{P}_{L}}{{P}_{P}}\right]}^{\frac{1}{2}}& \phantom{\rule{7.0em}{0ex}}& \left(6\right)\end{array}$

$\begin{array}{ccc}\frac{{P}_{UV}}{{P}_{L}}=\left[\frac{\sum _{m=1}^{M}{p}_{m}^{t}{q}_{m}^{t}}{\sum _{m=1}^{M}{p}_{m}^{t}{q}_{m}^{0}}×\frac{\sum _{m=1}^{M}{q}_{m}^{0}}{\sum _{m=1}^{M}{q}_{m}^{t}}\right]=\frac{\sum _{m=1}^{M}{S}_{m}^{0}{x}_{m}{y}_{m}}{\sum _{m=1}^{M}{S}_{m}^{0}{x}_{m}}/\frac{\sum _{m=1}^{M}{q}_{m}^{t}}{\sum _{m=1}^{M}{q}_{m}^{0}}=\frac{\left({\rho }_{x,y}^{{s}_{0}}c{\nu }^{{s}_{0}}\left(x\right)c{\nu }^{{s}_{0}}\left(y\right)+1\right){Q}_{L}}{\sum _{m=1}^{M}{q}_{m}^{t}/\sum _{m=1}^{M}{q}_{m}^{0}}.& \phantom{\rule{7.0em}{0ex}}& \left(7\right)\end{array}$

where ${\rho }_{x,y}^{{s}_{0}}$ is the ${s}_{m}^{0}$-weighted correlation coefficient between price relatives and quantity relatives, xm and ym and cνs0(x) and s0(y) are their ${s}_{m}^{0}$-weighted respective coefficients of variation, i.e. ${\sigma }_{y}/\overline{y}$ and ${\sigma }_{x}/\overline{x}$. It follows8

$\begin{array}{ccc}\frac{{Q}_{L}}{\sum _{m=1}^{M}{q}_{m}^{t}/\sum _{m=1}^{M}{q}_{m}^{o}}=\frac{\sum _{m=1}^{M}{p}_{m}^{0}{q}_{m}^{t}/\sum _{m=1}^{M}{q}_{m}^{t}}{\sum _{m=1}^{M}{p}_{m}^{0}{q}_{m}^{0}/\sum _{m=1}^{M}{q}_{m}^{0}}=\frac{\left({\rho }_{{p}^{0},{q}^{t}}\mathit{c\nu }\left({p}^{0}\right)\mathit{c\nu }\left({q}^{t}\right)+1\right)}{\left({\rho }_{{p}^{0},{q}^{0}}\mathit{c\nu }\left({p}^{0}\right)\mathit{c\nu }\left({q}^{0}\right)+1\right)}=\frac{\frac{\mathrm{cov}\left({p}^{0},{q}^{t}\right)}{{\overline{p}}^{0}}+1}{\frac{\mathrm{cov}\left({p}^{0},{q}^{0}\right)}{{\overline{p}}^{0}{\overline{q}}^{0}}+1}=\frac{\frac{{\stackrel{^}{\beta }}^{t}c\nu \left({p}^{0}\right)}{{\overline{q}}^{t}}+1}{\frac{{\stackrel{^}{\beta }}^{0}c\nu \left({p}^{0}\right)}{{\overline{q}}^{0}}+1}=\frac{{\overline{q}}^{0}}{{\overline{q}}^{t}}\frac{{\stackrel{^}{\beta }}^{t}}{{\stackrel{^}{\beta }}^{0}}& \phantom{\rule{7.0em}{0ex}}& \left(8\right)\end{array}$

where ${\rho }_{{p}^{0},{q}^{t}}$, and ${\rho }_{{p}^{0},{q}^{0}}$ are the correlation coefficients between ${p}_{m}^{0}$ and ${q}_{m}^{t}$ and ${p}_{m}^{0}$ and ${q}_{m}^{0}$ respectively, cov(.) the covariances, and ${\stackrel{^}{\beta }}^{t}$ and ${\stackrel{^}{\beta }}^{0}$ the estimated OLS slope coefficients from the regressions ${q}_{m}^{t}=\alpha +{\beta }^{t}{p}_{m}^{0}+{\epsilon }_{m}$ and ${q}_{m}^{0}=\alpha \prime +{\beta }^{0}{p}_{m}^{0}+{\epsilon }_{m}^{\prime }$ respectively. Substituting (8) into (7):

$\begin{array}{ccc}\frac{{P}_{UV}}{{P}_{L}}=\left({\rho }_{x,y}^{{s}_{0}}\mathit{c\nu }\left(x\right)\mathit{c\nu }\left(y\right)+1\right)\left(\frac{{\overline{q}}^{0}}{{\overline{q}}^{t}}\frac{{\stackrel{^}{\beta }}^{t}}{{\stackrel{^}{\beta }}^{0}}\right).& \phantom{\rule{7.0em}{0ex}}& \left(9\right)\end{array}$

The substitution effect between Fisher and Laspeyres, and Paasche and Fisher, is:

Substituting (10) and (11) into (6) yields a unit value to Fisher index ratio:

$\begin{array}{l}\frac{{P}_{UV}}{{P}_{F}}=\frac{{P}_{UV}}{{P}_{L}}×{\left[\frac{{P}_{L}}{{P}_{P}}\right]}^{\frac{1}{2}}=\left({\rho }_{x,y}^{{s}_{0}}c{v}^{{s}_{0}}\left(x\right)c{v}^{{s}_{0}}\left(y\right)+1\right)\left(\frac{{\overline{q}}^{0}}{{\overline{q}}^{t}}\frac{{\stackrel{^}{\beta }}^{t}}{{\stackrel{^}{\beta }}^{0}}\right)×{\left[\frac{1}{\left({\rho }_{x,y}^{{s}_{0}}c{v}^{{s}_{0}}\left(x\right)c{v}^{{s}_{0}}\left(y\right)+1\right)}\right]}^{\frac{1}{2}}\\ \begin{array}{ccc}={\left({\rho }_{x,y}^{{s}_{0}}c{v}^{{s}_{0}}\left(x\right)c{v}^{{s}_{0}}\left(y\right)+1\right)}^{\frac{1}{2}}\left(\frac{{\overline{q}}^{0}}{{\overline{q}}^{t}}\frac{{\stackrel{^}{\beta }}^{t}}{{\stackrel{^}{\beta }}^{0}}\right).& \phantom{\rule{7.0em}{0ex}}& \left(12\right)\end{array}& & & \end{array}$

19. First, the difference between a unit value index and Fisher price index can be seen to depend on the two factors in (12): the first is the substitution effect between Fisher and Laspeyres, as given by (11), ${\left({\rho }_{x,y}^{{s}_{0}}c{\nu }^{{s}_{0}}\left(x\right)c{\nu }^{{s}_{0}}\left(y\right)+1\right)}^{\frac{1}{2}}$, also equal to the substitution bias between Paasche and Fisher indexes, since $\frac{{P}_{F}}{{P}_{L}}=\frac{{P}_{P}}{{P}_{F}}$.9

20. The second factor is the levels effect, that is the effect, for negatively sloping demand, of quantities shifting towards lower-priced transactions. The measure is based on the ratio of the slope coefficients from the regressions of ${q}_{m}^{t}$ on ${p}_{m}^{0}$ and of ${q}_{m}^{0}$ also on ${p}_{m}^{0}$. The two lines intercept have a common ${\overline{p}}^{0}$ by construction. Assume for reasons of exposition that they also have a common ${\overline{q}}^{t}={\overline{q}}^{0}$. Thus, as depicted in Figure 1, as the slope changes for the period t regression, the line ${q}_{m}^{t}$ rotates about the means ${\overline{q}}^{t}={\overline{q}}^{0}$ and ${\overline{p}}^{0}$. As the slope of the period t line say increases, above average prices have lower quantities and below average prices have higher quantities—the larger the increase, the greater the shift. The change in slopes capture a shift in levels.

21. Note this requires the two lines to intersect at the means, which they may not since it is likely that ${\overline{q}}^{{t}^{*}}\ne {\overline{q}}^{0}$, the two lines having different intercepts. However, the adjustment required to bring ${\overline{q}}^{{t}^{*}}$ (in this depiction down) to ${\overline{q}}^{0}$ at ${\overline{p}}^{0}$ is $\frac{{\overline{q}}^{0}}{{\overline{q}}^{t}}$. The measure in (12) includes a ratio of βt to β0 with such an adjustment. Equation (12) successfully decomposes the difference between the formulas into these two effects, as we will illustrate in the empirical section.

22. Second, it is also apparent from (12) that for the unit value index to equal the Fisher price index it is necessary that: all price changes OR quantity changes are equal to each other, OR there is no (weighted) correlation between the base-period-weighted price and quantity changes AND mean quantities remain unchanged with no change to the demand slopes. These are extreme conditions. Having no dispersion in price or quantities or their changes is a negation of the index number problem, and while we do not expect the laws of economics to work perfectly, there is expected to be some relationship between price and quantity changes. There is further required in (12) an assumption of a static economy with parameter stability for demand functions and no change in average quantities purchased.

## IV. What to do for Broadly Comparable Items

24. Quality adjustment factors can be applied to prices to render the comparison of prices of differentiated items akin to one of homogeneous items. We make use of (hedonic) quality-adjusted unit value indexes that remove the effects on prices of product heterogeneity, a proposal that goes back to Dálen (2001) and is formalized and empirically examined in De Haan (2004) and reiterated in De Haan (2007).10 Since a unit value index is appropriate for homogeneous items, a quality-adjusted unit value index must be appropriate for broadly comparable items. We consider first such a measure.

25. A hedonic regression (see Triplett, 2004) using data on m = 1,…,M matched models for periods τ = 0,t, of the price, ${p}_{m}^{\tau }$, on k = 1,…,K quality characteristics, ${z}_{km}^{\tau }$:

$\begin{array}{ccc}{p}_{m}^{\tau }={\beta }_{0}^{\tau }+\sum _{k=1}^{K}{\beta }_{k}^{\tau }{z}_{km}^{\tau }+{u}_{m}^{\tau }& \phantom{\rule{7.0em}{0ex}}& \left(13\right)\end{array}$

where ${u}_{m}^{\tau }$ are assumed to be normally distributed with mean and variance δτ and ${\xi }_{\tau }^{2}$ respectively. The heterogeneity-adjusted prices in each period relative to a reference numeraire item with mean characteristics ${\overline{z}}_{km}^{\tau }$ in each period are given by:

$\begin{array}{ccc}{\stackrel{^}{p}}_{m}^{\tau }={p}_{m}^{\tau }-\sum _{k=1}^{K}{\beta }_{k}^{\tau }\left({z}_{km}^{\tau }-{\overline{z}}_{km}^{\tau }\right)& \phantom{\rule{7.0em}{0ex}}& \left(14\right)\end{array}$

26. Bear in mind the models in each period are matched so that ${z}_{km}^{\tau }={z}_{km}^{0}={z}_{km}^{t}$. Note also that ${\beta }_{k}^{0}$ may or may not equal ${\beta }_{k}^{t}$ and (13) can be estimated on pooled data with a dummy variable for time and with the constraint that ${\beta }_{k}^{t}={\beta }_{k}^{0}={\beta }_{k}^{\tau }$, though it is preferable to estimate (13) separately for each time period without the constraint. The heterogeneity-adjusted unit value index is:

$\begin{array}{ccc}{P}_{U}^{*}=\left(\frac{\sum _{m=1}^{M}{\stackrel{^}{p}}_{m}^{t}{q}_{m}^{t}}{\sum _{m=1}^{M}{q}_{m}^{t}}\right)/\left(\frac{\sum _{m=1}^{M}{\stackrel{^}{p}}_{m}^{0}{q}_{m}^{0}}{\sum _{m=1}^{M}{q}_{m}^{0}}\right).& \phantom{\rule{7.0em}{0ex}}& \left(15\right)\end{array}$

27. For goods and services with slight product differentiation we would advise a measure based on (15). Of course the quality adjustments need not use hedonic regressions. They may be much simpler due to say the addition of a single feature or option for which cost or market estimates of their value are available. Bear in mind that the items are matched in each period so there is no quality change over time. The above formula has abstracted from the measures in each period cross-sectional variation in price due to quality differences.

28. What of items that are comparable, say, models of television sets, washing machines, laptop computers, automobiles, whose price dispersion due to product differentiation is significant, as is the price dispersion due to factors that cannot be accounted for by the characteristics of the item?11 There is an element in the price measure for which the quality-adjusted unit value reduces the problem to one of homogenous items, but there is also an element for which a Fisher price index given by (1) is appropriate.

29. There is a need for a weighted average of (15) and (1), but a problem as to what the weights should be. One approach is to consider what we mean by “comparable.”12 Consider the case for, say, models of automobiles where a hedonic regression explains just about all of the price variation. The models are very different in this sense, compared to a data set, say, of the same model of automobile sold by different dealers for which a hedonic regression may explain a much smaller proportion of the price variation. A Fisher price index would be appropriate in the former case and unit value index for the latter, since quantities cannot be meaningfully added together for the former but can for the latter. Thus the weights for the heterogeneity-adjusted unit value index (15) might be the ratio of the sum of squared errors from the hedonic regression (SSE) to the total sum of squares (SST) and the weight for the Fisher price index given by (1) the ratio of the (explained) regression sum of squares (SSR) to SST. The weighted average is given by:

$\begin{array}{ccc}{P}_{U}^{*}{\overline{w}}_{U}+{P}_{F}\left(1-{\overline{w}}_{U}\right)=\frac{\sum _{m=1}^{M}{\stackrel{^}{p}}_{m}^{t}{q}_{m}^{t}/\sum _{m=1}^{M}{q}_{m}^{t}}{\sum _{m=1}^{M}{\stackrel{^}{p}}_{m}^{0}{q}_{m}^{0}/\sum _{m=1}^{M}{q}_{m}^{0}}×{\overline{w}}_{U}+\sqrt{\frac{\sum _{m=1}^{M}{p}_{m}^{t}{q}_{m}^{t}}{\sum _{m=1}^{M}{p}_{m}^{0}{q}_{m}^{t}}×\frac{\sum _{m=1}^{M}{p}_{m}^{t}{q}_{m}^{0}}{\sum _{m=1}^{M}{p}_{m}^{0}{q}_{m}^{0}}×\left(1-{\overline{w}}_{U}\right)}& \phantom{\rule{7.0em}{0ex}}& \left(16\right)\end{array}$

where ${\overline{w}}_{U}=\frac{SSE}{SST}$ and $\left(1-{\overline{w}}_{U}\right)=\frac{SSR}{SST}={R}^{2}$. Note that the weights in (16) have a bar over them, they are an arithmetic mean of the weights from period 0 and period t for the hedonic regressions in equation (13) for binary comparisons between τ = 0,t.

30. The index given by (16) has the property that if all price variation is explained by the hedonic regression, the index is a Fisher index; if none of the price variation is explained by the hedonic regression, the index is a unit value index; as the percentage of price variation explained by the hedonic regression increases, so too will the weight given to the Fisher component.

31. The use of such weights is but one proposal. Consider the case of television sets. A hedonic regression could be estimated over all screen sizes with dummies for these screen sizes and variables for the other quality characteristics. But while the regression would attribute a say 30% premium for a 32 inch screen over a 14 inch one, while controlling for other variables, it is unlikely that consumers would consider the two models as substitutes when an allowance has been made for the screen size, and other variables. Thus the regression should be undertaken for similar goods in the sense that there is some substitutability.13

32. It might be argued that substitutability should be the main concept behind the weighting system. However, this is problematic. Substitutability can exist for goods and services for which quality-adjustments for unit values are not feasible and the concept of an average price not meaningful, for example, beef and chicken. However, as indicated above, the first step should be to identify a cluster of goods which are comparable and substitutable or exchangeable, for example, television sets of a similar screen size, and then use quality adjustments.

## V. An Empirical Example Using Scanner Data

33. The empirical work utilizes monthly scanner data for television sets (TVs) from the bar-code readers of UK retail outlets from January 2001 to December 2001. Each observation is a model of a TV in a given month sold in one of four different outlet types: multiple chains, mass merchandisers (department stores), independents and catalogue stores. The sample was devised to only include models of TVs that were sold in all 12 months of the data. This has the advantage of replicating the matched model methodology employed by statistical offices for consumer and producer price indexes, as well as clarifying that the effects we identify are not due to new and old models of differing quality entering and leaving the market (Silver and Heravi, 2005). We further limit the sample to a narrow range of screen size, that is 10 and 14 inch TV screens, since it may be argued that larger TVs with larger screen sizes serve a different consumer need. The data set included series for 94 such models in each month accounting for sales of 0.37 million TVs worth £49 million.

34. Hedonic regressions were estimated for each month to remove the effect on price of cross-sectional product heterogeneity. The variable set for the regressions included: (i) 17 brand dummies; (ii) size of screen, 14 and 10 inch; (iii) Nicam stereo sound; (iv) on-screen text retrieval news and information panels from broadcasting companies, in order of sophistication: teletext and fastext; (v) three types of reception systems; (vi) continental monitor style; (v) flat & square and super-planar tubes; (vi) s-vhs socket; (vii) DVD playback or DVD recording; and (viii) the outlet types, multiple chains, mass merchandisers (department stores), independents and catalogue stores.

35. Table 1 and Figure 2 show Laspeyres, Paasche, Fisher price and unit value indexes. Laspeyres and Paasche generally act as upper and lower bounds on Fisher as expected from the negative value of the correlation between relative price and quantity changes, ${\rho }_{x,y}^{{s}_{0}}$, given in Table 1. The magnitude of the difference between both Fisher and Laspeyres and Paasche and Fisher price indexes is given by the substitution effect—equation (11). The magnitude of the substitution effect, as dictated by its components in equation (11), can be seen from the calculated values in Table 1 to be generally small. A notable exception is July during which the negative correlation increases in magnitude, as more price conscious consumers hit the summer sales, leading to an increase in the Laspeyres-Paasche gap. The unit value index for these differentiate products has quite dissimilar changes to the price indexes.

Table 1.

Understanding the Differences Between Laspeyres, Paasche, and Fisher 36. Equation (10) shows the difference between the unit value and Paasche index, rounding aside, to be solely governed by the expression for the levels effect, as demonstrated in Table 2. The levels effect is a significant and quite volatile generally positive factor with the unit value index exceeding the Paasche price index by over 8 percent in November, but being very similar in May. Table 2 shows the substitution effect to generally have a countervailing effect to that of the levels effect bring the unit value index closer to the price indexes. For example, in June the two effects are almost offsetting, with the unit value index very close to the Fisher price index. But this is unusual; the levels effect generally dominates driven by the increasing dispersion in quantities over time.

Table 2.

Understanding the Differences Between Unit Value Indexes and Laspeyres, Paasche, and Fisher Price Indexes 37. The unit value index is compiled for heterogeneous TVs comprising 17 brands and several characteristics as detailed above. They are broadly comparable items. We estimated hedonic regressions for each month as price on the quality characteristics and quality-adjusted the prices as outlined in equations (14) and (15). The quality-adjusted unit value index is given in Figure 3. Between January and February the unit value index increased by about 6 percent, reflecting an increasing quantity of purchases directed to more expensive sets, better brands. Yet when we take out the effect of such quality differences, the change in the mix of the characteristics purchased, there is a fall in the prices of about 5 percent. Consumers are paying more on average for better sets, but given their valuation of what the improved mix in characteristics are worth, the result is an overall fall in average (unit) prices. Similarly, in November to December, the unit value index fell by about 4 percent, as the bundle of TVs purchased included a higher proportion of cheaper models, but the quality-adjusted unit value index actually increased reflecting the fact that the fall in average prices is not compensating for the fall in quality that gave rise to it.

38. Also given in Figure 3 are the Fisher index, equation (1) and the weighted average of the quality-adjusted unit value index and the Fisher index (equation (16). The average value of the R2 for the hedonic regressions was 0.63, the quality-adjusted unit value index receiving the lower weight of 0.37, on average. As a result the weighted average tracks the Fisher price index more closely than the quality-adjusted unit value index.

## VI. Conclusions

39. For the aggregation of homogeneous items, the unit value index is the best index and superlative index numbers biased, and for the aggregation of heterogeneous items, superlative index numbers are best index and unit value index numbers biased.

40. The factors determining the difference between unit value indexes and Laspeyres, Paasche and Fisher price indexes were established in Section III. They comprise a substitution bias (for unit value to Laspeyres and a countervailing one from Fisher to Laspeyres) and a levels effect. The conditions for the unit value index to equal the price indexes were established to be implausible.

41. For items that are very similar, a unit value index remains appropriate for it is necessary to capture the effect of a change in price levels, and price indexes do not properly do this. Quality adjustments to the prices to mitigate price dispersion due to the slight product heterogeneity would be appropriate.

42. The determination of whether or not an item is homogeneous is critical to the choice of index number formula, but in practice many items are broadly comparable, and neither a unit value nor a Fisher price index is appropriate. The more similar the items aggregated, the stronger the case for a heterogeneity-adjusted unit value indexes. It follows that an appropriate formula may be based on an average of a heterogeneity-adjusted unit value index and a Fisher price index. The weighting ascribed to each should be an indicator of the similarity of the items. A possible indicator explored in this paper is the extent to which the price variation can be explained by price-determining characteristics: the (explained) sum of squares from a hedonic regression. While the discussion has been phrased in terms of hedonic regression analysis the principles apply to simpler quality adjustments.

## References

• Aizcorbe, Ana and Nicole Nestoriak, 2008, The Importance of Pricing the Bundle of Treatments, US Bureau of Economic Analysis, Paper presented at the 2008 World Congress on National Accounts and Economic Measurement (Washington D.C.), May.

• Export Citation
• Alterman, William, 1991, Price Trends in U.S. Trade: New Data, New Insights, in: Peter Hooper and J. David Richardson (Eds.), International Economic Transactions, (Chicago: University of Chicago Press), pp. 109143.

• Export Citation
• Angermann, Oswald, 1980, “External Terms of Trade of the Federal Republic of Germany Using Different Methods of Deflation,” in Review of Income and Wealth 26, 4, (December), pp. 36785.

• Export Citation
• Balk, Bert. M., 1998, “On the Use of Unit Value Indexes as Consumer Price Subindexes,” in: Proceedings of the Fourth Meeting of the International Working Group on Price Indexes (Washington D.C.: U.S. Bureau of Labor Statistics), pp. 112120. Available via the Internet: http://www.ottawagroup.org.

• Export Citation
• Balk, Bert M., 2005, “Price Indexes for Elementary Aggregates: The Sampling Approach,” Journal of Official Statistics, 21, 4, pp. 675699.

• Export Citation
• Ball, L. and Mankiw, N.G., 1994, “Asymmetric Price Adjustment and Economic Fluctuations,” Economic Journal, 104, pp. 247262.

• Bortkiewicz, L.v., 1923, “Zweck und Struktur einer Preisindexzahl,” Nordisk Statistisk Tidsskrift 2, pp. 369408.

• Bradley, Ralph, 2005, “Pitfalls of Using Unit Values as a Price Measure or Price Index,” Journal of Economic and Social Measurement, 30, pp. 3961.

• Export Citation
• Commission of the European Communities, International Monetary Fund, Organisation for Economic Co-operation and Development, United Nations, World Bank, 2008, System of National Accounts 2008 UN: pre-edit version of Volume 1, Approved by the Bureau of the UN Statistical Commission (August), available via the Internet: http://unstats.un.org/unsd/sna1993/draftingphase/Volume1.asp.

• Export Citation
• Dalén, Jorgen, 2001, Statistical Targets for Price Indexes in Dynamic Universes, Paper presented at the Sixth Meeting of the International Working Group on Price Indexes, Canberra, April 26.

• Export Citation
• de Haan, Jan, 2004, Estimating Quality-Adjusted Unit Value Indexes: Evidence from Scanner Data, Paper presented at the Seventh EMG Workshop, Sydney, Australia, (December 1214). SSHRC International Conference on Index Number Theory and the Measurement of Prices and Productivity, Vancouver (June 30–July 3).

• Export Citation
• de Haan, Jan, 2007, Hedonic Price Indexes: A Comparison of Imputation, Time Dummy and Other Approaches, Paper presented at the Seventh EMG Workshop, Sydney, Australia (December 12–14).

• Export Citation
• Diewert, W.E., 1976, “Exact and Superlative Index Numbers,” Journal of Econometrics 4, pp. 114145.

• Diewert, W.E., 1978, “Superlative Index Numbers and Consistency in Aggregation,” Econometrica 46, pp. 883900.

• Diewert, W.E., 1995, “Axiomatic and Economic Approaches to Elementary Price Indexes,” Department of Economics, University of British Columbia Discussion Paper 95–01.

• Export Citation
• Engel, Charles and Rogers, John H., 2001, “Deviations from Purchasing Power Parity: Causes and Welfare Costs,” Journal of International Economics, 55, pp. 2957.

• Export Citation
• Feenstra, R. and Kendall, J., 1997, “Pass-Through of Exchange Rates and Purchasing Power Parity,” Journal of International Economics, 43, pp. 23726.

• Export Citation
• Friedman, M., 1977, Nobel Lecture: “Inflation and Unemployment,” Journal of Political Economy, 85, pp. 45172.

• Hausman, Jerry and Ephraim Leibtag, 2008, “CPI Bias from Supercenters: Does the BLS Know that Wal-Mart Exists?” in: Erwin Diewert, John Greenlees and Charles Hulten, (eds), Price Index Concepts and Measurement, National Bureau of Economic Research.

• Export Citation
• Hong, P, McAfee R.P. and Nayyar, A., 2002, “Equilibrium Price Dispersion with Consumer Inventories,” Journal of Economic Theory, 105, pp. 503517.

• Export Citation
• International Labour Office ILO, IMF, OECD, Eurostat, United Nations, World Bank, 2004a, Consumer Price Index Manual: Theory and Practice (Geneva: ILO). Available via the Internet: http://www.ilo.org/public/english/bureau/stat/guides/cpi/index.htm.

• Export Citation
• International Labour Office ILO, IMF, OECD, UN ECE, World Bank, 2004b, Producer Price Index Manual: Theory and Practice (Washington: International Monetary Fund). Available via the Internet: http://www.imf.org/external/np/sta/tegppi/index.htm.

• Export Citation
• Lach, S., 2002, “Existence and Persistence of Price Dispersion,” Review of Economics and Statistics, 84, 3 (August), pp. 433444.

• Párniczky, G., 1974, “Some Problems of Price Measurement in External Trade Statistics,” Acta Oeconomica, 12, 2, pp. 229240.

• Ruffles, David and Kevin Williamson, 1997, “Deflation of Trade in Goods Statistics: Derivation of Price and Volume Measures from Current Price Values,” Economic Trends, 521 (April). Reproduced in Office for National Statistics ONS, Economic Trends, Digest of Articles No. 1 (London: ONS). 1998.

• Export Citation
• Silver, Mick, 2008, “Do Unit Value Export, Import, and Terms of Trade Indexes Represent or Misrepresent Price Indexes?IMF Staff Papers, 55, (September). Available via the Internet: http://www.palgrave-journals.com/imfsp/journal/vaop/ncurrent/index.html#23092008

• Export Citation
• Silver, Mick and Saeed Heravi, 2002, “Why the CPI Matched Models Method May Fail Us: Results From an Hedonic and Matched Experiment Using Scanner Data,” European Central Bank Working Paper Series, 144.

• Export Citation
• Silver, Mick and Saeed Heravi, 2007, “Why elementary price index number formulas differ: price dispersion and product heterogeneity,” Journal of Econometrics, 140, 2, pp. 87483.

• Export Citation
• Silver, Mick and Christos Ioannidis, 2001, “Inter-Country Differences in the Relationship Between Relative Price Variability and Average Prices,” Journal of Political Economy, 109, 2 (April), pp. 355374.

• Export Citation
• Sorenson, A.T., 2000, “Equilibrium Price Dispersion in Retail Markets for Prescription Drugs,” Journal of Political Economy, 108, 4, pp. 833850.

• Export Citation
• Stigler, G.J., 1961, “The Economics of Information,” Journal of Political Economy, 69 (June), pp. 213225.

• Triplett, J.E., 2004. “Handbook on Hedonic Indexes and Quality Adjustments in Price Indexes,” Directorate for Science, Technology and Industry OECD (Paris).

• Export Citation
• Vining, D.R. Jr. and Elwertowski, T.C., 1976, “The Relationship Between Relative Price and the General Price Level,” American Economic Review, 66, pp. 699708.

• Export Citation
• Yoskowitz, D.W., 2002, “Price Dispersion and Price Discrimination: Empirical Evidence from a Spot Market for Water,” Review of Industrial Organization, 20, pp. 283289.

• Export Citation
• Zieschang, Kimberly, 1988, The Characteristics Approach to the Problem of New and Disappearing Goods in Price Indexes, U.S. Bureau of Labor Statistics, Office of Prices and Living Conditions, Working Paper No. 183 (May).

• Export Citation

Acknowledgements for helpful advice are due to Ana Aizcorbe (U.S. Bureau of Economic Analysis), Erwin Diewert (University of British Columbia), Jan de Haan (Statistics Netherlands), Marshall Reinsdorf (U.S. Bureau of Economic Analysis), and Kimberly Zieschang (IMF). An earlier draft was presented at The 2008 World Congress on National Accounts and Economic Performance Measures for Nations, May 12–17, 2008, Washington D.C., available at: http://www.indexmeasures.com/dc2008/finalprogram.htm.

The Walsh price index is a less commonly used superlative index that is similar to a Laspeyres or Paasche price index, but uses a geometric mean of period 0 and t quantities as the fixed basket quantities (ILO et al., 2004a, Chapter 15, paragraphs 15.24–32).

Aggregator functions underlie the definition of indexes in economic theory, for example, a utility function to define a constant utility cost of living index. Different index number formulas can be shown to correspond with different functional forms of the aggregator function. Laspeyres, for example, corresponds to a highly restrictive Leontief form. The underlying functional forms for superlative indexes, including Fisher and Törnqvist, are flexible: they are second-order approximations to other (twice-differentiable) forms around the same point. It is the generality of functional forms that superlative indexes represent that allows then to accommodate substitution behavior and be desirable indexes.

The time reversal test requires that the index for period t compared with period 0, should be the reciprocal of that for period 0 compared with t. The factor reversal test requires that the product of the price index and the volume index should be equal to the proportionate change in the current values.

In practice, especially for CPIs where timeliness is of the essence, the price reference period 0 differs from the earlier weight reference period, say b, since it takes time to compile the results from the survey of households, establishments and other sources for the weights to use in the index. The Laspeyres index given by the second component in the right-hand-side expression in equation (1) may have quantities in period b instead of 0. This index is a Lowe index—see ILO et al. (2004: Chapter 15).

See Bortkiewicz (1923; 374-375) for the first application of this correlation coefficient decomposition technique: we define a correlation coefficient between u and v as ${\rho }_{u,v}=\left(\sum \mathit{uv}-m\overline{uv}\right)/m{\sigma }_{u}{\sigma }_{v}$. Then $\sum \mathit{uv}/\sum u={\sigma }_{u}{\sigma }_{v}{\rho }_{u,v}/\overline{u}+\overline{v}=\mathrm{cov}\left(u,v\right)/\overline{u}+\overline{v}$ and $\sum \mathit{suv}/\sum \mathit{su}$ yield s-weighted terms for the decomposition.

From the Bortkiewicz decomposition in the preceding footnote and since $\beta =\mathrm{cov}\left(\mathit{uv}\right)/{\sigma }_{v}^{2}$ in a regression of u on v and (x+1)/(y+1)=1+(xy)/(y+1).

It follows that if ${\rho }_{x,y}^{{s}_{0}}<0$, Laspeyres>Fisher>Paasche.

Silver and Heravi (2002) used hedonic regressions to control for heterogeneity in a Dutot index, see also Silver and Heravi (2007).

There is much empirical evidence that the law of one price does not hold in many markets, reasons for this including price discrimination—Yoskowitz (2002), menu costs—Ball and Mankiw (1994), search costs— Stigler (1961) and Sorenson (2000); signal extraction—Friedman (1977) and Silver and Ioannidis (2001); incomplete pass-through rates of exchange rate fluctuations—Feenstra and Kendall (1997) and Engel and Rogers (2001); inventory holding—Hong et al. (2002); and strategic pricing—Lach (2002).

Zieschang (1988) in a different context employs a concept of quasi-exchangeability when characteristics completely describe the associated varieties.

For example, Aizcorbe and Nestoriak (2008) used unit values for the price change measurement of alternative treatments of specific illnesses for a sample of 700 million US health claim records.

An Index Number Formula Problem: The Aggregation of Broadly Comparable items
Author: Mick Silver