The Hedonic Country Product Dummy Method and Quality Adjustments for Purchasing Power Parity Calculations
Author: Mick Silver

Contributor Notes

Author’s E-Mail Address:

The 2005 International Comparison Program's (ICP) estimates of economy-wide purchasing power parity (PPP) are based on parity estimates for 155 basic expenditure headings, mainly estimated using country product dummy (CPD) regressions. The estimates are potentially inefficient and open to omitted variable bias for two reasons. First, they use average prices across outlets as the left-hand-side variable. Second, quality-adjusted prices of non-comparable replacements, required when products in outlets do not match the required specifications, cannot be effectively included. This paper provides an analytical framework based on panel data and hedonic CPD regressions for ameliorating these sources of bias and inefficiency.


The 2005 International Comparison Program's (ICP) estimates of economy-wide purchasing power parity (PPP) are based on parity estimates for 155 basic expenditure headings, mainly estimated using country product dummy (CPD) regressions. The estimates are potentially inefficient and open to omitted variable bias for two reasons. First, they use average prices across outlets as the left-hand-side variable. Second, quality-adjusted prices of non-comparable replacements, required when products in outlets do not match the required specifications, cannot be effectively included. This paper provides an analytical framework based on panel data and hedonic CPD regressions for ameliorating these sources of bias and inefficiency.

I. Introduction

The International Comparison Program (ICP), claimed to be the world’s largest statistical initiative, produces estimates of purchasing power parity (PPP).1 The 2005 ICP PPP estimates have as their building blocks cross-country parity estimates for 155 basic (expenditure) headings, such as “poultry.” These parity estimates were, for the large part, estimated as parameters on country dummies in country product dummy (CPD) regressions. There are two potentially serious methodological weaknesses. First, is the general use of (country product) grouped average prices across outlets as the left-hand-side variable of the regression, rather than the observed price in each outlet. The resulting loss of between-outlet price variation for each product renders the parity estimates inefficient and potentially biased. Second, is the inability of the methodology to enable prices of non-comparable items to be used when a match of the specification of the product to be priced is not available in an outlet. It is well recognized that a major problem with purchasing power parity measurement is missing data—where comparable prices for representative products in one country cannot be matched in another (World Bank, ICP Handbook, 2007, Chapter 5). A hedonic CPD formulation, advocated in this paper, will enable the parity estimates to be conditioned on quality variations in non-comparable replacements.

This paper provides a measurement framework for dealing with both problems that may be usefully employed in the 2011 ICP. A panel data hedonic CPD approach is advocated that makes use of the outlet-level observations from which the aforementioned average prices were constructed. Section II briefly outlines the 2005 ICP basic heading aggregation methodology and the use, for inter-country matched price comparisons, of detailed product specifications based on structured product descriptions or “checklists.” Section III is based on the current use of grouped outlet data. The effect of grouping is considered as are a variety of hedonic CPD formulations using grouped data that include quality variation. Section IV relaxes the grouping assumption and considers again a hedonic CPD formulation with an emphasis on the panel structure of the data. Section V concludes with recommendations for the 2011 ICP BH parity estimates for both grouped data and, the preferred, individual data.

II. The 2005 ICP Methodology

A. Aggregation at the Basic Heading Level2

The 2005 ICP was based on prices collected for about 1,000 product specifications (PSs) grouped into 155 basic (expenditure) headings (BHs) in 146 countries divided into six regions. Separate product lists were used for each region to facilitate comparability, with an additional product list priced in 18 “ring” countries to enable inter-regional comparisons. The CPD method was used to estimate BH parities in four regions—Africa, Asia Pacific, West Asia, and South America3—and the EKS* method (see Diewert, 1999 and World Bank (2007), Chapter 11) was used for the OECD/Eurostat and the Commonwealth of Independent States (CIS) regions.4 The CPD method was used for (C=) 48 African; 23 Asia-Pacific; 11 West Asian; and 10 South America countries.

In each country the 1,000 PSs were grouped into 155 BHs; for example, (N=) 8 detailed PSs or types of poultry comprised the BH “poultry.” A PS for poultry may be a half lb. packet of non-branded, frozen, free range, de-boned, chicken breasts with skin sold in supermarkets—hereafter the term “product” and PS will be used interchangeably. For each BH for each region, say poultry for Africa, the average prices for each of the 8 products in each of the 48 African countries were used as the left-hand-side variable in a CPD regression and 48- 1=47 country and 8-1=7 product dummies on the right. The coefficients on the country dummies provided parity estimates for each of the 155 BHs for the 48 countries in Africa, and similarly for the other regions employing the CPD method. These BH parity estimates were then aggregated to PPPs using methods outlined in the World Bank’s ICP Handbook (2007) and Diewert (2008).

Of particular concern to the CPD aggregation is that for the 2005 ICP (i) average prices5 across outlets for each of the N products in the C countries were used, thus ignoring between outlet price variation in the regression, with an attendant loss of efficiency and potential omitted (outlet) variable bias; and (ii) the use of average prices across outlets negated the opportunity to effectively include quality adjustments into the aggregation procedure resulting in bias from either a loss of representativity, due to the omission of such non-comparable products, or due to the inclusion of non-comparable replacement item prices as if they were comparable, an issue we now turn to.

B. Checklists, Missing Observations, Non-Comparable Replacements, and Quality Adjustments

The 2005 ICP used highly detailed “tight” PSs or checklists based on detailed structured product descriptions (SPDs) to describe the price-determining characteristics of the products to be priced. 6This in turn enabled consistent cross-country matched price comparisons of like with like. But tight PSs make it less likely that comparable products will be found in different outlets/countries for matched price comparisons. The problem of missing price observations is accentuated by the very tightness of the PS. If a comparable match to the specification could not be found in an outlet in a country, the observation is either ignored, or the prices of non-comparable replacements used on the assumption, rightly or wrongly, that the difference in price due to the quality differentials were insignificant.7 Deaton and Heston (2008) note that lower quality items in poor countries may end up matched to higher quality items in rich countries, leading to an understatement of price levels in poor countries. For many goods the outlets sampled in poor countries may be closer to “dollar-stores” than to the typical outlet in the US. There is thus a trade-off between having tight specifications and poor coverage of items sampled, and loose specifications with price differences tainted by quality differences (Silver and Heravi, 2007a).

A proposal of this paper is for price collectors to select non-comparable replacements if a comparable item is not found, and note on their checklists the nature and extent of the difference from the original PS, say, a skinless chicken breast instead of one with skin. The CPD regression, with prices on the left hand side and country and product dummies on the right, can be extended to include the quality characteristics on the right to partial out the effect of the quality differences. This would serve to increase the representativity and efficiency of the resulting parameter estimates on the country dummies. The resurgence of interest in the 2005 ICP round in the CPD method and the innovation in the round of the use of detailed specifications provides a basis to deal with this problem—a hedonic CPD method. The ICP (2007) Handbook advocates the merits of a hedonic CPD especially with regard to the use of loose specifications for certain product areas:

“As already noted, the advantage of using loose specifications is that the number of price observations collected may be greatly increased. With very large sets of price observations it may be feasible to adjust for quite substantial differences in quality by employing hedonics.

One promising development is the combination of hedonics with the country product dummy method, or CPD, of estimating the parities for the basic heading, as both use the same type of multiple regression analysis.... …The use of loose specifications is therefore a real possibility that needs to be further researched for ICP purposes, but as yet there is little experience or evidence to demonstrate how well it works in practice and how robust the estimates are.”

Exclusive reliance on tight specifications may result in the basic heading PPPs being based on a quite restricted set of prices. The sample of products generated by the matched product approach based on tight specifications is far from random. There is a risk that unknown biases may be introduced.”8 (World Bank, Chapter 5, page 8).

The use of a hedonic country dummy framework for international price comparisons is not new—Heravi, Heston and Silver (2003)—though a hedonic country product dummy regression has, to the author’s knowledge, only been applied for inter-area PPP estimates within the United States, as outlined later. Given the importance of incorporating quality variations into PPP estimates there is, however, no formal evaluation as to how this can best be done. This paper addresses this issue.

III. The Hedonic CPD Method and Use of Grouped Data

The concern of this paper is with regression-based parity estimators that incorporate quality adjustments for non-comparable products, primarily through the use of a hedonic framework. An alternative hedonic-based approach is to use characteristic price indexes. Since such parity estimates are not regression-based, we consider such indices in Annex 1. For each BH consider the average prices of g=1,…,G groups of n = 1,…,N products in the domestic currency of c = 1,……,C countries where G = N × C, that is 8×48=384 groups for poultry in African countries. There are k = 1,....,Kg outlets from which prices are sampled in each group, and the means of observed outlet prices are p¯g=kgpk/Kg given in a CPD form by:


where α1 = 1 for a numeraire (c = 1) country, Dn is a dummy variable which is equal to 1 for product n and zero otherwise, Dc is a dummy variable equal to 1 for country c and zero otherwise, and ug a random disturbance. Equation (1) can be formulated as a regression equation in which pg* and ug* are logarithms of average prices9 and the random disturbances respectively and α1*,βn* and γc* are the logarithms of the parameters:


The (antilogarithms of) γ^c* are ordinary-least-squares (OLS) estimates10 of the country-specific price parities with respect to a benchmark country c = 1 while controlling for between product price variation (see Diewert (1999 and 2005) and Rao (2004) for details). A major advantage of the method is that standard errors are obtained for the parity estimates. The above formulation is unweighted. Diewert (2005) and Rao (2002 and 2004) have shown that specific weights used in weighted–least-squares (WLS) estimators for CPD parity estimates correspond to specific index number formulas.11 This provides a rationale for the CPD method beyond that originally proposed by Summers (1977) for “filling holes” in incomplete data tableaux. Rao (2004) has drawn attention to the effects of spatial autocorrelation, considered previously by Aten (1996), and the necessary adjustments to the estimates—though see also Druska and Horrace (2004) for a panel data context. Finally, as noted by Hill (2009), equation (2) is estimated for each BH. If cross-sectional residuals are correlated there may be efficiency gains through estimating the BH regression equations as a system akin to Zellner’s (1962) seemingly unrelated regression.

The use of the CPD method as formulated in equation (2) suffers from both the use of grouped data and the inability to included quality variations.

A. A CPD Regression Using Averages Across Outlets

Does the use of product averages as in equation (2) affect the CPD estimates?

Kmenta (1986) demonstrates that OLS parameter estimates based on group means are unbiased, though the disturbances are likely to be heteroskedastic and estimates inefficient, as a result of varying within group sample sizes. Such heteroskedasticity is avoided if the number of observations (outlets sampled for each product group in each country) is the same, but this is not so for the ICP. However, a WLS estimator with weights equal to the square root of the number of outlets sampled within each group, a readily available metric for ICP, should minimize this loss of efficiency and is thus advisable.

Dickens (1990) argued that the above analysis requires an assumption that regression errors for the individual observations within groups are independent and identically distributed. To assume disturbances are independent is to make the unlikely assumption that individual outlet prices within a product group and country share no common unobserved characteristics. Dickens (1990) further argues that where grouping is by a common characteristic, weighting by group size can lead to more, rather than less, heteroskedasticity, producing inefficient parameter estimates and biased estimates of the standard errors. Of course simple tests for heteroskedasticity can be run to see if this is the case.

Kmenta (1986) also compared the variances of the parameter estimates using OLS on ungrouped data and WLS on grouped data finding that there will always be some further of loss of efficiency even if the disturbances are homoskedastic. Little efficiency will be lost if the within group variation is small relative to the between group variation in the (unobserved) Xs and there will be no efficiency loss if all Xs are the same within a group. Also there will be spurious increases in the R¯2 due to the grouping.

The effect of efficiency losses on the estimated parameters may not be trivial. Machado and Santos Silva (2001) found for a hedonic regression of rental price for digital computers on their characteristics the coefficients on the dummy variable for 1963 compared with 1960, equivalent to a country parity, were (in logarithms) -0.594 (OLS) and -0.211 (WLS)—a 45 percent compared with 19 percent fall. The difference between parameter estimates using OLS and WLS (weights based on number of observations) estimators was substantial even in this case where disturbances were homoskedastic.

There is a further issue with a CPD formulation. If there is a latent unobserved variable that say has a higher values in some countries, the parity estimate will be biased and inconsistent as a result of multicollinearity between the omitted variable and γc*. Further, even if there is no multicollinearity, between the parity estimate and the omitted variable, say it is between the omitted variable and the product dummies βn*, say chicken breasts of a certain brand are sold in better quality stores with stricter adherence to freshness than other poultry products, then the nature of omitted variable bias is that the parity estimate will be unbiased, but α1*, upon which the parity estimate is benchmarked, will be biased and inconsistent and the variance of the parity estimate, upward biased (Kmenta, 1986).

Thus grouping can result in efficiency losses due to heteroskedasticity that may be mitigated against or aggravated by the use of a WLS estimator and that may also arise even if disturbances are homoskedastic. Further, spurious R¯2 detract from the normal array of diagnostics for detecting heteroskedasticity. The reliance on only country and product dummies also serves to preclude the introduction of quality variations when an item is missing. We thus turn to consider the inclusion of quality characteristics in a hedonic CPD based on grouped data. Such characteristics will be part of the PS of a product and price collectors can simply record any deviations from these specifications so that average changes in the quality of the product can be measured for a group.

B. A Hedonic CPD Regression Using Averages Across Outlets

Can average values of the quality characteristics be entered as explanatory variables in a CPD regression using averages?

Consider (for each BH) a regression akin to equation (2) based on g=1,…,G country product groups where X¯gj=kgXkj/Kg are means of each j explanatory variable within each country product group:


In this formulation we include quality characteristics in a CPD regression based on grouped averages. If, for example, chicken breasts in country 1 were on average better quality in some measurable way than in country 2, then the parity estimates γc* would be conditioned on the higher values of X¯j. Again tests for heteroskedasticity would need to be employed since sample sizes may vary between countries and product groups and WLS estimators employed, and again, even for homskedastic disturbances, the resulting estimators may be inefficient compared with regressions on ungrouped data. Yet further issue arises.

Machado and Santos Silva (2006) consider the case where the selection of groups is endogenous, in the sense that selection depends on unobserved characteristics affecting the dependent variable, something likely in the context of this paper; product and country grouping should affect price. Machado and Santos Silva (2006) and Dhrymes, and Lleras-Muney (2006) demonstrate that if group selection is endogenous, consistent and efficient estimates can only be obtained if covariate characteristics are fixed within groups and WLS is used, otherwise the estimates will not be consistent. Such fixing of the covariates is at odds with the needs of this paper, that is to utilize methods that condition parity estimates on quality variation. However, the dependence underlying the endogeneity is conditioned on the X¯gj. Endogeneity bias will be minimal if most price-determining variables are included, that is, if the correlation between the endogenous grouping variable and the disturbance is small (Dhrymes, and Lleras-Muney, 2006).12 Issues of endogeneity are considered further in the context of a panel structure to the data in the next section. A salient compromise is to utilize further stratification, include factors other than product and country (say, location and outlet type) in the regression as dummy variables. The prices would be averaged over these finer grouped data.

C. A Hedonic CPD Regression Based Only on Selected Stratifying Factors, not Covariates

Can we simply include the categorical hedonic quality characteristics within Xj in a CPD regression, say for location and outlet-type, effectively a country product outlet-type location dummy (CPOtLD) method?

Assume that X¯j=[X¯j1,X¯j2] for which X¯j1 are a selection of categorical variables such as outlet type (say supermarket, open-market store, department store, specialized store, discount store, other) and location (say capital/major city, other cities, towns, rural areas) and X¯j2 the remaining categorical variables and all covariates. The inclusion of only X¯j1 in the regression is equivalent to stratifying by these variables. That is, calculating average prices for, these groups and entering these averages into the left-hand-side of CPD regression with appropriate dummies for outlet-type and location on the right-hand side, along with the country and product dummies.13 The parameters estimates on the country dummies are inefficient in not taking into account variation within these groupings, and potentially suffer from omitted covariate bias in excluding the X¯j2.

Dhrymes, and Lleras-Muney (2006) consider whether it makes any difference, in terms of asymptotic efficiency, to use a finer or coarser grouping, say if “outlet-type’ is included is there any benefit from expanding the set of groups or loss from consolidating them? They find that finer groups always yield more efficient estimates of the structural parameters of interest.

Hill (2009) undertook some empirical work on the improving the 2005 ICP parity estimates for the Asia-Pacific region by using location (urban/rural) and store-type averages with corresponding dummies in an extended-CPD framework. The results, while preliminary, found the coefficients on such variables over the 85 BHs studied to be, more often than not, statistically significant and to impact on the parity estimates. However, the signs on these included variables were often unexpected and raised concern about the reliability of the country coding used for these variables. An alternative approach is for the price collector to identify how the non-comparable replacement differs from the specification. The desk officer then makes an explicit adjustment to its price for these quality deviations using the principles outlined below.

D. Explicit Quality Estimates

Why should we use hedonic estimates to value the quality difference: why not use outlet specific estimates from comparable products on the outlet shelf or based on option costs?

First, such an approach was used in the 2005 ICP for the case where the package size differed. If the PS was for a say 1 kg. bag of a specified quality of rice, and only a 0.5 kg. bag was on the shelf, then the country desk official would make a quality adjustment by multiplying the price by two. Thus, in this limited case, quality adjustments were incorporated into the ICP. Second, the approach can be easily developed using the checklists from which any change in the required specification is immediately apparent to the price collector. The difference in price arising from a unit change in quality may be apparent from observing the prices of other varieties of differing quality in that outlet. Since the replacement is to be compared with the missing specification for the specific outlet, there is a case for using the outlet estimate of the change in price of a unit quality characteristic, since outlet-specific factors are kept constant. However, there may not be a sufficient variety of models of the product on a store shelf. For PPP comparisons, use can be made of a wider sample of price-quality observations and the coefficients from hedonic regressions.

Product experts, as opposed to price collectors, may be used to judge the value of quality differentials. Sticking to tight PSs proved to be highly problematic for “machinery and equipment goods” in the 2005 ICP. Participating countries were asked to price as many products and product types as possible including the “preferred” make and model and also one or two “alternative” models where available. “Unspecified” models were also priced if neither preferred or alternative models were available and, in fact, these eventually made up about 40 percent of the total observations in Asia with substantial overlap. Of the submitted data, experts determined whether prices could be used in spite of minor technical variations, and also whether and how adjustments could be made to prices for more significant quality variation. For example, the price of a tractor without roll-over protection (ROP) could be adjusted by including the cost of the ROP, if the latter was part of the PS (Burdette, 2007). Also: “As stressed by the Asian core country experts, the next generation of SPDs should include more technical characteristics to support hedonic type analysis.” (Burdette, 2007, p. 8).

E. Explicit Hedonic Quality Adjustments

Why not undertake country-specific hedonic adjustments using the results from hedonic regressions estimated for individual countries?

In (3) the coefficients on the quality characteristics for each country are constrained to be the same. This is an essential characteristic of a dummy variable hedonic regression-based method. By constraining slope parameters to be the same, the difference between the country intercepts, as a measure of the price parity, is invariant to the value of the Xjc characteristics (Triplett, 2004). Yet, say a product available in country 1 with a PS of Xj1 = 10 was unavailable in some outlet(s) in country 2 and replacement(s) found with Xj2 = 12. We could simply run a hedonic regression just using country 2 data and make an explicit quality adjustment to the (log of the) price in country 2 of ρ^j2* (12 – 10) where ρ^j2* is an estimated coefficient from a hedonic regression of p¯j2* on X¯j2 using country 2 outlet data, that is:


A quality-adjusted price for country 2 can be directly compared to the actual country 1 price using a desirable index number formula without any need for a CPD regression.

Alternatively, the explicit quality adjustment ρ^j1* (10 −12) is made to the price of country 1 where ρ^j1* is an estimated coefficient on Xj1 from a hedonic regression estimated using country 1 data. However, the two methods will produce different results unless ρ^j1*=ρ^j2*. Given no a priori preference for either approach, a symmetric average of the two approaches is deemed appropriate (Feenstra, 1995) and indeed can be formulated so as to correspond to a superlative index number formula (Diewert, 2005).

The estimates of ρ^jc* suffer from all the defects of using grouped data outlined above and, further, may have insufficient or limited degrees of freedom. One possibility is to estimate the ρ^jc* from individual data and then incorporate the estimates into the hedonic CPD using a 2-stage least squares estimator. Dhrymes and Lleras-Muney (2006) demonstrate how efficiency gains can arise from the use of a mixed-2SLS (M2SLS) estimator where the dependent variable, our prices, would only be available for groups, whereas endogenous regressors are available at the individual level. Instruments would be required for the first stage estimation, but the idea of working with some variables at the outlet level and others at the grouped level begs the question as to why work at the grouped level in the first place. We now scratch this itch and advocate the use of ungrouped data and introduce some further estimation issues at the ungrouped level. Most importantly, we take formal account of the panel structure of the data.

IV. The Hedonic CPD Method and Use of Ungrouped Data

In this section we consider how to practically estimate the parities from hedonic CPD models using ungrouped data. We assume that we have available for the estimation data on prices for each product from each outlet in each country. As a normal part of ICP methods each of these product prices will have attached the detailed product/outlet-type specifications that define it. Variation in the specification will be recorded if a non-comparable replacement is sought and found. Incorporating quality variation into the measurement of CPD price parities will improve the efficiency of the estimates, remove potential bias, and enable the inclusion of non-comparable replacements.

The outlet variation may be explicitly modeled as a country-invariant randomly and independently distributed interaction term over outlet k with product n and country c that for simplicity we denote by δknc*. The price in outlet k of product n in country c is given by:


As will be outlined below, the outlet interaction term δknc* will be ignored in practical regression work due to the consuming degrees freedom required. The outlet interaction term will instead be considered as a latent outlet interaction variable. There may be variation in the price specification actually used; for example, the price of frozen skinless chicken breasts may only be available in an outlet when the PS is for frozen chicken breasts with skins. Since it is specific to an outlet it would be captured by δknc*. However, an alternative, parsimonious representation is to include a quality characteristic set of j=1,....,J characteristics taking the values Xj specific to each BH, and inclusive of the characteristics of all n PSs in a BH, of which, for example, X1 may be that the poultry is “with skin.” These can be included in equation (5) in place of the latent variable, that is:


The main concern of this section is the inclusion of the X j and the implications for the estimator of doing so.

A. A CPD Regression with Outlet Interaction Terms

Can we simply include the interaction term δknc* in the CPD as in equation (5), effectively a country product outlet dummy (CPOD) method?

Omitted variable bias would arise if say higher quality products sold in higher quality outlets were sold in some countries compared with others, a multicollinearity between the country and product/outlet quality effects. The inclusion of δknc* would remove such bias, but be very demanding in terms of degrees of freedom, especially since the same outlets would not necessarily be in the same countries, though more than one PS, say type of poultry, may be available in the same outlet. Further, we would have the incidental parameter problem—as KN increases for fixed C, the coefficients on the dummy variables would not be consistent since the number of these parameters increase as KN increases. (Baltagi 2005).

B. A Hedonic CPD Regression

Can we simply include the hedonic quality characteristics in a CPD regression instead of the interaction term, as in equation (6)?

Equation (6) can be seen to be a fixed effects (FE) panel estimator where the variation is over c = 1,....,C countries as opposed to more usually over time t=1,……,T, though the principles remain the same. Note that we still include in equation (6) the N=8-1=7 product dummies for the different types of poultry, but each of these products will have a detailed PS, including size, brand, free range or otherwise, skinless, frozen, sold unpackaged, and so forth, and outlet variables, such as market trader, independent butcher, supermarket, hypermarket, capital city, major city, rural area and so forth. The inclusion of Xj are as parsimonious proxies for the outlet/PS interaction terms δknc*. We can improve on the specification of (6) by also having quality and product interaction terms:


Thus if, for example, a certain brand of poultry may have a price premium, but the premium may be more for some products, say pre-pared ready-to-cook chicken dinners, than say frozen chicken breasts. The number of τnj* estimates can be reduced by appropriate tests.

Of course, if there are no product replacements and the characteristics remain the same across countries—they are country-invariant—then there is no between country variation. The λj* cannot be identified by the FE estimator and, indeed, there is no need for them to be identified since there is no need for quality adjustments. However, where there are non-comparable product replacements, the CPD hedonic formulations in (6) and (7) can be used to provide estimates of country price parities which incorporate quality-adjusted non-comparable replacements.

Empirical work of this nature is sparse. Aten (2003 and 2006) undertook inter-area price comparisons using U.S. data on nearly 200 item strata for 38 U.S. geographical (metropolitan) areas using a hedonic CPD—her “long” method. In Aten (2006) the hedonic CPD estimates are rerun for about 50 item strata, but the CPD for the remaining strata, so there are no published direct comparisons between the hedonic CPD and CPD methods. Aten subsequently compared14 the hedonic CPD with the CPD estimates on 50 item strata and found the (unweighted geometric mean) difference in the 38 inter-area price level comparisons to range from -11 percent to 16 percent. Of the 38 inter-area comparisons, nine showed a substantial difference between the hedonic CPD and CPD estimates of more than +7 percent or less than -7 percent, though a further nine area comparisons showed smaller differences of between +2 and -2 percent.

C. A Pooled Cross-Country Hedonic Regression

Why not exclude the product fixed effects and use OLS on pooled data?

Early studies of U.S. inter-area price comparison just used the quality characteristics of the product in an outlet to control for product variation—Kokoski, Cardiff, and Moulton (1994) and Kokoski, Moulton and Zieschang (1999), that is, they pooled the data and excluded product fixed effects.

Such a model is given by:


which exclude the fixed effect dummies from equation (6) and also the interaction terms in (7) and thus has potential omitted variable bias. The extent of any bias arising from such omissions will vary between BHs and is an empirical issue.

The OLS assumption that νnc*iid(0,σ2) for all n and c ignores the panel structure of the data. In (8): νnc*=βn*+ηnc* where ηnc* are uncorrelated with Xjc and βn* are the product-specific effects. The βn* are distinguished as a component of the error term νnc* since observations on the same product are likely to be more similar than observations from different products. If cov(βn*,ηjc*)=0 then (5) can be estimated as a RE panel estimator using GLS.15

OLS estimates of (8) would still be asymptotically unbiased, however the standard errors of the estimates would be understated and the OLS estimates would not be as efficient as GLS ones. Moulton (1986) compared OLS estimates and GLS estimates for three panel applications: a hedonic housing model, a housing demand model, and an earnings function, finding substantial, and for the housing demand study, dramatic, downward bias in the OLS standard errors and thus misleading inferences. The precision of the estimates suffered particularly when average group size and intra-group error correlation were large. The latter would be particularly true when there are variables with repeated values within a group, as is likely for the quality variables Xkjc.

Evans et al. (1995) estimated the coefficient on ‘concentration’ in a concentration-price regression using panel data on 1,000 city pairs of airlines carriers over specific routes over 21 quarters. They rejected the null hypothesis of concentration (xjc) being unrelated to the route-specific effect (βn*) i.e. they rejected the null of cov(βn*,Xjc*)=0. As a result the FE model was preferred and the pooled OLS estimates inefficient. Such decisions as to which estimator to use can have a major impact on the results. The pooled OLS estimated coefficient was 0.166 compared with 0.230 for the fixed effects estimated coefficient, the difference increasing (0.214:OLS and 0.577:FE) when instrumental variable estimators were used (since concentration was also correlated with the error term).

Heravi, Heston and Silver (2003) provided a number of estimates of price parities for television sets including a formulation such as (6). However, they used scanner data for three countries and found limited matching of models across countries thereby justifying a pooled OLS estimator. More particularly, Silver and Heravi (2007a) use the formulation in equation (6) to investigate the trade-off between tight specifications and the poor coverage of the items compared, and loosely defined specifications to allow for greater coverage but resulting in inappropriate quality comparisons. They provide an analytical framework and empirical results, based on a simulation of loosening the specifications, identifying conditions under which the bias from poor coverage from tight specifications may be offsetting or compounding, and thus severe.

D. The Choice of Estimator for a Hedonic CPD Regression

Can we simply use a fixed effects OLS estimator for (6) and (7)?

Fixed effects estimator

It is well established from the panel data literature that an OLS FE estimator provides an unbiased, consistent estimate of δj* if cov(βn*,Xj)0 and cov(γc*,Xj)0, that is, there is an assumption of endogeneity of all the regressors and both product and country effects.16 Our a priori expectations are for higher-priced products to have better (quality) specifications, i.e. cov(βn*,Xj)0, and products in higher-priced countries to be of a better quality i.e. cov(γc*,Xj)0.17 The FE estimator also has an intuition with regard to incorporating quality changes, as outlined in Annex 2. In practice a FE estimator does not use dummy variables, but equivalently regresses the difference in pknc on the differences in X j. Annex 2 demonstrates this equivalence and how the price parity estimated by a hedonic CPD model is, after taking expectations, the cross-country difference in the average price levels of comparable products plus the product of any cross-country difference in the quality characteristics multiplied by a hedonic estimate of its valuation.

Random effects estimator

A RE panel model treats product and country-specific variation as random draws from respective zero mean distributions, i.e. cov(βn*,Xj)=0 and cov(γc*,Xj)=0. The RE model assumes exogeneity of all the regressors and both product and country effects, unlike the FE model that allows for endogeneity of all the repressors. A RE model is estimated using GLS or ML estimators. A panel FE estimator as in (4) and (5) provides consistent estimates of δj* even if the RE model is valid.18 However, since the RE model uses both within and between product variation across countries it does not require the product specific dummy variables. The RE model is thus more efficient in such circumstances.19 Baltagi (1981), using Monte Carlo experiments, considered the merits of alternative estimators for a specification akin to (4), in which the component (fixed) effects are not correlated with X j. Using the mean squared error (MSE) criterion, he found that there was always a gain, and a substantial one, from performing two-stage generalized least squares (GLS) compared with an ordinary least squares (OLS) or a FE estimator. All the GLS estimators considered outperformed OLS ones, though all the GLS estimators had fairly close MSEs. The recommendation was to use more than one GLS procedure and, if the results differ widely, test the specifications of the model. Thus while, a priori, there is reason to use a CPD FE estimator, a Hausman (1978) test as to whether to reject a RE specification would be appropriate (Baltagi, 2005). However, Hausman and Taylor (1981) propose an alternative approach.

Hausman and Taylor estimator

Hausman and Taylor (1981) have as their focus a potential correlation between the individual effects, the latent outlet-item interaction variable, δj* in equation (3), with the explanatory variables Xjc. We stress that the individual outlet-item interaction effect is a latent unmeasured omitted variable that is excluded in the estimation for reasons given above. If this latent interaction term is correlated with measured variables Xjc, OLS and GLS estimates of δj* are biased and inconsistent. Hausman and Taylor (1981) divide the Xjc=[Xjc1,Xjc2] into Xjc1 endogenous variables that are correlated with the latent interaction term variables and Xjc2 exogenous variables that are uncorrelated with the latent interaction term. These in turn are divided into, in this context, country-invariant or country-varying variables, Xjc1=[Xjc1ci,Xjc1cv] and Xjc2=[Xjc2ci,Xjc2cv]. The FE estimator wipes out the country-invariant Xjc1ci and Xjc2ci variables. The GLS RE estimator ignores the endogeneity due to the interaction term and thus yields biased and inconsistent estimates of the parameters on Xjc1. However, Hausman and Taylor (1981) propose an instrumental variable estimator that uses Xjc2 exogenous variables as instruments. The country-varying Xjc1 that are uncorrelated with the latent interaction term serve two functions: (i) as deviations from individual means they produce unbiased estimates of δj*, and (ii) using the individual means they provide valid instruments for estimating the parameters of the country-invariant Xjc uncorrelated with the interaction term. The method is an improvement of the FE estimator in that it is more efficient and produces unbiased estimates of the parameters of the country-invariant variables. Hausman and Taylor (1981) find from an empirical application that when correlations between the variables of interest and the latent variable are taken into account the traditional estimates are revised markedly, see also Baglati and Levin (1986).

The HT estimator requires at least as many exogenous country-varying variables as there are endogenous country invariant variables for it to be more efficient than an FE estimator, otherwise the HT estimator is identical to the FE one since the parameters on the country-invariant variables Xjc1ci and Xjc2ci cannot be estimated.

Tests for choosing among estimators

There are of course tests to help choose between the above specifications and estimators. Of initial interest is whether the product dummies in the “P” term of any CPD serves any purpose, that is the null hypothesis that βn*=β* can be tested. Baltagi (1981) demonstrates that the Chow test performs poorly due to its invalid assumption that the variance of the cross-sectional effects is homoskedastic, and proposes alternative tests that include the Roy statistic. Importantly, Hausman and Taylor (1981) provide a specification test for the non-correlation assumptions required for an RE model. More generally, Baltagi, Bresson, and Pirotte (2003) provide tests to choose between FE, RE, or HT estimators. They advocate that first, a standard Hausman test based on the FE versus RE estimators is undertaken and the RE estimator used if it is not rejected. If rejected, a second Hausman test is undertaken based on the difference between the FE and HT estimators, and an HT estimator used if some of the variables, but not all, are correlated with individual effects. Otherwise an FE estimator is used.

V. Summary

The 2005 ICP used the regression-based CPD approach to derive BH parity estimates. A number of econometric issues arise, not least because of the panel structure of the data. This paper seeks to address such issues. Of particular concern was the use of grouped data for the 2005 ICP, that is average prices across outlets for each of the N products in the C countries, thus ignoring between outlet price variation in the regression with an attendant loss of efficiency and potential omitted variable bias.

The 2005 ICP included as an innovation detailed product specifications on each item priced. This enabled a more precise matching of the prices of products with like characteristics across countries. It also enabled, albeit for the large part only in principle, the collection of the prices of non-comparable replacements for missing items along with details of the quality differences. A serious problem with PPP estimates is bias from either a loss of respresentativity, due to the omission of such non-comparable products, or due to the inclusion of non-comparable replacement item prices as if they were comparable. This paper argues for the inclusion in the 2011 ICP round of non-comparable replacement items in a hedonic CPD bringing together the innovations of detailed product descriptions and a regression-based approach to parity estimates. It outlines some of the econometric issues.

The use of grouped data in CPD regressions results in efficiency losses due to heteroskedasticity that may be mitigated against, or aggravated by, the use of a WLS estimator, but may also arise even if the disturbances are homoskedastic. Further, spurious R¯2 and related statistics from grouped data detract from the normal array of diagnostics for detecting heteroskedasticity. If grouped data are to continue to be used, which we advise against, the collection and inclusion of non-comparable replacements for missing items is advocated with such data incorporated into a hedonic CPD that makes use of the group averages of quality characteristics.

Most price determining characteristics should be included for the parity estimates from a hedonic CPD to be consistent and to militate against omitted variable bias. A strategy of further stratification of a CPD regression by factors such as outlet-type and location is a limited case of the inclusion of most price-determining variables. If further stratification is undertaken, a finer coding is preferable on efficiency grounds to a coarser one, and experience has found that great care should be exercised in the country coding used in measuring these stratifying factors.

A less preferred alternative, on econometric grounds, for quality adjustments to the prices of non-comparable replacements, is to devise explicit adjustments using information from price collectors on different varieties, say similar brands on the outlet shelf or from experts based on option costs. Explicit estimated can also be generated from hedonic regressions, but if use is to be made of hedonic methods, the preference is to incorporate the quality characteristics into the (hedonic) CPD regression. The arguments for using explicit quality estimates apply as much to grouped data as to ungrouped, which we now turn to.

The shortcomings of using grouped data can of course be remedied by using in the CPD regression the ungrouped data from which the grouped data were derived. Again a hedonic CPD is argued to the preferred approach to enable the incorporation of non-comparable replacements and condition the parity estimates on such quality variation. Quality characteristics in a CPD are deemed to be proxy variables to capture latent outlet-product-country variables whose inclusion is precluded by degrees of freedom issues. A hedonic CPD formulation, possibly with country-product interaction terms, is argued to be preferable to both separate CPD and hedonic ones. However, and more importantly, consideration is given to the panel structure of the data and the choice between, and test for, fixed effects, random effects, and Hausman-Taylor estimators, an issue that also applies to grouped data.

Annex 1. Characteristic Price Index Numbers20

We consider characteristic price (hedonic) index numbers as an alternative methodology to hedonic CPD indexes. Characteristic price indices do not use a regression model for the parity estimates, but do for the quality adjustment. They have been proposed when there is a large turnover in new models and thus little panel structure to the data, such as for personal computers (see Berndt and Rapport (2001), Pakes (2003) and Silver and Heravi (2005)). They take an index number form, but with the quantity of characteristics held constant for each product in each country, but valued by country 2 hedonic parameters (shadow prices) in the numerator and country 1 hedonic parameters in the denominator. A family of such index numbers suggest themselves depending on which country’s quantities (or which average of country quantities) are held constant. For example, holding country 1 or country 2 quantities constant yields:

δ^j2*(Xj2)δ^j1*(Xj2) and δ^j2*(Xj1)δ^j1*(Xj1)(A1.1)

Superlative index number formula can be defined as specific forms of symmetric means of the two formula or symmetric mean-value of the X j1 and Xj2 (Silver and Heravi, 2007b).

Note that this is very different from the constrained δ^j* in (4) and (5) since the very essence of (8) is that δ^j1* and δ^j2* differ.

Equations (5) and (8) are based on quite different premises. The regression-based (5) holds the parameters constant so that the difference in intercepts is invariant to any value of X j while the essence of (8) is to allow the parameters to change while holding the X j constant. Such differences in approaches were considered empirically in Silver and Heravi (2007b) and more formally in Diewert, Heravi, and Silver (2008)—hereafter DHS (2008).21

The numerical difference between the two methods was shown in DHS (2008) to depend on the exponent of the product of the (expenditure-share weighted country) differences in the coefficients, in the mean values of X jc, and in the relative characteristics variance-covariance matrix.22

Annex 2. An Intuition for a Hedonic CPD

Some insight into the quality adjustment arising from the hedonic CPD in (5) based on a FE panel estimator can be seen by its evaluation in terms of differences. Consider for simplicity a bilateral comparison of countries c=1,2 and a product defined by country-invariant quality characteristics, Xn1=Xn2=Xnc, that change over products but do not change over countries, and by, for simplicity, a single country-varying characteristic Znc that may change over products and countries. For example, the specification to be priced is a 3 GB (memory) personal computer (PC) whose price is observed in country 1 which, other things equal across countries, has a non-comparable replacement of a 4 GB PC in country 2, Zn1 = 3 and Zn2 = 4.

Consider for countries c = 1, 2:

pnc*=α1*+γ2*D2+δ*Xnc+λ*Znc+ωnc* where ωnc*=βn*+enc*(A.2.1)

and where γ^2*=(α^2*α^1*) are the estimated parities and δ^* and λ^* are constrained to be the same across the two countries. Equation (A1.1) can be seen to be made up of separate equations for countries 1 and 2 given respectively by:


though equation (A1.1) makes the additional assumption that ωnc*=ωn1*=ωn2*.

The difference Δpn*=(pn2*pn1*) is, by construction of Xncand Znc, given by:


Note: that we have dropped the country-invariant changes Xnc and the product- specific effects βn* in (A.1) since:


It is well established that an OLS regression of the differences in (A.4) is equivalent to a FE dummy variable model regression given by (5) and, when there are many products, is computationally much easier (Davidson and MacKinnon, 1993). What (A.4) tells us is that for a given product n, the price parity measured by a hedonic CPD estimator is, after taking expectations, the cross-country difference in the average price levels of comparable products (the constant change) plus any cross-country difference in quality characteristics between the original PS and the non-comparable replacement’s PS each multiplied by constrained estimates of their hedonic valuation. The use of the FE panel estimator can be seen to have an intuition.


  • Aizcorbe, A., Corrado, C. and Doms M. (2000). Constructing Price and Quantity Indexes for High Technology Goods, Industrial Output Section, Division of Research and Statistics, Board of Governors of the Federal Reserve System, July.

    • Search Google Scholar
    • Export Citation
  • Aten, Bettina. H., (1996). Evidence of Spatial Autocorrelation in International Prices, Review of Income and Wealth, June, 42, 2, 14963.

    • Search Google Scholar
    • Export Citation
  • Aten, Bettina (2003) “Report on Interarea Price Levels, 2003,” Working Paper No. 2005-11, Bureau of Economic Analysis, May.

  • Aten, Bettina H., (2006) Interarea Price Levels: An Experimental Methodology, Monthly Labor Review, September, 129, 9, 4761.

  • Balk, Bert M., (2005). “Price Indexes for Elementary Aggregates: The Sampling Approach,” Journal of Official Statistics, Vol. 21, No. 4, pp. 67599.

    • Search Google Scholar
    • Export Citation
  • Baltagi, Badi H. (1981). A Two-Way Error Component Model, Journal of Econometrics, 17, 2149.

  • Baltagi, Badi H. (2005). Econometric Analysis of Panel Data, Chichester: John Wiley & Sons.

  • Baltagi Badi H., Bresson, Georges and Alaine Pirotte (2003). Fixed Effects, Random Effects or Hausman-Taylor? A Pretest Estimator, Economic Letters, 79, 361369.

    • Search Google Scholar
    • Export Citation
  • Baltagi, Badi H. and Dan Levin (1986) Estimating Dynamic Demand for Cigarettes Using Panel Data: The Effects of Bootlegging, Taxation and Advertising Reconsidered, The Review of Economics and Statistics, 68, 1, February, 148155.

    • Search Google Scholar
    • Export Citation
  • Berndt, E.R. and Rappaport N.J. (2001). Price and Quality of Desktop and Mobile Personal Computers: A Quarter-Century Historical Overview, American Economic Review, 91, 2, 268273.

    • Search Google Scholar
    • Export Citation
  • Burdette, Steve (2007). Comparing the Prices of Machinery and Equipment Across the Globe, ICP Bulletin, 4, 2, August, 58.

  • Cuthbert, J. and M. Cuthbert. (1988). On Aggregation Methods of Purchasing Power Parities, Working Paper No. 56, November, Paris: OECD.

  • Davidson, R. and J.G. MacKinnon (1993). Estimation and Inference in Econometrics, Oxford: University Press.

  • Deaton, Angus and Alan Heston (2008). Understanding PPPs and PPP-based national accounts, paper presented at CRIW-NBER Summer Workshop, July 14–15, 2008.

    • Search Google Scholar
    • Export Citation
  • Dhrymes, Phoebus J. and Adriana Lleras-Muney (2006) Estimation of Models with grouped and Ungrouped Data by Means of “2SLS”, Journal of Econometrics, 133, 129.

    • Search Google Scholar
    • Export Citation
  • Dickens, William T. (1990). Error Components in Grouped Data: Is It Ever Worth Weighting? Review of Economics and Statistics, 72, 2, May, 328333.

    • Search Google Scholar
    • Export Citation
  • Diewert, W. Erwin (1999). Axiomatic and Economic Approaches to International Comparisons. In International and Interarea Comparisons of Income, Output and Prices, Alan Heston and Robert E. Lipsey (eds.) Studies in Income and Wealth, 61, 1387, Chicago: University of Chicago Press.

    • Search Google Scholar
    • Export Citation
  • Diewert, W. Eerwin (2002). Hedonic Regressions: A Review of Some Unresolved Issues, Mimeo, Department of Economics, University of British Columbia.

    • Search Google Scholar
    • Export Citation
  • Diewert, W. Erwin (2004). “Elementary Indices,” in Consumer Price Index Manual: Theory and Practice, (Geneva: International Labour Office) Chapter 20, pages 35570. Available via the Internet at

    • Search Google Scholar
    • Export Citation
  • Diewert, W. Erwin (2005) Weighted Country Product Dummy Variable Regressions and Index Number Formulae, Review of Income and Wealth, 51, 4, December, 56170.

    • Search Google Scholar
    • Export Citation
  • Diewert, W. Erwin (2008). New Methodology for Linking Regional PPPS, ICP Bulletin, 5, 2, August, 1, 1021.

  • Diewert, W. Erwin, Heravi, Saeed, and Silver, Mick, (2008). Hedonic Imputation Indexes Versus Time Dummy Hedonic Indexes, NBER Working Paper 14018. Forthcoming in W. Erwin Diewert, John Greenlees, and Charles R. Hulten (eds.) Price Index Concepts and Measurement, NBER, Chicago: University of Chicago Press, 278–337, 2010.

    • Search Google Scholar
    • Export Citation
  • Druska, V. and Horrace, W.C. (2004). Generalised Moments Estimation for Spatial Panel Data: Indonesia Rice Farming, American Journal of Agricultural Economics, 86, 1 February, 185198.

    • Search Google Scholar
    • Export Citation
  • Evans, W. N., Froeb, L.M. and Werden, G.J. (1993). Endogeneity in the concentration-price relationship: causes consequences and cures, Journal of Industrial Economics, XLI, 4, December, 431438.

    • Search Google Scholar
    • Export Citation
  • Hausman, J. A. and Taylor, W.E. (1981). Panel Data and Unobservable Individual Effects, Econometrica, 49, 13771398.

  • Hausman, J.A. (1978). Specification Tests in Econometrics, Econometrica, 46, 12511271.

  • Heravi, Saeed, Heston, Alan, and Silver, Mick, (2003). Using Scanner Data to Estimate Country Price Parities: An Exploratory Study. Review of Income and Wealth, 49,1, 122, March.

    • Search Google Scholar
    • Export Citation
  • Ioannides, Christos and Silver, Mick, (1999). Estimating Exact Hedonic Indexes: An Application to U.K. Television Sets, Journal of Economics, Zeitschrift Für NationalÖkonomie, 69, 1.

    • Search Google Scholar
    • Export Citation
  • Johnston, J. and Dinardo, J. (1997). Econometric Methods, Fourth Edition, New York: McGraw-Hill.

  • Kmenta, Jan (1986). Elements of Econometrics, 2 nd Edition. New York: Maxwell Macmillan International Edition.

  • Kokoski, Mary, Cardiff, Patrick, and Brent Moulton (1994). Interarea Price Indices for Consumer Goods and Services: An Hedonic Approach Using CPI Data, Working Paper No. 256, available from the Office of Prices and Living Conditions, July.

    • Search Google Scholar
    • Export Citation
  • Kokoski, Mary F., Moulton, Brent R. and Zieschang, KimberlyD. (1999). Interarea Price Comparisons for Heterogeneous Goods and Several Levels of Commodity Aggregation. In A Heston and R Lipsey eds., International and Interarea Comparisons of Prices, Output and Productivity, Committee on Research on Income and Wealth, NBER, Chicago: University of Chicago Press, pp. 12366.

    • Search Google Scholar
    • Export Citation
  • Machado, José A. F. and Joa[tilde on a]o M.C. Santos Silva (2001) Identification with Averaged data and Implications for Hedonic Regression Studies, Economic Research Department Working Paper 10-01, November.

    • Search Google Scholar
    • Export Citation
  • Moulton, Brent (1986). Random Group Effects and the Precision of Regression Estimates, Journal of Econometrics, 32, 385397.

  • Pakes A. (2003). A Reconsideration of Hedonic Price Indexes with an Application to PCs, The American Economic Review, 93, 5, December, 157693.

    • Search Google Scholar
    • Export Citation
  • Rao, D.S. Prasada (2002). On the Equivalence of Weighted Country-Product-Dummy (CPD) Method and the Rao System for Multilateral Price Comparisons, School of Economics, University of New England, Armidale, Australia, March.

    • Search Google Scholar
    • Export Citation
  • Rao, D.S. Prasada, (2004). The Country-Product-Dummy Method: A Stochastic Approach to the Computation of Purchasing Power Parities in the ICP. Paper presented at the SSHRC International Conference on Index Number Theory and the Measurement of Prices and Productivity, Vancouver, Canada, June 30–July 3.

    • Search Google Scholar
    • Export Citation
  • Sickles, R.C. (2005). Panel Estimators and the Identification of Firm-Specific Efficiency Levels in Parametric and Non-Parametric Settings, Journal of Econometrics, June, 126, 2, 30534.

    • Search Google Scholar
    • Export Citation
  • Silver, Mick (2002). The Use of Weights in Hedonic Regressions: The Measurement of Quality-Adjusted price Changes, Mimeo, Cardiff Business School, Cardiff University.

    • Search Google Scholar
    • Export Citation
  • Silver, Mick (2007). Why Elementary Price Index Number Formulas Differ: Price Dispersion and Product Heterogeneity.” (with S. Heravi), Journal of Econometrics, 140, 2, 87483, October.

    • Search Google Scholar
    • Export Citation
  • Silver, Mick and Heravi, Saeed, (2005). A Failure in the Measurement of Inflation: Results from a Hedonic and Matched Experiment Using Scanner Data.” Journal of Business and Economic Statistics, 23, 3, 269281, July.

    • Search Google Scholar
    • Export Citation
  • Silver, Mick and Heravi, Saeed (2007a). Purchasing Power Parity Measurement and Bias from Loose Item Specifications in Matched Samples: An Analytical Model and Empirical Study, Journal of Official Statistics, 21, 3, 463487.

    • Search Google Scholar
    • Export Citation
  • Silver, Mick and Heravi, Saeed (2007b). Hedonic Indexes: A Study of Alternative Methods. In E.R. Berndt and C. Hulten (eds.) Hard-to-Measure Goods and Services: Essays in Honour of Zvi Griliches, pp. 235268, NBER/CRIW, Chicago: University of Chicago Press, 2008.

    • Search Google Scholar
    • Export Citation
  • Silver, Mick and Saeed Heravi (2007c), Hedonic Imputation Indexes and Time Dummy Hedonic Indexes, Journal of Business and Economic Statistics, 25:2, 239246.

    • Search Google Scholar
    • Export Citation
  • Summers, R. (1973). International Comparisons with Incomplete Data, The Review of Income and Wealth, March.

  • Teekens, R. and Koerts, J. (1972) Some Statistical Implications of the Log Transformations of Multipicative Models, Econometrica, 40, 5, 793819.

    • Search Google Scholar
    • Export Citation
  • Triplett, Jack. E. and McDonald, R.J. (1977) Assessing the Quality Error in Output Measures: The Case of Refrigerators, The Review of Income and Wealth, 23, 2, 137156.

    • Search Google Scholar
    • Export Citation
  • Triplett, Jack E., (2004). Handbook on Quality Adjustment of Price Indexes for Information and Communication Technology Products, OECD Directorate for Science, Technology and Industry, Draft, OECD, Paris.

    • Search Google Scholar
    • Export Citation
  • World Bank, (2007). International Comparisons Program Handbook, 2003–2006, Washington D.C.: World Bank. Available by clicking on:

    • Search Google Scholar
    • Export Citation
  • Zellner, A. (1962). An Efficient Method of Estimating Seemingly Unrelated Regressions and Tests of Aggregation Bias, Journal of the American Statistical Association, 57, 348368.

    • Search Google Scholar
    • Export Citation

See the World Bank’s ICP site at:


This account is based on Diewert (1980).


An extended CPD method was used for South America in which two additional variables were included that distinguished whether the item was representative and unrepresentative. Representative varieties are expected to have lower prices compared to unrepresentative ones (Cuthbert and Cuthbert, 1988). This is something that should be included in the regression specifications outlined in the paper, but is omitted for simplicity of exposition.


Diewert (2008) notes that the majority of the members of the Technical Advisory Group who provided advice to ICP 2005 favored the extended CPD method over EKS*, though the Eurostat were locked into the EKS* method by legislation. He continued to note that the empirical results for the inclusion of the representativity variables were disappointing and thus, at this state of knowledge, he favored the “plain vanilla” CPD method.


Arithmetic averages were used which not only have less desirable axiomatic properties than geometric one—Diewert (2004), Balk (2005) and Silver (2007)—but also are not consistent with the geometric (logarithmic) CPD specification used (see Rao (2004).


As World Bank (2007) explains: “For example, an SPD identifies the fabric from which clothing is made as a price determining characteristic and lists several possible fabrics—cotton, wool, polyester, etc. A PS derived from the SPD will stipulate, for example, that the fabric must be at least 80 percent cotton. There could be any number of PSs derived from a single SPD by taking different combinations of specific characteristics.” (Chapter 5, page 28).


Deaton and Heston (2008) note that a paradoxical result of tight specifications is that prices of items in poorer countries not available from outlets normally sampled for their consumer price indices were often collected from higher-end outlets, which had the effect of raising price levels of poorer countries.


The World Bank’s ICP Handbook (2007) cites Silver and Heravi (2002), now published as (2005), in this regard.


That should in fact be averages of logarithms of prices, Rao (2004).


An adjustment is necessary since for a log specification γ^c* is a biased estimate of γc* (Teekens and Koerts, 1972).


Triplett and McDonald (1977) and Aizcorbe et al. (2000) have shown how unweighted OLS estimates of price changes from logarithmic functions correspond to geometric mean indexes. Rao (2002) had demonstrated a correspondence between a Rao-system of multilateral price comparisons and estimates of price parities from a WLS CPD regression for which the weights are expenditure shares. Diewert (2002) and (2005) has demonstrated that if a harmonic mean of expenditure shares in the two periods is used as weights in a logarithmic CPD regression, the resulting parity estimates will have a close correspondence to a Törnqvist index. Heravi, Heston and Silver (2003) have provided an example of the application of a hedonic WLS estimator that corresponds to a superlative price parity index—see also Kokoski et al. (1999). WLS hedonic regressions have been used to estimate superlative indexes and have been compared with OLS estimates in Ioannides and Silver (1999) and Silver and Heravi (2007b). However, the correspondence to index number formulae only holds for well-behaved regression estimates. Silver (2002) has shown how leverage effects may upset such correspondences.


R¯2 cannot be used to determine the effectiveness of the explanatory variables since it is biased upwards due to the omission of within group variation.


Rao (2004) points to further considerations. First, that the average of the logarithms of prices should be used, as opposed to taking logarithms of average prices, and second, that a weighted least squares (WLS) estimator is appropriate, due to heteroskedastic disturbances, with weights inversely proportional to the sample size of the average.


Estimates from private correspondence between Bettina Aten and the author.


A RE model is estimated in two stages. First, var(βn*) and var(vn*)) are estimated and then the data is transformed by subtracting product-specific means multiplied by a weighting factor that is a function of the two estimated variances and the number of countries. The estimate on the transformed data is a weighted average of the within-product and between-product estimates.


Rao (2004) demonstrates an adjustment for spatial autocorrelation of disturbances if necessary.


Not all the characteristics that define the product X j need to be correlated with the fixed effects, as considered by Sickles (2005).


It is also assumed that cov(εnc*,Xj)=0 otherwise instrumental variable (IV) estimators are required. (Hausman, 1978).


In practice a FE model does not use dummy variables, but equivalently regresses the difference in pknc on the differences in Xj. An FE estimator can be seen to only use within product variation. A potential disadvantage of the FE estimator is that with measurement error the FE coefficients are downwards biased since the within estimation may be based on small changes relative to the measurement error.


Also referred to by Silver and Heravi (2007b) and DHS (2008) as hedonic imputation indexes, though we use the above terminology here to be consistent with Triplett (2004).


An earlier attempt at decomposing the difference between the two approaches is Silver and Heravi (2007c), but this excluded a covariance term.


Increasing as, say, the product market diversifies into distinct bundles of characteristics with distinct say high-end and bottom-of-the-market offerings.