Abstract

10.1 This chapter provides a general description with examples of the ways in which export and import price indices (XMPIs) are calculated in practice. The methods used in different countries are not exactly the same, but they have much in common.

A. Introduction

10.1 This chapter provides a general description with examples of the ways in which export and import price indices (XMPIs) are calculated in practice. The methods used in different countries are not exactly the same, but they have much in common.

10.2 As a result of the greater insights into the properties and behavior of price indices that have been achieved in recent years, it is now recognized that some traditional methods may not necessarily be optimal from a conceptual and theoretical viewpoint. Concerns have also been voiced in a number of countries about possible biases that may be affecting XMPIs. These issues and concerns need to be addressed in this Manual. Of course, the methods used to compile XMPIs are inevitably constrained by the resources available, mainly for collecting and processing prices. In some countries, the methods used may be severely constrained by a lack of resources.

10.3 The calculation of XMPIs usually proceeds in two stages. First, price indices are estimated for the unweighted elementary aggregates, and then these elementary price indices are averaged to obtain higher-level indices using the relative traded values for the elementary aggregates as weights. Section B starts by explaining how the elementary aggregates are constructed and which economic and statistical criteria need to be taken into consideration in defining the aggregates. The index number formulas most commonly used to calculate the elementary indices are then presented and their properties and behavior illustrated using numerical examples. The pros and cons of the various formulas are considered together with some alternative formulas that might be used. Much more detail on the properties of elementary aggregate index number formulas is provided in Chapter 21. The problems created by disappearing and new commodities are also explained as are the different ways of imputing for missing prices, considered in further detail respectively in Chapters 8 and 9.

10.4 The data for the measurement of price changes at the elementary aggregate level may be compiled using unit value indices from customs documentation as opposed to price relatives of detailed descriptions of commodities from establishment surveys. Unit value indices—ratios of the value of exports/imports divided by the quantity in one period compared with that in a reference period—may be used to represent the price changes of a commodity classification at the elementary level, prior to aggregation by weights, only if the items within the classification are homogeneous. This is unlikely for unit values from customs data. See Chapter 2 for issues concerning such a use of unit value indices as “plug ins” or proxies for price relatives at the elementary level. If more than one unit value index from customs data or price relative from establishment surveys is to be aggregated as unweighted averages at the elementary level, then the discussion in Section B of this chapter applies. Section B considers and illustrates the formulas that may be used to aggregate the price relatives when no information on weights is available, that is, for elementary price index number formulas. The first order solution to the problem is of course to attempt to obtain weighting information on the relative shares of imports/exports the price relatives represent. Section B and Chapter 21 address the index number problem when no such information is available.

10.5 Section C of the chapter is concerned with the calculation of higher-level indices, when information on weights is available. If elementary aggregate indices, be they based on unit value indices from customs data or price relatives from establishment surveys, are to be aggregated using weights—that is, be aggregated at the higher level—then the concern is with the choice of weighted formulas as discussed in Section C of this chapter. The focus is on the ongoing production of a monthly price index in which the elementary price indices are averaged, or aggregated, to obtain higher-level indices. Price updating of weights, chain linking, and reweighting are discussed, with examples provided. The problems associated with introduction of new elementary price indices and new higher-level indices into the XMPI are also covered. The section explains how it is possible to decompose the change in the overall index into its component parts. Finally, the possibility of using some alternative and rather more complex index formulas is considered.

10.6 Section D concludes with data editing procedures, because these are an integral part of the process of compiling XMPIs. It is essential to ensure that the right data are entered into the various formulas. There may be errors resulting from the inclusion of incorrect data or from entering correct data inappropriately, and errors resulting from the exclusion of correct data that are mistakenly believed to be wrong. The section examines data editing procedures that try to minimize both types of errors.

B. Calculation of Price Indices for Elementary Aggregates

10.7 XMPIs typically are calculated in two steps. In the first step, the elementary price indices for the elementary aggregates are calculated. In the second step, higher-level indices are calculated by averaging the elementary price indices. The elementary aggregates and their price indices are the basic building blocks of the XMPI.

B.1 Construction of elementary aggregates

10.8 Elementary aggregates are constructed by grouping individual commodities and individual services into groups of relatively homogeneous commodities or services. They may be formed for groups of commodities or services irrespective of the country of destination (export) or the country of origin (import), but it is also possible to form elementary aggregates according to country of destination or origin, or for different types of establishments, or even individual establishments. The actual formation of elementary aggregates thus depends on the circumstances and the availability of information, and they may therefore be defined differently in different countries. However, some key points should be observed:

  • Elementary aggregates should consist of groups of commodities or services that are as similar as possible, and preferably fairly homogeneous.

  • They should also consist of commodities that may be expected to have similar price movements. The objective should be to try to minimize the dispersion of price movements within the aggregate.

  • The elementary aggregates should be appropriate to serve as strata for sampling purposes in light of the sampling regime planned for the data collection.

10.9 Each elementary aggregate, whether relating to the whole export or import, the country of destination or origin, or a group of establishments, will typically contain a very large number of individual commodities or services. Unit value indices from customs data benefit from covering the vast majority of transactions for merchandise goods.1 However, for establishment surveys, in practice only a small number can be selected for pricing. When selecting the commodities, one must take into account the following considerations:

  • The transactions selected should be ones with price movements that are believed to be representative of all the commodities within the elementary aggregate.

  • The number of transactions within each elementary aggregate for which prices are collected should be large enough for the estimated price index to be statistically reliable. The minimum number required will vary between elementary aggregates, depending on the nature of the commodities and their price behavior.

  • The object is to try to track the price of the same product over time for as long as the product continues to be representative. The commodities selected should therefore be ones that are expected to remain on the market for some time so that like items can be compared with like and problems associated with disappearing commodities and selection of replacements can be reduced. Prices of commodities of matched quality need to be monitored because the aim of the measure is to be one of pure price changes unaffected by changes in the quality composition over time, as may be the case with unit values.

10.10 The individual commodities should be grouped into elementary aggregates by use of a product (commodity) or activity (industry) classification, such as the Harmonized Commodity Description and Coding System (HS) or the International Standard Industrial Classification of Economic Activities (ISIC). It is useful to assign a detailed product or activity code to each sampled commodity in order to facilitate the grouping of individual observations into elementary aggregates and the calculation of elementary indices. Similarly, the elementary aggregates should be appropriately coded to allow further aggregation into higher-level indices. This is dealt with below in Section C.1.1. The classifications were presented in more detail in Chapter 4.

B.2 Calculation of elementary price indices

10.11 An elementary price index is the price index for an elementary aggregate. Various methods and formulas may be used to calculate elementary price indices. This section provides a summary of pros and cons that statistical offices must evaluate when choosing a formula at the elementary level; Chapter 21 provides a more detailed discussion.

10.12 Often it is not possible to obtain information about the relative importance of the individual commodities that enter into the elementary aggregates. Or it may be considered too time consuming and resource demanding to obtain and maintain individual weights, compared with the possible improvements the use of such weights would add to the index. If such information has to be collected from the respondents it will also add to the establishment’s response burden. In many countries much aggregation is thus done without the use of weighting data. This section, therefore, focuses on the calculation of unweighted elementary price indices. The calculation of weighted elementary indices is dealt with in Section C.

10.13 The methods statistical offices most commonly use are illustrated by means of a numerical example in Table 10.1. It is assumed that prices are collected for four representative commodities within an elementary aggregate. The quality of each commodity remains unchanged over time so that the month-to-month changes compare like items with like. No information on weights is available. Assume initially that prices are collected for all four commodities in every month covered so that there is a complete set of prices. There are no disappearing commodities, no missing prices, and no replacement commodities. These are quite strong assumptions because many of the problems encountered in practice are attributable to breaks in the continuity of the price series for the individual transactions for one reason or another. The treatment of disappearing and replacement commodities is taken up in Section B.5.

Table 10.1.

Calculation of Price Indices for an Elementary Aggregate1

article image

All price indices have been calculated using unrounded figures.

10.14 Three widely used formulas that have been, or still are, in use by statistical offices to calculate elementary price indices are illustrated in Table 10.1. It should be noted, however, that these are not the only possibilities and some alternative formulas are considered later.

  • The first is the Carli index for i =1, . . . , n commodities. It is defined as the simple, or unweighted, arithmetic mean of the price ratios, or price relatives, for the two periods, 0 and t, to be compared:

Pc0:t=1nΣi(pitpi0).(10.1)
  • The second is the Dutot index, which is defined as the ratio of the unweighted arithmetic mean prices:
    PD0:t=1nΣipit1nΣipi0(10.2)
  • The third is the Jevons index, which is defined as the unweighted geometric mean of the price ratios, which is identical with the ratio of the unweighted geometric mean prices:

PJ0:t=Πi(pitpi0)1/n=Πi(pit)1/nΠi(pi0)1/n.(10.3)

10.15 Each month-to-month index shows the change in the index from one month to the next. The chained month-to-month index links together these month-to-month changes by successive multiplication. The direct index compares the prices in each successive month directly with those of the reference month, January. By simple inspection of the various indices in Table 10.1, it is clear that the choice of formula and method can make a substantial difference in the results obtained. Some results are striking—in particular, the large difference between the chained Carli index for July and each of the direct indices for July, including the direct Carli.

10.16 The properties and behavior of the different indices are summarized in the following paragraphs and explained in more detail in Chapter 21. First, the differences between the results obtained by using the different formulas tend to increase as the variance of the price relatives, or ratios, increases. The greater the change in the dispersion of the price movements, the more critical the choice of index formula and method becomes. If the elementary aggregates are defined so that the price movements within the aggregate are minimized, the results obtained become less sensitive to the choice of formula and method.

10.17 Certain features displayed by the data in Table 10.1 are systematic and predictable and follow from the mathematical properties of the indices. For example, it is well known that an arithmetic mean is always greater than, or equal to, the corresponding geometric mean—the equality holding only in the trivial case in which the numbers being averaged are all the same. The direct Carli indices are therefore all greater than the Jevons indices, except in May and July when the four price relatives based on January are all equal. In general, the Dutot index may be greater or less than the Jevons index, but tends to be less than the Carli index.

10.18 One general property of geometric means should be noted when using the Jevons formula. If any one observation out of a set of observations is zero, its geometric mean is zero, whatever the values of the other observations. The Jevons index is sensitive to extreme changes in prices, and it may be necessary to impose upper and lower bounds on the individual price relatives of, say, 10 and 0.1, respectively, when using the Jevons. Of course, extreme observations are often the result of errors of one kind or another, and so extreme price movements should be carefully checked in any case. More details on data editing can be found in Section D.

10.19 Another important property of the indices illustrated in Table 10.1 is that the Dutot and the Jevons indices are transitive, whereas the Carli index is not. Transitivity means that the chained monthly indices are identical with the corresponding direct indices. This property is important in practice, because many elementary price indices are in fact calculated as chain indices that link together the month-to-month-indices. The intransitivity of the Carli index is illustrated dramatically in Table 10.1, in which each of the four individual prices in May returns to the same level as it was in January, but the chained Carli index registers an increase of almost 14 percent over January. Similarly, in July, although each individual price is exactly 10 percent higher than in January, the chained Carli index registers an increase of 29 percent. These results would be regarded as perverse and unacceptable in the case of a direct index, but even in the case of the chained index, the results seems so intuitively unreasonable as to undermine the credibility of the chained Carli index. The price changes between March and April illustrate the effects of “price bouncing,” in which the same four prices are observed in both periods, but they are switched between the different commodities. The monthly Carli index from March to April increases, whereas both the Dutot and the Jevons indices are unchanged.

10.20 The message emerging from this brief illustration of the behavior of just three possible formulas is that different index numbers and methods can deliver very different results. Index compilers have to familiarize themselves with the interrelationships between the various formulas at their disposal for the calculation of the elementary price indices so that they are aware of the implications of choosing one formula rather than another. However, knowledge of these interrelationships is not sufficient to determine which formula should be used, even though it makes it possible to make a more informed and reasoned choice. It is necessary to appeal to additional criteria to settle the choice of formula. Two main approaches may be used, the axiomatic and the economic approaches.

B.2.1 Sampling properties of elementary price indices

10.21 The interpretation of the elementary aggregate indices is related to the way in which the sample of commodities is drawn. Hence, if the commodities in the sample are selected with probabilities proportional to the population value shares in the price reference period,

  • The sample (unweighted) Carli index provides an unbiased estimate of the population Laspeyres price index, and

  • The sample (unweighted) Jevons index provides an unbiased estimate of the population geometric Laspeyres price index (see equation (10.5)).

10.22 If the commodities are sampled with probabilities proportional to population quantity shares in the price reference period, the sample (unweighted) Dutot index would provide an estimate of the population Laspeyres price index. However, if the basket for the Laspeyres index contains different kinds of products whose quantities are not additive, the quantity shares, and hence the probabilities, are undefined.

B.2.2 Axiomatic approach to elementary price indices

10.23 As explained in Chapters 17 and 21, one way to decide on an appropriate index formula is to require it to satisfy certain specified axioms or tests. The tests throw light on the properties possessed by different kinds of indices, some of which may not be intuitively obvious. Four basic tests illustrate the axiomatic approach.

Proportionality test. If all prices are λ times the prices in the price reference period (January in the example), the index should equal λ. The data for July, when every price is 10 percent higher than in January, show that all three direct indices satisfy this test. A special case of this test is the identity test, which requires that if the price of every commodity is the same as in the reference period, the index should be equal to unity (as in May in the example).

Changes in the units of measurement test (or commen-surability test). The price index should not change if the quantity units in which the commodities are measured are changed—for example, if the prices are expressed per liter rather than per pint. The Dutot index fails this test, as explained below, but the Carli and Jevons indices satisfy the test.

Time reversal test. If all the data for the two periods are interchanged, then the resulting price index should equal the reciprocal of the original price index. The Carli index fails this test, but the Dutot and the Jevons both satisfy the test. The failure of the Carli index to satisfy the test is not immediately obvious from the example but can easily be verified by interchanging the prices in January and April, for example, in which case the backward Carli for January based on April is equal to 91.3 whereas the reciprocal of the forward Carli index is 1/132.5, or 75.5.

Transitivity test. The chained index between two periods should equal the direct index between the same two periods. The example shows that the Jevons and the Dutot indices both satisfy this test, whereas the Carli index does not. For example, although the prices in May have returned to the same levels as in January, the chained Carli index registers 113.9. This illustrates the fact that the Carli index may have a significant built-in upward bias.

10.24 Many other axioms or tests can be devised, as presented in Chapters 17 and 21, but the above (summarized in Table 10.2) are sufficient to illustrate the approach and also to throw light on some important features of the elementary indices under consideration here.

Table 10.2.

Properties of Main Elementary Aggregate Index Formulas

article image

10.25 The sets of commodities covered by elementary aggregates are meant to be as homogeneous as possible. If they are not fairly homogeneous, the failure of the Dutot index to satisfy the units of measurement, or commensurability, test can be a serious disadvantage. Although defined as the ratio of the unweighted arithmetic average prices, the Dutot index may also be interpreted as a weighted arithmetic average of the price ratios in which each ratio is weighted by its price in the base period.2 However, if the commodities are not homogeneous, the relative prices of the different commodities may depend quite arbitrarily on the quantity units in which they are measured.

10.26 Consider, for example, salt and pepper, which are found within the same Central Product Classification subclass. Suppose the unit of measurement for pepper is changed from grams to ounces while the units in which salt is measured (say, kilos) are left unchanged. Because an ounce of pepper is equal to 28.35 grams, the “price” of pepper increases by more than 28 times, which effectively increases the weight given to pepper in the Dutot index by more than 28 times. The price of pepper relative to salt is inherently arbitrary, depending entirely on the choice of units in which to measure the two goods. In general, when there are different kinds of commodities within the elementary aggregate, the Dutot index is not acceptable.

10.27 The Dutot index is acceptable only when the set of commodities covered is homogeneous, or at least nearly homogeneous. For example, the Dutot index may be acceptable for a set of apple prices, even though the apples may be of different varieties, but not for the prices of different kinds of fruits, such as apples, pineapples, and bananas, some of which may be much more expensive per item or per kilo than others. Even when the commodities are fairly homogeneous and measured in the same units, the Dutot index’s implicit weights may still not be satisfactory. More weight is given to the price changes for the more expensive commodities, but more expensive items may not account for the highest traded value share.

10.28 It may be concluded that from an axiomatic viewpoint, both the Carli and the Dutot indices, although they have been and still are widely used by statistical offices, have serious disadvantages. The Carli index fails the time reversal and transitivity tests. In principle, it should not matter whether we choose to measure price changes forward or backward in time. We would expect the same answer, but this is not the case for the Carli index. Chained Carli indices may be subject to a significant upward bias and should not be applied. The Dutot index is meaningful for a set of homogeneous commodities but becomes increasingly arbitrary as the set of commodities becomes more diverse. On the other hand, the Jevons index satisfies all the tests listed above and also emerges as the preferred index when the set of tests is enlarged, as shown in Chapter 21. From an axiomatic point of view, the Jevons index is clearly the index with the best properties, even though it may not have been used much until recently.

B.2.3 Economic approach to elementary price indices

10.29 The objective of the economic approach is to estimate an “ideal” or “true” economic index for the elementary aggregates. Consider a price index for the exports produced by resident establishments. As explained in Chapter 18, if, for example, it is assumed that the establishments behave as revenue maximizers, it follows that they would switch export production to commodities with higher relative price changes. This behavioral assumption about the firm allows something to be said about what a “true” index number formula should be and the suitability of different index number formulas as approximations to it. For example, the Laspeyres price index uses a fixed reference period for its export revenue shares to weight the price relatives and ignores the substitution of production toward products with higher relative price changes. The Laspeyres price index will thus understate aggregate price changes—that is, be biased downward against its true index. The Paasche price index uses fixed current period weights and will thus overstate aggregate price changes—that is, be biased upward against its true index.

10.30 The advantage of the economic approach is that it takes account of the interdependence between prices and quantities. The economic approach thus requires information on quantities or value shares. Index number compilation distinguishes between two stages of aggregation: the elementary level without weights and a higher level with weights. The concern here is with the elementary level. Because information on quantities or value shares is not available at the elementary level, an economic approach cannot be used at this level. However, if the items being priced are sampled with probabilities proportionate to quantity or value shares, then quantity or value information becomes attached to the sampled prices and the unweighted elementary index number formulas are implicitly weighted, as outlined in Chapter 21, Section G. The sampling should be with probabilities proportionate to population quantity or value shares and it is most likely in index number compilation that the available quantity or value share data will relate to the reference, and not current, period. Thus a prerequisite for the application of the economic approach to elementary index number formulas is that prices are sampled with probabilities proportionate to reference period population quantity or value shares. It is unusual that such formal sample designs are used for the sampling of establishments and items in survey-based XMPIs, and in their absence choice of formulas is best guided by the axiomatic approach.

10.31 Thus two things are needed to apply the economic approach: first, a specific sample design involving quantity or value shares to translate a sample unweighted elementary index into an estimator of a weighted population index; second, establishment of whether this estimated weighted index is an appropriate target, that is, one based on the behavioral assumptions of the enterprises or households responsible for exporting or importing the commodities.

10.32 Consider, for example, the Carli index, an unweighted average of a sample of price relatives. It differs from the Laspeyres index because there are no reference period weights to attach to the price relatives. Say sampling is with probabilities proportionate to population reference period value shares. Then the sample Carli index, under this sample design, is an estimator of the population Laspeyres index. Say exporters maintain fixed reference period revenue shares and ignore the substitution of exports toward items with higher relative price changes. Then Laspeyres is the appropriate target index. Information on current-period revenue shares is not relevant. Say there is some substitution behavior toward higher priced items. Then, as Chapter 18 shows, the target index will be one that takes account of substitution effects, such as the Fisher or Törnqvist index, and the sample estimator of Laspeyres will be biased. It will understate price changes. Considered below are the implications for this approach for export and import price indices in turn.

B2.3.1 Output export price indices, XPIs

10.33 For the export price index (XPI) the commodities for which respondents provide prices are treated as a basket of goods and services sold by establishments to provide revenue. The establishments may substitute between the commodities supplied in response to changes in their relative prices. However, in the absence of information about quantities or trade volumes within an elementary aggregate, an economic index can be estimated only under assumptions about the establishments’ reaction to price changes and a sample design that involves relative quantities or values.

10.34 There are two special cases of some interest. The first case is that used for illustration above, when producers continue to produce the same relative quantities irrespective of any changes in relative export prices on the market, that is, when all cross-elasticities of supply are zero. In this case, a population Laspeyres index would be an exact measure of the economic export (output) price index. The sampling approach, outlined in Chapter 21, Section G, demonstrates that a sample Carli index would provide an estimate of the economic index, provided that commodities are selected with probabilities proportional to the population value shares in the reference period. If the products were selected with probabilities proportional to the population quantity shares in the reference period, the sample Dutot would provide an estimate of the population Laspeyres.

10.35 However, even if sampling with probabilities proportionate to reference period quantity shares is used, such sampling is meaningful only for homogeneous (identical) items measured in the same units, for the units are implicitly added up. Many markets comprise differentiated branded items and it is not meaningful to add up the quantities of such heterogeneous items. The applicability of Dutot is thus relatively limited. Yet, for the case of homogeneous items, say tons of rolled steel of the same dimensions and quality sold under the same terms and conditions, the appropriate target index is not a Laspeyres index but a unit value index (Chapter 21). Thus the use of a Dutot index and sampling with probabilities proportionate to reference period quantity shares for homogeneous items would still be inappropriate.

10.36 The second case of interest is that the sample Jevons index provides an unbiased estimate of the population geometric Laspeyres index provided that the commodities are selected with probabilities proportional to the population revenue shares in the reference period (Chapter 21, Section F). If producers are assumed to vary the quantities they produce in inverse proportion to the changes in relative output prices such that the cross-elasticities of supply are all unity, so the revenue shares remain the same in both periods, then the sample Jevons index is an estimator of the population Törnqvist index. This again requires that commodities are selected with probabilities proportional to the population revenue shares in the reference period, which in this case is also equal to the current period. It may be argued that unitary cross-elasticities are unlikely as they imply that establishments should produce more of those goods whose prices are falling and less of those whose prices are increasing. However, it might be the case that producers also take into account other factors than prices, such as their market share or expected demand. In such cases, producers may focus on growing markets even if the relative prices are falling; for example, many high-technology product markets are characterized by rapid growth and falling prices, especially if quality adjusted.

10.37 In competitive, demand-led industries, producers will tend to produce more of those commodities whose relative price has increased. Under such conditions none of these indices provide a close estimate of the economic index. However, a Carli index is more likely to provide a closer approximation to the economic index than the Jevons index, which may be viewed as downward biased, provided sampling is with reference period revenue shares.

10.38 In the economic approach, the choice of index formula rests on which is likely to approximate more closely the underlying economic index—in other words, whether the (unknown) cross-elasticities are likely to be closer to unity or zero, on average. In practice, the cross-elasticities could take on any value. In some industries supply is relatively unresponsive to changes in demand, and producers tend to produce the same relative quantities irrespective of relative price changes. In this case, a Carli index is likely to give a closer approximation to the economic index than the Jevons index, provided prices are sampled with probability proportionate to value shares in the reference period.

10.39 While these results may at first sight add credence to the use of these formulas, they do so only if two conditions are met. First, that the appropriate sample design is used and second, that it is meaningful for the product group under consideration. As noted above, a justification for Carli requires sampling with probabilities proportionate to reference period revenue shares and zero-valued cross-elasticities.

B.2.3.2 Input import price indices, import price indices

10.40 The discussion is phrased in terms of a cost-minimizing resident purchaser for an import price index (or nonresident purchaser for an XPI). Again the choice of index formula rests on the sample design and which is likely to approximate more closely the underlying ideal economic index, that is, whether the (unknown) cross-elasticities are likely to be closer to unity or zero, on average. In some industries demand for inputs is relatively unresponsive to changes in prices and establishments tend to import the same relative quantities irrespective of changes in their prices. In such cases the Carli index is likely to provide a closer approximation to the ideal economic index than the Jevons index, which may be viewed as having a downward bias. If establishments tend to substitute toward cheaper inputs as a response to change in relative prices, the Jevons index may provide the better estimate of the economic index, and the Carli may be viewed as having an upward bias. However, in both instances a prerequisite for making any such calls is that the appropriate sample design has been utilized and, as noted earlier, for survey-based XMPIs, this is unlikely.

10.41 It should be noted that the Jevons index does not imply, or assume, that the trade value shares remain constant. Obviously, the Jevons can be calculated whether or not changes occur in the value shares in practice. What the economic approach shows is that if sampling is with probabilities proportionate to population value shares, and if the value shares remain constant, or roughly constant, then the Jevons index can be expected to provide a good estimate of the underlying ideal economic index. Similarly, if the same sample design holds and if the relative quantities remain constant, then the Carli index can be expected to provide a good estimate. But neither of these formulas actually implies that sampling is with probability proportionate to value or quantities and that value shares or relative quantities remain fixed over time. Quite limiting assumptions are required for the use of the economic approach at the elementary level, and if these assumptions do not hold, the axiomatic approach provides sound guidance to adopt the Jevons index.3 Reference should be made to Chapter 18 and, in the context of elementary index numbers, Section F of Chapter 21, for a more rigorous statement of the economic approach.

B.3 Chained versus direct indices for elementary aggregates

10.42 In a direct elementary index, the prices of the current period are compared directly with those of the price reference period. In a chained elementary index, prices in each period are compared with those in the previous period, and the resulting short-term indices are then multiplied, or chained, to obtain the long-term index, as illustrated in Table 10.1.

10.43 Provided that prices are recorded for the same set of commodities in every period, as in Table 10.1, any index formula defined as the ratio of the average prices will be transitive—that is, the same result is obtained whether the index is calculated as a direct index or as a chained elementary index. In a chained elementary index, successive numerators and denominators will cancel out, leaving only the average price in the last period divided by the average price in the price reference period, which is the same as the direct index. Both the Dutot and the Jevons indices are therefore transitive. As already noted, however, a chain Carli index is not transitive and should not be used because of its upward bias. The direct Carli fails, as noted above, the time reversal test and is not generally advised. Nevertheless, the direct Carli remains an option.

10.44 Although the chained and direct versions of the Dutot and Jevons indices are identical when there are no breaks in the series for the individual commodities, they offer different ways of dealing with new and disappearing commodities, missing prices, and quality adjustments. In practice, commodities continually have to be dropped from the index and new ones included, in which case the direct and the chain indices may differ if the imputations for missing prices are made differently.

10.45 When a replacement commodity has to be included in a direct index, it often will be necessary to estimate the price of the new commodity in the price reference period, which may be some time in the past. The same happens if, as a result of an update of the sample, new commodities have to be linked into the index. If no information exists on the price of the replacement commodity in the price reference period, it will be necessary to estimate it using price relatives calculated for the commodities that remain in the elementary aggregate, a subset of these commodities, or some other indicator. However, the direct approach should be used only for a limited period. Otherwise, most of the reference prices would end up being imputed, which would be an undesirable outcome. This effectively rules out the use of the Carli index over a long period, because the Carli index can be used only in its direct form anyway, and even then is subject to bias owing to its failure of the time reversal test, being unacceptable when chained.

10.46 In a chained elementary index, if a commodity becomes permanently missing, a replacement commodity can be linked into the index as part of the ongoing index calculation by including the commodity in the monthly index as soon as prices for two successive months are obtained. Similarly, if the sample is updated and new commodities have to be linked into the index, this will require successive old and new prices for the present and the preceding month. However, for a chain elementary index, the missing observation will affect the index for two months, because the missing observation is part of two links in the chain. This is not the case for a direct index where a single, nonestimated missing observation will affect only the index in the current period. For example, when comparing periods 0 and 3, a missing price of a commodity in period 2 means that the chained index excludes the commodity for the last link of the index in periods 2 and 3, while the direct index includes it in period 3 (because a direct index will be based on commodities with prices available in periods 0 and 3). However, in general, the use of a chained index can make the estimation of missing prices and the introduction of replacements easier from a computational point of view, whereas it may be inferred that a direct index will limit the usefulness of overlap methods for dealing with missing observations. Missing price observations are discussed further in Section B.5.

10.47 The direct and the chained elementary approaches also produce different by-products that may be used for monitoring price data. For each elementary aggregate, a chained index approach gives the latest monthly price change, which can be useful for both editing data and imputing missing prices. By the same token, however, a direct index derives average price levels for each elementary aggregate in each period, and this information may be a useful by-product. However, the availability of cheap computing power and spreadsheets allows such by-products to be calculated whether a direct or a chained approach is applied, so the choice of formula should not be dictated by considerations regarding by-products.

B.4 Consistency in aggregation

10.48 Consistency in aggregation means that if an index is calculated stepwise by aggregating lower-level indices to obtain indices at progressively higher levels of aggregation, the same overall result should be obtained as if the calculation had been made in one step. If the elementary indices are calculated using one formula, and then averaged to obtain the higher-level indices using another formula, the resulting XMPI is not consistent in aggregation. However, it may be argued that consistency in aggregation is not necessarily an important or even appropriate criterion. There may be different elasticities of substitution within elementary aggregates compared to the elasticities between elementary aggregates. This may be an argument for using a different index formula at a different level of aggregation. Also, it may be unachievable, particularly when the amount of information available on quantities and trade values is not the same at the different levels of aggregation.

10.49 The Carli index is consistent in aggregation with a higher-level Laspeyres index if the commodities are selected with probabilities proportional to trade values in the price reference period. The Dutot and the Jevons indices are not consistent in aggregation with a higherlevel Laspeyres index. However, as explained below, the XMPIs actually calculated by statistical offices are usually not true Laspeyres indices anyway, even though they may be based on fixed baskets of goods and services. If the higher-level index were to be defined as a geometric Laspeyres index, consistency in aggregation could be achieved by using the Jevons index for the elementary indices at the lower level, provided that the individual commodities are sampled with probabilities proportional to trade values. Although unfamiliar, a geometric Laspeyres index has desirable properties from an economic point of view and is considered again in Section B.6.

B.5 Missing price observations

10.50 The price of a commodity may not be collected in a particular period either because the commodity is missing temporarily or because it has permanently disappeared. The two classes of missing prices require different treatments. Temporary unavailability may occur for seasonal commodities (particularly for fruit, vegetables, and clothing) because of supply shortages or possibly because of some collection difficulty (e.g., an establishment was closed or a respondent was on vacation). The treatment of seasonal commodities raises a number of particular problems. These are dealt with in Chapter 23 and are not discussed here.

B.5.1 Treatment of temporarily missing prices

10.51 In the case of temporarily missing observations for commodities, one of four actions may be taken:

  • Omit the commodity for which the price is missing so that a matched sample is maintained (like is compared with like), even though the sample is depleted.

  • Carry forward the last observed price.

  • Impute the missing price by the average price change of the prices that are available in the elementary aggregate.

  • Impute the missing price by the price change of a comparable commodity from a similar establishment.

10.52 The price development for a given commodity may be different according to the country to which it is exported or from which it is imported. This may be due to different price trends in the countries or exchange rate changes. An elementary index thus may contain commodities from several countries, and the price development of a missing commodity for a specific country may be unusual compared to the average price development of the remaining ones. When imputing a price by the price development of another commodity or a group of commodities, one should therefore give consideration to the country of origin (imports) or destination (exports).

10.53 Omitting an observation from the calculation of an elementary index is equivalent to assuming that the price would have moved in the same way as the average of the prices of the commodities that remain included in the index. Omitting an observation changes the implicit weights attached to the other prices in the elementary aggregate.

10.54 Carrying forward the last observed price should be avoided wherever possible and is acceptable for only a very limited number of periods. Special care needs to be taken in periods of high inflation or when markets are changing rapidly as a result of a high rate of innovation and commodity turnover. Although simple to apply, carrying forward the last observed price biases the resulting index toward zero change. In addition, there is likely to be a compensating step-change in the index when the price of the missing commodity is recorded again. The adverse effect on the index will be increasingly severe if the commodity remains unpriced for some length of time. In general, carry forward is not an acceptable procedure or solution to the problem unless one is certain the price has not changed.

10.55 Imputation of the missing price by the average change of the available prices may be applied for elementary aggregates when the prices can be expected to move in the same direction. The imputation can be made using all the remaining prices in the elementary aggregate. As already noted, this is numerically equivalent to omitting the commodity for the immediate period, but it is useful to make the imputation so that if the price becomes available again in a later period, the sample size is not reduced in that period. In some cases, depending on the homogeneity of the elementary aggregate, it may be preferable to use only a subset of commodities from the elementary aggregate to estimate the missing price. In some instances, this may even be a single comparable commodity from a similar type of establishment whose price change can be expected to be similar to the missing one.

10.56 Table 10.3 illustrates the calculation of the price index for an elementary aggregate consisting of three commodities, where one of the prices is missing in March. The upper part of Table 10.3 shows the indices where the missing price has been omitted from the calculation. The direct indices are therefore calculated on the basis of A, B, and C for all months except March, where they are calculated on basis of B and C only. The chained indices are calculated on the basis of all three prices from January to February and from April to May. From February to March and from March to April, the monthly indices are calculated on the basis of B and C only.

Table 10.3.

Imputation of Temporarily Missing Prices

article image

10.57 For both the Dutot and the Jevons, the direct and chain indices now differ from March onward. The first link in the chained index (January to February) is the same as the direct index, so that the two indices are identical numerically. The direct index for March ignores the price decrease of commodity A between January and February, whereas this is taken into account in the chained index. As a result, the direct index is higher than the chained index for March. On the other hand, in April and May, where all prices again are available, the direct index catches the price development, whereas the chained index fails to track the development in the prices.

10.58 In the lower half of Table 10.3, the missing price for commodity A in March is imputed by the average price change of the remaining commodities from February to March. Although the index may be calculated as a direct index comparing the prices of the present period with the reference period prices, the imputation of missing prices should be made on the basis of the average price change from the preceding to the present period, as shown in the table. Imputation on the basis of the average price change from the price reference period to the present period should not be used because it ignores the information about the price change of the missing commodity that has already been included in the index. The treatment of imputations is discussed in more detail in Chapter 8.

10.59 A special case of “missing prices” occurs when prices are recorded with different frequency, such as if some prices are recorded monthly while others only quarterly or every half-year. If the index is compiled on a monthly basis there will then be a need to temporarily update the prices recorded with a lower frequency. The options for updating the lower frequency prices will be the same as those described above.

B.5.2 Treatment of commodities that have permanently disappeared and their replacements

10.60 Commodities may disappear permanently for various reasons. The commodity may disappear from the market because new commodities have been introduced or an establishment from which prices have been collected leaves the market. When commodities disappear permanently, a replacement commodity has to be sampled and included in the index. The replacement commodity should ideally be one that accounts for a significant proportion of sales, is likely to continue to be sold for some time, and is likely to be representative of the price changes of the market that the old commodity covered.

10.61 The timing of the introduction of replacement commodities is important. Many new commodities are initially sold at high prices that then gradually drop over time, especially as the value of sales increases. Alternatively, some commodities may be introduced at artificially low prices to stimulate demand. In such cases, delaying the introduction of a new or replacement commodity until a large volume of sales is achieved may miss some systematic price changes that ought to be captured by XMPIs. It may be desirable to try to avoid forced replacements caused when commodities disappear completely from the market and to try to introduce replacements when sales of the commodities they replace are decreasing and before they cease altogether.

10.62Table 10.4 shows an example where commodity A disappears after March and commodity D is included as a replacement from April onward. Commodities A and D are not available on the market at the same time, and their price series do not overlap. To include the new commodity in the index from April onward, an imputed price needs to be calculated either for the base period (January) if a direct index is being calculated, or for the preceding period (March) if a chained index is calculated. In both cases, the imputation method ensures that the inclusion of the new commodity does not, in itself, affect the index.

Table 10.4.

Disappearing Commodities and Their Replacements with No Overlap

article image

10.63 In the case of an unweighted (elementary) chained formulation, imputing the missing price by the average change of the available prices gives the same result as if the commodity is simply omitted from the index calculation until it has been priced in two successive periods. This allows the chained index to be compiled by simply chaining the month-to-month index between periods t -1 and t, based on the matched set of prices in those two periods, on to the value of the chained index for period t -1. In the example, no further imputation is required after April, and the subsequent movement of the index is unaffected by the imputed price change between March and April.

10.64 In the case of a direct index, however, an imputed price is always required for the reference period to include a new commodity. In the example, the price of the new commodity in each month after April still has to be compared with the imputed price for January. As already noted, to prevent a situation in which most of the reference period prices end up being imputed, the direct approach should be used for only a limited period of time.

10.65 The situation is somewhat simpler when there is an overlap month in which prices are collected for both the disappearing and the replacement commodity. In this case, it is possible to link the price series for the new commodity to the price series for the old commodity that it replaces. Linking with overlapping prices involves making an implicit adjustment for the difference in quality between the two commodities, because it assumes that the relative prices of the new and old commodity reflect their relative qualities. For perfect or nearly perfect markets, this may be a valid assumption, but for certain markets and commodities it may not be so reasonable. The question of when to use overlapping prices is dealt with in detail in Chapter 8. The overlap method is illustrated in Table 10.5.

Table 10.5.

Disappearing and Replacement Commodities with Overlapping Prices

article image

10.66 In the example, overlapping prices are obtained for commodities A and D in March. Their relative prices suggest that one unit of commodity A is worth two units of commodity D. If the index is calculated as a direct Carli index, the January base period price for commodity D can be imputed by dividing the price of commodity A in January by the price ratio of A and D in March.

10.67 A monthly chain index of arithmetic mean prices will be based on the prices of commodities A, B, and C until March, and from April onward by B, C, and D. The replacement commodity is not included until prices for two successive periods are obtained. Thus, the monthly chained index has the advantage that it is not necessary to carry out any explicit imputation of a reference price for the new commodity.

10.68 If a direct index is calculated as the ratio of the arithmetic mean prices, the price of the new commodity needs to be adjusted by the price ratio of A and D in March in every subsequent month, which complicates computation. Alternatively, a reference period price of commodity D for January may be imputed. However, this results in a different index because the price relatives are implicitly weighted by the relative reference period prices in the Dutot index, which is not the case for the Carli or the Jevons index. For the Jevons index, all three methods give the same result, which is an additional advantage of this approach.

10.69 Problems with missing prices may be particular in smaller countries or for commodity groups, even in larger countries, where the number of reporting establishments is very limited. From time to time establishments will leave the market and no replacement can be found. If there are still prices recorded, the elementary index can be continued on the basis of the remaining prices. However, in some instances all prices for an elementary aggregate may disappear. In this case, it will be necessary to assign the weight to another elementary aggregate or to impute or carry forward the elementary index until the next revision of the index.

10.70 The statistical office may try to reduce the problems associated with missing prices by defining the elementary aggregates not too narrowly. More broadly defined elementary aggregates will reduce problems with missing prices and help facilitate a smooth, regular compilation of the index. However, markets change over time and the index should reflect this. This issue is dealt with in more detail in Sections C.6.4 and C.6.5 on the introduction of new elementary and higher-level indices in the overall price index.

B.6 Calculation of elementary price indices using weights

10.71 Whenever possible, weights that reflect the relative importance of the sampled commodities may be introduced in the calculation of the elementary indices. For certain elementary aggregates, information about the value of export or import of particular commodities may be obtained from existing trade and industry sources, or the statistical office can work with establishment respondents to obtain weighting data, as discussed in Chapter 5. In addition, the growing use of electronic recording of transactions in many countries, in which records on both prices and quantities are maintained, means that valuable new sources of information may become increasingly available to statistical offices.

10.72 For example, assume that the number of importers of a certain commodity such as gasoline is limited. The market shares of the importers may be known from business survey statistics and can be used as weights in the calculation of an elementary aggregate price index for gasoline.

10.73 A special situation occurs in the case of tariff prices. A tariff is a list of prices for the provision of a particular kind of good or service under different terms and conditions. One example is electricity for which one price is charged during the day and a lower price is charged at night. Another example may be airline passenger fares sold at one price to some passengers and at lower prices to others. In such cases, it is appropriate to assign weights to the different tariffs or prices to calculate the price index for the elementary aggregate.

10.74 Weights within elementary aggregates may be updated independently and possibly more often than the elementary aggregate weights themselves.

Table 10.6.

Calculation of a Weighted Elementary Index

article image

10.75 If weighting data are available for all the individual commodities within an elementary aggregate, the elementary price index can be calculated as a Laspeyres price index, or as a geometric Laspeyres index; both are discussed further in Chapter 21. The Laspeyres price index is defined as

PL0:t=ipitqi0ipi0qi0=iwi0.(pitpi0),wi0=pi0qi0ipi0qi0.(10.4)

10.76 As the quantities are often unknown, the index usually will have to be calculated by weighting together the individual price ratios by their trade value shares in the price reference period, wi0. The available weighting data may refer to an earlier period than the price reference period, but may still provide a good estimate. A more general version of equation (10.4) would be that of a Lowe or a Young index, where the weights are not necessarily those of the price reference period. These two indices are discussed in more detail in Section C.3. Note that if all the weights were equal, equation (10.4) would reduce to the Carli index. If the weights were proportional to the prices in the reference period, it would reduce to the Dutot index.

10.77 The geometric Laspeyres index is defined as

PGL0:t=Πi(pitpi0)w0i=Πi(pit)wi0Πi(pi0)wi0,iwi0=1,(10.5)

where the weights, wi0, are again the trade value shares in the reference period. When the weights are all equal, equation (10.5) reduces to the Jevons index. If the trade value shares do not change much between the weight reference period and the current period, then the geometric Laspeyres index approximates a Törnqvist index. A more general version of equation (10.5) would be that of a geometric Young index, where the weights are not necessarily those of the price reference period.

10.78 The weights may be attached to the individual price observations or to groups of price observations. For example, two establishments may both report, say, five prices that enter into the calculation of an elementary aggregate price index. However, the only weighting information may refer to the overall relative market share of the two establishments rather than to the individual commodities. Thus, if the relative market shares are 40/60, the two groups of prices may be weighted according to the 40/60 shares of the establishments.

10.79 Table 10.6 provides an example of calculation of an elementary index using weights. The elementary aggregate consists of three commodities for which prices are collected monthly. The trade value shares are estimated to 0.80, 0.17, and 0.03.

10.80 One option is to calculate the index as the weighted arithmetic mean of the price ratios, which gives an index of 112.64. The individual price changes are weighted according to their explicit weights, irrespective of the price levels. This corresponds to the calculation of a Laspeyres price index, where the price ratios and the weights refer to the same reference month. The index may also be calculated as the weighted geometric mean of the price ratios, the so-called geometric Laspeyres index, which gives an index of 105.95.

10.81 A third option could be to calculate the index as the ratio of the weighted arithmetic mean prices. As already noted, an elementary index should be based on arithmetic mean prices only if it includes homogeneous products measured in the same unit; otherwise it is not meaningful to calculate an average price. In practice, this will also mean that the price level of the products should be more or less the same. Second, this approach weights the price changes according to the relative price level in the reference period. Hence, the increase of 28.6 percent on commodity A that accounts for 80 percent of the market is weighted down because of its relative low price, resulting in an index of 94.11. This calculation is misleading, however.

10.82 The difference between the two arithmetic methods can be illustrated by an example: Assume an elementary aggregate with two commodities, X and Y, of equal weights (50/50). The price of X is constant 90, and the price of Y increases from 10 to 12. The weighted arithmetic mean of the price ratios gives 90/90 + 12/10 = 1.10. The ratio of arithmetic weighted prices gives (90 + 12)/(90 + 10) = 1.02. In the first approach, the price increases of the two commodities are equally weighted, which gives an increase of 10 percent. The problem in the second approach is that it weights the 0 percent price increase on X by 90/100, and the 20 percent increase of Y by only 10/100, which gives an overall increase of 2 percent. This can be justified only if the weights are proportional to the relative price level in the reference period, that is, if the weight of X is 90 and that of Y is 10, which, however, contradicts the assumption of 50/50 weights. Because of the calculation method, the weights are twisted according to the relative price levels resulting in a misleading index.

10.83 Weighting information at the very detailed level demands resources to obtain and update. This has to be balanced against the possible gains in terms of a more accurate price index.

B.7 Some alternative index formulas

10.84 Another type of average is the harmonic mean. In the present context, there are two possible versions: either the harmonic mean of price ratios or the ratio of harmonic mean of prices. The harmonic mean of price ratios is defined as

PHR0:t=11nΣipi0pit.(10.6)

The ratio of harmonic mean prices is defined as

PRH0:t=Σin/pi0Σin/pit.(10.7)

Neither formula appears to be used much in practice, perhaps because the harmonic mean is not a familiar concept and would not be easy to explain to users. However, at an aggregate level, the widely used Paasche index is a weighted harmonic average.

10.85 The ranking of the three common types of mean is always

arithmeticmeangeometricmeanharmonicmeans.

It is shown in Chapter 21 that, in practice, the Carli index, the arithmetic mean of the price ratios, is likely to exceed the Jevons index, the geometric mean, by roughly the same amount that the Jevons exceeds the harmonic mean, as shown in equation (10.6). The harmonic mean of the price relatives has the same kinds of axiomatic properties as the Carli but with opposite tendencies and biases. It fails the transitivity and time reversal tests discussed earlier. In addition it is very sensitive to “price bouncing,” as is the Carli index. As it can be viewed conceptually as the complement, or rough mirror image, of the Carli index, it has been argued that a suitable elementary index would be provided by a geometric mean of the two, in the same way that, at an aggregate level, a geometric mean is taken of the Laspeyres and Paasche indices to obtain the Fisher index. Such an index has been proposed by Carruthers, Sellwood, and Ward (1980) and Dalén (1992)—namely,

ICSWD0:t=Ic0:tIHR0:t.(10.8)

ICSWD is shown in Chapter 20 to have very good axiomatic properties but not quite as good as Jevons index, which is transitive, whereas the ICSWD is not. However, it can be shown to be approximately transitive and, empirically, it has been observed to be very close to the Jevons index.

B.8 Unit value indices

10.86 As noted in Chapter 2, unit value indices based on customs data have a history as the predominant method of compiling XMPIs. However, it was also noted in Chapter 2 that unit value indices are subject to a particular form of formula bias and that price indices based on price surveys are now considered, for the most part, to be preferable. The potential bias in unit value indices is addressed in Chapter 2. The unit value index for period t relative to a

PUV0:t=(ΣipitqitΣiqit)/(Σipi0qi0Σiqi0).(10.9)

10.87 The unit value index is simple in form. The unit value in each period is calculated by dividing total trade value on some commodity by the related total quantity. It is clear that the quantities must be strictly additive in an economic sense, which implies that they should relate to a single homogeneous commodity. The unit value index is then defined as the ratio of unit values in the current period to those in the reference period. An example of a unit value calculation can be found in Table 10.7. The trade quantity data is assumed to be available at the level of documentation required for the international shipment. The example in Table 10.7 shows the commodity category for the Harmonized Code 6402991815: Tennis shoes, basketball shoes, gym shoes, training shoes, and the like.

Table 10.7.

Calculation of Unit Value Index for Sample Commodity Category1

(6402991815—Tennis shoes, basketball shoes, gym shoes, training shoes, and the like)

article image

All unit value indices have been calculated using unrounded figures.

10.88 It is important to recognize that the unit value index is not a price index as normally understood, because it essentially is a measure of the change in the average price of a single commodity when that commodity is sold at different prices to different purchasers, perhaps at different times within the same period. In Table 10.7, both the trade value and quantity increased from January to February and the resulting unit value index shows an increase. However, because the commodity category is so broad, it cannot be determined whether in fact there was a price increase or a change in the type and/or quality of footwear traded. It is concluded that unit value indices should not be calculated for sets of heterogeneous commodities. Because international trade in homogeneous commodities like raw materials makes up less and less of total world trade, the use of unit value indices is not recommended as a substitute for XMPIs.

10.89 However, in cases with strict homogeneous commodities where the commodity specifications remain constant, although unlikely with customs data, a unit value may be used to estimate a price index. Often the unit value indices will be available from foreign trade statistics, or can be calculated on the basis of data from foreign trade statistics. The unit values should be based on data on both the value of the total trade and the total quantities sold covering the whole period, for example, one month, in order to derive a unit value index. This is particularly important if the commodity is sold at a discount price for part of the period and at the “regular” price for the rest of the period. Under these conditions, neither the discount price nor the regular price is likely to be representative of the average price at which the commodity has been sold or the price change between periods. The unit value over the whole month should be used. With the possibility of collecting more and more data from electronic records, such procedures may be increasingly used. However, it should be stressed that the commodity specifications must remain constant through time. Changes in the commodity specifications could lead to unit value changes that reflect quantity, or quality, changes and should not be part of price changes.

10.90 It is possible to combine unit values or unit value indices with prices collected in the XMPI survey in the form of a hybrid index as discussed in Chapter 2. For example, one elementary aggregate may consist of three commodities where a unit value is used for the first commodity, while sampled prices are used for the other two. It may also be the case that a unit value index constitutes an elementary index on its own. It can then be aggregated into higher-level indices with other elementary indices, whether these are based on sampled prices or unit value indices.

B.9 Formulas applicable to electronic data

10.91 Respondents may well have computerized management accounting systems that include highly detailed data on sales in terms of both prices and quantities. Their primary advantages are that the number of price observations can be significantly larger and that both price and quantity information are available in real time. Much work has been undertaken on the use of scanner data as an emerging data source for consumer price index (CPI) compilation and there are parallels for XMPIs, whenever establishments have detailed electronic files on individual transactions. There are a large number of practical considerations, which are discussed and referenced in the CPI Manual (ILO and others, 2004a) and also in Chapter 7, Section D, of this Manual, but it is relevant to discuss briefly here a possible index number formula that may be applicable if electronic data are collected and used in XMPI compilation.

10.92 The existence of quantity and trade value information at the detailed transaction level increases the ability to estimate price changes accurately. It means that traditional index number approaches such as Laspeyres and Paasche can be used, and that superlative formulas such as the Fisher and Törnqvist-Theil indices can also be derived in real time. The main observation made here is that because price and quantity information are available for each period, it may be tempting to produce monthly or quarterly chained indices using one of the ideal formulas mentioned above. However, the compilation of subannual chained indices has been found in some studies to be problematic because it often results in an upward bias referred to as “chain drift.”

C. Calculation of Higher-Level Indices

C.1 Target indices

10.93 A statistical office must have some target index at which to aim. Statistical offices have to consider what kind of index they would choose to calculate in the ideal hypothetical situation in which they have complete information about prices and quantities in both time periods compared. If the XMPI is meant to be an economic index, then a superlative index such as a Fisher, Walsh, or Törnqvist-Theil would have to serve as the theoretical target, because a superlative index may be expected to approximate the underlying economic index.

10.94 Many countries do not aim to calculate an economic index and prefer the concept of a basket index. A basket index measures the change in the total value of a given basket of goods and services between two time periods. This general category of index is described here as a Lowe index after the early 19th-century index number pioneer who first proposed this kind of index (see Chapter 16, Section D). The meaning of a Lowe index is clear and can be easily explained to users. It should be noted that, in general, there is no necessity for the basket to be the actual basket in one or other of the two periods compared. If the theoretical target index is to be a basket or Lowe index, the preferred basket might be one that attaches equal importance to the baskets in both periods—for example, the Walsh index.4 Thus, the same kind of index may emerge as the theoretical target on both the basket and the economic index approaches. In practice, however, a statistical office may prefer to designate the basket index that uses the actual basket in the earlier of the two periods as its target index on grounds of simplicity and practicality. In other words, the Laspeyres index may be a target index.

10.95 The theoretical target index is a matter of choice. In practice, it is likely to be either a Laspeyres or some superlative index. However, even when the target index is the Laspeyres, there may a considerable gap between what is actually calculated and what the statistical office considers to be its target. It is now necessary to consider what statistical offices tend to do in practice.

C.1.1 Aggregation by product or activity classification

10.96 The elementary indices are aggregated into higher-level indices by use of a product or activity classification. When aggregating according to a product classification, such as the HS, each elementary aggregate is assigned a detailed product code. This enables the statistical office to aggregate the elementary indices into indices at successively higher levels of aggregation for product classes, groups, divisions, and so on. In the same way, the elementary indices can be aggregated into higher-level indices by type of activity, using, for example, the ISIC or the General Industrial Classification of Economic Activities within the European Communities.

10.97 It is up to the statistical office to decide which aggregation structure to follow. However, it is recommended that a standard international classification be applied. When national classifications or national variants of the international standards are used, they should allow for international comparisons, at least down to a fairly detailed level of aggregation. The aggregation structure should be consistent so that the weights at each level of aggregation are equal to the sum of their components.

10.98 In some instances the statistical office may wish to compile higher-level indices according to both a product and activity classification. Users may have interest in both kinds of aggregates or, for example, the national accounts may need deflators according to product or activity aggregates depending on the structure in the production of national accounts statistics.

10.99 One way to deal with this would be to assign both a product and activity code to each elementary aggregate, after which aggregation would provide higher-level indices according to both classifications. This may not always be possible in practice, however. The problem is that in general it is not possible to uniquely identify the originating industry of a product because the detailed product code may identify products originating from establishments in different industries. Thus, in order to aggregate elementary aggregates defined by a product classification (HS, for example) into higher-level indices according to some activity classification (ISIC, say), it will be necessary to have a key between the elementary aggregate product codes and the higher-level activity codes.

C.2 XMPIs as weighted averages of elementary indices

10.100 Section B discussed alternative formulas for combining individual price observations to calculate the first level of indices, the elementary aggregate indices. The next step in compiling the XMPI involves taking the elementary indices and combining them, using weights, to calculate successively higher levels of indices.

10.101 A higher-level index is an index for some trade aggregate above the level of elementary aggregates, including the overall XMPIs themselves. The second stage of compiling the XMPI does not involve individual prices or quantities. Instead, the higher-level indices are calculated by averaging the elementary indices using a set of predetermined weights. The inputs into the calculation of the higher-level indices are the elementary price indices and the weights of the elementary aggregates derived primarily from trade values data.

10.102 The weights typically remain fixed for a sequence of at least 12 months. Some countries revise their weights at the beginning of each year to approximate as closely as possible to current trade patterns. However, many countries continue to use the same weights for several years. The weights may be changed only every five years or so. Owing to the volatility of international trade, it is expected that the weights for XMPIs would be updated more often than those for the CPI or producer price index. The use of fixed weights has the considerable practical advantage that the index can make repeated use of the same weights, which saves both time and money. Revising the weights can be both time consuming and costly, but is necessary to ensure the weights remain relevant. Resources permitting, updating the weights annually is preferable.

10.103 The higher-level indices are calculated as the trade share–weighted arithmetic average of the elementary price indices. The formula can be written as follows:

P0:t =ΣjwjbPj0:t,Σjwjb=1,(10.10)

where P0:t Denotes the overall XMPI, or any higher-level index, from period 0 to t; wjb is the weight attached to each of the elementary price indices; and Pj0:t is the corresponding elementary price index identified by the subscript j. As already noted, a higher-level index is any index, including the overall XMPIs, above the elementary aggregate level. The weights are derived from trade values in period b, which in practice precedes period 0, the price reference period.

10.104 Equation (10.10) applies at each level of aggregation above the elementary aggregate level. The index is additive. This means that any higher-level index can be calculated in one step as the weighted arithmetic average of the elementary indices of which it consists, or by weighting together the indices at the intermediate level, with the same result. For example, a higherlevel index at the two-digit level may be calculated by weighting together the elementary indices or the three-digit-level indices of which it consists.

10.105 Provided the elementary aggregate indices are calculated using a transitive formula such as the Jevons or Dutot, but not the Carli, and provided that there are no new or disappearing commodities from period 0 to t, equation (10.10) is equivalent to

P0:t=wjbPj0:t1Pjt1:t,wjb=1.(10.11)

The advantage of this version of the index is that it allows the sampled commodities within the elementary price index from t – 1 to t to differ from the sampled commodities in the periods from 0 to t – 1. Hence, it allows replacement commodities and new commodities to be linked into the index from period t – 1 without the need to estimate a price for period 0, as explained in Section B.5. For example, if one of the sampled commodities in periods 0 and t – 1 is no longer available in period t, and the price of a replacement commodity is available for t – 1 at t, the new replacement commodity can be included in the index using the overlap method.

10.106 Note also from equation (10.11) that an elementary or lower-level index from t – 1 to t enters into the higher-level index not by its weight, but by the weight multiplied by the price development up to period t – 1. In order to calculate the rate of change of the higher-level index from t – 1 to t it is necessary to update the weights to reflect the price changes that have taken place from period 0 to period t – 1.

10.107 It is useful to recall that three kinds of reference periods may be distinguished:

  • Weight Reference Period: The period covered by the trade value statistics used to calculate the weights. Usually, the weight reference period is a year.

  • Price Reference Period: The period whose prices are used as denominators in the index calculation.

  • Index Reference Period: The period for which the index is set to 100.

10.108 The three periods are generally different. For example, an XMPI might have 1998 as the weight reference year, December 2002 as the price reference month, and the year 2000 as the index reference period. The weights typically refer to a whole year, or even two or three years, whereas the periods whose prices are compared are typically months or quarters. The weights are usually compiled from trade value statistics already collected some time before the price reference period. For these reasons, the weight and the price reference periods are invariably separate periods in practice.

10.109 The index reference period is often a year, but it could be a month or some other period. An index series may also be re-referenced to another period by simply dividing the series by the value of the index in that period, without changing the rate of change of the index. The expression “base period” can mean any of the three reference periods and can sometimes be quite ambiguous. “Base period” should be used only when it is absolutely clear in context exactly which period is referred to.

10.110 Table 10.8 illustrates the calculation of higherlevel indices. The index consists of five elementary aggregate indices (A–E), which are calculated using one of the formulas presented in Section 10.B, and two intermediate higher-level indices, G and H. The overall index (Total) and the higher-level indices (G and H) are all calculated using equation (10.10). For example, the overall index for April can be calculated from the two intermediate higher-level indices of April as

Table 10.8.

The Aggregation of the Elementary Price Indices

article image
PJan:Apr=0.6×103.92+0.4×101.79=103.06,

or directly from the five elementary indices

PJan:Apr=0.2×108.75+0.25×100+0.15×104+0.1×107.14+0.3×100=103.06.

C.3 Price updating of trade value weights

10.111 As already noted most, if not all, statistical offices calculate the higher-level indices by use of equation (10.10) or the equivalent equation (10.11). However, for the practical calculation of XMPIs the situation is complicated by the fact that the weight reference period usually precedes the price reference period and the duration of the weight reference period is typically much longer than the period to which the prices refer. The weights usually refer to the trade values over a year, or longer, whereas the price reference period is usually a month or a quarter in some later year. For example, a monthly index may be compiled from January 2003 onward with December 2002 as the price reference month, but the latest available weights during the year 2003 may refer to the year 2000, or even some earlier year.

10.112 This means that the statistical office has to decide if the weights should be re-referenced, or price-updated, from the weight reference period to the price reference period, or be applied as they stand without any price updating.

10.113 By price updating, the weights are aligned to the same reference period as the prices. If the statistical office decides to price-update the weights, the resulting index will be a Lowe index. The Lowe index is a fixed-basket index, which from period to period measures the value of the same (annual) basket of goods and service. It is defined as follows:

PL00:t=ΣipitqibΣipi0qib.(10.12)

The individual quantities (qib) in the weight reference period b make up the basket. The index measures the value of the period b basket in period t in relation to the value of the same basket in period 0. However, to be used in practice it is necessary to express the index as a function of value shares rather than individual quantities:

PL00:t=ΣipitqibΣipi0qib=Σipi0qi0pitpi0Σipi0qib =Σiwib(0)pitpi0,(10.13)wib(0)=wib(pi0/pib)Σiwib(pi0/pib),wib=pibpibΣipibpib.

The individual price ratios are weighted together with their hybrid value shares, wib(0), that is, the period b quantities valued at period 0 prices. The hybrid shares are calculated by price updating the period b value shares from period b to 0. By price updating, the quantities are implicitly kept constant and the value shares are allowed to change according to the development in the relative prices. The practical counterpart of (10.13) is

PL00:t=Σiwib(0)pi0:t,wib(0)=wibpib:0Σiwibpib:0.(10.14)

Equation (10.14) shows that in practice the higherlevel indices are calculated by weighting together the elementary aggregate indices by their price-updated weights. The price-updated weights are calculated by multiplying the original period b value shares by their elementary indices from period b to period 0 and rescaling to sum to unity.

10.114 The Lowe index is not a Laspeyres index, as the weight and price reference periods do not coincide. However, it reduces to the Laspeyres index when b = 0 and to the Paasche index when b = t. Further, the Lowe index can be expressed as the ratio of two Laspeyres indices, one from b to 0 and one from b to t:

PL00:t=ΣipitqibΣipi0qib=ΣipitqibΣipibqib/Σipi0qibΣipibqib.(10.15)

From equation (10.15) it follows that the Lowe index from period 0 to t will show the same rate of change as a Laspeyres price index with period b as weight and price reference period. In other words, price updating the weights from b to 0 means that the index will show the same rate of changes as if the weights had been applied from period b.

10.115 Because it uses the fixed basket of an earlier period, the Lowe index is sometimes loosely described as a “Laspeyres-type” index, but this description is unwarranted. A true Laspeyres index would require the basket to be that purchased in the price reference month, whereas in most XMPIs the basket refers to a period different from the price reference month. When the weights are annual and the prices are monthly, it is not possible, even retrospectively, to calculate a monthly Laspeyres price index.

10.116 The statistical office may decide instead to calculate the higher-level indices without price updating the weights. This corresponds to the calculation of a share-weighted arithmetic mean of the price ratios:

PY00:t=Σiwib(pitpi0),wib=pibqibipibqib.(10.16)

10.117 This general type of index is described here as a Young index. The index is general in the sense that the shares are not restricted to refer to any particular period, but may refer to any period or an average of different periods, for example. The index is named after another 19th-century index number pioneer who advocated this type of index (see Chapter 16, Section D). In practice it is calculated simply by weighting together the elementary indices from 0 to t by their value shares as they stand without price updating:

PY00:t =Σiwibpi0:t.(10.17)

10.118 The Young index is a fixed-weight index in which the focus is that the weights should be as representative as possible for the average value shares for the period in which the weights are used in the calculation of the index. A fixed-weight index is not necessarily a fixed-basket index, that is, it does not necessarily measure the change in the value of a fixed basket such as the Lowe index. The Young index measures the development in the trade value from period 0 to t with the period b trade value shares. This does not correspond to the changing value of any actual basket, unless the trade shares have remained unchanged from b to 0. In the special case where b equals 0, the Young index reduces to the Laspeyres index.

10.119 Note that even if it is decided to price-update the weights and calculate a fixed-basket or Lowe index, the index is calculated in the form of a Young index, namely as a share-weighted arithmetic average of the elementary indices. Thus, a Lowe index is equal to a Young index in which the weights are hybrid value shares obtained by revaluing the period b quantities at the prices of the price reference month.

10.120 The issues involved are explained with the help of a numerical example in Table 10.9, The base period b is assumed to be the year 2000 so that the weights are the trade value shares in 2000. In the upper half of the table, 2000 is also used as the price reference period. However, in practice, weights based on 2000 cannot be introduced until after 2000 because of the time needed to collect and process the trade value data. In the lower half of the table, it is assumed that the 2000 weights are introduced in December 2002, and that this is also chosen as the new price reference base.

Table 10.9.

Price Updating of Weights Between Weight and Price Reference Periods

article image

10.121 If the statistical office decides to use the quantities, the trade weights of 2000 have to be adjusted for the relative price changes from 2000 to December 2002 in order to preserve the quantities. This means, in practice, that the value weights of 2000 have to be multiplied by the development in their elementary aggregate indices from 2000 to December 2002, and rescaled to sum to unity. This is illustrated in the lower half of Table 10.9, where the price-updated weights are labeled w00(Dec02).

10.122 The resulting index with price-updated weights in the lower part of Table 10.9 is a basket, or Lowe index, in which the quantities are those of 2000. The index can be expressed as ratios of the indices in the upper part of the table. For example, the overall Lowe index for March 2003 with December 2002 as its price reference base, namely 101.97, is the ratio of the index for March 2003 based on 2000 shown in the upper part of the table, namely 106.05, divided by the index for December 2002 based on 2000, namely 104.00. Thus, the price updating preserves the movements of the indices in the upper part of the table while shifting the price reference period to December 2002.

Table 10.10.

Calculation of a Chained Index

article image

10.123 On the other hand, it could be decided to calculate a series of Young indices using the trade value weights from 2000 as they stand without price updating. If the trade value shares were actually to remain constant, the quantities would have had to move inversely with the prices between 2000 and December 2002. As the quantities are kept constant in the price-updated Lowe index, the movements of the two indices usually will be different. In the special case where the relative prices remain unchanged from the weight to the price reference period, the price-updated weights will be unchanged and the Young and the Lowe indices will give the same result.

10.124 It is up to the statistical offices to decide for themselves whether to price-update the trade value shares or not. If the primary aim is to compile an XMPI measuring the price development of an actual past fixed basket of goods and services, the weights should be price-updated. The resulting fixed-basket, or Lowe, index will provide a good estimate of the price development if quantities tend to remain constant.

10.125 If the statistical office considers that the value shares of the weight reference period are the better estimates of the relative importance of commodities imported/exported, this may be an argument for applying the value shares as they stand and omitting price updating. The case for using value shares is stronger if they remain relatively constant over time or the weights are frequently updated.

10.126 The likely bias of the Young and Lowe indices depends on the target of the XMPI and the development in relative prices and trade shares, that is, whether the target of the statistical office is a Laspeyres index or an economic, superlative index, such as the Fisher, Törnqvist, or Walsh index. The factors determining the differences between the Lowe index and the Laspeyres index, and the Young index and the Laspeyres index, are outlined in Chapter 16, as are those determining the difference between Laspeyres and Paasche, and therefore Laspeyres and Fisher. Because both quantities and trade shares changes through time and by progressively larger amounts, the more the indices are likely to drift apart. Thus, whether the weights are price-updated or not, they should be reviewed and updated frequently to reduce potential bias.

10.127 Price updating the weights does not imply that the resulting trade value weights are necessarily more up to date. When there is a strong inverse relation between movements of price and quantities, price updating on its own could produce perverse results. For example, the price of computers has been declining rapidly in recent years. If the quantities are held fixed while the prices are updated, the resulting trade value on computers would also decline rapidly. In practice, however, the share of trade value on computers might actually be rising because of a very rapid increase in quantities of computers purchased.

10.128 When rapid changes take place in relative quantities as well as relative prices, statistical offices are effectively obliged to change their trade value weights more frequently. Price updating on its own cannot cope with this situation. The trade value weights have to be updated with respect to their quantities as well as their prices, which, in effect, implies collecting new trade value data.

C.4 Factoring the Young index

10.129 It is possible to calculate the change in a higher-level Young index between two consecutive periods, such as t – 1 and t, as a weighted average of the individual price indices between t – 1 and t, provided that the weights are updated to take into account the price changes between the price reference period 0 and the previous period, t – 1. This makes it possible to factor equation (10.10) into the product of two component indices in the following way:

P0:t=ΣjwjbPj0:t(10.18)=P0:t1ΣjwjbPj0:t1Pjt1:tjwjbPj0:t1=P0:t1Σjwjb(t1)Pjt1:t,wherewjb(t1)=wjbPj0:t1wjbPj0:t1

where P0:t–1 is the Young index for period t – 1. The weight Wjb(t–1) is the original weight for elementary aggregate j, price-updated by multiplying it by the elementary price index for j between 0 and t – 1, and rescaled to sum to unity. The price-updated weights are hybrid weights because they implicitly revalue the quantities of period b at the prices of t – 1 instead of at the average prices of b. Such hybrid weights do not measure the actual trade value shares of any period.

10.130 The Young index for period t can thus be calculated by multiplying the already calculated Young index for t – 1 by a separate Young index between t – 1 and t with hybrid price-updated weights. In effect, the higher-level index is calculated as a chained index. This method gives more flexibility to introduce replacement commodities and makes it easier to monitor the movements of the recorded prices for errors, because month-to-month movements are smaller and less variable than the total changes because the price reference period.

C.5 Change of index reference period

10.131 It is possible to re-reference index series to another index reference period by dividing the series with the value of the index in the new reference period. Re-referencing changes only the period in which the index is equal to 100; it does not influence the rate of change of the index. Change of the index reference period may be useful for presentational purposes or, for example, if the index is supposed to be compared with other statistics that refer to a different period in time.

10.132 The issue of re-referencing can be illustrated by an example. Assume that an index (P03:t) has been calculated with 2003 as reference year, but for some reason the statistical office wishes to re-reference the index to 2005. The re-referenced index with 2005 equal to 100 (P05:t) can then be calculated by dividing the original index by its value in 2005:

P05:t=P03:tP03:05=jwj03Pj03:tjwj03Pj03:05(10.19)=jwj03Pj03:05Pj05:tjwj03Pj03:05=wj03Pj03:05jwj03Pj03:05Pj05:t=Σwj03(05)Pj05:t.

The only difference between the two index series is a constant, namely the value of the original index in 2005. This means that the rates of change of the two index series are identical. Equation (10.19) also shows that a re-referenced index can be calculated by price updating the weights and re-referencing the elementary, or lower-level, indices to the same period in time. Thus, simultaneous price updating of the weights and re-referencing of the index series does not change the rate of change of the index, as also illustrated in the example in Table 10.9.

C.6 Introduction of new weights and chain linking

10.133 From time to time, the weights for the elementary aggregates have to be revised to ensure that they reflect current trade value shares and business activity. When new weights are introduced, the price reference period for the new index can be the last period of the old index, the old and the new indices being linked together at this point. The old and the new indices make a chained index.

10.134 The introduction of new weights is often a complex operation because it provides the opportunity to introduce new commodities, new samples, new data sources, new compilation practices, new elementary aggregates, new higher-level indices, or new classifications. These tasks are often undertaken simultaneously at the time of reweighting to minimize overall disruption to the time series and any resulting inconvenience to users of the indices.

10.135 In many countries reweighting and chaining is carried out about every five years but some countries introduce new weights each year. However, chained indices do not have to be linked annually, and the linking may be done less frequently. The real issue is not whether to chain but how frequently to chain. Reweighting is inevitable sooner or later, as the same weights cannot continue to be used forever. Whatever the time frame, statistical offices have to address the issue of chain linking sooner or later. It is an inevitable and major task for index compilers.

C.6.1 Frequency of reweighting

10.136 It is reasonable to continue to use the same set of elementary aggregate weights as long as trade patterns at the elementary aggregate level remain fairly stable. However, over time new commodities are continually being introduced on the market while others drop out. Over the longer term, trade patterns are also influenced by several other factors. These include export new markets and country sources of imported material and semi-finished, finished, and capital inputs; rising incomes and standards of living; demographic changes in the structure of the population; changes in technology; and changes in tastes and preferences.

10.137 There is wide consensus that regular updating of weights—at least every five years, and more often if there is evidence of rapid changes in international trade patterns—is a sensible and necessary practice. In the European Union, for example, the regulation on short-term statistics requires member countries to update the weights at least every five years. Frequent (e.g., annual) updating of the weights and chain-linking can be costly to implement and maintain. On the other hand, annual chaining has the advantage that changes such as the inclusion of new goods can be introduced on a regular basis, although every index needs some ongoing maintenance, whether annually chained or not. XMPIs are advantageously placed with regard to frequently updating weights because the primary source of weighting information is from administrative customs sources. Because these data are reliable and timely, the recommendation is to update as frequently as resources allow, ideally annually, unless the pattern of trade is, and is expected to continue to be, relatively constant over time.

10.138 Both exporters and importers of certain types of commodities are strongly influenced by short-term fluctuations in the economy. For example, trade of cars, major durables, expensive luxuries, and so on may change drastically from year to year. In such cases, it may be preferable to base the weight on an average of two or more years’ trade value.

C.6.2 The calculation of a chained index

10.139 Assume that a series of fixed-weight Young indices have been calculated with period 0 as the price reference period, and that in a subsequent period, k, a new set of weights has to be introduced in the index. The new weights are likely to be based on data surveyed in a period prior to k, the weight reference period, and may, or may not, have been price-updated from the new weight reference period to period k. For ease of exposition these weights are denoted as wik. A chained index is then calculated as

P0:t =P0:kΣjwjkPjk:t=P0:kPk:t(10.20)

10.140 There are several important features of a chained index:

  • The chained index formula allows weights to be updated and facilitates the introduction of new commodities and subindices and removal of obsolete ones.

  • To link the old and the new series, an overlapping period (k) is needed in which the index has to be calculated using both the old and the new set of weights.

  • A chained index may have two or more links. Between each link period, the index may be calculated as a fixed-weight index using equation (10.10) or any other index formula. The link period may be a month or a year, provided the weights and indices refer to the same period.

  • Chaining is intended to ensure that the individual indices on all levels show the correct development through time.

  • Chaining leads to nonadditivity. When the new series is chained onto the old as in equation (10.20), the higher-level indices cannot be obtained as the weighted arithmetic averages of individual indices using the new weights.5 Such results need to be carefully explained and presented.

10.141 An example of the calculation of a chain index is presented in Table 10.10. From 1998 to December 2002, the index is calculated with the year 1998 as weight and price reference period. From December 2002 onward, a new set of weights is introduced. The weights may refer to the year 2000, for example, and may or may not have been price-updated to December 2002. A new fixed-weight index series is then calculated with December 2002 as the price reference month. Finally, the new index series is linked onto the old index with 1998 = 100 by multiplication to get a continuous index from 1998 to March 2003.

Table 10.11.

Calculation of a Chained Index Using Linking Coefficients

article image

10.142 The chained higher-level indices in Table 10.10 are calculated as

P00:t =P98:Dec02Σwj00(Dec02)PjDec02:t.(10.21)

Because of the lack of additivity, the overall chained index for March 2003 (129.07), for example, cannot be calculated as the weighted arithmetic mean of the chained higher-level indices G and H using the weights from December 2002.

C.6.3 Chaining indices using linking coefficients

10.143 Table 10.11 presents an example of chaining indices with new weights to the old reference period (1998 = 100). The linking can be done several ways. As described above, one can take the current index on the new weights and multiply it by the old index level in the overlap month (December 2002). Alternatively, a linking coefficient can be calculated between the old and new series during the overlap period and this coefficient applied to the new index series to bring the index up to the level of the old series. The linking coefficient for keeping the old index reference period is the ratio of the old index in the overlap period to the new index for the same period. For example, the coefficient for the Total index is (124.90/100.00) = 1.2490. This coefficient is then applied to the Total index each month to convert it from a December 2002 reference period to the 1998 reference period. Note that a linking coefficient is needed for each index series that is being chained.

10.144 Another option is to change the index reference period at the time the new weights are introduced. In the current example, the statistical office can shift to a December 2002 reference period and link the old index to the new reference period. This is done by calculating the linking coefficient for each index as the ratio of the new index in the overlap period to the old index. For example, the coefficient for the Total index is (100.00/124.90) = 0.80064. This coefficient is applied to the old Total index series to bring it down to the level of the new index. Table 10.11 presents the linking coefficients and the resulting re-referenced price indices using the two alternative index reference periods—1998 or December 2002.

C.6.4 Introduction of new elementary aggregates

10.145 First, consider the situation in which new weights are introduced and the index is chain linked in December 2002. The overall coverage of the XMPI is assumed to remain the same, but certain commodities have increased sufficiently in importance to merit recognition as new elementary aggregates. Possible examples are the introduction of new elementary aggregates for exports of mobile telephones or a new multinational company setting up a car factory for exports.

10.146 Consider the calculation of the new index from December 2002 onward, the new price reference period. The calculation of the new index presents no special problems and can be carried out using equation (10.20). However, if the weights are price-updated from, say, 2000 to December 2002, difficulties may arise because the elementary aggregate for mobile telephones did not exist before December 2002, so there is no price index with which to price-update the weight for mobile telephones. Prices for mobile telephones may have been recorded before December 2002, possibly within another elementary aggregate (communications equipment) so that it may be possible to construct a price series that can be used for price updating. Otherwise, price information from other sources such as business surveys, trade statistics, or industry sources may have to be used. If no information is available, then movements in the price indices for similar elementary aggregates may be used as proxies for price updating.

10.147 The inclusion of a new elementary aggregate means that the next and successive higher-level indices contain a different number of elementary aggregates before and after the linking. Therefore, the rate of change of the higher-level index whose composition has changed may be difficult to interpret. However, failing to introduce new goods or services for this reason would result in an index that does not reflect the actual dynamic changes taking place in the economy. If it is customary to revise the XMPI backward, then the prices of the new commodity and their weights might be introduced retrospectively. If the XMPI is not revised backward, however, little can be done to improve the quality of the chained index. In many cases, the addition of a single elementary aggregate is unlikely to have a significant effect on the next higher-level index into which it enters. If the addition of an elementary aggregate is believed to have a significant impact on the time series of the higher-level index, it may be necessary to discontinue the old series and commence a new higherlevel index. These decisions can be made only on a case-by-case basis. In all cases information should be provided on the release to explain the changes taking place and provisions for such updates explained in the XMPIs’ metadata.

C.6.5 Introduction of new, higher-level indices

10.148 It may be necessary to introduce a new, higher-level index in the overall XMPI. This situation may occur if the coverage of the XMPI is enlarged or the grouping of elementary aggregates is changed. It then needs to be decided what the initial value of the new higher-level index should be when it is included in the calculation of the overall XMPI. Take as an example the situation in Table 10.10 and assume that a new higher-level index from January 2003 has to be included in the index. The question is what should be the December 2002 value to which the new higher-level index is linked. There are two options.

  • Estimate the value in December 2002 that the new higher-level index would have had with 1998 as the price reference period, and link the new series from January 2003 onward to this value. This procedure will prevent any break in the index series.

  • Use 100 in December 2002 as the starting point for the new higher-level index. This simplifies the problem from a calculation perspective, although there remains the problem of explaining the index break to users.

In any case, major changes such as those just described should, so far as possible, be made in connection with the regular reweighting and chaining to minimize disruptions to the index series.

10.149 A final case to consider concerns classification change. For example, a country may decide to change from a national classification to an international one, such as ISIC. The changes in the composition of the aggregates within the XMPI may then be so large that it is not meaningful to link them. In such cases, it is recommended that the XMPI with the new classification should be calculated backward for at least one year so that consistent annual rates of change can be calculated.

C.6.6 Partial reweighting and introducing new goods

10.150 The weights for the elementary aggregates may be obtained from a variety of sources over a number of periods. Consequently, it may not be possible to introduce all the new weighting information at the same time. In some cases, it may be preferable to introduce new weights for some elementary aggregates as soon as possible after the information is received. This would be the case for introducing new goods (e.g., revolutionary goods, discussed in Chapter 9) into the index when these goods fall within the existing commodity structure of the index. The introduction of new weights for a subset of the overall index is known as partial reweighting.

10.151 As an example, assume there is a four-digit industry with three major commodities (A, B, and C) that were selected for the sample in 2000. From the trade value data for 2000, A had 50 percent of trade values, B had 35 percent, and C had 15 percent. From a review of trade values conducted for 2002, the statistical office discovers that C now has 60 percent of the trade value and A and B each have 20 percent. When the new weights are introduced into the index, the procedures discussed in Section C.7.2 for chaining the new index onto the old index can be used. For example, the new commodity weights for 2002 are used to calculate the index in an overlap month such as April 2003 with a base price reference period of December 2002. For May 2003, the index using the new commodity weights is again calculated and the change in the new index is then applied (linked) to the old industry-level index for April 2003 (with 2000 = 100) to derive the industry index for May 2003 (2000 = 100). The formula for this calculation is the following:

P00:May03 =P00:Apr03×[Σnj=1wj02PjDec02:May03/Σnj=1wj02PjDec02:Apr03].(10.22)

10.152 Continuing with this example, assume the review of trade values finds the new commodity (D) has a significant share of trade (perhaps 15 or 20 percent), and it is expected to continue growing in relative importance. The statistical office would use the same procedure for introducing the new commodity. In this case, the calculations for the new industry index in April and May would use all four commodities instead of the original three. The price change in the new sample is linked to the old index as in equation (10.21). The only difference will be that the summations are over m (four commodities) instead of n (three) commodities.

10.153 One could also make the same calculations using the linking coefficient approach discussed in Section C.7.3. The linking coefficient is derived by taking the ratio of the old industry index (2000 = 100) to the new industry index (December 2002 = 100) in the overlap period (April 2003):

Linkingcoefficient =Σj=1mwj02PjDec02:Apr03/Σj=1mwj02PjDec02:Apr03.(10.23)

The linking coefficient, computed for the overlap period only, is then applied each month to the new index to adjust it to the level of the old index with an index reference period of 2000.

10.154 Another issue is the weights to use for compiling the index for the commodity groups represented by A, B, C, and D. For example, if indices for commodities A and B are combined with indices for commodities X and Y to calculate a commodity group index, the new weights for A and B present a problem because they represent trade values in a more current period than do the weights for X and Y. Also, the indices have different price reference periods. If we had weights for commodities X and Y for the same period as for A and B, then we could use the same approach as just described for compiling the industry index. Lacking new commodity weights for X and Y means the statistical office will have to take additional steps. One approach to resolve this problem is to price-update the weights for commodities X and Y from 2000 to 2002 using the change in the respective price indices. Thus, the original weight for commodity X is multiplied by the change in prices between 2002 and 2000 (i.e., the ratio of the average price index of X in 2002 to the average price index of X in 2000). Then one can use the same base price reference period as for A and B so that the indices for commodities X and Y are each re-referenced to December 2002. The commodity group index can then be compiled for April 2003 using the new weights for all four commodities and their indices with December 2002 = 100. Once the April 2003 index is compiled on the December 2002 price reference period, then the linking coefficient using equation (10.23) can be calculated to adjust the new index level to that of the old index. Alternatively, the price change in the new commodity group index (December 2002 = 100) can be applied to the old index level each month as shown in equation (10.22).

10.155 As this example demonstrates, partial reweight-ing has particular implications for the practice of price updating the weights. Weighting information may not be available for some elementary aggregates at the time of reweighting. Thus, it may be necessary to consider price updating the old weights for those elementary aggregates for which no new weights are available. The weights for the latter may have to be price-updated over a long period, which, for reasons given earlier, may give rise to some index bias if relative quantities have changed inversely with the relative price changes. Data on both quantity and price changes for the old index weights should be sought before undertaking price updating alone. The disadvantage of partial reweighting is that the implicit quantities belong to different periods than do other components of the index, so that the composition of the basket is obscure and not well defined.

10.156 One may conclude that the introduction of new weights and the linking of a new series to the old series are not difficult in principle. The difficulties arise in practice when trying to align weight and price reference periods and when deciding whether higher-level indices comprising different elementary aggregates should be chained over time. It is not possible for this Manual to provide specific guidance on decisions such as these, but compilers should consider carefully the economic logic and statistical reliability of the resulting chained series and also the needs of users. To facilitate the decision process, one should give careful thought to these issues in advance during the planning of a reweighting exercise, paying particular attention to which indices are to be published.

C.6.7 Long- and short-term links

10.157 Consider a long-term chained index in which the weights are changed annually. In any given year, the current monthly indices when they are first calculated have to use the latest set of available weights, which cannot be those of the current year. However, when the weights for the year in question become available subsequently, the monthly indices can then be recalculated on the basis of the weights for the same year. The resulting series can then be used in the long-term chained index rather than the original indices first published. Thus, the movements of the long-term chained index from, say, any one December to the following December, are based on weights of that same year, the weights being changed each December.6

10.158 Assume that each link runs from December to December. The long-term index for month m of year Y with December of year 0 as index reference period is then calculated by the formula7

PDec0:mY=(Πy=1Y1PDecy1:Decy)PDecY1:mY=PDec0:Dec1×PDec1:Dec2×...×PDecY2:DecY1×PDecY1:mY.(10.24)

The long-term movement of the index depends on the long-term links only as the short-term links are successively replaced by their long-term counterparts. For example, let the short-term indices for January to December 2001 be calculated as

PDec00:m01=wj00(Dec00)PjDec00:m01,(10.25)

where wj00(Dec00) are the weights from 2000 price-updated to December 2000. At the time when weights for 2001 are available, this is replaced by the long-term link

PDec00:Dec01=wj00(Dec00)PjDec00:Dec01,(10.26)

where wj01(Dec00) are the weights from the 2001 price backdated to December 2000. The same set of weights from 2001 price-updated to December 2001 is used in the new short-term link for 2002,

PDec01:m02=wj01(Dec01)PjDec01:m02,(10.27)

10.159 With this method, the movement of the long-term index is determined by contemporaneous weights that refer to the same period. The method is conceptually attractive because the weights that are most relevant for most users are those based on trade patterns at the time the price changes actually take place. The method takes the process of chaining to its logical conclusion, at least assuming the indices are not chained more frequently than once a year. Because the method uses weights that are continually revised to ensure that they are representative of current trade patterns, the resulting index also largely avoids the substitution bias that occurs when the weights are based on the trade patterns of some period in the past. The method may therefore appeal to statistical offices whose objective is to estimate an economic index.

10.160 Finally, it may be noted that the method involves some revision of the index first published. In some countries, there is opposition to revising a XMPI once it has been first published, but it is standard practice for other economic statistics, including the national accounts, to be revised as more up-to-date information becomes available. This point is considered below.

C.7 Decomposition of index changes

10.161 Users of the index are often interested in how much of the change in the overall index is attributable to the change in the price of some particular commodity or group of commodities, such as petroleum or agricultural goods. Alternatively, there may be interest in what the index would be if agricultural goods or petroleum were left out. Questions of this kind can be answered by decomposing the change in the overall index into its constituent parts.

10.162 Assume that the index is calculated as in equation (10.10) or (10.11). The relative change of the index from tm to t can then be written as

P0:tP0:tm1=ΣjwjbPj0:tmPjtm:tΣjwjbPj0:tm1.(10.28)

Hence, a subindex from tm to t enters the higher-level index with a weight of

wjbPj0:tmΣjwjbPj0:tm=wjbPj0:tmP0:tm.(10.29)

The effect on the higher-level index of a change in a subindex can then be calculated as

Effect=wjbPj0:tmPj0:tm.(Pj0:tPj0:tm1)=wjbPj0:tm(Pj0:tPj0:tm).(10.30)

With m = 1, equation (10.30) gives the effect of a monthly change; with m = 12, it gives the effect of the change over the past 12 months.

10.163 If the index is calculated as a chained index, as in equation (10.20), then a subindex from tm to t enters the higher-level index with a weight of

wjkPjk:tmPk:tm=wjk(Pj0:tm/Pj0:k)(P0:tm/P0:k).(10.31)

The effect on the higher-level index of a change in a subindex can then be calculated as

Effect=wikPk:tm.(Pjk:tPjk:tm)=wik(P0:tm/P0:k)(Pj0:tPj0:tmPj0:k).(10.32)

It is assumed that tm lies in the same link (i.e., tm refers to a period at or later than k). If the effect of a subindex on a higher-level index is to be calculated across a chain, the calculation needs to be carried out in two steps, one with the old series up to the link period and one from the link period to period t.

10.164 Table 10.12 illustrates an analysis using both the percentage index point effect and contribution of each component index to the overall 12-month change. The next-to-last column in Table 10.12 is calculated using equation (10.30) to derive the effect each component index contributes to the total percentage change. For example, for agriculture the index weight (wjb) is 38.73, which is divided by the previous period index (P0:t–m) or 120.2, and then multiplied by the index point change (Pjt:0Pj0:tm) between January 2003 and January 2002, 10.5. The result shows that agriculture’s effect on the 9.1 percent overall change was 3.4 percent. The change in agriculture contributed 37.3 percent (3.4/9.1 × 100) to the total 12-month change.

Table 10.12.

Decomposition of Index Change from January 2002 to January 2003

article image

C.8 Some alternatives to fixed-weight indices

10.165 Monthly XMPIs are typically arithmetic weighted averages of the price indices for the elementary aggregates in which the weights are kept fixed over a number of periods, which may range from 12 months to many years. The repeated use of the same weights relating to some past period b simplifies calculation procedures and reduces data collection requirements and costs. Moreover, when the weights are known in advance of the price collection, the index can be calculated immediately after the prices have been collected and processed.

10.166 However, the longer the same weights are used, the less representative of current trade patterns they become, especially in periods of rapid technical change when new kinds of goods and services are continually appearing on the market and old ones are disappearing. This may undermine the credibility of an index that purports to measure the rate of change in the value of goods and services exported and imported. Similarly, if the objective is to compile an economic index, the continuing use of the same fixed basket is likely to become increasingly unsatisfactory the longer the same basket is used. The longer the same basket is used, the greater the bias in the index is likely to become.

10.167 There are several possible ways of minimizing, or avoiding, the potential biases from the use of fixed-weight indices. These are outlined below.

10.168 Annual chaining. One way to minimize the potential biases from the use of fixed-weight indices is to keep the weights and the base period as up to date as possible by frequent weight updates and chaining. A number of countries have adopted this strategy and revise their weights annually. In any case, as noted earlier, it would be impossible to deal with the changing universe of commodities without some chaining of the price series within the elementary aggregates, even if the weights attached to the elementary aggregates remain fixed. Annual chaining eliminates the need to choose a base period, because the weight reference period is always the previous year, or possibly the preceding year.

10.169 Annual chaining with current weights. When the weights are changed annually, it is possible to replace the original weights based on the previous year, or years, by those of the current year if the index is revised retrospectively as soon as information on current-year trade value becomes available. The long-term movements in the XMPI are then based on the revised series. This is the method adopted by the Swedish Statistical Office for its CPI as explained in Section C.7.7. This method could provide unbiased results.

10.170 Other index formulas. When the weights are revised less frequently, say every five years, another possibility would be to use a different index formula for the higher-level indices instead of an arithmetic average of the elementary price indices. One possibility would be a weighted geometric average. This is not subject to the same potential upward bias as the arithmetic average. More generally, a weighted version of the Lloyd-Moulton formula, given in Section B.6, might be considered. This formula takes account of the substitutions that purchasers make in response to changes in relative prices and should be less subject to bias for this reason. It reduces to the geometric average when the elasticity of substitution is unity, on average. It is unlikely that such a formula could replace the arithmetic average in the foreseeable future and gain general acceptance, if only because it cannot be interpreted as measuring changes in the value of a fixed basket. However, it could be compiled on an experimental basis and might well provide a useful supplement to the main index. It could at least flag the extent to which the main index is liable to be biased and throw light on its properties.

10.171 Retrospective superlative indices. Finally, it is possible to calculate a superlative index retrospectively. Superlative indices such as Fisher and Törnqvist-Theil treat both periods compared symmetrically and require trade value data for both periods. Although the XMPI may have to be based on weighting data of a past period when it is first published, it may be possible to estimate a superlative index later when much more information becomes available about producers’ trade values period by period. At least one office, the U.S. Bureau of Labor Statistics, is publishing such an index for its CPI. The publication of revised or supplementary indices raises matters of statistical policy, but users readily accept revisions in other fields of economic statistics.

D. Data Editing

10.172 This chapter has been concerned with the methods used by statistical offices to calculate their XMPIs. This concluding section considers the data editing carried out by statistical offices, a process closely linked to the calculation of the price indices for the elementary aggregates. Data collection, recording, and coding—the data capture processes—are dealt with in Chapter 6. The next step in the production of price indices is the data editing. Data editing is here meant to comprise two steps:

  • Detecting possible errors and outliers, and

  • Verifying and correcting data.

10.173 Logically, the purpose of detecting errors and outliers is to exclude errors or the effects of outliers from the index calculation. Errors may be falsely reported prices, or they may be caused by recording or coding mistakes. Also, missing prices because of nonresponse may be dealt with as errors. Possible errors and outliers are usually identified as observations that fall outside some prespecified acceptance interval or are judged to be unrealistic by the analyst on some other ground. It may also be the case, however, that even if an observation is not identified as a potential error, it may actually show up to be false. Such observations are sometimes referred to as inliers. On the other hand, the sampling may have captured an exceptional price change that falls outside the acceptance interval but has been verified as correct. In some discussions of survey data, any extreme value is described as an outlier. The term is reserved here for extreme values that have been verified as being correct.

10.174 When a possible error has been identified, whether it is in fact an error or not needs to be verified. This can usually be accomplished by asking the respondent to verify the price, or by comparing it with the price change of similar commodities. If it is an error, it needs to be corrected. This can be done easily if the respondent can provide the correct price or, where this is not possible, by imputation or omitting the price from the index calculation. If the price proves to be correct, it should be included in the index. If it proves to be an outlier, it can be accepted or corrected according to a predefined practice—for example, omitting or imputation.

10.175 Data editing involves two steps: the detection of possible errors and outliers, and the verification and correction of the data. Effective monitoring and quality control are needed to ensure the reliability of the basic price data fed into the calculation of the elementary prices indices on which the quality of the overall index depends. However, extreme values arise for traded goods and services because price changes are undertaken infrequently. There is much theory and evidence on this. For example, cost-driven or exchange rate changes may not be immediately passed through to prices but stored up and delivered as a large price increase rather than a series of smaller ones. Harsh data editing may interpret such an increase as noise, rather than the signal of an actual price change. It is advised that automatic outlier detection routines be used in conjunction with a system that allows, at least for commodities with substantial trade, an external validation, say by phone contact with the establishment responsible. This is facilitated when the source of the price data is the establishment. However, unit value indices from customs data are by their nature volatile, and automatic outlier routines may well distort the results unless there is a follow-up procedure.

10.176 Although the power of computers provides obvious benefits, not all of these activities have to be computerized. However, there should be a complete set of procedures and records that controls the processing of data, even though some or all of it may be undertaken without the use of computers. It is not always necessary for all of one step to be completed before the next is started. If the process uses spreadsheets, for example, with default imputations predefined for any missing data, the index can be estimated and reesti-mated whenever a new observation is added or modified. The ability to examine the impact of individual price observations on elementary aggregate indices and the impact of elementary indices on various higherlevel aggregates is useful in all aspects of the computation and analytical processes.

10.177 It is neither necessary nor desirable to apply the same degree of scrutiny to all reported prices. The price changes recorded by some respondents carry more weight than do others, and statistical analysts should be aware of this. For example, one elementary aggregate with a weight of 2 percent, say, may contain 10 prices, whereas another elementary aggregate of equal weight may contain 100 prices. Obviously, an error in a reported price will have a much smaller effect in the latter, where it may be negligible, whereas in the former it may cause a significant error in the elementary aggregate index and even influence higherlevel indices.

10.178 However, there may be an interest in the individual elementary indices as well as in the aggregates built from them. Because the sample sizes used at the elementary level may often be small, any price collected, and error in it, may have a significant impact on the results for individual commodities or industries. The verification of reported data usually has to be done on an index-by-index basis, using statistical analysts’ experience. Also, for support, analysts will need the cooperation of the respondents to the survey to help explain unusual price movements.

10.179 Obviously, the design of the survey and questionnaires influences the occurrence of errors. Hence, price reports and questionnaires should be as clear and unambiguous as possible to prevent misunderstandings and errors. Whatever the design of the survey, it is important to verify that the data collected are those that were requested initially. The survey questionnaire should prompt the respondent to indicate if the requested data could not be provided. If, for example, a commodity is not produced anymore and thus is not priced in the current month, a possible replacement would be requested along with details of the extent of its comparability with the old one. If the respondent cannot supply a replacement, there are a number of procedures for dealing with missing data (see Chapter 8).

D.1 Identifying possible errors and outliers

10.180 One of the ways price surveys are different from other economic surveys is that, although prices are recorded, the measurement concern is with price changes. Because the index calculations consist of comparing the prices of matching observations from one period to another, editing checks should focus on the price changes calculated from pairs of observations, rather than on the reported prices themselves.

10.181 Identification of unusual price changes can be accomplished by

  • Nonstatistical checking of input data,

  • Statistical checking of input data, and

  • Output checking.

These will be described in turn.

D.1.1 Nonstatistical checking of input data

10.182 Nonstatistical checking can be undertaken by manually checking the input data, by inspecting the data presented in comparable tables, or by setting filters.

10.183 When the price reports or questionnaires are received in the statistical office, the reported prices can be checked manually by comparing these with the previously reported prices of the same commodities or by comparing them with prices of similar commodities from other establishments. Although this procedure may detect obvious unusual price changes, it is far from ensuring that all possible errors are detected. It is also extremely time consuming, and it does not identify coding errors.

10.184 After the price data have been coded, the statistical system can be programmed to present the data in a comparable form in tables. For example, a table showing the percentage change for all reported prices from the previous to the current month may be produced and used for detection of possible errors. Such tables may also include the percentage changes of previous periods for comparison and 12-month changes. Most computer programs and spreadsheets can easily sort the observations according to, say, the size of the latest monthly rate of change so that extreme values can easily be identified. It is also possible to group the observations by elementary aggregates.

10.185 The advantage of grouping observations is that it highlights potential errors so that the analyst does not have to look through all observations. A hierarchical strategy whereby all extreme price changes are first identified and then examined in context may save time, although the price changes in elementary aggregate indices, which have relatively high weights, should also be examined in context.

10.186 Filtering is a method by which possible errors or outliers are identified according to whether the price changes fall outside some predefined limits, such as ±20 percent or even 50 percent. This test should capture any serious data coding errors, as well as some of the cases in which a respondent has erroneously reported on a different commodity. It is usually possible to identify these errors without reference to any other observations in the survey, so this check can be carried out at the data-capture stage. The advantage of filtering is that the analyst need not look through numerous individual observations.

10.187 These upper and lower limits may be set for the latest monthly change, or change over some other period. Note that the set limits should take account of the context of the price change. They may be specified differently at various levels in the hierarchy of the indices—for example, at the commodity level, at the elementary aggregate level, or at higher levels. Larger changes for commodities whose prices are known to be volatile might be accepted without question. For example, for monthly changes, limits of ± 10 percent might be set for petroleum prices, while for professional services the limits might be 0 percent to +5 percent (because any price that falls is suspect), and for computers it might be –5 percent to zero, because any price that rises is suspect. One can also change the limits over time. If it is known that petroleum prices are rising, the limits could be 10 percent to 20 percent, while if they are falling, they might be –10 percent to –20 percent. The count of failures should be monitored regularly to examine the limits. If too many observations are being identified for review, the limits will need to be adjusted or the customization refined.

10.188 The use of automatic deletion systems is not advised, however. It is a well-recorded phenomenon in pricing that price changes for many commodities, especially durables, are not undertaken smoothly over time but saved up to avoid what are termed “menu costs” associated with making a price change. These relatively substantial increases may take place at different times for different models of commodities and have the appearance of extreme, incorrect values. To delete a price change for each model of the commodity as being “extreme” at the time it occurs is to ignore all price changes for the industry.

D.1.2 Statistical checking of input data

10.189 Statistical checking of input data compares, for some time period, each price change with the change in prices in the same or a similar sample. Two examples of such filtering are given here, the first based on non-parametric summary measures and the second on the log-normal distribution of price changes.

10.190 The first method involves tests based on the median and quartiles of price changes, so they are unaffected by the impact of any single extreme observation. First, the median, first quartile, and third quartile price relatives are defined as RM, RQ1 and RQ3, respectively. Then, any observation with a price ratio more than a certain multiple C of the distance between the median and the quartile is identified as a potential error. The basic approach assumes price changes are normally distributed. Under this assumption, it is possible to estimate the proportion of price changes that are likely to fall outside given bounds expressed as multiples of C. Under a normal distribution, RQ1 and RQ3 are equidistant from RM,; thus, if C is measured as RM – (RQ1 + RQ3)/2, 50 percent of observations would be expected to lie within ±C from the median. From the tables of the standardized normal distribution, this is equivalent to about 0.7 times the standard deviation (σ). If, for example, C were set to 6, the distance implied is about 4σ of the sample, so about 0.17 percent of observations would be identified this way. With C = 4, the corresponding figures are 2.7σ, or about 0.7 percent of observations. If C = 3, the distance is 2.02σ, so about 4 percent of observations would be identified.

10.191 In practice, most prices may not change each month, and the share of observations identified as possible errors as a percentage of all changes would be unduly high. Some experimentation with alternative values of C for different industries or sectors may be appropriate. If this test is to be used to identify possible errors for further investigation, a relatively low value of C should be used.

10.192 To use this approach in practice, three modifications should be made. First, to make the calculation of the distance from the center the same for extreme changes on the low side as well as on the high side, a transformation of the relatives should be made. The transformed distance for the ratio of one price observation i, Si, should be

Si=1RM/Riif0<Ri<RMand=Ri/RM1ifRiRM.

Second, if the price changes are grouped closely together, the distances between the median and quartiles may be very small, so that many observations would be identified that had quite small price changes. To avoid this, some minimum distance, say 5 percent for monthly changes, should also be set. Third, with small samples, the impact of one observation on the distances between the median and quartiles may be too great. Because sample sizes for some elementary indices are small, samples for similar elementary indices may need to be grouped together.8

10.193 An alternative method can be used if it is thought that the price changes may be distributed log-normally. In this method, the standard deviation of the log of all price changes in the sample (excluding unchanged observations) is calculated and a goodness of fit test (X2)is undertaken to identify whether the distribution is log-normal. If the distribution satisfies the test, all price changes outside two times the exponential of the standard deviation are highlighted for further checking. If the test rejects the log-normal hypothesis, all the price changes outside three times the exponential of the standard deviation are highlighted. The same caveats mentioned before about clustered changes and small samples apply.

10.194 The second example is based on the Tukey algorithm. The set of price relatives is sorted and the highest and lowest 5 percent flagged for further attention. In addition, now that the top and bottom 5 percent are excluded, exclude the price relatives that are equal to 1 (no change). The arithmetic (trimmed) mean (AM) of the remaining price relatives is calculated. This mean is used to separate the price relatives into two sets, an upper and a lower one. The upper and lower “mid-means”—that is, the means of each of these sets (AML, AMU)—are then calculated. Upper and lower Tukey limits (TL, TU) are then established as the mean ±2.5 times the difference between the mean and the mid-means:

TU=AM+2.5(AMUAM),
TL=AM2.5(AMAML).

Then, all those observations that fall above TU and below TL are flagged for attention.

10.195 This method is similar to but simpler than that based on the normal distribution. Because it excludes all cases of no change from the calculation of the mean, it is unlikely to produce limits that are very close to the mean, so there is no need to set a minimum difference. However, its success will also depend on there being a large number of observations on the set of changes being analyzed. Again, it will often be necessary to group observations from similar elementary indices. For any of these algorithms, the comparisons can be made for any time periods, including the latest month’s changes, but also longer periods, in particular, 12-month changes.

10.196 The advantage of these two models of filtering compared with the simple method of filtering is that for each period the upper and lower limits are determined by the data itself and hence are allowed to vary over the year, given that the analyst has decided the value of the parameters entering the models. A disadvantage is that, unless one is prepared to use approximations from earlier experience, all the data have to be collected before the filtering can be undertaken. Filters should be set tightly enough so that the percentage of potential errors that turn out to be real errors is high. As with all automatic methods, the flagging of an unusual observation is for further investigation, as opposed to automatic deletion.

D.1.3 Checking by impact, or data output checking

10.197 Filtering by impact, or output editing, is based on calculating the impact an individual price change has on an index to which it contributes. This index can be an elementary aggregate index, the total index, or some other aggregate index. The impact a price change has on an index is its percentage change times its effective weight. In the absence of sample changes, the calculation is straightforward: It is the nominal (reference period) weight, multiplied by the price relative, and divided by the level of the index to which it is contributing. So the impact on the index P of the change of the price of commodity i from time t to t +1 is ±wi (pt+1/pt)/Pt, where wi is the nominal weight in the price reference period. A minimum value for this impact can be set, so that all price changes that cause an impact greater than this change can be flagged for review. If index P is an elementary index, then all elementary indices may be reviewed, but if P is a higher-level index, prices that change by a given percentage will be flagged or not, depending on how important the elementary index to which they contribute is in the aggregate.

10.198 However, at the lowest level, births and deaths of commodities in the sample cause the effective weight of an individual price to change quite substantially. The effective weight is also affected if a price observation is used as an imputation for other missing observations. The evaluation of effective weights in each period is possible, though complicated. However, as an aid to highlighting potential errors, the nominal weights, as a percentage of their sum, will usually provide a reasonable approximation. If the impact of 12-month changes is required to highlight potential errors, approximations are the only feasible filters to use, because the effective weights will vary over the period.

10.199 One advantage of identifying potential errors this way is that it focuses on the results. Another advantage is that this form of filtering also helps the analyst to describe the contributions to change in the price indices. In fact, much of this kind of analysis is done after the indices have been calculated, because the analyst often wishes to highlight those indices that have contributed the most to overall index changes. Sometimes the analysis results in a finding that particular industries have a relatively high contribution to the overall price change, and that finding is considered unrealistic. The change is traced back to an error, but it may be late in the production cycle and jeopardize the scheduled release date. There is thus a case for identifying such unusual contributions as part of the data editing procedures. The disadvantage of this method is that an elementary index’s change may be rejected at that stage. It may be necessary to override the calculated index, though this should be a stopgap measure only until the index sample is redesigned.

D.2 Verifying and correcting data

10.200 Some errors, such as data coding errors, can be identified and corrected easily. Ideally, these errors are caught at the first stage of checking, before they need to be viewed in the context of other price changes. Dealing with other potential errors is more difficult. There may be errors that are not identified in the data checking procedure and observations that have been identified as potential errors may prove to be correct, especially if the data checking limits are rather narrow. Some edit failures may be resolved only by checking the data with the respondent.

10.201 If a satisfactory explanation can be obtained from the respondent, the data can be verified or corrected. If not, procedures may differ. Rules may be established that if a satisfactory explanation is not obtained, then the reported price is omitted from the index calculation. On the other hand, it may be left to the analyst to make the best judgment as to the price change. However, if an analyst makes a correction to some reported data, without verifying it with the respondent, it may subsequently cause problems with the respondent. If the respondent is not told of the correction, the same error may persist in the future. The correct action depends on a combination of the confidence in the analysts, the revision policy in the survey, and the degree of communication with respondents. Most statistical offices do not want to unduly burden respondents.

10.202 In many organizations, a disproportionate share of activity is devoted to identifying and following up on potential errors. If the practice leads to little change in the results, because most reports end up being accepted, then the bounds on what are considered to be extreme values should be relaxed. More errors are likely introduced by respondents failing to report changes that occur than from wrongly reporting changes, and the goodwill of respondents should not be unduly undermined.

10.203 Generally, the effort spent on identifying potential errors should not be excessive. Obvious mistakes should be caught at the data capture stage. The time spent identifying observations to query, unless they are highly weighted and excessive, is often better spent treating those cases in the production cycle where things have changed—quality changes or unavailable prices—and reorganizing activities toward maintaining the relevance of the sample and checking for errors of omission.

10.204 If the price observations are collected in a way that prompts the respondent with the previously reported price, the respondent may report the same price as a matter of convenience. This can happen even though the price may have changed, or even when the particular commodity being surveyed is no longer available. Because prices for many commodities do not change frequently, this kind of error is unlikely to be spotted through normal checks. Often the situation comes to light when the contact at the responding outlet changes and the new contact has difficulty finding something that corresponds to the price previously reported. It is advisable, therefore, to keep a record of the last time a particular respondent reported a price change. When that time has become suspiciously long, the analyst should verify with the respondent that the price observation is still valid. What constitutes too long will vary from commodity to commodity and the level of overall price inflation but, in general, any price that has remained constant for more than a year is suspect.

D.2.1 Treatment of outliers

10.205 Detection and treatment of outliers (extreme values that have been verified as being correct) is an insurance policy. It is based on the fear that a particular data point collected is exceptional by chance, and that if there were a larger survey, or even a different one, the results would be less extreme. The treatment, therefore, is to reduce the impact of the exceptional observation, though not to ignore it, because, after all, it did occur. The methods to test for outliers are the same as those used to identify potential errors by statistical filtering, as described above. For example, upper and lower bounds of distances from the median price change are determined. In this case, however, when observations are found outside those bounds, they may be changed to be at the bounds or imputed by the rate of change of a comparable set of prices. This outlier adjustment is sometimes made automatically, on the grounds that the analyst by definition has no additional information on which to base a better estimate. Although such automatic adjustment methods are employed, the Manual proposes caution in their use. If an elementary aggregate is relatively highly weighted and has a relatively small sample, an adjustment may be made. The general prescription should be to include verified prices and the exception to dampen them.

D.2.2 Treatment of missing price observations

10.206 It is likely that not all the requested data will have been received by the time the index needs to be calculated. It is generally the case that missing data turn out to be delayed. In other cases, the respondent may report that a price cannot be reported because neither the commodity, nor any similar substitute, is being made anymore. Sometimes, of course, what started as an apparent late report becomes a permanent loss to the sample. Different actions need to be taken depending on whether the situation is temporary or permanent.

10.207 For temporarily missing prices, the most appropriate strategy is to minimize the occurrence of missing observations. Survey reports are likely to come in over a period of time before the indices need to be calculated. In many cases, they follow a steady routine—some respondents will tend to file quickly, others typically will file later in the processing cycle. An analyst should become familiar with these patterns. If there is a good computerized data capture system, it can flag reports that appear to be later than usual, well before the processing deadline. Also, some data are more important than others. Depending on the weighting system, some respondents may be particularly important, and such commodities should be flagged as requiring particular scrutiny.

10.208 For those reports for which no estimate can be made, two basic alternatives are considered here (see Chapter 8 for a full range of approaches): imputation, preferably targeted, in which the missing price change is assumed to be the same as some other set of price changes, or an assumption of no change, in which the preceding period’s price is used (the carry-forward method discussed in Chapter 8). However, this latter procedure ignores the fact that some prices will prove to have changed, and if prices are generally moving in one direction, this will mean that the change in the indices will be understated. It is not advised. However, if the index is periodically revised, the carry-forward method will lead to fewer subsequent revisions than will making an imputation, because for most commodities, prices do not generally change in any given period. The standard approach to imputation is to base the estimate of the missing price observation on the change of some similar group of observations.

10.209 There will be situations where the price is permanently missing because the commodity no longer exists. Because there is no replacement for the missing price, an imputation will have to be made each period until either the sample is redesigned or a replacement can be found. Imputing prices for permanently missing sample observations is, therefore, more important than in the case of temporarily missing reports and requires closer attention.

10.210 The missing price can be imputed by the change of the remaining price observations in the elementary aggregate, which has the same effect as removing the missing observation from the sample, or by the change of a subset of other price observations for comparable commodities. The series should be flagged as being based on imputed values.

10.211 Samples are designed on the basis that the commodities chosen to observe are representative of a wider range of commodities. Imputations for permanently missing prices are indications of weakness in the sample, and their accumulation is a signal that the sample should be redesigned. For indices where there is known to be a large number of deaths in the sample, the need for replacements should be anticipated.

1

In practice the coverage is less, so because large proportions of trade may be deleted if no quantity information is available, the unit value changes are considered to be outliers, and customs documentation does not cover the commodities concerned.

2
This can be seen by rewriting equation (10.2) above as
PD0:t=1npi0.(pit/pi0)1npi0.
3

The Carli index failed the time reversal and transitivity tests and should not be used. The Dutot failed the commensurability test and is applicable only for homogeneous items.

4

The quantities that make up the basket in the Walsh index are the geometric means of the quantities in the two periods.

5

If, on the other hand, the index reference period is changed and the index series before the link period are rescaled to the new index reference period, these series cannot be aggregated to higher-level indices by use of the new weights.

6

This method has been developed by the Central Statistical Office of Sweden, where it is applied in the calculation of the CPI. See Statistics Sweden (2001).

7

In the actual Swedish practice, a factor scaling the index from December year 0 to the average of year 0 is multiplied onto the righthand side of equation (10.24) to have a full year as reference period.

8

For a detailed presentation of this method the reader is referred to Hidiroglou and Berthelot (1986). The method can be expanded to also take into account the level of the prices, so that, for example, a price increase from 100 to 110 is attributed a different weight than is a price increase from 10 to 11.