15. Peer Group Analysis and Descriptive Statistics

International Monetary Fund
Published Date:
April 2006
  • ShareShare
Show Summary Details


15.1 Both users and compilers of FSIs have recognized the need for peer group analysis and dispersion analysis. This chapter sets out options and ideas in these areas for use by compilers and analysts.

15.2 Sector balance sheets and income and expense data can disguise important information. For example, the sector-wide capital-to-asset ratio for deposit takers is essentially the average capital-to-asset ratio for the system (derived by the summation of all institutions’ capital and its division by all institutions’ assets) and, if normally distributed, would convey information about the median capital asset ratio as well as the most frequently observed capital asset ratio (the mode). However, the ratio does not indicate whether the individual institution’s capital ratios are clustered in a narrow range around the average value or are spread over a wide range. Moreover, if the data for one highly capitalized deposit taker offset the data for several undercapitalized deposit takers, the aggregate ratio may appear robust, masking significant vulnerabilities from weak deposit takers whose failure could lead to contagion throughout the system.

15.3 A wide variety of meaningful peer groups can be created for comparison purposes, and descriptive statistics can be compiled to examine the dispersion and concentration of the institutions within the peer group or sector. This chapter describes some types of peer groups and discusses measures of concentration and of dispersion. Issues in developing these data are set out, such as weighting the contribution of the individual institutions, and some guidance in analyzing the data is provided.

Peer Group Analysis

15.4 A peer group is a set of individual institutions that are grouped on the basis of analytically relevant criteria. Peer groups can be used to compare FSI ratios: (1) individual deposit takers with similar institutions, (2) peer groups with other domestic peer groups, and (3) peer groups across countries. Peer group analysis can be undertaken using either cross-border or domestic consolidated data.

Types of Peer Groups

15.5 Depending on analytical needs and data availability, different types of peer groups may be constructed. Some might be constructed on an ad hoc basis. For example, ad hoc peer groups might cover recent entrants into the market, deposit takers with low capital ratios or low return on equity, deposit takers with high levels of nonperforming loans, and deposit takers that concentrate on lending to particular types of borrowers. Other peer groups might be created to facilitate ongoing analysis, such as groups of similarly sized deposit takers (based on their total assets).

15.6 By way of example, peer group data could be constructed for groupings of deposit takers based on the following major characteristics:

  • Size of assets or revenues. The size of institutions might affect market competitiveness or market power. Moreover, the condition of the peer group composed of the largest deposit takers—such as the three to five largest deposit takers, based on total assets—is often important for understanding overall stability, because these deposit takers are the most likely to be systemically important and may exercise the greatest market power. Such a group has a small enough number of institutions that it can be constructed for most economies and can facilitate international comparison.
  • Line of business. For example, regular retail banks might be distinguished from mortgage banks.
  • Type of ownership. For example, publicly controlled deposit takers might be distinguished from privately controlled deposit takers.
  • Offshore or onshore. Deposit takers that are offshore can have transactions with only nonresidents and thus might be an important group to identify.
  • Region of the country.

15.7 From the above list, the Guide encourages, at a minimum, the compilation of core FSIs for peer groups based on the relative size of assets. The Guide discourages the dissemination of peer group data that might reveal information on specific institutions, unless the country normally requires deposit takers to publicly disclose such information.

Compilation of Peer Group Data

15.8 A key consideration in constructing peer group data is determining how such data are to be compiled. Regardless of the approach taken, constructing peer group data depends critically on the cost of compiling these data and on the ease with which they can be reorganized to serve various analytical needs. To allow construction of peer group data, the Guide encourages compilers to maintain individual institution data in a database that allows quick, flexible, and low-cost data aggregation. Under such an approach, peer group data can potentially be compiled using the same principles as sector-level data. For example, intragroup income and expense items and, depending on data availability, intragroup equity holdings could be eliminated in constructing peer group data.

15.9 In constructing the data, a decision needs to be made on whether the peer group should be treated as a subgroup of the total population (that is, the data are the peer group’s contribution to the total for the population) or as a stand-alone grouping (that is, the group is self-contained, with all institutions outside the group treated as external to the group). There are advantages in adopting either approach, but data compilation considerations may be decisive, particularly if ad hoc groups are created.

15.10 The stand-alone approach is likely to require less additional data than the subgroup approach. For instance, under either approach, intra-peer-group interest income and expense will be eliminated in the net interest income line. However, under the subgroup approach, the elimination of interest income and expense vis-à-vis institutions within the sector but outside the peer group requires the collection of additional data.

15.11 However, even the stand-alone approach will require additional data if the peer group data are to be compiled in line with the sector-level approach. Some of this information might be obtainable from the data reported in Tables 11.2 and 11.4, depending on the consolidation approach adopted. For instance, intra-peer-group holdings of equity could be eliminated to the extent that individual deposit takers identify their holdings of equity issued by other deposit takers. As a practical matter, peer group data might be compiled on an approximate best practice basis; this would still allow the identification of trends but—depending on the degree of approximation and the scope of analysis—could potentially mask relevant interrelationships. In such circumstances, it is encouraged that any relevant potential limitations of the data be identified for the user, such as capital and reserves’ not being fully adjusted for intra-peer-group holdings.

Descriptive Statistics

15.12 In many ways, concentration and dispersion analysis uses specific techniques depending on the nature of the issue under review, the types of data available and the ease of using them, and any limitations on revealing information on specific institutions. Flexibility in selecting techniques should be maintained. This section provides a menu of techniques that are useful in a variety of situations. However, in disseminating information to the public, some types of descriptive statistics may prove particularly useful, because they can describe concentration and dispersion without revealing information on individual institutions.

Measures of Concentration

15.13 The Herfindahl Index, H, is the sum of squares of the market shares of all firms in a sector, that is,

By using market shares, this index stresses the importance of the larger firms in the population. Higher values indicate greater concentration. In a situation with no concentration, where each of the 100 firms has an identical 1 percent share of the market, the value of H = 100. In contrast, with perfect concentration, where one firm has a 100 percent market share, H = 10,000; that is, the contribution of the monopoly firm is 100 × 100 = 10,000. A rule of thumb sometimes used is that H below 1,000 indicates relatively limited concentration, and H above 1,800 points to significant concentration. Table 15.1 illustrates how to compute H for a country consisting of 11 deposit takers.

Table 15.1.Example of Computing the Herfindahl Index
Deposit TakerAssetsPercentage ShareShare2
Total1,0001001,692.0 Herfindahl Index (Top 5 = 1,614)

15.14 As noted in Chapter 12, the Guide encourages dissemination of the Herfindahl Index. For ease of compilation, it is also possible to compile partial Herfindahl indices, such as the one based on the shares of the total sector assets of the largest five deposit takers.

15.15 The Gini Index estimates the degree of inequality, indicating how equally a variable is distributed among participants. It captures the information shown in a Lorenz curve, which is the difference between the actual distribution of a variable and the hypothetical state in which the distribution of the variable is uniform. In the hypothetical state every unit has the same endowment (of income, market share, volume of market trading, and so on), which generates a Gini index of zero. If only one unit is endowed with all income, assets, and so on, and no other unit has any, there is perfect concentration and the Gini index is one. Gini indices are especially useful to track changes in inequality over time. Table 15.2 illustrates an example of computation of Gini index.

Table 15.2.Example of Computing the Gini Index
Deposit TakerAssetsPercentage ShareCumulative Actual Share YiCumulative Equal Share XiDifference XiYiDifferenc × 2 (XiYi) × 2((Differenc × 2)×.0911): ((XiYi) × 2) × (XiXi–1)

Gini Index

The “equal share” percentage of the total.

This index is scaled by a factor of 100.

The “equal share” percentage of the total.

This index is scaled by a factor of 100.

15.16 For example, for N deposit takers, arrayed by the size of assets, from smallest to largest:

where Xi=iN×100
Yi=cumulative percentage share
Δ Xi=XiXi–1.

Measures of Dispersion

15.17 The four main categories of these statistics are measures of (1) central tendency, (2) variability, (3) skewness, and (4) kurtosis. They can be useful for data analysis, for comparing multiple data sets and for reporting final results of a survey.1 In disseminating information, graphical presentations, such as simple scatter diagrams, can also be useful in providing users with information on the dispersion of data.

15.18 Measures of central tendency include:

  • Mean (first moment of the distribution), or

This is the arithmetic average of the data. Generalizing,


xi=value of observation i
ni=number of observations with value xi
N=total number of observations
X¯=population mean.

15.19 As the mean can be affected by extreme observations, other measures of central tendency might also be calculated:

  • Median is the middle observation in a data set. It is often used when a data set is not symmetrical, or when there are outlying observations.
  • Mode is the value around which the greatest number of observations are concentrated, or the value of the most common observation.

15.20 Measures of variability describe the dispersion (or spread) of the data set:

  • Range is the difference between the largest and the smallest observations in the data set. It has limitations because it depends on only two observations in the data set.
  • Variance (the second moment of the distribution), or
    measures the dispersion of the data around the mean, taking into account all data points. Generalizing,
  • Standard Deviation (or σ=σ2) is the positive square root of the variance and is the most common measure of variability. Standard deviation indicates how close observations are to the mean.

15.21 Skewness (the third moment of the distribution, or μ3) indicates the extent to which data are asymmetrically distributed about the mean. Positive skewness indicates a longer right-hand side (tail) of the distribution; negative skewness indicates a longer left tail. One measure of skewness is based on the difference between the mean and the median, standardized by dividing by the standard deviation:

15.22 Kurtosis (the fourth moment of the distribution, or μ4) indicates whether the data are more or less concentrated toward the center; that is, it indicates the degree of flatness of the distribution near its center. As the kurtosis of a normal distribution equals 3, it is common to subtract 3 from the measure of kurtosis to estimate “excess kurtosis.” Positive excess kurtosis indicates that the distribution is more peaked than the normal distribution; negative excess kurtosis indicates a relatively flat distribution.

Weighting options

15.23 In compiling dispersion data, an issue to address is whether data should be compiled so that each observation has the same weight (equal-weight approach) or is weighted by its relative contribution to the numerator and denominator (weighted-by-contribution approach). As noted above, the Guide, at the sector level, uses the weighted-by-contribution approach.

15.24 The equal-weight approach facilitates identification of whether weaknesses are concentrated in one or two deposit takers or spread across a larger number of institutions and helps identify emerging weaknesses regardless of the size of the institution.

15.25 Variance, skewness, and kurtosis can be calculated using the weight of the contribution from each observation. For variance, the distance of each observation to the mean should be scaled by its weight in the overall average; for skewness and kurtosis, the weight measures the contribution of each observation to the mean, relative to a normal distribution. Compilation (and dissemination) of descriptive statistics on a weighted-by-contribution basis might reveal whether outliers are small or large relative to the sector.

15.26 Because of their analytical usefulness, dispersion statistics could be compiled using both weighting approaches, depending on data availability. However, if the equal-weight approach is adopted, users should be made aware that the mean calculated under this approach might well be different from the FSI itself.

Interpretation of descriptive statistics

15.27Figure 15.1 sets out an example of an economy that has 100 deposit takers with capital asset ratios distributed as shown in the figure. Table 15.3 provides dispersion statistics on an equal-weight basis, and Table 15.4 provides the equivalent statistics calculated on a weighted-by-contribution basis.

Figure 15.1.Distribution of Observations

Table 15.3.Dispersion Statistics of Capital Asset Ratios (Equal-Weight Approach)
MeanMedianModeVarianceStandard DeviationSkewnessKurtosis−0.5−0.5
Table 15.4.Dispersion Statistics of Capital Asset Ratios (Weighted-by-Contribution Approach)
Weighted MeanStandard DeviationMedianModeSkewnessKurtosis

15.28 The statistics in Table 15.3 could be interpreted as follows: because the value of the mean is smaller than both the median and mode, the distribution is asymmetric with a leftward skew (that is, a longer tail toward smaller values). This is confirmed by the negative value for the measure of skewness. In addition, the standard deviation indicates some significant dispersion around the mean. The flat distribution (relative to a normal distribution) is confirmed by the negative kurtosis.2

15.29 The weighted-by-contribution approach produces different results from that of the equal-weight approach. As seen in Table 15.4, the mean is lower and standard deviation higher, respectively, than those shown in Table 15.3 due to the large weights for the observations at the end of the tails. The large negative kurtosis also reflects low peakedness (that is, “fat” tails).

15.30Figures 15.2 and 15.3 add to this analysis. The height of the columns in Figure 15.2 shows the distribution of the individual institution’s ratios by weight, that is, the contribution of those deposit takers to the sector-level FSI. The weights are presented in percentage terms. Figure 15.3 indicates both the weight (through the size of the bubble) and the number (through the bubble’s height) of institutions at each ratio. These figures show that the outlying observations in the equally weighted distribution take on increased significance in the weighted-by-contribution distribution. In this example, of the 100 deposit takers in the system, there are only 5 deposit takers with ratios of 2 percent and 10 deposit takers at 14 percent, but together they account for half the weight—in other words, the outliers are relatively important.

Figure 15.2.Distribution of Ratios by Weight

Figure 15.3.Distribution of Capital Asset Ratios by Number of Institutions and by Weight

15.31 Another approach is to compare individual deposit takers’ (or peer groups’) contribution to specific FSIs with their relative contribution to sector assets. For example, a deposit taker generating large income flows through transactions in the financial market could make a significantly bigger contribution to the sector’s income-based FSIs than its asset size would suggest. Such divergence over a period of time might indicate that the deposit taker is taking large risks to generate large income flows. Such comparisons might also be used to check the reliability of data submitted.

15.32 Divergence between the relative balance sheet size of a deposit taker and its contribution by weight to specific FSIs can be identified by constructing the following comparison ratio:

and i is the i th FSI, j is the j th reporting institution, and N is the number of reporting institutions.

15.33 A comparison ratio for a given deposit taker and a given FSI larger (smaller) than unity indicates that, compared with the rest of the deposit-taking sector, that deposit taker makes a larger contribution to the specific FSI than its balance sheet size suggests. A summary matrix of comparison ratios (for deposit takers and FSIs) can be constructed.

Extensions of Dispersion Measures

15.34 Although the above set of descriptive statistics provides a useful overview of the distribution of data, it does not adequately illuminate weak (strong) conditions—that is, those in the left tail of the distribution.3 Specifically, no information is provided on how many deposit takers populate the left tail and how they are distributed therein. In this context, some possible extensions to the descriptive statistics in the Guide are explored below.

Option 1: Right- and left-tail attributes

15.35 The measures of central tendency and variance set out in the Guide can be applied to the left and right tails of the distribution, as shown in Table 15.5. This provides some additional insight into the size of the skewness, especially if the size of the standard deviation for the left and right tails relative to their respective means are compared; the relatively large standard deviation for the left tail reveals there are a number of institutions with ratios significantly below 5.8. Nevertheless, further disaggregation of the data is needed to arrive at how many institutions are involved and how far to the left the distribution is skewed.

Table 15.5.Extensions of Dispersion Statistics of Capital Asset Ratios (Equal-Weight Approach)
MeanMedianModeVarianceStandard DeviationSkewnessKurtosis
Left tail5.
Right tail11.311.

Option 2: Ranges

15.36 One way of conveying additional information about the distribution is to show the number of institutions falling within specified ranges or intervals (Table 15.6). This can be supplemented with mean and variance information for each interval. While providing additional insight into the shape of the distribution, the usefulness of this approach is dependent on the size of the intervals. Moreover, cross-country and cross-FSI comparisons may not be useful because the appropriate intervals will likely differ across countries and FSIs.

Table 15.6.Statistics of Capital Asset Ratios, by Range
Standard deviation1.

15.37 Nevertheless, this approach might be well suited to indicators that have an accepted norm or benchmark, such as the Basel Capital Adequacy Ratio, for which the analysis could focus on the distribution of ratios to the left of the benchmark. This approach may become more widely applicable as countries gain experience with FSIs and the calibration of benchmarks to local circumstances.

Option 3: Percentiles

15.38 The percentile distribution of individual deposit takers’ ratios goes some way toward addressing concerns about cross-country comparison of ranges. Percentile analysis involves arranging observations in ascending order and dividing the data into groups with equal numbers of observations. The values that serve as the dividing lines among groups are called percentiles. For example, Table 15.7 shows that the 10th percentile corresponds to an observation of 4, and that the 20th percentile corresponds to an observation of 6.4

Table 15.7.Statistics of Capital Asset Ratios, by Percentile
FSI ratio ≤
Mean for percentile range3.
Standard deviation for percentile range1.

15.39 Combined with the mean and standard deviation for each percentile range (for example, 0–10 percent, 10–20 percent, and 20–30 percent), these statistics can reveal areas of financial weakness.5 For instance, from Table 15.7, the large standard deviation relative to the mean for the bottom percentile indicates that the tail extends below 4 percent for a number of institutions. By contrast, the standard deviation of zero for other percentile ranges indicates that within each range all observations are equal to the mean for the range.

15.40 An interquartile range involves arranging the observations in ascending order and dividing them into four groups of equal size. The values that serve as dividing lines among the groups are called quartiles.

15.41 As with any system that involves decomposition of aggregated data, the choice of approach can be constrained by confidentiality issues. For example, it is a common statistical practice not to disclose data from cells containing fewer than three institutions. Moreover, the usefulness of this approach depends on the number of percentiles used.

Further Extensions of Dispersion Measures

15.42 To extend the data analysis, it is often useful to observe the variation in the distribution of FSI ratios and the persistence of individual deposit takers’ FSI values over time.

Variation in the distribution6

15.43 At different percentiles, the variation in the distribution of deposit takers’ rates of return over time can facilitate an understanding of trends within sector-level data.

15.44Figure 15.4 provides an example using data on profitability. An interpretation of the figure might be as follows: until period 4, the rates of return at all percentiles tended to move in the same direction, but thereafter there was a noticeable variation in the distribution. While the path of profitability of the median deposit taker (that is, the return on equity at the 50th percentile) was broadly unchanged, deposit takers in the top percentile recorded an increasing rate of return (notably, from 31 percent in period 10 to 47 percent in period 12), while those in the bottom percentile recorded falling profitability (notably from –3.0 percent in period 10 to –24.9 percent in period 12).

Figure 15.4.Percentiles of Distribution of Return on Equity


15.45 Inspection of particular percentiles is not informative about the “persistence” of an individual deposit taker’s performance from one year to the next. One way of capturing this information is by constructing a transition matrix (Table 15.8) that shows the movement of deposit takers among percentile groups over a period of time.

Table 15.8.Transition Matrix for One-Year Transitions Among Percentiles of the Distribution of Return on Capital
PercentilePercentile 1t=2Percentile 2t=2Percentile 3t=2Percentile 4t=2Percentile 5t=2
Percentile 1t=
Percentile 2t=120.050.522.65.41.5
Percentile 3t=17.921.646.920.72.8
Percentile 4t=14.17.421.752.314.5
Percentile 5t=

15.46 The principal diagonal (shaded, top left to bottom right) in a transition matrix gives the proportion of deposit takers that persist in the same percentile over time. For example, Table 15.8 shows that 65.2 percent of the deposit takers that populated the top percentile in period 1 also populated that percentile in period 2. The remaining 34.8 percent of deposit takers that populated the first percentile in period 1 populate lower percentiles in period 2.

15.47 An interpretation of the example provided in Table 15.8 might be as follows: there is a relatively high degree of persistence, with typically about half to two-thirds of the deposit takers in a particular percentile remaining in that percentile the following period. Moreover, persistence among the very profitable deposit takers (in the top percentile) and very unprofitable deposit takers (in the bottom percentile) is greater than that for the deposit takers in the three middle percentiles. Moreover, mobility from one percentile to the neighboring percentiles is greater than mobility to the more distant percentiles.

Explaining the distribution of financial performance

15.48 Whereas describing the patterns observed in measures of financial health is relatively straightforward, explaining the patterns can be more difficult. Nevertheless, some insights can be provided by examining the characteristics of those entities in the tails of the distributions of these indicators, in effect, by combining peer group and percentile analysis.

15.49 For example, Table 15.9 considers the composition by industry of those nonfinancial companies that in the current period have the lowest level of profitability and the highest levels of capital gearing (debt-to-equity ratio). For illustrative purposes, low profitability refers to levels below those in the 10th percentile, while high capital gearing refers to levels above those in the 90th percentile. The table, based on the number of firms in each industry group expressed as a percentage of the total number of firms, compares the industrial distribution at the tails (rows 2 and 3) with that of the whole sector (row 1). An interpretation of the data in Table 15.9 might be as follows: while firms with the lowest profitability are to be found within each of the industry groups, the extraction and transport and communications industries have more such firms relative to their presence in the sector as a whole. Among the companies with high capital gearing, again the transport and communications industries are overrepresented.

Table 15.9.Analysis of Tails of the Distribution by Industry Classification(In percent)
Percentage Represented in Each Industry Classification (SIC)Total Percentage of Firms
Industry GroupSIC 1SIC 2SIC 3SIC 4SIC 5SIC 6SIC 7SIC 8
1. All firms in sample56151210182014100
2. Firms with low profitability (ROE)2161010493713100
3. Firms with high capital gearing368167113415100
Note: Industry groups are one-digit nonfinancial, Standard Industrial Classification (SIC-1980) groups.1. Energy and water supplies; 2. Extraction of minerals and ores other than fuels; manufacture of metals, mineral products, and chemicals; 3. Metal goods, engineering, and vehicles industry; 4. Other manufacturing; 5. Construction; 6. Distribution, hotels, and catering; 7. Transportation and communication; and 8. Other services.
Note: Industry groups are one-digit nonfinancial, Standard Industrial Classification (SIC-1980) groups.1. Energy and water supplies; 2. Extraction of minerals and ores other than fuels; manufacture of metals, mineral products, and chemicals; 3. Metal goods, engineering, and vehicles industry; 4. Other manufacturing; 5. Construction; 6. Distribution, hotels, and catering; 7. Transportation and communication; and 8. Other services.

Interactions among indicators of financial health

15.50 From a financial soundness perspective, it may matter whether, for example, the companies with high debt levels are also suffering losses and/or have low liquidity. The overlaps among indicators can therefore be important to the analysis, not least because the interactions among indicators can amplify vulnerability to shocks. One approach to monitoring interactions among FSIs is through regression analysis, while another is presented in Figure 15.5.

Figure 15.5.Coincidence of Financial Soundness Indicators

15.51Figure 15.5 provides a stylized example of the overlaps among indicators for companies. One-third of the companies (that is, 32 percent) with the highest gearing also had the lowest profitability. In addition, nearly one-third of companies (that is, 29 percent) with the highest gearing had the lowest liquidity. A small group comprising 9 percent of the sector had all three of these characteristics.


An issue arises as to whether dispersion analysis should be undertaken on a stand-alone basis or on a subgroup basis. As noted elsewhere in this chapter, there are advantages to both approaches. To help in the understanding of any data disseminated, it is important to know the approach taken, as (for example) the mean and variance for FSI ratios for peer groups can vary depending on which basis the data are compiled.


The standard deviation for the population can be used to estimate the percentage of the population members that lies within a specified distance of the mean. Tcehbychev’s rule is commonly used for forming such estimates.


The terms “weak” and “strong” are relative concepts in this context. That is, they are used to convey weakness or strength relative to the mean, which itself may be weak or strong vis-à-vis a predetermined norm or benchmark (such as 8 percent for the capital adequacy ratio).


It is important to note this does not imply that all deposit takers with ratios of 4 percent are in the bottom percentile; some deposit takers with ratios of 4 percent may also populate the next percentile.


The mean and standard deviation can also be calculated for each percentile range on a cumulative basis (for example, 0–10 percent, 0–20 percent, and 0–30 percent).

    Other Resources Citing This Publication