Sixth Review of the Fund's Data Standards Initiatives - Metadata Standardization in the Data Quality Program
Author:
International Monetary Fund
Search for other papers by International Monetary Fund in
Current site
Google Scholar
Close

This Supplement describes how the staff proposes to achieve further synergies by mapping the DQAF into the metadata structure of the DQP’s other key component: (3) the data transparency initiatives comprising the Special Data Dissemination Standard (SDDS) and General Data Dissemination System (GDDS).

Abstract

This Supplement describes how the staff proposes to achieve further synergies by mapping the DQAF into the metadata structure of the DQP’s other key component: (3) the data transparency initiatives comprising the Special Data Dissemination Standard (SDDS) and General Data Dissemination System (GDDS).

I. Introduction

1. At the Fifth Review of the Fund’s Data Standards Initiatives in July 2003, Executive Directors welcomed the Data Quality Program’s (DQP’s) integration of these initiatives, sharpening their focus on data quality assessment as well as identifying and promoting good statistical practices.1 The DQP’s Data Quality Assessment Framework (DQAF) provides the common metadata structure integrating the DQP’s three principal components.

2. The DQAF already is the underlying structure for two key components of the DQP: (1) the Data Module of the Report on the Observance of Standards and Codes (data ROSC) and (2) the Fund Statistics Department’s (STA’s) Project Management System for technical assistance (TA). This Supplement describes how the staff proposes to achieve further synergies by mapping the DQAF into the metadata structure of the DQP’s other key component: (3) the data transparency initiatives comprising the Special Data Dissemination Standard (SDDS) and General Data Dissemination System (GDDS).

3. This last integration will not change the four-dimensional metadata structure of the SDDS Annex and the GDDS Document,2 comprising data, quality, integrity, and access. It also will not change the six aspect summary methodology of the SDDS and comprehensive framework of the GDDS. Instead, it replaces the existing “prompt points,” written by the staff to aid countries in drafting metadata for the above dimensions and aspects, with the detailed components of the DQAF.3 The staff proposes to parse countries’ existing metadata into these DQAF “prompt points,” forward the results to the countries for approval, and, on receipt of their approval, use these deconstructed metadata as the basis for the metadata posted on the Dissemination Standards Bulletin Board.[k1]

4. Integrating the DQAF into the metadata model of the data transparency initiatives will enable direct linkages between the metadata maintained through the SDDS and GDDS and comparable metadata from the detailed assessment (section III) of the Data ROSC. It also will sharpen the role of the GDDS in planning national statistical development and coordinating the supporting statistical TA.

5. Within the Fund, all of these integration initiatives will materially support the prospective Data Warehouse project. This project is investigating the potential benefits of incorporating the Fund’s databases into an integrated information technology environment comprising a repository for the databases and facilities for linking and documenting them. STA is the focal point of this project, which is a joint undertaking of STA, Fund area departments, and the Fund Research Department.

6. In an ongoing complementary effort, the Fund has been participating in an interagency initiative to streamline electronic data interchange between institutions. The SDMX initiative will make the electronic exchange of data and metadata across organizations more fully automated and more accurate by eliminating transcription errors. This includes data transfers among international organizations, among government agencies, among units within these organizations (such as Fund Departments), and among all of these groups.

7. Section II describes the key role of the DQAF for guiding STA’s activities with countries, notably in the data ROSC and STA’s TA; Section III outlines the advantages of using the DQAF structure for SDDS and GDDS metadata; and Section IV describes the plan for migrating existing metadata to the DQAF structure. Section V describes the work taking place under the SDMX initiative and the DQAF’s role in setting standards for metadata content.

II. The Data Quality Assessment Framework

8. The DQAF, the heart of the DQP, provides a methodology that covers every aspect of the data compilation and dissemination cycle. It captures key aspects of this cycle by focusing on the quality-related features of the governance of statistical systems, their core statistical processes, and their statistical products. Rooted in the United Nations Fundamental Principles of Official Statistics, it is the product of an intensive consultation with national and international statistical authorities and data users inside and outside the Fund. The DQAF has a cascading topical classification structure. Its top level comprises six one-digit dimensions: (0) prerequisites of quality, (1) assurances of integrity, (2) methodological soundness, (3) accuracy and reliability, (4) serviceability, and (5) accessibility (see Appendix I). Within each dimension, the DQAF contains one or more two-digit elements, and within each element, one or more three-digit indicators. This structure is common to all datasets.4 Within each indicator, the DQAF’s detailed structure becomes specialized to the subject matter of each dataset.

9. By providing an organizing model of internationally accepted good practices, including internationally accepted methodologies, the DQAF facilitates the comparison of national practices against best practices. Hence, it guides staff in assessing national practices and provides a systematic, yet flexible, structure for the Data ROSC.

10. In addition to its use in data ROSCs, the DQAF has been applied in the Fund’s statistical TA program as a guide to identify areas for improvement, make recommendations, and evaluate the outcomes of TA projects. More recently, STA’s implementation of the Fund-wide Technical Assistance Information Management System (TAIMS) has taken advantage of the DQAF methodology to structure the various aspects of TA missions’ tasks. The Partnership in Statistics for Development in the 21st Century (PARIS21) also has incorporated the DQAF into its statistical capacity building indicators.

III. The Revised Metadata Structure of the SDDS and GDDS

11. The SDDS and GDDS metadata components are alternative aggregations of the DQAF’s indicators. The SDDS/GDDS metadata model has somewhat narrower scope than the DQAF because the former excludes some DQAF indicators, notably in the DQAF’s dimension 0, comprising topics under prerequisites of quality. These deal with legal and institutional arrangements not envisaged under the dissemination practices orientation of the SDDS and GDDS.

12. The mapping of the SDDS and GDDS metadata elements to the three-digit indicators of the DQAF should provide substantial opportunity for realizing efficiencies in and increasing the effectiveness of STA’s technical assistance and data ROSC work, as well as materially supporting the Data Warehouse project. Prospective work integrating the GDDS plans for improvement with the three-digit DQAF structure should strengthen the GDDS as a capacity-building framework for the Fund and other TA providers. The Fund already structures its statistics TA programs according to the three-digit DQAF. For the Fund, the consolidation of the GDDS plans for improvement with the TAIMS, both following the structure of the DQAF, will help to eliminate duplication across STA TA activities. Furthermore, merging the DQAF structure into the data dissemination standards makes the data ROSC reports and updates a direct source of metadata for the GDDS and SDDS, eliminating unnecessary redrafting. At the same time, supplying data ROSCs with metadata already in the DQAF format from the DSBB presents potentially significant increased efficiencies in data ROSC mission preparation and mission activities while in country.

IV. MIGRATING SDDS/GDDS METADATA TO THE DQAF STRUCTURE

13. The SDDS and GDDS metadata reside in a database, content management, and web-dissemination system collectively known as the Dissemination Standards Bulletin Board (DSBB). The staff intends to take a phased approach to migrating the existing metadata of the DSBB to the DQAF structure. The migration process will not burden member countries subscribing to the SDDS or participating in the GDDS. STA staff will parse existing information into the relevant three-digit DQAF indicators. Newly parsed metadata will be forwarded to countries for review, update, and approval in the context of the required quarterly SDDS metadata certification and annual GDDS metadata updates. To support conversion and maintenance of the metadata in the new format, the staff will implement updated versions of the Microsoft Word templates now used for reporting annual GDDS metadata and metadata updates, modified to incorporate DQAF three-digit topics. The new templates will be used for the SDDS quarterly metadata certification and update process as well as the annual GDDS metadata update process.

14. A future refinement of the process for reporting metadata would provide SDDS subscribers and GDDS participants the option of using web forms rather than Word templates in reporting and updating metadata. Still another option for future metadata capture would take advantage of prospective adoption by countries and international organizations of the developing Statistical Data and Metadata eXchange (SDMX) and eXtensible Markup Language (XML) encoding and decoding protocols for displaying web pages and exchanging statistical information. The term for the combination of these protocols is SDMX-ML. In this scenario, the Fund could directly scan public websites disseminating metadata organized according to the DQAF and encoded with SDMX-ML protocols without needing a formal “reporting” mechanism. Section V below elaborates further on the initiative to develop the SDMX by an interagency consortium including the IMF.

15. On the DSBB, the SDDS and GDDS metadata will continue to be available in the current presentation formats familiar to DSBB users. As the underlying metadata will be structured according to the DQAF, another view of the metadata, based on the cascading structure of the DQAF, also will be provided on the DSBB. Appendix II contains preliminary mappings of the SDDS and GDDS metadata elements to the three-digit indicators of the DQAF.

V. The SDMX Initiative

16. The SDMX initiative brings together several international organizations to foster greater efficiencies in data and metadata exchange. The goal is to establish standards and foster best-practices for exchanging statistical information, increasing data management/exchange efficiencies for SDMX partners. The SDMX website (www.SDMX.org) provides the most current information on the various projects taking place under the auspices of the SDMX initiative.

17. SDMX standards comprise two distinct but complementary sets of standards. The technical standards provide, inter alia, the specifications for the formats for the exchange of SDMX-structured data and metadata. These SDMX Version 1.0 Standards are an approved technical specification of the International Organization for Standardization (ISO/TS 17369:2005 SDMX). The other component of SDMX standards, the content standards, is required to standardize and harmonize the use of specific concepts and terminologies when exchanging statistical information, a necessary step to encourage interoperability of exchange flows.

18. The cascading structure of the DQAF is at the core of the SDMX proposal for reference metadata5 content standards. This approach is consistent with the future work on the DQAF identified at the time of the Fifth Review,6 i.e., collaboration with international organizations with the aim of reconciling the existing quality frameworks. The seven international organizations7 involved in the SDMX are collaborating with national statistical agencies and central banks to develop a structure for metadata content and exchange that is derived from the three-digit indicators of the DQAF.

APPENDIX I Data Quality Assessment Framework—Generic Framework

(July 2003 Framework)

article image
article image
article image

APPENDIX II Mapping of SDDS and GDDS Metadata to the Three-Digit DQAF

Table A1.

SDDS and GDDS Data, Quality, Integrity, and Access Integrated with the DQAF

Three-digit DQAF topics covered by SDDS and GDDS Dimensions

article image
article image
article image

Table A2. SDDS Summary Methodology and GDDS Comprehensive Framework Integrated with the DQAF

Three-digit DQAF topics covered by SDDS Summary Methodology/ GDDS Comprehensive Framework Aspects

article image
article image
1

The Acting Chair’s Summing Up—Fifth Review of the Fund’s Data Standards Initiatives—Executive Board Meeting 03/66—July 9, 2003, Public Information Notice No. 03/86, July 23, 2003 (PIN No. 03/86)

3

For latest set of existing prompt points, see the GDDS Guide, Updated October 2004, Appendix V, at http://dsbb.imf.org/vgn/images/pdfs/gddsguide.pdf.

4

Currently, the DQAF applies to macroeconomic datasets for the real, fiscal, financial, and external sectors. Much of its structure, however, would apply with minor adaptation to other types of statistical datasets, such as those contained in the socio-demongraphic sector of the GDDS, that are usefully linked with the macroeconomic datasets in assessing and tracking economic and social development. To explore this possibility, Fund and World Bank staff could jointly examine development of a DQAF variant for this data sector.

5

The term “reference metadata” refers to metadata that provides information on every aspect of the data production cycle, such as data access, statistical concepts, compilation practices and methodologies, as well as agencies assuming responsibility for the production of data. SDDS and GDDS metadata are reference metadata.

7

The Bank for International Settlements, the European Central Bank, the IMF, the Organisation for Economic Cooperation and Development, the Statistical Office of the European Communities (Eurostat), the United Nations Statistics Division, and the World Bank.

  • Collapse
  • Expand
Sixth Review of the Fund's Data Standards Initiatives - Metadata Standardization in the Data Quality Program
Author:
International Monetary Fund