GoldenSource Blog
data quality

GoldenSource 101: Data Quality

Financial services industry firms are finding data traits that can be used to evaluate and quantify data quality.

What is data quality?

Data quality is a measure of the extent to which data is fit for purpose, meaning the data is meeting the needs of those who use it for whatever business processes they require, such as budgets. Having good data quality means having high quality in a number of aspects of data: data content, data processes, data sets and analysis. The metrics to measure, monitor and report on the quality of data typically include completeness, accuracy, uniqueness, consistency, validity, timeliness, age/temporality, granularity and structure/format. Companies in many different industries are paying attention to the quality of their data because it impacts the effectiveness and efficiency of their business activities.

The term can have different meanings, depending on the industry. For financial services, investment, and securities trading, data quality requires identification of critical data elements that impact the ability to do business professionally and in compliance with regulations. This most often requires that data relating to clients, transactions, performance, financial instruments or products, and corporate actions is of a relevant standard. Any errors, gaps or obsolescence in such data have a direct impact on trades, relationships, reporting, risk management accounting and compliance. The processes and controls put in place to manage the quality of data in financial services form part of the discipline of enterprise data management (EDM).

What are the 10 characteristics of data quality?

The characteristics of data quality listed below are all measurable:

  • Data accuracy
  • Completeness
  • Precision
  • Validity
  • Consistency
  • Timeliness
  • Reasonableness
  • Temporality
  • Conformity
  • Integrity
  • Uniqueness

Several of these metrics appeared in the previous answer defining data quality. Although quantitative measurement of the quality of data might seem challenging, when seeking to improve quality it is important to agree which characteristics are important and how they will be measured. A shared definition of the relevant characteristics is also needed, to ensure that the meaning of measurements is understood and can be acted upon. Setting up performance metrics tools and controls re-enforces the standards that a firm has set.

Importantly, recording and reporting quality over time enables firms to assess the return they’re getting from the investments they’ve made in improving and maintaining said quality. Performance metrics can also be further enhanced through automation. Artificial intelligence, in particular, can be applied to quality testing to accomplish that testing more reliably and efficiently. Using such “intelligent automation” can reduce risk and improve the credibility and integrity of the data.

Let’s further define some of the key metrics mentioned above, namely accuracy, consistency, timeliness, validity and uniqueness. Accurate data provides a true picture of the reality of the information it depicts. Consistent data means that data being stored in multiple locations is exactly the same wherever it’s found. Timely data is ready when it is needed or expected – for example, a quarterly earnings report that is late is lower quality. Valid data is correctly formatted for whatever information it represents, like the correct desired format for a date. Having unique data means that the data is not accidentally or incorrectly duplicated.

What does it mean to have good data?

Having quality data means that your data’s fitness for purpose consistently satisfies service level agreements (SLAs), contractual obligations, policies and procedures, software system requirements, regulatory, legal and compliance requirements, plus common industry and technology standards.

Although we say that quality data is ‘fit for purpose’, there are many types of fitness for purpose, in part because there are many stakeholders, steps and systems involved in any financial services end-to-end process. Purposes include the data being fit for use in models, quoting a price, launching a fund, announcing a stock split, rebalancing a portfolio, reconciling collateral calls, settling a trade, client reporting, building a prospectus, calculating capital reserves and many other daily operations. This has resulted in data operations, enterprise data management and data science becoming career paths in financial services firms.

Being highly regulated and needing to transact smoothly with one another, financial services firms have been open to adopting common data standards and common approaches for their data. For example, SWIFT standards for payments and corporate actions messaging, or GLEIF standard legal entity identifiers.

Efforts are still underway to improve data quality and data availability across the industry. Regulators have set standards for compliance with mixed degrees of success. The BCBS 239 principles were established because during the financial crisis in 2008, many banks were unable to understand their exposure to institutions and organizations that were failing. However, as of the end of 2018, no banks had yet complied with BCBS 239, and there has been no update on compliance since then.

Anti-money laundering, Solvency II and Fundamental Review of Trading Book (FRTB) regulations set out certain data provisions, including appropriateness of data. FRTB, however, has been doubly postponed, now not effective until January 2023. The EDM Council, set up to encourage improvement in data standards throughout the financial services industry, has developed its Data Management Capability Assessment Model (DCAM) methodology to assess data. The council’s most recent update of DCAM, version 2.2, was issued in 2021.

How do I improve the quality of my data?

Along with measurement and testing efforts, the operational sourcing, validation and processing of data actively improves its quality. At the level of any given process-step, monitoring will identify whether data is acceptable to move to the next step. At the broader data operations level, the effectiveness of your overall data sourcing, validation and processing can be assessed and managed through tracking quality over time.

Sourcing presents quality concerns as the number of data sources proliferates. That includes data generated by mobile apps and devices. Consider just a few sources: customer information, financial transactions and market feeds.

The greater number of sources and the volumes they generate brings about an enterprise data management challenge. This is not only related to physically ingesting the data into your firm. It can never be assumed that data bought direct from a data vendor satisfies your quality needs.

Even if the data vendor is selected for its suitability in terms of the asset classes or economic indicators that it covers, the data sets or feeds that are received will always need to be cleaned.

The next step after getting sourcing right, or to confirm that sourcing is being done correctly, is validation. Validation requires firms to review their data sets and feeds, find and mark errors or exceptions, and make corrections.

However, just validating and cleansing data before distribution may catch isolated issues but does little to find and react to recurring quality problems. To prevent such recurring issues, firms should be tracking data for quality over time. Also, a daily focus as well as a big picture focus can make it easier for firms to see how mistakes are being caused.

To truly improve data quality, it’s a good idea to perform advanced data checks that go beyond these measures. This includes checking the reality and accuracy of data, fixing errors before sending data on, confirming data against original documents, and making sure data comes from firsthand sources.

Taking the next steps

Leveraging automation for oversight and insights into the quality of data, and having real time views of data characteristics, is the direction in which most financial firms are moving. With the mass adoption of outsourcing and cloud services, quality needs to be managed remotely as well as on internal databases and operational data stores. Dashboards are now essential tools in any data operations environments, providing oversight and control of the data and every process, from bringing it in to the firm to making it available to data consumers, be they people or systems. All the way along the value chain of the data, in analysis, modeling, calculation, reporting or monetizing, your initial and ongoing efforts will help ensure the integrity and efficiency of the business.

All Posts