Envision a company that has bad data but perfect process mapping. A company with this setup could lose revenue because inaccurate data may lead to challenges such as higher resource consumption, higher maintenance costs, negative publicity on social media, and lower productivity. The company would need to erase its old data and collect new data, which likely would require spending more time and money. This is a red flag.
On the other hand, a company with good data but bad process mapping may still need to spend time attempting to rectify these processes. This company may need to reallocate resources to improve data quality, yet this scenario will not be as expensive as fixing downright bad data.
According to Gartner research, “organizations believe poor data quality to be responsible for an average of $15 million per year in losses.” Larger organizations that work with more customer, employee, supplier, and product data are at a higher risk of encountering poor data quality.
Seven Key Data Concepts
As a master data practitioner, I have some insight in how to work with data that I’d like to share through these seven key data concepts.
Fat Records: Businesses collect a lot of attributes based on their requirements. Depending on the business, the material master, article master, business partner, and all other assets need different attributes. For example, material master data typically uses more than 250 attributes, which may include details such as plant data, sales data, purchasing data, accounting data, warehouse data, etc.
A fat record will hold many attributes. If you are unable to manage your business-critical attributes effectively due to their quantity and unstructured nature, the business and data custodian might not know the significance of the business-dependent attributes. It is extremely important that the attributes of any business object are kept concise.
Federation of Elements and Records: Having an organized view of elements and federation of functional elements are key concerns for data custodians and the business as a whole. A business entity’s grouping, classification, and hierarchy can make assessment, auditing, and overhauling simple. Central data elements of the object should be at the top of the schema and then categorized by functional grouping.
Half-Finished Data: Information should be complete. Missing data can lead to misleading analysis and results. Businesses must spend time and resources as they attempt to recreate and recover lost data. Some businesses may not be willing to do so and, instead, may leave these records unused. This leads to setbacks in productivity timelines, loss of trust from customers, and data repair/replacement costs.
Data is a digital asset. Every part of this information is valid and indispensable. Suppose we have incomplete equipment master data, where the core functional attributes such as type, function, capacity, and age are left undefined. This leads to ambiguity in the use of the equipment and could harm the business.
Ownership and Accountability: Unsteady ownership and accountability lead to half-finished and unused records. Businesses with customized, effective governance models will have higher-quality data because management responsibilities, roles, accountabilities, data flow, and other guidelines are strictly defined and put into action. Governance operating models improve data coordination, leaving little room for mistakes.
It is highly encouraged to improve data quality by establishing structured schemas, a hierarchy of responsibilities, process flow documentation, and a clear set of “to do and not to do” tips.
Track and Profile: To avoid inconsistent, inappropriate, or half-finished data, we need data profiling and change tracking. This will help you keep track of the who, what, when, and why that’s needed to maintain quality data.
Metadata, Schema, and Model: The foremost objective of a schema data model is to maintain an accurate, comprehensive representation of the objects in the application. A poor data model leads to deprived data. Every data object has its own schema attributes and set of keys. Objects need to be keyed accurately—whether the keys are primary or foreign. We then need to identify and define the relationships and associations between different objects in the entire database, creating organized schemas.
In an enterprise, objects can be a multitude of things. So, it is crucial that detailed information regarding each entity is stored in the database and characterized into various fields, called attributes, with details that are then organized.
Metadata is data about data, such as characters, texts, or numerals. Once again, to avoid inconsistency, the model should have reference values, key mapping, hierarchy, classification, grouping, etc.
Duplicate Records: There are a lot of scenarios through which records in a system can be duplicated. Duplicate records and validating duplicates also can result in wasted time. Duplicate records may result from unpredictable source data and inadequate reference and hierarchy data, due to heterogeneous systems such as the varied characterization of special characters, punctuation, noise words, abbreviations, and some identifications. Transforming and substituting such sources/buzzwords may reduce the risk of creating duplicates.
Seven Key Sources of Duplicate Occurrences During Runtime
- Lack of ownership and accountability
- Lack of skills (ownership comes from skills)
- On-flight urgency
- Inconstant change tracking and monitoring
- Absence of data profiling
- Bad configuration/setup
- Not having real-time data enrichment through third-party systems
Defining a desired level of matching across records and identifying duplicates can help correct and avoid duplicate entries.
Staying Focused on High-Quality Data
In the digital age, companies seem to be cold-shouldering the quality of resources—relevant and skilled—as well as ethical and traditional techniques. They’re simply looking at new tools and technologies to acquire data. Yet the quality of its information is a key to success for any organization. Avoiding these causes of poor data and bad processes can help us all focus on improving our data quality.
Register for the ASUG Experience for Enterprise Information Management (EIM) Oct. 28–30 in Minneapolis to learn from peers how to manage your data to meet business goals. Additionally, we welcome all ASUG members to submit their ideas for blog posts they want to write.