Skip to main contentThe Data Quality feature in the NexusOne platform enables you to set rules for your data so it adheres to data quality characteristics such as accuracy and consistency.
NexusOne does this by either providing recommended rules or allowing you to create custom rules.
Key features
The Quality feature implements all data quality characteristics plus the following:
- Automatically generate SQL queries for quality: You can use natural language to generate SQL queries.
It’s helpful to technical and non-technical users.
- Data quality rules: Create rules from the generated SQL queries to ensure data quality.
- Predefined queries: NexusOne provides suggested queries if you’re unsure of what to ask.
Characteristics of data quality
There are several characteristics of data quality. However, these are the most common ones:
- Accuracy: Data correctly represents its source.
- Completeness: Data contains all required fields; empty required fields automatically make it incomplete.
- Consistency: Data represents information uniformly within a table.
- Uniqueness: Data that has no duplicate entries within a table.
Data quality rules
Data quality rules are conditions met for ingested data, so it’s considered of high quality.
These rules represent the previously described characteristics of data quality.
When you describe the end goal of a data quality rule using natural language, NexusOne provides several suggested SQL commands. You can then choose to accept or ignore them.
Here are a few examples:
- Verify that all values in the
email column follow the valid email address format [email protected].
- Check that all values in the
state column use two-letter state abbreviations only, such as GA, CA, or NY.
- Column
title must be non-null and contain at least one character.
- Column
type must equal the string TV Show.
- Ensure that the column
name has no duplicates.
Use cases
These examples show how different industries can use NexusOne’s data quality capabilities:
-
Financial services: Perform post-validation on transformed data before storing it in a data warehouse. Sending inaccurate information can violate data privacy regulations and result in huge fines.
-
Health: Perform post-validation on transformed data before storing a patient’s data in a data warehouse. If a medication allergy is missing from the data, then it could harm the patient after a drug prescription and eventually result in heavy fines.
Additional resources