DataHub best practices

The DataHub best practices page describes several efficient ways to use DataHub. To maintain a high-quality and trustworthy data catalog, follow these recommended best practices:

Assign owners to every dataset: Ensure each dataset has a clearly identified owner responsible for quality, access, and documentation.
Keep descriptions up to date: Maintain accurate descriptions at both the table and column levels so users can easily understand the dataset’s purpose and contents.
Use standardized glossary terms: Apply approved business terms consistently across datasets to promote shared understanding and improve searchability.
Tag datasets with relevant classifications: Use tags and classifications to support governance, discovery, and compliance workflows.
Review stale or deprecated datasets: Periodically audit unused or superseded datasets and mark them as deprecated when appropriate.
Monitor and maintain ingestion pipelines: Monitor metadata ingestion pipelines and ensure they run reliably and without errors, so the catalog remains accurate and current.
Define and maintain data quality tests: Implement table-level and column-level tests for critical datasets to validate schema, freshness, null values, ranges, or business rules.
Automate test execution within pipelines: Run data quality tests automatically as part of ETL/ELT workflows or orchestration jobs to ensure consistent and reliable validation.
Investigate and resolve failures promptly: Use lineage and test failure details to diagnose root causes and coordinate remediation with upstream dataset owners.
Monitor historical data quality trends: Review test history and recurring failures to detect long-term quality issues and prevent downstream impact.

Additional resources

To get an overview of DataHub, refer to the DataHub in NexusOne page.
For more details about DataHub, refer to the DataHub official documentation.
If you are using the NexusOne portal and want to learn how to launch DataHub, refer to the Govern page.

Documentation Index

​Additional resources

Additional resources