Prerequisites
- Appropriate permission:
nx1_ingest,nx1_monitor,nx1_s3_admin,airflow_user,superset_user,spark_sql, andtrino_admin - Ensure you are ingesting the files NexusOne currently supports.
Add or upload a file
- Web portal
- REST API
You add or upload a file so NexusOne can process and analyze it.
This automatically integrates the file into your data pipeline workflow for later use.
- Log in to NexusOne.
-
From the NexusOne homepage, navigate to
Ingest > File -
Add a file
- If you are uploading a file, then click Choose File and select the file that you want to upload.
- If you are adding a public URL to a file, then click Public File URL and enter the public URL to the file you’d like to upload. This file might be in an S3 bucket and exposed over HTTPS.
-
Add ingest details. These fields define how the NexusOne ingests and accesses the data through a schema and table.
- Enter a unique name for this ingest job. This name appears in the Monitor tab and Airflow’s portal for tracking.
- Choose an existing database schema or enter a new schema name to create one.
- Choose an existing table or enter a new table name.
- Optional: Set how often the ingestion job should run. Schedule options include
Run Once,Every 3 hours,Daily,Weekly,Monthly, andQuarterly. - Select a mode for how to store incoming records at the destination table.
- Append: Add new records to the existing dataset.
- Merge: Add or update existing records where applicable.
- Overwrite: Replace all existing records.
- Optional: Select a DataHub domain. For example,
Company,Product,Sales. This is only applicable if you have one previously. created on DataHub via the Govern feature on NexusOne. - Optional: Select or create one or more tags to label this job.
- After configuring all fields, click Ingest to submit the job.
View, trigger, or delete a job
When you schedule an ingested file, it runs as a job in Apache Airflow. You can do the following:Additional resources
- To get instructions on how to ingest specific file types supported on NexusOne or visualize your dataset, see the following:
- For more information about the monitoring feature, refer to Monitor Overview.
- For more information about roles or permissions, refer to Govern Overview.