> ## Documentation Index
> Fetch the complete documentation index at: https://docs.nx1cloud.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Ingest a database

> Ingest a database into NexusOne by copying external database datasets into the platform.

Ingesting a database into the NexusOne platform means copying a dataset from an
existing database into the NexusOne platform. Once ingested, the dataset becomes available
for your data pipeline workflow.

## Prerequisites

* Appropriate permission: `nx1_ingest`, `nx1_monitor`, `nx1_s3_admin`, `airflow_user`, `superset_user`,
  `spark_sql`, and `trino_admin`
* Ensure you are ingesting a database that NexusOne currently supports.

## Add datasets from a database

<Tabs>
  <Tab title="Web portal" icon="browser">
    You add datasets from a database so NexusOne can process and analyze it.
    This automatically makes the dataset available for your data pipeline workflow.

    1. Log in to NexusOne.

    2. On the top navigation bar, hover your mouse over **Data Pipeline** and then select **Ingest**.

    3. Click **Database**.

    4. Add database details:

       * If you are using a query to select a dataset, then click **From Query** and enter the SQL query.
         When specifying the schema and table in your query, use the following format:

         ```bash theme={null}
         <database_name>.<schema>.<table>
         ```

       * If you are select a specific table in a dataset, then click **From Table** and enter the table
         and schema name in the **Source Schema** and **Source Table** fields.

       * Adding filters are optional.

    5. Add connection details:

       * In the **Database URL** field, enter a JDBC URL used to connect to the public database.
         The URL should have the following format:

         ```bash theme={null}
         jdbc:<database_vendor's_jdbc_driver_name>:<database_URL>:<database_port>/<database_name>
         ```

       * In the **Username** field, enter a username.

       * In the **Password** field, enter a password.

    6. Add ingest details:

       <Warning>Ensure that the values added to your Schema and Table adhere to Apache Spark's
       [identifiers](https://spark.apache.org/docs/latest/sql-ref-identifier.html).</Warning>

       * Enter a unique name for this ingest job. This name appears in the Monitor tab and Airflow's portal for tracking.
       * Choose an existing database schema or enter a new schema name to create one.
       * Choose an existing table or enter a new table name.
       * Optional: Set how often the ingestion job should run. Schedule options include `Run Once`,
         `Every 3 hours`, `Daily`, `Weekly`, `Monthly`, and `Quarterly`. On the first schedule, the job/DAG in Apache Airflow
         automatically runs. Recurrent runs depend on your selected schedule option.
       * Select a mode for how to store incoming records at the destination table.
         * **Append**: Add new records to the existing dataset.
         * **Merge**: Add or update existing records where applicable.
         * **Overwrite**: Replace all existing records.
       * Optional: Select a DataHub domain. For example, `Company`, `Product`, `Sales`. This is only applicable if you have one previously
         [created on DataHub](https://docs.datahub.com/docs/domains) via the Govern feature on NexusOne.
       * Optional: Select or create one or more tags to label this job.

    7. Optional: Column transformations:

       * Click **Add Transformation**, then select any of the following column transformation types:

         * **Cast**: Converts a column's data type during ingestion. To use this transformation type,
           enter the column name in the **Column** field, and then select a target type in the **Target Type**.
         * **Drop**: Removes a column from the dataset. To use this transformation type, enter the column
           name in the **Column** field.
         * **Encrypt**: Makes the data unreadable using an encryption key. To use this transformation type,
           enter the column name in the **Column** field, and then optionally enter a key name in the **Key Name**
           field.
         * **Rename**: Changes a column's name. To use this transformation type, enter the column name in
           the **Column** field, and then enter a new column name in the **New Name** field.

       * Repeat until you have added all the transformations necessary for your use case.

    8. After configuring all fields, click **Ingest** to submit the job.
  </Tab>

  <Tab title="REST API" icon="code">
    Use the following API endpoint to:

    * [Submit a new data ingestion request](/api-reference/endpoints/data-ingestion/ingest).

    If you are selecting an existing domain, schema, table, or tags, then you can
    use the following API endpoints to:

    * [Get domains](/api-reference/endpoints/metastore/get-domains).
    * [Get schemas](/api-reference/endpoints/metastore/get-schemas).
    * [Get tables](/api-reference/endpoints/metastore/get-tables).
    * [Get tags](/api-reference/endpoints/metastore/get-tags).
  </Tab>
</Tabs>

## Monitor, trigger, or delete a job

When you schedule an ingested file, it runs as a job in Apache Airflow.
You can monitor, trigger, or delete the job.

<Tabs>
  <Tab title="Web portal" icon="browser">
    1. When you create a job, a success message and a View Jobs button appear.
    2. Track the job status by clicking **View Jobs** or navigating to the NexusOne homepage
       and clicking **Monitor**.
    3. Use the three dots `...` menu to trigger or delete a job.
    4. If you clicked **Trigger job**, then click the job's name to open its DAG details in Airflow's portal.
  </Tab>

  <Tab title="CLI" icon="square-terminal">
    Use the following `nx1` command to:

    * [View all jobs](/cli-reference/nx1/jobs#list).
    * [View a job](/cli-reference/nx1/jobs#get).
    * [Trigger a job](/cli-reference/nx1/jobs#trigger).
    * [Delete a job](/cli-reference/nx1/jobs#delete).
  </Tab>

  <Tab title="REST API" icon="code">
    Use the following API endpoints to:

    * [View all jobs](/api-reference/endpoints/jobs/get-jobs).
    * [View a job](/cli-reference/nx1/jobs#get).
    * [Trigger a job](/api-reference/endpoints/jobs/trigger-job).
    * [Delete a job](/api-reference/endpoints/jobs/delete-job).
  </Tab>
</Tabs>

## Additional resources

* To understand how database ingestion works in NexusOne, refer to
  [How database ingestion works](/documentation/data-pipeline/overview/ingest).
* For more information about the monitoring feature, refer to the
  [Monitor Overview](/documentation/platform/overview/monitor) page.
* For more information about roles or permissions, refer to the
  [Govern Overview](/documentation/govern/overview) page.
