> ## Documentation Index
> Fetch the complete documentation index at: https://docs.nx1cloud.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Ingest a lakehouse

> Ingest a lakehouse into NexusOne by copying data between internal tables while preserving catalog and table format metadata for downstream querying.

In NexusOne, a lakehouse refers to an internal database with a [catalog](/platform-components/apache-hive-metastore)
and [table format](/platform-components/apache-iceberg). Ingesting a lakehouse means copying
previously ingested data from a table in that internal database into another table. This preserves
the table format and catalog metadata so that downstream apps like [Trino](/platform-components/trino)
can query it.

This guide walks you through how to ingest a lakehouse.

## Prerequisites

* Appropriate permission: `nx1_ingest`
* Ensure you have previously [ingested a dataset](./ingest-a-file)

## Ingest a lakehouse

<Tabs>
  <Tab title="Web portal" icon="browser">
    Specify a table containing a previously ingested dataset, so NexusOne can copy
    it into another table.

    1. Log in to NexusOne.

    2. On the top navigation bar, hover your mouse over **Data Pipeline** and then select **Ingest**.

    3. Click **Lakehouse**.

    4. Add lakehouse details containing previously ingested data:
       * Enter a lakehouse schema name.
       * Enter a table name.

    5. Add ingest details containing information about the new table:

       <Warning>Ensure that the values added to your Schema and Table adhere to Apache Spark's
       [identifiers](https://spark.apache.org/docs/latest/sql-ref-identifier.html).</Warning>

       * Enter a unique name for this ingest job. This name appears in the Monitor tab
         and Airflow's portal for tracking.
       * Enter a new schema name.
       * Enter a new table name.
       * Optional: Set how often the ingestion job should run. Schedule options include `Run Once`,
         `Every 3 hours`, `Daily`, `Weekly`, `Monthly`, and `Quarterly`.
       * Select a mode for how to store incoming records at the destination table.
         * **Append**: Add new records to the existing dataset.
         * **Merge**: Add or update existing records where applicable.
         * **Overwrite**: Replace all existing records.
       * Optional: Select a DataHub domain. For example, `Company`, `Product`, `Sales`.
         This is only applicable if you have one previously.
         [created on DataHub](https://docs.datahub.com/docs/domains) via the Govern feature on NexusOne.
       * Optional: Select or create one or more tags to label this job.

    6. Optional: Column transformations:

       * Click **Add Transformation**, then select any of the following column transformation types:

         * **Cast**: Converts a column's data type during ingestion. To use this transformation type,
           enter the column name in the **Column** field, and then select a target type in the **Target Type**.
         * **Drop**: Removes a column from the dataset. To use this transformation type, enter the column
           name in the **Column** field.
         * **Encrypt**: Makes the data unreadable using an encryption key. To use this transformation type,
           enter the column name in the **Column** field, and then optionally enter a key name in the **Key Name**
           field.
         * **Rename**: Changes a column's name. To use this transformation type, enter the column name in
           the **Column** field, and then enter a new column name in the **New Name** field.

       * Repeat until you have added all the transformations necessary for your use case.

    7. After configuring all fields, click **Ingest** to submit the job.
  </Tab>

  <Tab title="REST API" icon="code">
    Use the following API endpoints to:

    1. [Get schemas](/api-reference/endpoints/metastore/get-schemas).
    2. [Get tables](/api-reference/endpoints/metastore/get-tables).
    3. [Submit a new data ingestion request](/api-reference/endpoints/data-ingestion/ingest).

    If you are selecting existing domains or tags, you can use the following
    API endpoints to:

    * [Get domains](/api-reference/endpoints/metastore/get-domains).
    * [Get tags](/api-reference/endpoints/metastore/get-tags).
  </Tab>
</Tabs>

## Monitor, trigger, or delete a job

When you schedule an ingested lakehouse, it runs as a job in Apache Airflow.
You can monitor, trigger, or delete the job.

<Tabs>
  <Tab title="Web portal" icon="browser">
    1. When you create a job, a success message and a View Jobs button appear.
    2. Track the job status by clicking **View Jobs** or navigating to the NexusOne homepage
       and clicking **Monitor**.
    3. Use the three dots `...` menu to trigger or delete a job.
    4. If you clicked **Trigger job**, then click the job's name to open its DAG details in Airflow's portal.
  </Tab>

  <Tab title="CLI" icon="square-terminal">
    Use the following `nx1` command to:

    * [View all jobs](/cli-reference/nx1/jobs#list).
    * [View a job](/cli-reference/nx1/jobs#get).
    * [Trigger a job](/cli-reference/nx1/jobs#trigger).
    * [Delete a job](/cli-reference/nx1/jobs#delete).
  </Tab>

  <Tab title="REST API" icon="code">
    Use the following API endpoints to:

    * [View all jobs](/api-reference/endpoints/jobs/get-jobs).
    * [View a job](/cli-reference/nx1/jobs#get).
    * [Trigger a job](/api-reference/endpoints/jobs/trigger-job).
    * [Delete a job](/api-reference/endpoints/jobs/delete-job).
  </Tab>
</Tabs>

## Additional resources

* For more information about the ingest feature, refer to [Ingest Overview](/documentation/data-pipeline/overview/ingest).
* For more information about the monitoring feature, refer to [Monitor Overview](/documentation/platform/overview/monitor).
* For more information about roles or permissions, refer to [Govern Overview](/documentation/govern/overview).
