> ## Documentation Index
> Fetch the complete documentation index at: https://docs.nx1cloud.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Ingest a file

> Ingest a file into NexusOne by uploading a local file or providing a public URL.

Ingesting a file into the NexusOne platform means uploading or adding a public URL to the file so
the platform can process and analyze it. Once ingested, the file is automatically integrated into
your data pipeline workflow.

This guide walks you through how to ingest a file.

## Prerequisites

* Appropriate permission: `nx1_ingest`, `nx1_monitor`, `nx1_s3_admin`, `airflow_user`, `superset_user`,
  `spark_sql`, and `trino_admin`
* Ensure you are ingesting the [files NexusOne currently supports](../../overview/ingest#how-file-ingestion-works).

## Add or upload a file

<Tabs>
  <Tab title="Web portal" icon="browser">
    You add or upload a file so NexusOne can process and analyze it.
    This automatically integrates the file into your data pipeline workflow for later use.

    1. Log in to NexusOne.

    2. On the top navigation bar, hover your mouse over **Data Pipeline** and then select **Ingest**.

    3. Click **File**.

    4. Add a file

       * If you are uploading a file, then click **Choose File** and select the file that you want to upload.
       * If you are adding a public URL to a file, then click **Public File URL** and enter the public URL to the
         file you'd like to upload. This file might be in an S3 bucket and exposed over HTTPS.

    5. Add ingest details. These fields define how the NexusOne ingests and accesses the data through a schema and table.

       <Warning>Ensure that the values added to your Schema and Table adhere to Apache Spark's
       [identifiers](https://spark.apache.org/docs/latest/sql-ref-identifier.html).</Warning>

       * Enter a unique name for this ingest job. This name appears in the Monitor tab and Airflow's portal for tracking.
       * Choose an existing database schema or enter a new schema name to create one.
       * Choose an existing table or enter a new table name.
       * Optional: Set how often the ingestion job should run. Schedule options include `Run Once`,
         `Every 3 hours`, `Daily`, `Weekly`, `Monthly`, and `Quarterly`.
       * Select a mode for how to store incoming records at the destination table.
         * **Append**: Add new records to the existing dataset.
         * **Merge**: Add or update existing records where applicable.
         * **Overwrite**: Replace all existing records.
       * Optional: Select a DataHub domain. For example, `Company`, `Product`, `Sales`.
         This is only applicable if you have one previously.
         [created on DataHub](https://docs.datahub.com/docs/domains) via the Govern feature on NexusOne.
       * Optional: Select or create one or more tags to label this job.

    6. Optional: Column transformations:

       * Click **Add Transformation**, then select any of the following column transformation types:

         * **Cast**: Converts a column's data type during ingestion. To use this transformation type,
           enter the column name in the **Column** field, and then select a target type in the **Target Type**.
         * **Drop**: Removes a column from the dataset. To use this transformation type, enter the column
           name in the **Column** field.
         * **Encrypt**: Makes the data unreadable using an encryption key. To use this transformation type,
           enter the column name in the **Column** field, and then optionally enter a key name in the **Key Name**
           field.
         * **Rename**: Changes a column's name. To use this transformation type, enter the column name in
           the **Column** field, and then enter a new column name in the **New Name** field.

       * Repeat until you have added all the transformations necessary for your use case.

    7. After configuring all fields, click **Ingest** to submit the job.
  </Tab>

  <Tab title="CLI" icon="square-terminal">
    Use the following `nx1` commands to:

    * Add a [file from a source URL](/cli-reference/nx1/files) and submit a new data ingestion request.
    * Upload a [local file](/cli-reference/nx1/ingest-file) and submit a new data ingestion request.

    If you are selecting an existing domain, schema, table, or tags, then you can
    use the following `nx1` commands to:

    * [Get domains](/cli-reference/nx1/domains).
    * [Get schemas](/cli-reference/nx1/schemas).
    * [Get tables](/cli-reference/nx1/tables).
    * [Get tags](/cli-reference/nx1/tags).

    After getting the specific detail you want, you can use the value when submitting the
    new data ingestion request.
  </Tab>

  <Tab title="REST API" icon="code">
    Use the following API endpoints to:

    1. Add a [file from a source URL](/api-reference/endpoints/files/upload-file-url) or
       upload a [local file](/api-reference/endpoints/files/upload-file).
    2. [Submit a new data ingestion request](/api-reference/endpoints/data-ingestion/ingest).

    If you are selecting existing domains, schemas, tables, or tags, then you can
    use the following API endpoints to:

    * [Get domains](/api-reference/endpoints/metastore/get-domains).
    * [Get schemas](/api-reference/endpoints/metastore/get-schemas).
    * [Get tables](/api-reference/endpoints/metastore/get-tables).
    * [Get tags](/api-reference/endpoints/metastore/get-tags).
  </Tab>
</Tabs>

## Monitor, trigger, or delete a job

When you schedule an ingested file, it runs as a job in Apache Airflow.
You can monitor, trigger, or delete the job.

<Tabs>
  <Tab title="Web portal" icon="browser">
    1. When you create a job, a success message and a View Jobs button appear.
    2. Track the job status by clicking **View Jobs** or navigating to the NexusOne homepage
       and clicking **Monitor**.
    3. Use the three dots `...` menu to trigger or delete a job.
    4. If you clicked **Trigger job**, then click the job's name to open its DAG details in Airflow's portal.
  </Tab>

  <Tab title="CLI" icon="square-terminal">
    Use the following `nx1` command to:

    * [View all jobs](/cli-reference/nx1/jobs#list).
    * [View a job](/cli-reference/nx1/jobs#get).
    * [Trigger a job](/cli-reference/nx1/jobs#trigger).
    * [Delete a job](/cli-reference/nx1/jobs#delete).
  </Tab>

  <Tab title="REST API" icon="code">
    Use the following API endpoints to:

    * [View all jobs](/api-reference/endpoints/jobs/get-jobs).
    * [View a job](/cli-reference/nx1/jobs#get).
    * [Trigger a job](/api-reference/endpoints/jobs/trigger-job).
    * [Delete a job](/api-reference/endpoints/jobs/delete-job).
  </Tab>
</Tabs>

## Additional resources

* To get instructions on how to ingest specific file types supported on NexusOne or visualize your dataset, see the following:

  * [Upload a CSV file](/tutorials/ingest/file/upload-csv)
  * [Upload an XLSX file](/tutorials/ingest/file/upload-xlsx)
  * [Add a CSV file from a public URL](/tutorials/ingest/file/add-csv-from-a-public-url)
  * [Add an XLSX file from a public URL](/tutorials/ingest/file/add-xlsx-from-a-public-url)
  * [Add a Parquet file from a public URL](/tutorials/ingest/file/add-parquet-from-a-public-url)

* For more information about the monitoring feature, refer to [Monitor Overview](/documentation/platform/overview/monitor).

* For more information about roles or permissions, refer to [Govern Overview](/documentation/govern/overview).