Skip to main content
POST
/
api
/
dataingest
/
submit
Ingest
curl --request POST \
  --url https://api.example.com/api/dataingest/submit \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "name": "<string>",
  "schema_name": "<string>",
  "table": "<string>",
  "schedule": "<string>",
  "merge": "<string>",
  "column_transformations": [
    {
      "column": "<string>",
      "stream": "<string>",
      "new_name": "<string>",
      "encryption_key_name": "<string>"
    }
  ],
  "file_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "file_path": "<string>",
  "file_format": "<string>",
  "file_read_options": {},
  "jdbc_url": "<string>",
  "jdbc_username": "<string>",
  "jdbc_password": "<string>",
  "jdbc_schema": "<string>",
  "jdbc_table": "<string>",
  "jdbc_query": "<string>",
  "lakehouse_sourceschema": "<string>",
  "lakehouse_sourcetable": "<string>",
  "lakehouse_conditions": "<string>",
  "domain": "<string>",
  "tags": [
    "<string>"
  ],
  "owner_id": "<string>"
}
'
{
  "flow_url": "<string>",
  "job_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "error": "<string>"
}

Authorizations

Authorization
string
header
required

The access token received from the authorization server in the OAuth 2.0 flow.

Body

application/json

Request model for data ingestion jobs.

name
string
required

The name of the ingestion job.

ingesttype
enum<string>
required

Defines the type of data ingestion (Valid values: file, jdbc, lakehouse).

Available options:
file,
jdbc,
lakehouse
schema_name
string
required

The target schema where data will be stored.

table
string
required

The target table name for the data.

mode
enum<string>
required

Defines how the query output is handled to the target table (Valid values: append, overwrite, merge).

Available options:
append,
overwrite,
merge
schedule
string | null

The cron expression that specifies when or how often the ingestion runs.

merge
string | null

Columns used to match and merge data when the mode is merge.

column_transformations
ColumnTransformation · object[] | null

Optional list of column transformations to apply during ingestion. Supports casting to different data types, renaming columns, and encrypting values using vault_encrypt UDF.

file_id
string<uuid> | null

Unique ID of the uploaded file.

file_path
string | null

S3 URL to the file you want to ingest.

file_format
string | null

The format of the file (Valid values: csv, orc, xml, parquet, xls).

file_read_options
File Read Options · object

Options for reading the file.

jdbc_url
string | null

Connection URL for the JDBC data source when the format is jdbc.

jdbc_username
string | null

Username for the JDBC connection.

jdbc_password
string | null

Password for the JDBC connection.

jdbc_type
enum<string> | null

Type of JDBC ingestion (Valid values: table, query).

Available options:
table,
query
jdbc_schema
string | null

Source schema in the JDBC database.

jdbc_table
string | null

Source table in the JDBC database.

jdbc_query
string | null

SQL query used to extract data via JDBC.

lakehouse_sourceschema
string | null

Source schema in the lakehouse when file format is lakehouse.

lakehouse_sourcetable
string | null

Source table in the lakehouse.

lakehouse_conditions
string | null

WHERE conditions for filtering lakehouse data (For example, salary > 2000, store_id = 1).

domain
string | null

Domain to be set for the dataset in DataHub.

tags
string[] | null

Tags to be set for the dataset in DataHub.

owner_id
string | null

The logged in user's email ID.

Response

Data Ingest submitted successful.

flow_url
string
required

Airflow URL to access the ingestion flow.

job_id
string<uuid> | null

A unique ID of the created ingestion job.

error
string | null

Error message if the ingestion fails due to any issue.