How file ingestion works
The file ingest feature allows you ingest files containing structured data into the platform. NexusOne supports two file ingestion options:- Upload file: Files stored on your local machine
- Public file URL: A public URL to a file you’d like to upload. You might store this file in an S3 bucket and expose it over HTTPS.
Supported file formats
NexusOne currently supports these file formats:- CSV
- Parquet
- ORC
- XML
- XLS/XLSX
Use cases
These examples show how different industries can use NexusOne’s file ingestion and query capabilities:- Financial services: Ingest Parquet-formatted market data feeds into NexusOne to monitor portfolio risk and run analytics on a single, secure platform without having to manage custom pipelines.
- Education: Ingest Excel-formatted grade books into NexusOne to store student records and analyze student performance trends.
How database ingestion works
The database ingest feature allows you ingest a public database containing datasets into the NexusOne platform. NexusOne does this by connecting to the database using a JDBC URL, authenticating, and creating an Airflow job that queries a table and copies the results into NexusOne.You can query one table in a schema or several tables.
Supported database vendors and their JDBC URL
When you attempt to ingest a database into NexusOne, a JDBC URL is one of the options used to set up a connection between NexusOne and the database. The following table describes the supported database vendors on NexusOne and their JDBC URL:| Database | JDBC URL format |
|---|---|
| Db2 | jdbc:db2://<database_URL_or_IP_address>:50001/<database_name> |
| MariaDB | jdbc:mariadb://<database_URL_or_IP_address>:3306/<database_name> |
| Microsoft SQL Server | jdbc:sqlserver://<database_URL_or_IP_address>:1433;databaseName=<database_name> |
| MySQL | jdbc:mysql://<database_URL_or_IP_address>:3306/<database_name> |
| Oracle | jdbc:oracle:thin:@//<database_URL_or_IP_address>:1521/<database_name> |
| PostgreSQL | jdbc:postgresql://<database_URL_or_IP_address>:5432/<database_name> |
- All port numbers specified here are defaults. Depending on how you deployed your database, change the port number accordingly.
- In PostgreSQL, a database name is different from a schema name. The default database name
is
postgresand it stores default schemas.
Use cases
These examples show how different industries can use NexusOne’s database ingestion and query capabilities:- Financial services: Connect to a PostgreSQL or Oracle database containing market transactions so you can ingest its structured tables into NexusOne for centralized risk monitoring and analytics.
- Education: Connect to a MySQL or Microsoft SQL Server database that stores student records and grades so you can ingest its structured data into NexusOne for centralized student performance analysis.
How lakehouse ingestion works
A lakehouse is a data lake that behaves like a data warehouse. It stores all data as files in object storage, but adds a table format for structure and reliable updates. It also uses a catalog so query engines can quickly find and read the data they need. In NexusOne, lakehouse refers to a data architecture that stores databases in object storage. It uses a metastore and table format for query consistency. In NexusOne, ingesting a lakehouse means copying data from one table into another table. A table format such as Iceberg exposes that ingested table, after NexusOne copies the table into another, it creates a new table format metadata for the copied table so downstream apps like Trino can query it. These examples show how different industries can use NexusOne’s lakehouse ingestion capabilities:- Financial services: Use lakehouse ingestion to copy previously ingested Parquet market data tables into new tables within NexusOne. This ensures that analysts can run risk calculations and portfolio analytics in a centralized table.
- Education: Use lakehouse ingestion to copy previously ingested Excel grade book tables into a new table. This ensures that administrators can centralize student records and analyze performance trends.
Additional resources
- For full instructions about how to ingest a database in NexusOne, refer to How to ingest a database.
- For full instructions about how to ingest a file in NexusOne, refer to How to ingest a file.
- For full instructions about how to ingest a lakehouse in NexusOne, refer to How to ingest a file.