Skip to main contentThis glossary defines key terms used in the NexusOne documentation.
-
Apache Airflow
An open source platform used internally at NexusOne to author, schedule, and monitor tasks or
jobs represented as DAGs.
See also Engineer, Ingest, or the Apache Airflow repository.
-
Apache APISIX API Gateway
An open source platform used internally at NexusOne to manage traffic from microservices and
large language models.
See also the Apache APISIX API Gateway repository.
-
Apache Gravitino
An open source platform used internally at NexusOne to provide metadata management and governance
across data lakes and warehouses.
See also the Apache Gravitino repository.
-
Apache Iceberg
An open source table format used internally at NexusOne to provide a unified way to access data in a data lake.
See also Ask, Discover, Engineer, Quality, or the Apache Iceberg.
-
Apache Kyuubi
An open source platform that acts as a gateway so that you can run SQL queries against Apache Spark engines.
See also the Apache Kyuubi repository.
-
Apache Ranger
An open source platform used internally at NexusOne that uses policies to manage access control to a data platform.
See also the Apache Ranger repository.
-
Apicurio Schema Registry
An open source platform used internally at NexusOne to store, share, and manage the structured definitions
of data used in events or APIs.
-
Apache Spark
An open source platform used internally at NexusOne to provide a data processing layer for large data analytics.
See also the Apicurio Schema Registry repository.
-
Apache Superset
An open source platform used internally at NexusOne that interactively analyzes and visualizes data.
See also Discover, Metabase, or the Apache Superset repository.
-
Ask
A NexusOne feature that generates SQL commands for interacting with ingested data, so you can gain
meaningful insights.
See also Insight Overview.
-
Bucket
A container that stores data in the Amazon S3 service.
See also Ingest.
-
Build
A NexusOne feature that launches a JupyterHub-based development environment.
See also Build Overview.
-
Catalog
A group of similar items or data. It has several meanings:
- DataHub: Groups related datasets and assets such as Topics, Views, or Dashboards, for easier discovery.
- Iceberg: Provides a consistent way to create, load, and drop tables across different storage systems.
- Trino: Defines a connection to a data source containing schemas and tables.
-
Certificate manager
A digital credential manager used internally at NexusOne to manage web domain names in a Kubernetes cluster,
ensuring that web communication between clients and services is secure.
-
Cluster
A group of Kubernetes nodes and pods that run open source tools used by NexusOne.
-
Connect
A NexusOne feature that allows you to programmatically access ingested data. It also provides a single interface
to launch several user-facing apps hosted on the NexusOne platform.
See also Connect Overview.
-
CSV - Comma-Separated Values
A file format for storing tabular data by separating each value with a comma.
You can ingest CSV files into NexusOne.
See also Ingest.
-
DAG - Directed Acyclic Graph
A Directed Acyclic Graph contains several tasks that execute from left to right. At NexusOne, DAGs
execute when you ingest data or schedule SQL commands to run at specific times.
See also Apache Airflow, Engineer, or Ingest.
-
Dashboard
A visual user interface displaying ingested or processed data as graphs, charts, or other visual elements.
See also Discover.
-
Database
A collection of data stored in tabular format and made available as a software package.
You reference a database as a data source when it’s ingested into NexusOne or when it’s
used as a Trino catalog by the Govern feature.
-
Data format
A specific way to encode data. This could be in
csv, parquet, or other formats.
-
DataHub
An open source platform used internally at NexusOne to provide metadata for your ingested
data. This metadata displays the lineage of your data, such as which storage location stores
the data, and what Spark operation processed it.
See also Govern, or the DataHub repository.
-
Data ingestion
See also Ingest.
-
Data insight
See also Insight.
-
Data lake
A centralized data store for all types of data, such as structured, semi-structured, and unstructured.
This allows users and applications to access and analyze the data from a single location.
-
Data mirroring
The process of ingesting a database into NexusOne and then creating a copy of it.
See also Ingest.
-
Data pipeline
An automated way to ingest and transform data.
-
Data product
A DataHub feature used to organize and manage tables, views, and other data assets.
-
Data source
A platform that stores data and provides it when requested. This could be different vendor databases,
such as MongoDB and PostgreSQL, or a data warehouse like Snowflake. On NexusOne, Trino provides access
to these data sources.
-
Data warehouse
A large, centralized data store that contains structured and semi-structured data. It’s often used for
analytics and reporting because it includes both current and historical data.
-
Debezium
An open source server used internally at NexusOne to capture table changes from an ingested Database and
mirroring it into another database.
See also the Debezium repository.
-
Discover
A NexusOne feature that launches the Metabase platform.
See also Discover Overview.
-
Domain
A logical way to categorize related data using DataHub.
See also Ask, Engineer, or Ingest.
-
Engineer
A NexusOne feature that transforms your data. It allows you to select one or more catalogs.
See also Engineer.
-
ETL - Extract, Transform, and Load
A process that’s used to integrate data from multiple sources, transform it, and then write the result to a target data store.
See also Ingest.
-
External DNS Manager
An open source tool for making Kubernetes resources managed by NexusOne discoverable to public domain name servers.
See also the External DNS Manager repository.
-
Govern
A NexusOne feature that manages user or group access to NexusOne features and data sources.
See also Govern Overview.
-
Group
A collection of users who belong to the same department or share similar responsibilities.
-
Health check
A way to verify if the API is operational.
-
IAM - Identity and Access Management
A sub-feature of the NexusOne Govern feature. It’s used to manage roles assigned to users,
groups, and tags.
See also Govern.
-
IAM group
See also Group.
-
IAM role
See also Role.
-
Ingest
The process of adding data to a system. It’s also a NexusOne feature used
to add a file, database, or lakehouse.
See also Ingest Overview.
-
Insight
The result of getting meaningful information by examining data.
It’s also a NexusOne feature that analyzes data and presents insights.
See also Ask.
-
Job
An Apache Spark operation scheduled in Apache Airflow. This is often found when using the
Engineer, Ingest, or Monitor features on NexusOne.
See also Monitor.
-
JupyterHub
An open source platform that NexusOne launches when you are about to use the Build feature. It’s helpful when
you are developing and testing DAGs that orchestrate ETL pipelines.
See also Build or the JupyterHub repository.
-
Keda
An open source platform used for auto-scaling Kubernetes workloads managed internally at NexusOne.
See also the Keda repository.
-
Keycloak
An open source platform that provides authentication to apps and users. At NexusOne, it’s used
to manage user access and authentication.
See also the Keycloak repository.
-
Kubernetes
An open source tool used to orchestrate apps deployed at NexusOne. All apps are internally managed
by the NexusOne team.
See also the Kubernetes repository.
-
Lakehouse
A data architecture that combines a data lake and a data warehouse.
See also Ingest.
-
LLM - Large Language Model
An artificial intelligence chatbot trained with a huge amount of data. In NexusOne, LLMs generate SQL
queries you can use to gather data insights, check data quality, or transform data.
See also Ask, Engineer, or Quality.
-
Metabase
A more user-friendly open source platform used internally at NexusOne to interactively analyze and
visualize ingested data.
See also Apache Superset, Ask, Discover, or the Metabase repository.
-
Metadata
Data that describes the content of another data.
See also Apache Gravitor and DataHub.
-
Monitor
A NexusOne feature for viewing all scheduled, running, failed, or completed jobs.
-
NexusOne
The brand name of the Nexus Cognitive software platform.
-
Ollama
An open source platform used internally at NexusOne to provide access to a local LLM from companies
such as OpenAI and Google.
See also Ask, Engineer, Quality, or the Ollama repository.
-
OpenFaaS - Open Function as a Service
An open source platform that allows the NexusOne team to deploy serverless functions on Kubernetes.
See also the OpenFaaS repository.
-
Open source
Software whose source code is publicly available on the internet and distributed under a license.
NexusOne uses several open source platforms, and the team actively contributes to them as well.
-
ORC - Optimized Row Columnar
A file format that stores data as columns. Unlike Parque files, ORC files are more efficient
for write-heavy workloads. You can ingest ORC files into NexusOne.
See also Ingest.
-
Parquet
A file format type that you can ingest into NexusOne. Unlike row-based file formats such as CSV,
Parquet stores data as columns. This makes querying the data faster.
See also Ingest.
-
Permission
See also Permission boundary.
-
Permission boundary
For restricting access to features for specific ingested data associated with a tag. You do this
by assigning specific roles to the tag.
See also Govern.
-
PyTorch
An open source Python library used internally at NexusOne to build and train deep learning models.
See also the PyTorch repository.
-
Quality
A NexusOne feature that allows you to check if your ingested data meets a defined goal.
See also Quality.
-
Query
The process of talking to ingested data in NexusOne using the Trino SQL syntax.
See Metabase.
-
Ray
An open source Python library used internally at NexusOne to scale machine learning workloads.
See also the Ray repository.
-
Redis
An open source server used internally at NexusOne for caching data apps.
See also the Redis repository.
-
Reporting cycle
The process of consistently running a SQL command using the NexusOne Ask feature, so you can
gather data insights.
See also Ask.
-
Role
A set of permissions grouped under a name. Within NexusOne, the default roles represent specific
features such as ingesting or transforming data. Custom roles also exists, these are a combination
of several default roles.
See also Govern.
-
Rule
A SQL command scheduled to execute at specific times of the day using Apache Airflow.
See also Engineer and Quality.
-
S3
An Amazon cloud storage service used for storing large amounts of object data. NexusOne stores
ingested data in S3.
-
Schedule
The process of telling a task or job to run at specific times of the day.
See also Engineer, Ingest.
-
Schema
The structure of a database. When ingesting data into NexusOne, you must specify a schema.
See also Apache Superset or Metabase.
-
SQL - Structured Query Language
A programming language used to interact with ingested data. NexusOne only supports the
Trino SQL syntax.
See also Ask, Engineer, or Quality.
-
SSO - Single sign on
The process of logging into several apps within NexusOne using your single NexusOne credentials.
-
Table
A collection of data represented in rows and columns. When ingesting data into NexusOne,
you must specify a table.
See also Apache Superset or Metabase.
-
Tag
An optional name given to ingested data within NexusOne. NexusOne can assign a role to a tag
to ensure that the tagged data can only use specific NexusOne features.
See also Ingest or Govern.
-
Task
See Job.
-
Trino
An open source SQL query engine used within NexusOne.
See also Apache Superset or Metabase.
-
User
A logical identity assigned to employees interacting with NexusOne.
See also Govern.
-
Visualize
The process of seeing insights from ingested data on a user interface.
See also Apache Superset or Metabase.
-
XLSX - Excel Open XML Spreadsheet
A file format used to represent data. You can ingest XLSX files into NexusOne.
See also Ingest.
-
XML - eXtensible Markup Language
A file format used to represent data using elements. For example, .
You can ingest XML files into NexusOne.
See also Ingest.