Skip to main contentThe Build feature is a JupyterHub-based
development environment integrated within NexusOne. It provides a build environment for developing and testing
Apache Airflow DAGs that orchestrate ETL pipelines.
Build is directly connected to Apache Airflow through a shared DAGs directory, which allows you to create, modify,
and manage tasks that Apache Airflow can execute automatically.
Key features
Build offers flexible compute environments and shared directories that can
make developing and managing DAGs simple.
Environment options
When setting up the Build environment, you can choose between two compute environments:
- Non-GPU profile: Designed for general-purpose data processing use cases.
- GPU-enabled profile: Designed for high-performance GPU-based use cases such as machine learning.
Each environment is temporary, it times out and gets destroyed when you aren’t actively interacting with it.
Default directories
When you launch a JupyterHub session, NexusOne launches a Kubernetes pod that spawns a dedicated Notebook server
for you. The environment is pre-configured with two default directories you can interact with, dags and utils.
DAGs
Directed Acyclic Graphs, or dags, define specific tasks and in what order they should run.
JupyterHub shares this directory with Apache Airflow. It contains all executed DAGs run within NexusOne
via the Ask, Engineer, or Ingest feature.
When you create or update a DAG in Build, the file is automatically written to the shared DAGs directory.
Apache Airflow’s scheduler continuously monitors this DAG directory. When it automatically
detects a new or modified DAG, it makes the DAG available for execution.
Utils
Utilities, or utils, contains reusable scripts or tools, such as
Apache Spark, used to support the DAGs.
Use cases
These examples show how different industries can use NexusOne’s build capabilities:
- Financial services: Develop DAGs in Build that ingest customer transaction files and invoke Apache Spark to
perform data transformation actions.
- Healthcare: Develop DAGs in Build that clean and validate patient records by triggering Apache Spark
to perform data transformation actions.
Additional resources
For general instructions about how to ingest a file, schedule a data insight report,
or schedule a transformation rule in NexusOne, refer to the following: