Key features
Build offers flexible compute environments and shared directories that can make developing and managing DAGs simple.Environment options
When setting up the Build environment, you can choose between two compute environments:- Non-GPU profile: Designed for general-purpose data processing use cases.
- GPU-enabled profile: Designed for high-performance GPU-based use cases such as machine learning.
Default directories
When you launch a JupyterHub session, NexusOne launches a Kubernetes pod that spawns a dedicated Notebook server for you. The environment is pre-configured with two default directories you can interact with,dags and utils.
DAGs
Directed Acyclic Graphs, ordags, define specific tasks and in what order they should run.
JupyterHub shares this directory with Apache Airflow. It contains all executed DAGs run within NexusOne
via the Ask, Engineer, or Ingest feature.
When you create or update a DAG in Build, the file is automatically written to the shared DAGs directory.
Apache Airflow’s scheduler continuously monitors this DAG directory. When it automatically
detects a new or modified DAG, it makes the DAG available for execution.
Utils
Utilities, orutils, contains reusable scripts or tools, such as
Apache Spark, used to support the DAGs.
Command line tools
Within the JupyterHub environment, Build comes pre-installed with several NexusOne command line tools, such as:- nx1
- s3Cli
- Kyuubi batch-submit
Use cases
These examples show how different industries can use NexusOne’s build capabilities:- Financial services: Develop DAGs in Build that ingest customer transaction files and invoke Apache Spark to perform data transformation actions.
- Healthcare: Develop DAGs in Build that clean and validate patient records by triggering Apache Spark to perform data transformation actions.
Additional resources
- For general instructions about how to ingest a file, schedule a data insight report, or schedule a transformation rule in NexusOne, refer to the following:
- To learn about NexusOne’s command line tool, refer to s3Cli.