Skip to main content
PostgreSQL is one of the database vendors supported on NexusOne. However, to ingest datasets from the database to NexusOne, you have to create the database instance in an on-prem or public cloud environment, make it publicly available, and then ingest it into NexusOne. This guide walks you through deploying a PostgreSQL database instance in a public cloud using Terraform, adding datasets into the database, and ingesting the database into NexusOne.

Prerequisites

  • An existing AWS, Azure, or Google Cloud account
  • Appropriate NexusOne permission: nx1_ingest, nx1_monitor, nx1_s3_admin, airflow_user, superset_user, spark_sql, and trino_admin
  • Command line tools from AWS, Azure, or Google Cloud installed
  • psql command line tool installed
  • Terraform installed

Authenticate to your public cloud

Use your public cloud’s command line tool to authenticate to your cloud account. This allows Terraform to interact with your cloud account.
The following AWS command requests that you enter the following AWS credentials:
  • Access key ID
  • Secret access key
  • Default region
# User account login
aws configure
To create an access key ID and secret access key, refer to How IAM users can manage their own access keys in the AWS documentation.

Define a Terraform and cloud provider

After authenticating to your cloud account using the command line tool, create the following Terraform file.
# Terraform provider
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = ">=6.32.0"
    }
  }
}

# AWS provider
provider "aws" {
  region = "us-west-2"
}
The defined cloud provider automatically uses the credentials you defined when interacting with the cloud provider’s command line tool.

Define a PostgreSQL instance

This step defines Terraform resources required in this tutorial for deploying the PostgreSQL database. Create the following Terraform file.
# Create the database instance
resource "aws_db_instance" "postgres" {
  allocated_storage   = 10
  engine              = "postgres"
  instance_class      = "db.t3.micro"
  username            = "psqladmin"
  password            = "NexusPostgres!"
  publicly_accessible = true
  skip_final_snapshot = true
}

# Output the database address
output "database_address" {
  value = aws_db_instance.postgres.address
}
  • Azure expects a globally unique PostgreSQL Flexible Server name. If you experience a ServerNameAlreadyExists error when creating the resource, then use a unique name.
  • Public access to the database is for demonstration purposes only.

Deploy the PostgreSQL instance

After defining the PostgreSQL Terraform resource specific to your cloud provider, take the following steps to deploy the Terraform resources.
  1. Initialize the Terraform configuration.
    terraform init
    
  2. Create the PostgreSQL instance.
    terraform apply --auto-approve
    
Once the resources are successfully created, Terraform outputs the database address.

Add datasets to the database

To ingest datasets from the database into NexusOne, these datasets need to exist first. Take the following steps to add datasets to the database:
  1. Authenticate to the database using the psql command line tool.
    psql \
    --host=<database_address> \
    --username=psqladmin \
    --dbname=postgres
    
    PostgreSQL creates a default database named postgres, which stores default schemas.
  2. When prompted, enter your password, NexusPostgres!.
  3. Create a new table inside the default public schema.
    CREATE TABLE public.test (
      id SERIAL PRIMARY KEY,
      name TEXT
    );
    
  4. Insert data into the table.
    INSERT INTO public.test (name)
    VALUES ('Nexus'), ('Cognitive');
    
  5. View all rows in the test table.
    SELECT * FROM public.test;
    

Ingest the database into NexusOne

Now that a dataset is available on the PostgreSQL database, you can ingest the database into NexusOne using the following steps:
  1. Log in to NexusOne.
  2. From the NexusOne homepage, navigate to Ingest > Database > From Table
  3. Enter the table and schema name:
    • Source Schema: public
    • Source Table: test
  4. Add connection details:
    • Database URL: jdbc:postgresql://<database_URL_or_public_IP>:5432/postgres
      • Port 5432 is PostgreSQL’s default port number.
      • PostgreSQL creates a default database named postgres, which stores default schemas.
    • Username: psqladmin
    • Password: NexusPostgres!
  5. Add ingest details:
    • Name: ingest_database
    • Schema: postgres_schema
    • Table: postgres_table
    • Schedule: Run Once
    • Mode: append
    • Tags: Don’t add any tags
  6. After configuring all fields, click Ingest to submit the job. Wait for a few minutes until you see a success message appear with an option to click View Jobs.

Monitor job

When you ingest the database, this creates an Airflow job. To monitor the status of the job, use the following steps:
  1. Click View Jobs, or navigate to the NexusOne homepage and then click Monitor.
  2. Find your job name, ingest_database, in the list, and watch its current status.
  3. Wait for a few minutes, and then refresh your browser until the status changes to Completed.

Visualize your dataset

Use the following steps to visualize your dataset:
  1. On the NexusOne homepage, click Discover to launch Superset.
  2. Hover your mouse over SQL, and then select SQL Lab.
  3. Enter the following command in the query box:
SELECT * FROM postgres_schema.postgres_table
visualize-postgres-v4-1-2

Visualize your dataset

Delete the table and Terraform resources

After completing the tutorial, delete the created table and PostgreSQL instance using the following steps:
  1. Delete the table.
    DROP TABLE public.test;
    
  2. Destroy the Terraform resources.
    terraform destroy --auto-approve
    

Additional resources