Skip to main content
The Trino hands-on examples page demonstrates how to work with schemas, tables, joins, and aggregations in Trino.

Schema and table operations

Trino supports a variety of metadata operations for creating tables, defining views, and modifying existing schema definitions. These operations work directly with the underlying data lake storage and across available catalogs.
Certain operations require appropriate permissions or roles.

Create an Iceberg table

You can use CREATE TABLE to create a new table in a database. The table might be empty or populated with data from a query. The following example creates a new table summarizing employee counts and average salaries by job ID, providing a quick overview of compensation patterns.
CREATE TABLE demo.employee_salary_summary AS
SELECT job_id,
       COUNT(*) AS employee_count,
       AVG(salary) AS avg_salary
FROM employees1
GROUP BY job_id;

Create a view

You can use CREATE VIEW to define a virtual table representing a saved query. It’s useful for simplifying complex queries or providing controlled access. The following example defines a view to easily access employees earning higher than 5000, without modifying the underlying table.
CREATE VIEW demo.high_earning_employees AS
SELECT *
FROM employees1
WHERE salary > 5000;

Alter a table

You can use ALTER TABLE to modify an existing table’s structure, such as adding, dropping, or modifying columns. The following example adds a new bonus column to the employees table to store additional compensation data.
ALTER TABLE demo.employees1 ADD COLUMN bonus INTEGER;

Available catalogs

Two catalogs are accessible in the NexusOne Trino environment. These include:
  • Iceberg: Iceberg provides access to tables stored in S3 or HDFS. Key features include the following:
    • Atomic commits, rollback, and schema evolution
    • Uses hidden partitioning and metadata-based pruning for efficient scans
    • Works well with cloud object storage
    • Allows time-travel queries using table snapshots
    • Serves curated, frequently queried analytical datasets
  • Hive: Hive provides access to tables in Hive-compatible formats, such as Parquet or ORC. A Hive Metastore stores the metadata while the underlying files reside in S3 or HDFS. Key features include the following:
    • Directory-based partitioning
    • Supports multiple file formats and storage layouts
    • Used for raw, historical, or legacy datasets
    • Provides broad compatibility with existing Hadoop/Hive ecosystems
    • Serves foundational lake data or mixed-format workloads
The Iceberg and Trino catalogs appear directly in the Trino UI and in BI tools like Superset. It’ll enable you to browse schemas, tables, and metadata while performing analytical or federated SQL queries.

Joins and aggregations

You can perform standard relational joins, cross-catalog joins, and apply optimization hints to improve execution when you know the size characteristics of your tables.

Create a simple join

Use simple joins when querying tables within the same catalog or schema. These follow the standard SQL semantics and combine related datasets stored in a single source. The following example joins an employees1 and departments tables to list employee details with their department names.
SELECT e.employee_id,
       e.first_name,
       e.last_name,
       d.department_name
FROM employees1 e
JOIN departments d
ON e.department_id = d.department_id;

Create a cross-catalog join

Cross-catalog joins allow Trino to combine data from different catalogs, such as joining Iceberg tables with Hive tables or external databases. This enables federated analytics without moving or duplicating datasets. The following example joins an employees1 table in Iceberg with the departments table in Hive to list employee details along with their department names.
SELECT e.employee_id,
       e.first_name,
       e.last_name,
       d.department_name
FROM iceberg.demo.employees1 e
JOIN hive.default.departments d
ON e.department_id = d.department_id;

Use a join optimization hint

Join optimization can be helpful when you have prior knowledge about the size or distribution of the tables involved in a join. In certain scenarios, such as joining a very large fact table with a small dimension table, Trino can optimize performance by broadcasting the small table. Trino copies the small table to every worker node. Each node then processes rows from the large table incrementally and joins them with the broadcasted table locally, hence avoiding redistribution of the large dataset across the cluster. The following example joins an employees1 and departments tables, then it broadcasts the departments table to optimize the join for better performance.
SELECT /*+ BROADCAST(d) */
       e.employee_id,
       e.first_name,
       e.last_name,
       d.department_name
FROM employees1 e
JOIN departments d
ON e.department_id = d.department_id;

Additional resources

  • To get an overview of Trino, refer to the Trino in NexusOne page.
  • To learn about best practices when using Trino, refer to the Trino best practices page.
  • For more details about Trino, refer to the Trino official documentation.
  • If you are using the NexusOne portal and want to learn how to launch Trino, refer to the Launch a hosted app page.