> ## Documentation Index
> Fetch the complete documentation index at: https://docs.nx1cloud.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Iceberg in NexusOne

> Overview of Apache Iceberg, a table format powering data storage in NexusOne.

Apache Iceberg is an open table format for large-scale analytical datasets
in data lakes. It provides ACID transaction guarantees, schema evolution, and
time travel capabilities.

Traditional table formats, like Hive or Impala, combine metadata and read/write
transactions to a single compute engine, like Spark or Flink. Only that engine can
manage them while maintaining consistency. Unlike traditional table formats, Iceberg
separates the metadata from the compute engine, allowing multiple engines to read
and write data concurrently while ensuring a consistent table state.

Because NexusOne requires multi-compute engine access, strong consistency, and
long-term schema evolution in its data lakehouse, the NexusOne team adopts Iceberg as its
foundational table format. NexusOne also stores Iceberg metadata in a Hive
Metastore and data files in Amazon S3.

## Key features

* **ACID transactions**: Keep data consistent using serializable or snapshot
  isolation. It defaults to a serializable isolation and ensures that multiple
  jobs can read from and write to a table at the same time, with conflicting
  writes detected at commit and retried.
* **Schema evolution**: Add, drop, or rename columns without rewriting data,
  with schema changes tracked in table metadata, so all readers see the updated
  schema.
* **Partitioning flexibility**: Allows you to change partitioning schemes, for
  example, by date instead of by region. You can do this without touching existing
  files. New data uses the new partitioning scheme, old data remains accessible,
  and queries scan only relevant partitions.
* **Time travel**: Query historical table states using snapshot IDs or
  timestamps for auditing, debugging, and reproducible analytics.

## Iceberg components

Iceberg's architecture comprises the following components from top to bottom:

1. **Catalog**: Contains a pointer to an Iceberg table's metadata file
2. **Metadata file**: Contains all information about a table. It's a
   `metadata.json` file, and it comprises the following:
   1. **Snapshots**: Records the table's content at specific points in time
   2. **Manifest lists**: Tracks all manifest files in a snapshot
   3. **Manifest files**: Tracks data files
3. **Data files**: Parquet, ORC, or Avro files storing the table's data
4. **Storage layer**: S3, HDFS, or other object stores holding the metadata
   and data files

## Environment configuration

This section describes how the NexusOne team configured Iceberg.
The current version information supported by the NexusOne environment
includes the following details:

* **Iceberg version**: `1.8`
* **Table format version**: `2.0`
* **File format**: Defaults to Parquet, but also supports Avro and ORC
* **Catalog type**: Hive Metastore

## Additional resources

* To learn about best practices when using Iceberg in the NexusOne environment, refer to the
  [Iceberg](./iceberg-best-practices) page.
* For more details about Apache Iceberg, refer to the
  [Apache Iceberg](https://iceberg.apache.org/docs/latest/spark-writes/) official documentation.
* For more details about how NexusOne integrates Apache Spark, refer to the
  [Apache Spark](/platform-components/apache-spark/spark-in-nx1).
