Integrate Tableflow with the AWS Glue Catalog in Confluent Cloud¶

Tableflow enables integrating with the AWS Glue Data Catalog as an external catalog, allowing the metadata of Apache Iceberg™ tables materialized by Tableflow to be published to AWS Glue. This makes the Iceberg tables accessible to any Iceberg-compatible query or compute engine that leverages the AWS Glue Data Catalog.

These tables must be consumed as read-only tables.

Diagram showing Tableflow integration with AWS Glue Data Catalog

Integrating with the AWS Glue Data Catalog ensures that external catalogs maintain up-to-date and consistent metadata, while the Tableflow catalog remains the single source of truth.

The AWS Glue Data Catalog integrates with Tableflow at the cluster level, enabling the automatic publication of all Tableflow-enabled topics as tables within Glue. As shown in the following diagram, the database and table names map directly to their corresponding cluster ID and topic name, establishing a clear relationship between these elements.

Diagram showing how Tableflow syncs Kafka clusters with AWS Glue Data Catalog

Configure AWS Glue Catalog integration¶

The following steps show how to enable AWS Glue Catalog integration at the cluster level.

Important

Topics must be materialized in order for catalog synchronization to complete. Enable Tableflow on a topic before enabling your external catalog provider. Catalog sync remains in the pending state until at least one topic is enabled with Tableflow.

Configure read-only access¶

The Iceberg tables materialized by Tableflow should be read-only tables. The following steps show how to set the required permissions to AWS Glue and S3 buckets.

Open the AWS Glue Catalog console.
Find the Iceberg table that was published in the previous steps as an AWS Glue table.
- The cluster ID maps to the AWS Glue database name.
- The Kafka topic name maps to the AWS Glue table name.
You can query these tables from any analytics or compute engine that supports AWS Glue Data Catalog. For more information, see Query Data.

You must consume Tableflow Iceberg tables as read-only, ensuring that downstream analytics engines have read-only access to them.

Integrate Tableflow with the AWS Glue Catalog in Confluent Cloud¶

Configure AWS Glue Catalog integration¶

Configure read-only access¶

Related content¶