Python transform basics

Iceberg tables can be used as inputs and outputs in Python transforms using the transforms.tables API, which can be imported in the transforms-tables package.

This page provides code examples for the fundamentals of working with Iceberg table inputs and outputs in Python transforms.

Example: Generate a simple Iceberg table

Copied!
1 2 3 4 5 6 7 8 9 from transforms.api import transform, TransformContext from transforms.tables import TableOutput, TableTransformOutput @transform( output=TableOutput("/.../Output") ) def compute(ctx: TransformContext, output: TableTransformOutput): df_custom = ctx.spark_session.createDataFrame([["Hello"], ["World"]], schema=["phrase"]) output.write_dataframe(df_custom)

Example: Iceberg table output, Iceberg table input

Copied!
1 2 3 4 5 6 7 8 9 from transforms.api import transform from transforms.tables import TableInput, TableOutput, TableTransformInput, TableTransformOutput @transform( source_table=TableInput("/.../Input"), output_table=TableOutput("/.../Output") ) def compute(source_table: TableTransformInput, output_table: TableTransformOutput): output_table.write_dataframe(source_table.dataframe())

Example: Iceberg table output, dataset input

Copied!
1 2 3 4 5 6 7 8 9 10 from transforms.api import transform, Input, TransformInput from transforms.tables import TableOutput, TableTransformOutput @transform( source_dataset=Input("/.../Input"), output_table=TableOutput("/.../Output") ) def compute(source_dataset: TransformInput, output_table: TableTransformOutput): output_table.write_dataframe(source_dataset.dataframe())

Example: Dataset output, Iceberg table input

Copied!
1 2 3 4 5 6 7 8 from transforms.api import transform, Output, TransformOutput from transforms.tables import TableInput, TableTransformInput @transform( source_table=TableInput("/.../Input"), output=Output("/.../Output") ) def compute(source_table: TableTransformInput, output: TransformOutput): output.write_dataframe(source_table.dataframe())