Virtual tables overview

Virtual tables allow you to query and write to tables in supported data platforms without storing the data in Foundry.

You can interact with the tables in Python transforms with the transforms-tables library.

Prerequisites

To interact with virtual tables from a Python transform, you must:

  1. Upgrade your Python repository to the latest version.
  2. Install transforms-tables from the Libraries tab.

Transforms that use the use_external_systems decorator are currently not compatible with Virtual Tables. Switch to source-based external transforms or split your transform into multiple transforms, one that uses Virtual Tables as input and one that uses the use_external_systems decorator.

To query and write to tables in external platforms, the associated source must have both code imports and source exports enabled in the source connection settings in Data Connection.

API overview

The Python virtual tables API provides TableInput and TableOutput types to interact with virtual tables.

Copied!
1 2 3 4 5 6 7 8 9 from transforms.api import transform from transforms.tables import TableInput, TableOutput, TableTransformInput, TableTransformOutput @transform( source_table=TableInput("ri.tables.main.table.1234"), output_table=TableOutput("ri.tables.main.table.5678"), ) def compute(source_table: TableTransformInput, output_table: TableTransformOutput): ... # normal transforms API

The tables referred to in a Python transform need not come from the same source, or even the same platform.

The above example relies on the tables specified in the transform to already exist within your Foundry environment. If this is not the case, you can configure the output virtual table to be created during checks, as with dataset outputs. This requires extra configuration to specify the source and location where the table should be stored.

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 from transforms.api import transform from transforms.tables import TableInput, TableOutput, TableTransformInput, TableTransformOutput, SnowflakeTable @transform( source_table=TableInput("ri.tables.main.table.1234"), output_table=TableOutput( "/path/to/new/table", # Must specify the Data Connection source you want to create the table in and the table identifier/location "ri.magritte..source.1234", SnowflakeTable("database", "schema", "table"), ), ) def compute(source_table: TableTransformInput, output_table: TableTransformOutput): ... # normal transforms API

Once created, the extra configuration for the source and table metadata can be removed from the TableOutput to be more concise. Once a virtual table has been created, it is not possible to change the source or location. Modifying the source or location will cause checks to fail.

The available Table subclasses are:

  • BigQueryTable(project: str, dataset: str, table: str)
  • DeltaTable(path: str)
  • FilesTable(path: str, format: FileFormat)
  • IcebergTable(table: str, warehouse_path: str)
  • SnowflakeTable(database: str, schema: str, table: str)

You must use the appropriate class based on the type of source you are connecting to.

File template configuration wizard [Beta]

Beta

Virtual table outputs in the file template configuration wizard are in the beta phase of development. Functionality may change during active development.

Virtual table inputs and outputs can be configured in the Code Repositories file template configuration wizard using the virtual table template variable type. When creating virtual table outputs, the wizard will walk you through selecting an output source to write to, along with a Foundry location for the virtual table.

Configuring a virtual table output in the "Output table source" dialog.

Compute pushdown

Tables backed by a BigQuery, Databricks, or Snowflake connection can push Foundry authored transforms to BigQuery, Databricks, or SnowflakeDB. Known as "compute pushdown", this allows for the use of Foundry's pipeline management, data lineage, and security functionality on top of data warehouse compute. Use virtual table inputs and outputs to push down compute.

View source-specific compute pushdown documentation with code examples here: