Connect Foundry to Snowflake to read and sync data between Snowflake and Foundry.
| Capability | Status |
|---|---|
| Exploration | ๐ข Generally available |
| Bulk import | ๐ข Generally available |
| Incremental | ๐ข Generally available |
| Virtual tables | ๐ข Generally available |
| Compute pushdown | ๐ข Generally available |
| Export tasks | ๐ก Sunset |
| Table Exports | ๐ก Beta |
Learn more about setting up a connector in Foundry.
Snowflake accounts with underscores (_) must replace underscores with dashes (-). For example, my_account_prod needs to become my-account-prod. Failure to do so will cause networking issues.
| Option | Required? | Description |
|---|---|---|
Account identifier | Yes | This is the identifier that precedes ".snowflakecomputing.com". See Snowflake's official documentation โ for more details. |
Roles | No | This is the default role to be used by the connection in case the credentials provided have access to multiple roles. |
Database | Yes | Specify a default database to use once connected. |
Schema | No | Option to specify a default schema to use once connected. If not specified, all schemas will be available that are in-scope of the credentials. |
Warehouse | No* | The virtual warehouse to use once connected. In the case of registered virtual tables, this will be used for any source-side compute. |
Credentials | Yes | Refer to the authentication section below for more details. |
Network Connectivity | Yes** | Refer to the networking section below for more details. |
* Warehouse details are optional for syncing Foundry datasets, but required for registering virtual tables.
** Network egress policies are required for Foundry worker connections, but not for agent worker connections.
You can authenticate with Snowflake in the following ways:
| Method | Description | Documentation |
|---|---|---|
| Username and password [Legacy] | Authenticate with a user account using a username and password. Basic authentication is legacy and not recommended in production. | Working with passwords โ |
| Key-pair authentication | Provide a username and private key. Note that only unencrypted private keys are supported. Foundry will encrypt and store the private key securely. | Key-pair authentication and key-pair rotation โ |
| External OAuth (OIDC) [Recommended] | Authenticate as a user using workload identity federation. Workload identity federation allows workloads running in Foundry to access Snowflake without the need for Snowflake secrets. Follow the displayed source system configuration instructions to set up external OAuth. | Workload identity federation โ Refer to our OIDC documentation for an overview of how OpenID Connect (OIDC) is supported in Foundry. |
For all authentication options, ensure that the provided user and role has usage privileges on the target database(s) and schema(s), as well as select privileges on the target table(s).
When registering virtual tables, the user and their role should also have usage privileges on the warehouse.
Snowflake is rolling out changes to require multi-factor authentication (MFA) for human users that use passwords, and to disallow passwords for all service users. As such, Username and password will no longer be a suitable authentication mechanism. Refer to the official Snowflake documentation โ for additional information and guidance on migrating.
For connections running on a Foundry worker, the appropriate egress policies must be added when setting up the source in the Data Connection application.
To identify the hostnames and port numbers of your Snowflake account to be allowlisted, you can run the following command in your Snowflake console. Ensure that at least the entries for SNOWFLAKE_DEPLOYMENT and STAGE are added as egress policies in Foundry.
Copied!1 2 3 4SELECT t.VALUE:type::VARCHAR as type, t.VALUE:host::VARCHAR as host, t.VALUE:port as port FROM TABLE(FLATTEN(input => PARSE_JSON(SYSTEM$ALLOWLIST()))) AS t;
See Snowflake's official documentation โ for additional information on identifying hostnames and port numbers to allowlist.
Connections from Foundry to Snowflake normally come from the default public gateway IPs for your environment. However, traffic within the same cloud provider (for example, AWS-AWS or Azure-Azure) may use different routing, and require establishing a connection via PrivateLink. See below for the additional setup required per cloud provider, or contact your Palantir representative for additional guidance.
If your Snowflake instance is configured to route internal S3 stage traffic through a VPCE โ, the Snowflake JDBC driver must be manually configured to not use the custom VPCE domain. Otherwise, the driver will be routed to the custom VPCE domain (which is inaccessible from Foundry's VPC) and will fail connections to URLs with the format of <bucketname>.bucket.vpce-<vpceid>.s3.<region>.vpce.amazonaws.com.
You can manually configure this by adding a JDBC connection property in the Connection details of your instance, with a key of S3_STAGE_VPCE_DNS_NAME and an empty value field (the equivalent of setting it to null).
The S3 stage traffic will then be routed through the AWS S3 Gateway Endpoint (<bucketname>.bucket.s3.<region>.vpce.amazonaws.com) which maintains private connectivity so traffic will not be routed through the public internet.
Review our PrivateLink egress documentation โ for more information.
For egress policies that depend on an S3 bucket in the same region as your Foundry instance, ensure you have completed the additional configuration steps detailed in our Amazon S3 bucket policy documentation for the affected bucket(s).
The Snowflake JDBC driver used for the Foundry Snowflake connector may attempt to connect directly to an underlying โinternal stageโ storage bucket when fetching data. For Snowflake hosted on Azure, because Azure-hosted Foundry enrollments route traffic over Azure service endpoints, network connectivity from Foundry to the underlying stage buckets must be explicitly allow-listed by following the instructions below.
You will need the following information about your Azure-hosted Snowflake warehouse to establish network connectivity to Foundry:
Use the SYSTEM$ALLOWLIST command to get the full list of domains that may be required to successfully connect.
For the Azure storage bucket returned from the SYSTEM$ALLOWLIST command, you will also need to retrieve the storage account identifier.
ENABLE_INTERNAL_STAGES_PRIVATELINK โ parameter to TRUE for the account.SYSTEM$GET_PRIVATELINK_CONFIG() โ function, which returns a field called privatelink-internal-stage containing the Azure storage account resource identifier.
A full Azure Storage account resource identifier will be in the following format:
/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Storage/storageAccounts/{storageAccountName}
More information on how to find an Azure Storage account resource ID directly in the Azure console can be found in the Azure documentation โ. Restricting cross-account network traffic using VNET rules and the storage account identifier is in line with Microsoftโs published best practices โ, and should be used for all connections to Azure-hosted Snowflake warehouses from within Azure compute instances.
Now that you have gathered the required information about your Snowflake warehouse, you can create the required policies needed to enable Foundry access to your Snowflake data.
Create a standard egress policy for the Azure storage internal stage, and attach it to your Snowflake source.
SYSTEM$ALLOWLIST command, and not just the storage bucket domain.Create an Azure storage policy, pasting in the storage account resource identifier.
Navigate back to your Snowflake source in Data Connection and confirm you can explore the source and run syncs.
The Virtual tables section of this documentation provides details on integrating Iceberg tables registered in Snowflake Horizon Catalog. This functionality requires network connectivity to the external volume where the Iceberg table is stored. You must create network egress policies for each external volume, as Foundry reads and writes to the storage location directly rather than querying the table using a Snowflake compute warehouse.
Refer to the official Snowflake documentation โ for more information on external volumes and how to determine table storage locations.
Configuring egress policies for an external volume enables network traffic to egress from Foundry to that storage location. Network controls may vary between cloud providers, so you should ensure that any network controls on the storage location permit network traffic from Foundry. Learn more about identifying the IP addresses where Foundry traffic originates..
Additionally, refer to the other sections of this Networking documentation to ensure you correctly set up connections to external volumes hosted in the same cloud region as your Foundry instance.
This section provides additional details around using virtual tables with a Snowflake source. This section is not applicable when syncing to Foundry datasets.
The Snowflake connector now offers enhanced functionality when using virtual tables to access Iceberg tables registered in Horizon Catalog. Foundry uses the Iceberg REST APIs exposed in Horizon Catalog to access tables as well as read and write data in the underlying storage locations, which are configured as external volumes in Snowflake. Additionally, Foundry uses Horizon Catalog credential vending to ensure secure access to cloud object storage. This can also improve the performance of reads and writes against these tables.
The connector exposes Iceberg functionality automatically if you:
Foundry uses Iceberg clients to establish connections to read or write tables to the storage location directly without using Snowflake compute. However, Foundry still uses the warehouse configured on the source for certain metadata queries, such as determining the type of table being accessed.
Refer to the official Snowflake documentation โ for more information on querying Iceberg tables with an external engine through Snowflake Horizon Catalog. Refer to the Networking section of this documentation for details on enabling network access to external volumes.
Foundry treats Iceberg tables like regular Snowflake tables if any of the above requirements are not met. Connections to Snowflake are made using the same mechanism as for other Snowflake data (such as tables, views, or materialized views) and rely on a Snowflake compute warehouse to read and write from the table.
The table below highlights the virtual table capabilities that are supported for Snowflake.
| Capability | Status |
|---|---|
| Bulk registration | ๐ก Beta |
| Automatic registration | ๐ข Generally available |
| Table inputs | ๐ข Generally available: tables, views, materialized views in Code Repositories, Pipeline Builder |
| Table outputs | ๐ข Generally available: tables in Code Repositories, Pipeline Builder |
| Incremental pipelines | ๐ข Generally available: APPEND only [1] |
| Compute pushdown | ๐ข Generally available ๐ก Beta: Pipeline Builder |
Consult the virtual tables documentation for details on the supported Foundry workflows where Snowflake tables can be used as inputs or outputs.
[1] To enable incremental support for pipelines backed by Snowflake virtual tables, ensure that Change Tracking โ and Time Travel โ are enabled for the appropriate retention period. This functionality relies on CHANGES โ The current and added read modes in Python Transforms are supported. These will expose the relevant rows of the change feed based on the METADATA$ACTION column. The METADATA$ACTION, METADATA$ISUPDATE, METADATA$ROW_ID columns will be made available in Python Transforms.
For full feature support, you should provide the following privileges to the credentials configured for the source connection. You should apply these on either the database, schema, or table depending on the desired inheritance model.
| Category | Privilege | Notes |
|---|---|---|
| Prerequisite | USAGE | Must be granted on the Snowflake databases and schemas that will be used in Foundry. |
| Read | SELECT | Required to read Snowflake tables when using syncs or virtual table inputs. |
| Edit | DELETE, INSERT, TRUNCATE, UPDATE | Required to modify Snowflake tables when using virtual table outputs. |
| Create | CREATE SCHEMA, CREATE TABLE | Required to create Snowflake tables when using virtual table outputs. |
When using Iceberg tables, USAGE privilege is required on the external volume where the table is stored.
Additionally, the credentials provided must have usage privileges on the warehouse provided in the source configuration.
Refer to the official Snowflake documentation โ for more information on access control privileges in Snowflake.
When using virtual tables, remember the following source configuration requirements:
See the Connection Details section above for more details.
Foundry offers the ability to push down compute to Snowflake when using virtual tables. Virtual table inputs leverage the Snowflake Spark connector โ which has built-in support for predicate pushdown.
When using Snowflake virtual tables registered to the same source as inputs and outputs to a pipeline, it is possible to fully federate compute to Snowflake. This feature is currently available in Python transforms. See the Python documentation for details on how to push down compute to Snowflake.
Note that columns of type array โ, object โ, and variant โ will be parsed by Foundry as type string. This is due to the source's variable typing.
For example, the Snowflake array [ 1, 2, 3 ] would be interpreted by Foundry as the string "[1,2,3]".
See Snowflake's official documentation โfor more details.