The Foundry connector enables data sharing from one instance of Foundry to another. This workflow requires access to both Foundry instances, and designates one instance as the "source" and the other as the "destination." Throughout the data connection process, users will perform most functions on the destination instance.
Note that this connector is not currently compatible with views as inputs, nor it is compatible with restricted views. Datasets ingested by the destination instance must first be materialized within their source instance.
For example, if a use case requires the transfer of data from red.palantirfoundry.com to blue.palantirfoundry.com, most of the setup and subsequent interactions will take place in the destination instance blue.palantirfoundry.com, which is where the transferred data will ultimately land. The workflows discussed below read data via ingest, rather than write data via export.
| Capability | Status |
|---|---|
| Bulk import | 🟢 Generally available |
| Streaming ingests | 🟢 Generally available |
| Incremental ingests | 🟢 Generally available |
| Exploration | Coming soon |
| Virtual tables | 🟡 Beta |
| Compute pushdown | Not available |
| Table exports | Not available |
| Export tasks | Not available |
Open the Data Connection application and select + New Source in the upper-right corner of the screen.
Select Foundry from the available connector types.
Choose to use a direct connection over the Internet or to connect through an intermediary agent.
Input the hostname of your source Foundry instance. In this case, blue.palantirfoundry.com will pull data from red.palantirfoundry.com, so red.palantirfoundry.com is the source instance.
Choose a means of authentication.
Create an egress policy for the source instance if you are using a direct connection. To ingest data from red.palantirfoundry.com to blue.palantirfoundry.com, create an egress policy for the URL https://red.palantirfoundry.com on port 443. Unlike traditional data connections, you must whitelist all IP addresses within the source instance. This is done through Control Panel by selecting the option to Configure network ingress
Follow the instructions below to configure ingress IP allowlisting for the source instance by adding the destination instance's IP addresses to the source instance's Network ingress extension in Control Panel:
If you are creating an agent-based connection, then you must provide the appropriate IP addresses based on your agent's host. Additionally, your agent must use Java 21, at a minimum, as agent-based connections using the Foundry connector are not compatible with prior versions of Java. Learn more about identifying IPs when configuring network egress.
Contact Palantir Support if you are unable to access the Network ingress Control Panel extension.
Learn more about setting up a connector in Foundry.
The Foundry connector supports the following authentication methods:
Client credentials (production): For long-lived connections, we only allow client credentials. To create a client credential follow the below steps:
Copied!1https://<SOURCE_FOUNDRY_INSTANCE>.palantirfoundry.com/workspace/developer-console/
This process will create a Service user for which you can provide or deny access to assets in Foundry. To check if this service user has access to a dataset or a project, you can use the Check access feature for the given asset.
Personal access token (temporary): For security purposes, we don't allow tokens to be used in production use cases. Ingests will fail if a sync is run while relying on a token with a life span greater than 36 hours.
Authentication credentials are input in the destination instance. In the source instance, you must create a token that will afford the destination instance the ability to read data. To do so, navigate to the following URL:
Copied!1https://<SOURCE_FOUNDRY_INSTANCE>.palantirfoundry.com/workspace/settings/tokens
Then, select + Create token in the upper-right corner. At this step you can name your token and choose its lifespan. Then, copy your token and navigate to the destination Foundry instance.
The provided credentials must have the following necessary privileges:
The Foundry connector requires network access to the destination Foundry instance on port 443 (HTTPS). The destination instance needs an egress policy that corresponds to the URL of the source instance.
To enable direct connections from a Foundry instance to another Foundry instance, the appropriate egress policies must be added when setting up the source in the Data Connection application.
Egress policies are not needed for connection using an agent.
To set up a Foundry-to-Foundry sync, select Explore and create syncs in the upper-right of the source Overview screen. Browse the available projects and datasets in the source Foundry instance, then select the datasets you want to sync. When ready, select Create sync for x datasets.
Incremental, or Append, syncs maintain state about the most recent sync and only ingest new or changed data from the target dataset. There are two ways of establishing these ingests with the Foundry Connector.
The "initial incremental state" can be set to an arbitrarily distant date, like January 1, 1970. On the first run of the ingest, all data will extracted. Starting from the second run onwards, each ingest will only extract the newest data available.
The "initial incremental state" can be set to a date of your choosing, like January 1, 2024. Similar to the above option, the first run of the ingest will extract all data. Then subsequent ingests will only extract the newest data available. This is a more filtered option for use cases where the author of the ingest knows that they want to exclude data that was written in the external source system prior to a particular date.
You can ingest a stream from one Foundry enrollment to another. The dataset from the source enrollment must be a stream. A sync can be established with a dataset RID and a branch name. After specifying a schema and running for the first time, a new dataset will be created in the destination enrollment.
The sync can be configured to ingest only newly created rows, or to start by ingesting all existing rows of the stream. The main trade-off to consider is that ingesting all historical rows can be expensive from a time and compute perspective if the streaming dataset is sufficiently large.
Datasets are not supported with virtual tables. Only managed Iceberg tables on the "source" Foundry instance can be virtualized.
This section provides additional details around using virtual tables with a Foundry source. This section is not applicable when syncing to Foundry datasets.
The table below highlights the virtual table capabilities that are supported for Foundry.
| Capability | Status |
|---|---|
| Bulk registration | 🔴 Not available |
| Automatic registration | 🔴 Not available |
| Table inputs | 🟢 Generally available: tables in Code Repositories, Pipeline Builder |
| Table outputs | 🟢 Generally available: tables in Code Repositories, Pipeline Builder |
| Incremental pipelines | 🟢 Generally available |
| Compute pushdown | 🔴 Not available |
Review the virtual tables documentation for details on the supported workflows where Foundry tables can be used as inputs or outputs.
Ensure that the "destination" Foundry instance has network access to the "source" Foundry instance as well as the location of the bucket backing the Iceberg table. Verify that this bucket allows ingress from the "destination" Foundry instance.
In addition to the Foundry-to-Foundry sync workflow above, a Python transform can call the Foundry API directly using the OAuth2 client credentials grant. Use this pattern when you need to invoke Foundry endpoints that are not exposed as syncs — for example, to enumerate project contents, trigger builds, or read from the Ontology API from within a transform.
The overall setup (REST API source configuration, storing client_id/client_secret as additional secrets, and the generic token-request and pagination scaffolding) is the same as any other OAuth2 client credentials flow. Review the OAuth Client Credentials grant example on the REST API connector page for the generic pattern.
The sections below cover the Foundry-specific details that differ from a third-party API.
Foundry's OAuth2 token endpoint is:
POST /multipass/api/oauth2/token
The endpoint is hosted on the Foundry instance you are calling. If the transform runs on the same instance it is calling, the hostname on the REST API source is that same instance; if it is a different instance, the source's hostname is that of the target instance and you must configure egress policies and ingress allowlisting as described in Networking.
Learn more about the token endpoint parameters.
Every Foundry API endpoint documents the scope it requires — see the per-endpoint reference in the API documentation. Some common examples:
| Scope | Grants access to |
|---|---|
api:datasets-read | Read datasets |
api:datasets-write | Write datasets |
api:ontologies-read | Read Ontology objects and link types |
Request only the scopes your transform needs. Multiple scopes are separated by spaces, for example api:datasets-read api:datasets-write.
The client_id and client_secret used in the token request come from a third-party application registered on the target Foundry instance. Follow the steps in Authentication above to create a backend service application and obtain the client_id and client_secret, then store them as additional secrets on your REST API source.
The service user created for the client must be granted permissions on every project, dataset, or Ontology resource the transform needs to access.
The following transform requests an access token against /multipass/api/oauth2/token, then uses the token to list the children of a project via the Foundry /api/v2/filesystem/resources/{rid}/children endpoint. For the generic token-request and pagination scaffolding this example reuses, see the REST API OAuth Client Credentials grant example.
Copied!1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67import logging import pandas as pd from transforms.api import Output, transform_pandas from transforms.external.systems import external_systems, Source, ResolvedSource logger = logging.getLogger(__name__) @external_systems( foundry_api_source=Source("<source_rid>") ) @transform_pandas( Output("<output_dataset_rid>"), ) def compute(foundry_api_source: ResolvedSource) -> pd.DataFrame: base_url = foundry_api_source.get_https_connection().url client = foundry_api_source.get_https_connection().get_client() client_id = foundry_api_source.get_secret("additionalSecretClientId") client_secret = foundry_api_source.get_secret("additionalSecretClientSecret") token_response = client.post( base_url + "/multipass/api/oauth2/token", data={ "grant_type": "client_credentials", "client_id": client_id, "client_secret": client_secret, "scope": "api:datasets-read", }, headers={"Content-Type": "application/x-www-form-urlencoded"}, ) token_response.raise_for_status() access_token = token_response.json()["access_token"] auth_headers = {"Authorization": f"Bearer {access_token}"} project_rid = "ri.compass.main.folder.xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" resources = [] page_token = None while True: params = {"pageSize": 100} if page_token: params["pageToken"] = page_token response = client.get( base_url + f"/api/v2/filesystem/resources/{project_rid}/children", headers=auth_headers, params=params, ) response.raise_for_status() body = response.json() for resource in body.get("data", []): resources.append({ "rid": resource.get("rid"), "name": resource.get("displayName"), "type": resource.get("type"), }) page_token = body.get("nextPageToken") if not page_token: break logger.info(f"Fetched {len(resources)} resources so far, continuing to next page.") return pd.DataFrame(resources)