This page describes several common issues with Syncs with steps to debug:
PKIX exceptions and other SSLHandshakeExceptions occur when the agent does not have the correct certificates and therefore cannot authenticate with the source. To ensure that you have the correct certificates installed, follow the guide in our Data Connection and Certificates documentation.
If your sync fails with the error Response 421 received. Server closed connection, this suggests you may be connecting with an unsupported SSL protocol / port combination. An example includes implicit FTPS over port 991, which is an outdated and unsupported standard. Explicit SSL over port 21 is the preferred method in this case.
If your sync is an FTP/S sync, ensure that you are not using an egress proxy load balancer. FTP is a stateful protocol, so using a load balancer can cause the sync to fail if sequential requests don't originate from the same IP.
Note that due to the nature of load balancing, failures will be non-deterministic; syncs and previews may sometimes succeed, even with the load-balancing proxy in place.
If your sync or exploration is failing with the error com.amazonaws.services.s3.model.AmazonS3Exception:Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: null; S3 Extended Request ID: null), this means that the command is hitting an error on going through the egress proxy. If you receive this error, you should check whether any of the following scenarios are applicable:
To the S3 source proxyConfiguration block, add:
Copied!1 2 3 4host: <address of deployment gateway or egress NLB> port: 8888 protocol: http nonProxyHosts: <bucket>.s3.<region>.amazonaws.com,s3.<region>.amazonaws.com,s3-<region>.amazonaws.com
e.g: Allowlisting all VPC buckets would involve a config addition of:
Copied!1 2 3 4 5 6clientConfiguration: proxyConfiguration: host: <color>-egress-proxy.palantirfoundry.com port: 8888 protocol: http nonProxyHosts: *.s3.<region>.amazonaws.com, s3.<region>.amazonaws.com, s3-<region>.amazonaws.com
To see the exact query that ran against your source system, refer to _data-ingestion.log.
If your sync is an incremental sync, ensure you have provided a monotonically increasing column (e.g. timestamp or id) and an initial value for this column.
Once you've chosen the incremental column, you need to make sure you have added the ? operator to the SQL query in the sync configuration page (the ? is replaced with the 'incremental' value and only a single ? may be used). For example, SELECT * FROM table WHERE id > ?.
If you believe there are rows missing from your synced dataset or that previously synced rows aren't being properly updated, check the following:
ID as your monotonically increasing column and the last ID value synced in the last sync was 10, and then you add a row with ID 5, that row with ID 5 won't be synced.If you believe existing rows are being re-synced, check the following:
LONG or a STRING (in ISO-8601 format).If a NullPointerException is thrown on your incremental sync, this may indicate that the SQL query is retrieving rows from the database that would cause the incremental column to contain null values.
SELECT * FROM table WHERE col > ? OR timestamp > 1, where col is the incremental column being used for the sync. The use of OR means that the query does not guarantee that col only contains non-null values. If a null value for col is synced for any row, then the sync will fail upon Data Connection attempting to update the incremental state for the sync since the current state will be compared with the synced null value and throw an error.SELECT * FROM table WHERE (col > ? OR timestamp > 1) AND col IS NOT NULL.If you wish to change the incremental column used for your sync, we recommend that you create a new sync.
On the agent host, in the <bootvisor-directory>/var/data/processes directory, run ls -lrt to find the most recently created bootstrapper~<uuid> directory.
cd into that directory and navigate to /var/log/.magritte-agent-output.log.If you see the error OutOfMemory Exception, it means that the agent cannot handle the workload being assigned to it.
Below are some common causes of hanging syncs and their associated fixes:
All syncs: Hanging during the fetching stage
If your sync is hanging during the fetching stage, check if the source is both available and operational:
JDBC syncs: Hanging during the fetching stage
If your sync is taking longer than expected to complete the fetching stage, it could be because the agent is making a large number of network and database calls. In order to tune the number of network and database calls made during a sync, you can alter the Fetch Size parameter:
Fetch Size parameter is located within the "advanced options" section of the source configuration and defines the number of rows fetched during each database round trip for a given query. Therefore:
Fetch Size parameter will result in fewer rows being returned per call to the database, and more calls will be required. However, this means the agent will use less memory as fewer rows will be stored in the agent's heap at a given time.Fetch Size parameter will result in more rows being returned per call to the database, and fewer calls will be required. However, this means the agent will use more memory as a larger number of rows will be stored in the agent's heap at a given time.Fetch Size: 500 and tuning accordingly.JDBC syncs: Hanging during the upload stage
If your sync is taking a long time to upload files or fails during the upload stage, you could be overloading a network link. In this case we suggest tuning the Max file size parameter:
Max file size parameter is located within the "advanced options" section of the source configuration and defines the maximum size (in bytes or rows) of the output files which are uploaded to Foundry. Therefore:
Max file size parameter can increase pressure on the network as smaller files are uploaded more frequently; if a file upload fails, the cost of re-uploading is less.Max file size parameter will require less total bandwidth, but such uploads are more likely to fail.Max file size: 120mb.FTP / SFTP / Directory / syncs: Hanging during the fetching stage
The most common reason why file-based syncs hang during the fetching stage is because the agent is crawling a large file system.
Syncs that crawl a filesystem will do two complete crawls of the filesystem (unless configured otherwise). This is to ensure the sync does not upload files which are currently being written to or altered in any way.
REQUEST_ENTITY_TOO_LARGE errorDownloading, processing, and uploading large files is error-prone and slow. REQUEST_ENTITY_TOO_LARGE service exceptions occur if an individual file exceeds the maximum size configured for the agent's upload destination. For the data-proxy upload strategy, this is set to 100GB by default.
Overriding the limit is not recommended; if possible, find a way to access this data as a collection of smaller files. However, if you wish to override this limit as a temporary workaround, use the following steps:
Within Data Connection, navigate to your agent and select the Advanced configuration tab.
Select the "Agent" tab.
Under the destinations block, include the following to increase the limit to 150Gb:
Copied!1 2 3uploadStrategy: type: data-proxy maximumUploadedFileSizeBytes: 161061273600
BadPaddingException errorBadPaddingException exceptions occur because the source credential encryption key stored within the agent is not what was expected. This commonly happens when an agent manager is manually upgraded, but the old /var/data directory is not copied to the new install location.
The easiest way to resolve this is to re-enter the credentials for each of the sources using the affected agent.
When rows are synced from a JDBC source and they contain timestamp columns, those timestamp columns will be cast to long columns in Foundry. This behavior exists for backwards compatibility reasons.
To fix the data type for these columns, we recommend using a Python Transform environment to perform this cleaning. Here is an example code snippet that casts the column "mytimestamp" back into timestamp form:
Copied!1df = df.withColumn("mytimestamp", (F.col("mytimestamp") / 1000).cast("timestamp"))