The Agent Manager is referred to as a "bootvisor" on the server where it is installed.
This page contains information on how to configure agent logs, describes several common issues with agent configuration, and provides debugging guidance.
The steps described must be taken after SSHing into the host where the agent has been installed.
Before exploring additional troubleshooting topics, we recommend first checking./var/diagnostic/launch.yml to confirm the agent successfully connected to Foundry. If the connection was unsuccessful, follow the instructions described in the field enhancedMessage.
Common issues with agent configuration
curl -s https://<your domain name>/magritte-coordinator/api/ping > /dev/null && echo pass || echo fail from the host where the agent is installed.
pass as output. In which case you should:
echo $http_proxy on the command line of a Unix-based machine.curl: (6) Could not resolve host: .... In this instance, it is likely there is something blocking the connection (e.g. a firewall or a proxy), and you should contact your Palantir representative.<agent-manager-install-location>/var/log/startup.log file.
If you see the following error: Caused by: java.net.BindException: {} Address already in use, it means there is a process already running on the port to which the Agent Manager is trying to bind.
<agent-manager-directory>/var/conf/install.yml file and looking for a port parameter (e.g. port: 1234  - here 1234 is the port). Note if there is no port parameter defined, the Agent Manager will use the default port 7032.ps aux | grep $(lsof -i:<PORT> |awk 'NR>1 {print $2}' |sort -n |uniq) where <PORT> is the port to which the Agent Manager is trying to bind.
com.palantir.magritte.bootvisor.BootvisorApplication it means another Agent Manager is already running.To fix the BindException error, you will need to find a new port for the Agent Manager, that isn't currently being used.
lsof -i :<PORT> where <PORT> is the chosen port number.Once you have found an available port, you will need to add (or update) the port parameter in the configuration stored at <agent-manager-directory>/var/conf/install.yml
Below is an example Agent Manager configuration snippet with the port set to 7032:
Copied!1 2 3... port: 7032 auto-start-agent: true
Once you have saved the above configuration, restart the Agent Manager by running <agent-manager-root>/service/bin/init.sh stop && <agent-manager-root>/service/bin/init.sh start.
Check the contents of the <agent-manager-directory>/var/data/processes/<latest-bootstrapper-directory>/var/log/startup.log file.
If you see the following error: Caused by: java.net.BindException: {} Address already in use, it means there is a process already running on the port to which the Bootstrapper is trying to bind.
port parameter (for example, port: 1234  - here 1234 is the port). Note the default port for the Bootstrapper is 7002.ps aux | grep $(lsof -i:$PORT |awk 'NR>1 {print $2}' |sort -n |uniq) where $PORT is the port to which the Bootstrapper is trying to bind.
com.palantir.magritte.bootstrapper.MagritteBootstrapperApplication it means another Bootstrapper is already running.To fix the BindException error, you will need to find a new port for the Bootstrapper, that isn't currently being used.
lsof -i :<PORT> where <PORT> is the chosen port number.Once you have found an available port, you will need to set the port parameter in the Bootstrapper's configuration. This can be done by navigating to the agent overview page in the Data Connection application. From there select the advanced configuration button and finally navigate to the "Bootstrapper" tab.
Below is an example Bootstrapper configuration snippet with the port set to 7002:
Copied!1 2 3 4server: adminConnectors: ... port: 7002 #This is the port value
Once you have updated the configuration, you will need to save your changes and restart the agent for them to take effect.
More often than not, this is caused by another "ghost" instance of the agent running that you need to find and shut down.
To find and terminate old processes, follow the steps below:
<agent-manager-install-location>/service/bin/init.sh stop.<agent-manager-install-location>/var/data/processes/index.json file.for folder in $(ls -d <agent-manager-root>/var/data/processes/*/); do $folder/service/bin/init.sh stop; done to shut down the old processes.<agent-manager-install-location>/service/bin/init.sh start).Manually starting agents on the host where they are installed (as opposed to through Data Connection) can lead to the creation of "ghost" processes.
Often when the agent process shows as "unhealthy" it is because it has crashed or been shut down by either the operating system or another piece of software such as an antivirus.
There are multiple reasons why the operating system might have shut down the process, but the most common one is because the operating system does not have enough memory to run it, which is referred to as being OOM (Out Of Memory) killed.
To check if any of the agent or Explorer subprocesses were OOM killed by the operating system, you can run the following command: grep "exited with return code 137" -r <agent-manager-directory> --include=*.log. This will search all the log files within the Agent Manager directory for entries containing 'exited with return code 137' (return code 137 signifies a process was OOM killed).
The following is an example output produced by the above command and shows the agent subprocess is being OOM killed.: ./var/data/processes/bootstrapper~<>/var/log/magritte-bootstrapper.log:ERROR [timestamp] com.palantir.magritte.bootstrapper.ProcessMonitor: magritte-agent exited with return code 137. If you see an output similar to this, you should follow the steps below on tuning heap sizes.
You can also check the operating system logs for OOM kill entries by running the following command: dmesg -T | egrep -i 'killed process. This command will search the kernel ring buffer for 'killed process' log entries, which indicates a process was OOM killed.
Actual log entries of OOM killed processes will look like the following:
[timestamp] Out of memory: Killed process 9423 (java) total-vm:2928192kB, anon-rss:108604kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:1232kB oom_score_adj:0(java) can be ignored as they are not related to your agent.Before you change any heap allocations, you should first:
free -h. On a 6 GB system, the output might look something like this:Copied!1 2 3total used free shared buff/cache available Mem: 5.8Gi 961Mi 2.8Gi 9.0Mi 2.1Gi 4.6Gi Swap: 1.0Gi 0B 1.0Gi
In the output produced by the free command, the available column shows how much memory can be used for starting new applications. To determine how much memory can be allocated to the agent, we recommend that you stop the agent and run free -h while the system is under normal to high load. The available value will tell you the maximum amount of memory you can devote to all agent processes combined. We recommend that you leave a buffer of approximately 2 - 4GB, if possible, to account for other processes on the system needing more memory, as well as off-heap memory usage by the agent processes. Note that not all versions of free show the available column, so you may need to check the documentation for the version on your system to find the equivalent information.
Determine how much memory is assigned to each of the following subprocesses: Agent Manager, Bootstrapper, agent, and Explorer.
In order to find out how much memory is assigned to the agent and Explorer subprocesses, you should navigate to the agent configuration page within Data Connection, choose the advanced configuration button, and select the "Bootstrapper" tab. From there you will see each of the subprocesses have their own configuration block; within each block you should see a jvmHeapSize parameter which defines how much memory is allocated to the associated processes.
By default, the Bootstrapper subprocess is assigned 512mb of memory. This can be confirmed by first navigating to the <agent-manager-directory>/var/data/processes/ directory; from there you will need to run ls -lrt to find the most recently created bootstrapper~<uuid> directory. Once in the most recently created bootstrapper~<uuid> directory, you can inspect the contents of the ./var/conf/launcher-custom.yml file. Here, the Xmx value is the amount of memory assigned to the Bootstrapper.
By default, the Agent Manager subprocess is also assigned 512mb of memory. This can be confirmed by inspecting the contents of the file <agent-manager-directory>/var/conf/launcher-custom.yml. Here, the Xmx value is the amount of memory assigned to the Agent Manager.
Agents installed on Windows machines do not use the launcher-custom.yml files and thus, by default, Java will allocate both the Agent Manager and Bootstrapper processes 25% of the total memory available to the system. To fix this you will need to set the Agent Manager and Bootstrapper heap sizes manually, which can be done by following the steps below:
setx -m JAVA_HOME "{BOOTVISOR_INSTALL_DIR}\jdk\{JDK_VERSION}-win_x64\"setx -m MAGRITTE_BOOTVISOR_WIN_OPTS "-Xmx512M -Xms512M"setx -m MAGRITTE_BOOTSTRAPPER_OPTS "-Xmx512M -Xms512M".\service\bin\magritte-bootvisor-winOnce you have determined how much memory the host has available and how much memory is assigned to each of the above subprocesses, you should then decide whether to: decrease the amount of memory allocated to the above processes or increase the amount of memory available to the host.
Whether or not you can safely decrease the amount of memory used by the agent processes will depend on your agent settings (for example, the maximum number of concurrent syncs and file upload parallelism), the types of data being synced, and the typical load on the agent. Decreasing the heap size makes it less likely that the OS will kill the process but more likely that the java process will run out of heap space. You may need to test different values to find what works. Contact your Palantir representative if you need assistance tuning this value.
To decrease the amount of memory allocated to one (or multiple) of the subprocesses, do the following:
jvmHeapSize parameter for each of the individual subprocesses.Copied!1 2 3agent: .... jvmHeapSize: 3g #This is jvm heap size value
Default heap allocations
By default an agent requires ~3gb of memory, allocated as follows:
Java processes also use some amount of off-heap memory; thus, we recommend you ensure there is at least ≥ 4gb left free for them.
There are two main causes of failed agent downloads: network connections and expired links.
If you can connect to Foundry but are getting an invalid tar.gz file or an error message on the download, you may have an expired or invalidated link.
A user must be an editor of a Project to create an agent in that Project, but must be an owner of the Project to administer the agents within that Project. That means that a user may create an agent and then be unable to generate download links or perform other administrative tasks on the agent. For more on agent permissions, review the guidance in our permissions reference documentation.
TLSv.1.0 and TLSv1.1 are not supported by Palantir as they are outdated and insecure protocols. Amazon Corretto builds of the OpenJDK used by Data Connection agents explicitly disable TLSv1.0 and TLSv1.1 by default under the jdk.tls.disabledAlgorithms security property in the java.security file.
Attempts to connect to a data sources system exclusively supporting TLSv1.0 and TLSv1.1 will fail with various errors including Error: The server selected protocol version TLS10 is not accepted by client preferences.
We actively discourage the usage of deprecated versions of TLS. Palantir is not responsible for security risks associated with its usage.
If there is a critical need to temporarily support TLSv1.0 and TLSv1.1, perform the following steps:
Bootstrapper tab.tlsProtocols entries to both the agent and explorer configuration blocks followed by the protocols you want to enable. Be sure to also include TLSv1.2 so any sources using it will not break. For example:Copied!1 2 3 4 5 6 7 8 9 10 11 12agent: tlsProtocols: - TLSv1 - TLSv1.1 - TLSv1.2 ... explorer: tlsProtocols: - TLSv1 - TLSv1.1 - TLSv1.2 ...

With this configuration, the agent will continue to allow TLSv1.0 and TLSv1.1 across agent upgrades and restarts. Once the datasource has moved to new TLS versions, revert all changes made to the advanced agent configuration.
To adjust the log storage settings for an agent on its host machine, follow the steps below:
Your new configuration should now be in effect.
There are a number of reasons your agent could be unavailable; for instance, the agent may be restarting or the underlying hardware running the agent could be offline or restarting.
There are two ways to determine when the agent first became unavailable:
Metrics tab.The files will remain on disk until the Bootvisor cleans up old process folders (30 days or 10 old folders triggers a clean up). These files are encrypted and the keys to decrypt them only existed in the memory of processes that died.