This endpoint is in preview and may be modified or removed at any time.
To use this endpoint, add preview=true to the request query parameters.
Creates a new live deployment for a model version with the specified runtime configuration. The deployment will begin provisioning compute resources and deploying the target model version.
Third-party applications using this endpoint via OAuth2 must request the following operation scope: api:models-write.
objectunionThe target model source for the live deployment. Determines which model and version selection strategy to use when creating the deployment.
objectThe compute resource configuration for the deployment.
objectThe created LiveDeployment
stringThe Resource Identifier (RID) of a Live Deployment.
objectThe currently deployed model version.
stringoptionalThe model branch this deployment tracks. Present for direct deployments that follow the latest model version on a branch; absent for deployment types that are not branch-scoped.
objectThe compute resource configuration for the deployment.
objectThe current operational status of the deployment.
1
2
3
4
5
curl -X POST \
\t-H "Content-type: application/json" \
\t-H "Authorization: Bearer $TOKEN" \
"https://$HOSTNAME/api/v2/models/liveDeployments?preview=true" \
-d '{"runtimeConfiguration":{"minReplicas":1,"maxReplicas":3,"cpu":1.0,"memory":"256MiB","threadCount":32}}'1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
{
"runtimeConfiguration": {
"minReplicas": 1,
"maxReplicas": 3,
"cpu": 1,
"memory": "256MiB",
"threadCount": 32
},
"modelVersion": {
"modelRid": "ri.models.main.model.f351c142-0e4c-4b12-adc2-6e1539737ae9",
"modelVersionRid": "ri.models.main.model-version.adf94926-c3ac-41ea-beb2-4946699d08ee"
},
"rid": "ri.foundry-ml-live.main.live-deployment.f351c142-0e4c-4b12-adc2-6e1539737ae9",
"branch": "master",
"status": {
"state": "ACTIVE",
"isReady": true
}
}| Error Name | ||
|---|---|---|
Thread | Error Code | INVALID_ARGUMENT |
| Status Code | 400 | |
| Description | The specified thread count exceeds the maximum allowed value. | |
| Parameters | maxThreadCount, providedThreadCount | |
Invalid | Error Code | INVALID_ARGUMENT |
| Status Code | 400 | |
| Description | The GPU count is invalid. The GPU count must be between 1 and the maximum allowed for the requested GPU type. | |
| Parameters | providedGpuCount, maxGpuCount | |
Gpu | Error Code | INVALID_ARGUMENT |
| Status Code | 400 | |
| Description | The requested GPU type is not available. Use a GPU type that is available in the deployment's resource queue. | |
| Parameters | requestedGpuType, availableGpuTypes | |
Create | Error Code | PERMISSION_DENIED |
| Status Code | 403 | |
| Description | Could not create the LiveDeployment. | |
| Parameters | | |
Model | Error Code | NOT_FOUND |
| Status Code | 404 | |
| Description | The given Model could not be found. | |
| Parameters | modelRid | |
See Errors for a general overview of errors in the platform.