Create an Amazon SageMaker inference endpoint
Generally available; Added in 9.1.0
Path parameters
-
The type of the inference task that the model will perform.
Values are
text_embedding
,completion
,chat_completion
,sparse_embedding
, orrerank
. -
The unique identifier of the inference endpoint.
Query parameters
-
Specifies the amount of time to wait for the inference endpoint to be created.
Values are
-1
or0
.
PUT
/_inference/{task_type}/{amazonsagemaker_inference_id}
Console
PUT _inference/text_embedding/amazon_sagemaker_embeddings
{
"service": "amazon_sagemaker",
"service_settings": {
"access_key": "AWS-access-key",
"secret_key": "AWS-secret-key",
"region": "us-east-1",
"api": "elastic",
"endpoint_name": "my-endpoint",
"dimensions": 384,
"element_type": "float"
}
}
resp = client.inference.put(
task_type="text_embedding",
inference_id="amazon_sagemaker_embeddings",
inference_config={
"service": "amazon_sagemaker",
"service_settings": {
"access_key": "AWS-access-key",
"secret_key": "AWS-secret-key",
"region": "us-east-1",
"api": "elastic",
"endpoint_name": "my-endpoint",
"dimensions": 384,
"element_type": "float"
}
},
)
const response = await client.inference.put({
task_type: "text_embedding",
inference_id: "amazon_sagemaker_embeddings",
inference_config: {
service: "amazon_sagemaker",
service_settings: {
access_key: "AWS-access-key",
secret_key: "AWS-secret-key",
region: "us-east-1",
api: "elastic",
endpoint_name: "my-endpoint",
dimensions: 384,
element_type: "float",
},
},
});
response = client.inference.put(
task_type: "text_embedding",
inference_id: "amazon_sagemaker_embeddings",
body: {
"service": "amazon_sagemaker",
"service_settings": {
"access_key": "AWS-access-key",
"secret_key": "AWS-secret-key",
"region": "us-east-1",
"api": "elastic",
"endpoint_name": "my-endpoint",
"dimensions": 384,
"element_type": "float"
}
}
)
$resp = $client->inference()->put([
"task_type" => "text_embedding",
"inference_id" => "amazon_sagemaker_embeddings",
"body" => [
"service" => "amazon_sagemaker",
"service_settings" => [
"access_key" => "AWS-access-key",
"secret_key" => "AWS-secret-key",
"region" => "us-east-1",
"api" => "elastic",
"endpoint_name" => "my-endpoint",
"dimensions" => 384,
"element_type" => "float",
],
],
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"service":"amazon_sagemaker","service_settings":{"access_key":"AWS-access-key","secret_key":"AWS-secret-key","region":"us-east-1","api":"elastic","endpoint_name":"my-endpoint","dimensions":384,"element_type":"float"}}' "$ELASTICSEARCH_URL/_inference/text_embedding/amazon_sagemaker_embeddings"
Request examples
A text embedding task
Run `PUT _inference/text_embedding/amazon_sagemaker_embeddings` to create an inference endpoint that performs a text embedding task.
{
"service": "amazon_sagemaker",
"service_settings": {
"access_key": "AWS-access-key",
"secret_key": "AWS-secret-key",
"region": "us-east-1",
"api": "elastic",
"endpoint_name": "my-endpoint",
"dimensions": 384,
"element_type": "float"
}
}
Run `PUT _inference/completion/amazon_sagemaker_completion` to create an inference endpoint that performs a completion task.
{
"service": "amazon_sagemaker",
"service_settings": {
"access_key": "AWS-access-key",
"secret_key": "AWS-secret-key",
"region": "us-east-1",
"api": "elastic",
"endpoint_name": "my-endpoint"
}
}
Run `PUT _inference/chat_completion/amazon_sagemaker_chat_completion` to create an inference endpoint that performs a chat completion task.
{
"service": "amazon_sagemaker",
"service_settings": {
"access_key": "AWS-access-key",
"secret_key": "AWS-secret-key",
"region": "us-east-1",
"api": "elastic",
"endpoint_name": "my-endpoint"
}
}
Run `PUT _inference/sparse_embedding/amazon_sagemaker_sparse_embedding` to create an inference endpoint that performs a sparse embedding task.
{
"service": "amazon_sagemaker",
"service_settings": {
"access_key": "AWS-access-key",
"secret_key": "AWS-secret-key",
"region": "us-east-1",
"api": "elastic",
"endpoint_name": "my-endpoint"
}
}
Run `PUT _inference/rerank/amazon_sagemaker_rerank` to create an inference endpoint that performs a rerank task.
{
"service": "amazon_sagemaker",
"service_settings": {
"access_key": "AWS-access-key",
"secret_key": "AWS-secret-key",
"region": "us-east-1",
"api": "elastic",
"endpoint_name": "my-endpoint"
}
}