InferenceProviderTargetConfiguration
The configuration for a provider-based inference target. This configuration explicitly defines the endpoint, model mapping, and operations used to route requests to a large language model (LLM) provider.
Contents
- endpoint
-
The HTTPS endpoint of the inference provider that the gateway forwards requests to.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 2048.
Pattern:
https://[a-zA-Z0-9\-\.]+(:[0-9]{1,5})?(/.*)?Required: Yes
- modelMapping
-
The configuration that translates client-facing model IDs to the model IDs expected by the provider.
Type: ModelMapping object
Required: No
- operations
-
A list of per-operation configurations that map request paths to the models supported for each operation.
Type: Array of InferenceOperationConfiguration objects
Array Members: Minimum number of 1 item. Maximum number of 10 items.
Required: No
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: