Version: 0.8

Scale Feature Servers

Private Preview

This feature is currently in Private Preview.

This feature has the following limitations:

Access to this API is limited based on account type

If you would like to participate in the preview, please file a feature request.

Tecton provides an API to programmatically scale Feature Servers to handle irregular traffic patterns. For example, a customer expecting double the traffic during the holiday season can provision 2 times the Feature Servers compared to the normal, which would enable them to gracefully serve features during peak traffic without any server errors.

When to scale up feature servers

Tecton recommends customers to consider scaling up feature servers during capacity planning, especially when expecting the traffic levels to surpass the current capacity provisioned by Tecton. An additional indication for scaling up is when encountering 429 errors while making 'get feature' requests. Tecton exposes the current usage through the overall feature serving dashboard. If the utilization percentage is close to 100%, Tecton will respond with a 429 error code to prevent oversaturation.

When to scale down feature servers

Tecton recommends customers to consider downsizing their feature server if, over the last 10 days, the peak utilization remains below 50% of the allocated capacity, and customer don't foresee increased traffic to Tecton in the near future. Customers can review the current utilization specifics through the overall feature serving dashboard.

Using the Scaling API

The scaling API lets users retrieve the current Feature Server configuration and scaling the pods up or down. In the following examples, please make sure to update the following based on your cluster configuration:

<CLUSTER_URL> to match the cluster URL (e.g. mycluster.tecton.ai)
<API_KEY> to refer to an API key with admin permissions on the cluster
<NUMBER> to refer to the desired count of Feature Server pods

Retrieve Current Feature Server Configuration

curl https://<CLUSTER_URL>/api/v1/metadata-service/get-feature-server-config -H "Authorization: Tecton-key <API_KEY>

Scale your Feature Server pods up or down

curl https://<CLUSTER>/api/v1/metadata-service/set-feature-server-config \
  -H "Authorization: Tecton-key <API_KEY>" \
  -X POST -d '{ "count" : <NUMBER> }'

Sample Response for Both Queries

This response indicates that your cluster has created 5 total Feature Server pods. Of the 5 pods, 2 are available and ready for serving. It also shows the desired number of pods that you can update via the set api.

{"currentCount":5,"availableCount":2,"desiredCount":10}

Errors

The maximum number of feature server pods allowed is X. Request count is Y
- There is a limit to the maximum number of pods you can provision. Please contact Tecton support if you want to raise this limit.
You cannot increase the number of pods by more than X in a single request. Requested increase of pods by Y
- There is a limit to the number of pods you can add using one query. We default this limit to 50 pods. Please wait for the availableCount to reach the desiredCount before attempting to scale further.
serviceAccount <sa> not authorized to perform action scale_feature_server. See ../docs/setting-up-tecton/administration-setup/user-management-and-access-controls#summary-of-roles-and-permissions for details of what roles include the requested access.
- This indicates that your service account doesn't have access to the scaling API. Go to Accounts and Access in your web ui and give your service account the admin role.

Scale Feature Servers

When to scale up feature servers​

When to scale down feature servers​

Using the Scaling API​

Retrieve Current Feature Server Configuration​

Scale your Feature Server pods up or down​

Sample Response for Both Queries​

Errors​

Was this page helpful?