QPS stands for queries per second. It is also referred to sometimes as TPS or transactions per second. Depending on the customer loads, each customer can send requests to Trestle at various queries per second load.
By default, every customer is set at 10 QPS, so if they send a request higher than 10, they will get a 429 HTTP error.
In general, we see 95% of our customers are within this QPS limit and are able to get all of their queries served successfully without any errors. However, we certainly have customers who need a much higher QPS. They generally fall into two categories:
- Enterprise customers serving high loads
- Batch runs that customers run on a daily, weekly, or monthly basis
In both these cases, Trestle can work with the customer to appropriately adjust the QPS allowed. Our service tiers cover up to 256 QPS, but we have a few customers with a custom configuration of up to 750 QPS. No matter what the customer requirement is, Trestle’s scalable architecture should easily be able to support it. However, we highly recommend our customers ensure they are tuned at the right QPS. For example, customers can run ~1MM queries within an hour at 256 QPS. The customer should evaluate if that is really a need, especially for a batch run or if the requests can be appropriately spaced out at a lower QPS.
A couple of other questions we get asked frequently is, “Why charge for a higher QPS when we are already being charged per query,” and “Is the cost for a higher QPS increasing linearly?” While parts of our architecture do scale linearly, there are certain systems that need to be scaled beforehand to avoid cold start issues. Depending on the customer needs, Trestle has to appropriately scale its systems beforehand, which might stay idle for long periods of time costing Trestle in terms of cloud infrastructure costs. However, a lot of these systems do not scale linearly, so the scalability changes in step function. For example, a single database node can support up to 32 QPS, and then an additional node, which increases the costs 2x, will be needed to even serve at 35 QPS without any negative impact on the latencies.