How to load test an API

Load testing is used to test scalability and performance of sites, apps and APIs; there are however some differences in how to prepare and configure your load test depending on your type of system and what answer you are looking to get from your testing.

When testing a web site or application, you often want to know how many visitors your service can handle or how it performs with a specific number of visitors. In a load test, a visitor interacting with your service is represented by a virtual user, VU. The number of VUs in your load test simulates concurrent visitors to your website.

However, when testing an API, it is more common to want to know the scalability of your system by measuring the throughput in terms of requests per second, RPS. In the same way that the number of visitors or VUs is an interesting concept and metric when talking about websites, requests per second or RPS is an interesting metric for understanding the throughput of an API. 

In the current version of Load Impact, sizing of load tests are determined by VUs. Converting the scale or size of a load test from VUs to RPS depends on a number of factors

  • Response times - depends on system response times, network latency, etc.
  • Complexity of processing logic of load testing script - e.g. are we hammering a simple API with GET-requests, or looking to exercise a system using some more sophisticated logic involving client-side processing and computation?
  • VU concurrency and infrastructure processing overhead - Load Impact provisions load generator instances based on sizing of test, where up to 500 concurrent VUs on each instance are instantiated with instructions to run specific load testing scripts. In the default setting, Load Impact also allows each VU to keep 4 active concurrent TCP connections.
The below sample script attempts to generate throughput towards the system under test at 25 RPS/VU: 
-- This script takes a URL as input, and attempts to generate a steady flow 
-- of 25 requests per second, RPS. Dividing your target RPS by 25 while using this
-- script as the user scenario(s), you will know the number of required VUs to 
-- model in your test configuration.

-- What URL are we looking to hit in this test
local url = ""

-- For controlled RPS/VU testing where Load Impact runs up to 500 VUs per 
-- load generator instance, we have found 25 RPS/vu to scale linearly
local requestsPerSecond = 25

-- Use this setting to change the number of concurrent TCP requests per V
-- For controlled RPS/VU testing, we have found that default settings are OK
--http.set_max_connections(30, 30)

-- NO NEED TO MODIFY SCRIPT BELOW ---------------------

-- in the case of high-throughput RPS-testing, we will not report stats 
-- on each http-request result (only once per batch)
http.set_option("report_results", false)

-- sizing of each batch, 3x seems to work fine
local requestsPerBatch = 3*requestsPerSecond
-- through empirical testing, we have found increasing defined RPS by 5% to 
-- better reflect what's actually being generated by script below 
local batchesPerSecond = (1.05*requestsPerSecond)/requestsPerBatch
-- what's the upper duration limit for each batch of requests?
-- (for RPS/vu=25, below equals ~2.86s)
local batchDuration = 1/batchesPerSecond

-- our never-ending loop where requests are generated, batch by batch
while true do
  -- generate the batch request
  local requests = {}

  -- insert the first request, with report_results=true
  table.insert(requests, { "GET", url, headers = { ["Accept-Encoding"]="gzip, sdch, deflate, br" }, report_results=true })
  for i=requestsPerBatch-1, 1, -1 do
    -- insert all other requests, using default report_results (false)
    table.insert(requests, { "GET", url, headers = { ["Accept-Encoding"]="gzip, sdch, deflate, br" } })

  local pageDuration = util.time()
  -- create a page, so we get stats reporting on total batch request
  http.page_start(requestsPerSecond .. "RPS/VU batch")
  -- run our batch of HTTP requests
  local response = http.request_batch(requests)
  http.page_end(requestsPerSecond .. "RPS/VU batch")
  pageDuration = util.time() - pageDuration

  -- We pace the script based on desired RPS as defined in header of script. 
  -- If SUT is unable to service requests at our desired pace, we will back 
  -- off and sleep proportionately, by use of abs-function
  local sleepTime = math.abs(batchDuration-pageDuration)
  result.custom_metric("sleep_time", sleepTime)

  -- Allow for single iteration only, when validating
  if test.is_validation() then         'Running a validation, returning')


  • There are a few possible configurations or customizations to the script that may or may not apply depending on the characteristics of the system under test - e.g. changing concurrent TCP connections per VU.
  • When following this pattern, the VU Load Time is likely a meaningless metric, and it is not reported until the end of the user scenario execution. Checkout this article for more information.
  • This script generates a custom metric sleep_time that you can use for understanding the behavior of your testing
  • Below is an example chart showing how Load Impact would report a 500VU/11m run of the above sample script:

See also:

Feedback and Knowledge Base