A solid example will explain histogram_quantile well.
Assumptions:
- ONLY ONE series for simplicity
- 10 buckets for metric
http_request_duration_seconds.
10ms, 50ms, 100ms, 200ms, 300ms, 500ms, 1s, 2s, 3s, 5s
http_request_duration_seconds is a metric type of COUNTER
| time |
value |
delta |
rate (quantity of items) |
| t-10m |
50 |
N/A |
N/A |
| t-5m |
100 |
50 |
50 / (5*60) |
| t |
200 |
100 |
100 / (5*60) |
| ... |
... |
... |
... |
- We have at least two scrapes of the series covering 5 minutes for
rate() to calculate the quantity for each bucket
rate_xxx(t) = (value_xxx[t]-value_xxx[t-5m]) / (5m*60) is the quantity of items for [t-5m, t]
- We are looking at 2 samples(
value(t) and value(t-5m)) here.
10000 http request durations (items) were recorded, that is,
10000 = rate_10ms(t) + rate_50ms(t) + rate_100ms(t) + ... + rate_5s(t).
| bucket(le) |
10ms |
50ms |
100ms |
200ms |
300ms |
500ms |
1s |
2s |
3s |
5s |
+Inf |
| range |
~10ms |
10~50ms |
50~100ms |
100~200ms |
200~300ms |
300~500ms |
500ms~1s |
1~2s |
2s~3s |
3~5s |
5s~ |
| rate_xxx(t) |
3000 |
3000 |
1500 |
1000 |
800 |
400 |
200 |
40 |
30 |
5 |
5 |
Bucket is the essence of histogram. We just need 10 numbers in rate_xxx(t) to do the quantile calculation
Let's take a close look at this expression (aggregation like sum() is omitted for simplicity)
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))
We are actually looking for the 95%th item in rate_xxx(t) from bucket=10ms to bucket=+Inf. And 95%th means 9500th here since we got 10000 items in total (10000 * 0.95).
From the table above, there are 9300 = 3000+3000+1500+1000+800 items together before bucket=500ms.
So the 9500th item is the 200th item (9500-9300) in bucket=500ms(range=300~500ms) which got 400 items within
And Prometheus assumes that items in a bucket spread evenly in a linear pattern.
The metric value for the 200th item in bucket=500ms is 400ms = 300+(500-300)*(200/400)
That is, 95% is 400ms.
There are a few to bear in mind
- Metric should be
COUNTER in nature for histogram metric type
- Series for quantile calculation should always get label
le defined
- Items (Data) in a specific bucket spread evenly a linear pattern (e.g.: 300~500ms)
Prometheus makes this assumption at least
- Quantile calculation requires buckets being sorted(defined) in some ascending/descending order (e.g.: 1ms < 5ms < 10ms < ...)
- Result of
histogram_quantile is an approximation
P.S.:
The metric value is not always accurate due to the assumption of Items (Data) in a specific bucket spread evenly a linear pattern
Say, the max duration in reality (e.g.: from nginx access log) in bucket=500ms(range=300~500ms) is 310ms, however, we will get 400ms from histogram_quantile via above setup which is quite confusing sometimes.
The smaller bucket distance is, the more accurate approximation is.
So setup the bucket distances that fit your needs.