Package com.dieselpoint.norm.latency
Class BackoffLatencyAlerter
java.lang.Object
com.dieselpoint.norm.latency.BackoffLatencyAlerter
- All Implemented Interfaces:
LatencyAlerter
One of the dangers when reporting latency issues to external services, is that the reporting itself a) takes a
significant amount of time and may create Customer Experience issues, and b) you end up with millions of latency
alerts when a database goes bad. This class implements a basic "exponential backoff with jitter" algorithm. Subclasses
can simply implement
alertLatencyFailureAfterBackoffAndJitter(DbLatencyWarning, long)
to take advantage of the exponential backoff facility.
var cwAlerter = CloudWatchAlerter( Duration.ofMillis( 500 ), Duration.ofMinutes( 10 ) ); will initially
alert at 500ms intervals, then 1000ms (1 second), 2 seconds, 4 seconds .... 10 minutes.
For more information, refer to Exponential Backoff And Jitter by Amazon Web Services
When implementing your alerter, remember to swallow errors. You don't want your platform slowing down/failing
because the monitoring service is failing. See the
alertLatencyFailureAfterBackoffAndJitter(DbLatencyWarning, long) documentation for
further steps to ensure that monitoring doesn't accidentally become a significant overhead.
-
Constructor Summary
ConstructorsConstructorDescriptionBackoffLatencyAlerter(Duration minimumReportingInterval, Duration maximumReportingInterval) -
Method Summary
Modifier and TypeMethodDescriptionvoidalertLatencyFailure(DbLatencyWarning warning) abstract booleanalertLatencyFailureAfterBackoffAndJitter(DbLatencyWarning warning, long numberOfAlertsSwallowed)
-
Constructor Details
-
BackoffLatencyAlerter
-
-
Method Details
-
alertLatencyFailure
- Specified by:
alertLatencyFailurein interfaceLatencyAlerter
-
alertLatencyFailureAfterBackoffAndJitter
public abstract boolean alertLatencyFailureAfterBackoffAndJitter(DbLatencyWarning warning, long numberOfAlertsSwallowed) - Parameters:
warning- the latency warningnumberOfAlertsSwallowed- the number of alerts that were swallowed during the exponential backoff period. This might (or might not) be interesting to report alongside the current issue. It'll definitely give you a sense of how bad things have gone!- Returns:
- true if notifying the remote service was successful, false otherwise. If false, then we'll automatically backoff calls to reporting in the same way as latency failures, to avoid a slowdown / issue on monitoring impacting the actual customer experience
-