How has site reliability engineering revealed hidden issues with early alerts?
#1
Site reliability engineering is about preventing outages, but sometimes the most valuable lessons come from when things break. What's a specific, non-obvious metric or alert you've set up that gave you an early warning for a problem users hadn't even noticed yet?
Reply


[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Forum Jump: