Active incident investigation
Describe what you’re seeing and when it started. The more context you provide — time, service, identifiers — the faster and more precise the results.
“We’re seeing elevated 500s in auth-service since around 14:30 UTC. What’s going on?”
“Investigate latency spikes inpayments-apistarting 14:32 UTC. Trace ID:abc-123, error:connection pool exhausted.”
“Deploymenta3f9c21went out at 15:45 UTC. Did it cause the error spike we’re seeing incheckout-service?”
“There’s a Jira ticket INC-8823 open for this — can you investigate?”
Follow-up questions
After an initial investigation, ask follow-up questions to drill deeper, challenge a hypothesis, or expand the scope. Traversal retains the context of the investigation, so you don’t need to repeat yourself.“Can you focus on the database layer specifically?”
“What happened in the 15 minutes before the spike?”
“What’s the blast radius — which other services are affected?”
Service health
Use Traversal to check in on a service or system without a specific incident in mind. Useful for routine health checks, pre-deploy sanity checks, or investigating a vague concern before it becomes a page.
“What’s the health of auth-service over the last 24 hours?”
“Are there any anomalies in the data pipeline right now?”
“What typically goes wrong with the database cluster?”
Dashboards
Point Traversal at a specific dashboard and ask it to interpret what it sees. Useful when you notice something unusual but aren’t sure how to read it, or want a second opinion on whether a pattern is significant.“Take a look at the payments dashboard and tell me if anything looks off.”
“Walk me through what you see in the prod-us-east latency dashboard from this morning.”
“Is the spike on the error rate dashboard related to the deployment that went out at 15:45?”
System exploration and onboarding
Traversal can map dependencies, explain service relationships, and surface historical incident patterns — making it a powerful tool for engineers ramping up on an unfamiliar system or preparing for on-call.
“What does payments-api depend on, and has anything changed recently?”
“Walk me through the checkout flow and its dependencies.”
“I’m new to the payments team — what are the most common failure modes in this service?”
“What should I know about this service before going on-call?”
Code and pull requests
When Traversal identifies a root cause tied to a code issue, it can look at the relevant commits, identify the change that introduced the regression, and open a pull request with a fix.
“The root cause is a missing null check in payments-api — can you open a PR to fix it?”
“Create a pull request with a fix for the connection pool exhaustion we identified.”
“Look at recent changes to checkout-service and tell me if any of them could have caused this.”
“Which commit introduced this regression?”
Alert trends and recommendations
Traversal can analyze patterns across your alert channels to surface which alerts are noisy, which are flapping, and which thresholds need adjustment.“Which alerts are firing most frequently this week?”
“Are there any alerts that keep firing and resolving on their own? I want to identify what’s flapping.”
“Give me recommendations for which alerts we should tune or suppress.”
“This alert has fired 47 times in the last 7 days. Is it signaling anything real, or should we adjust the threshold?”
Post-mortems
Traversal can generate a blameless post-mortem from an incident channel or from context you provide directly. See Post-mortems for the full options, including automatic post-mortem generation in Slack.“@Traversal write a post-mortem for this incident.”
“Create a blameless post-mortem for today’s auth outage starting at 09:12 UTC.”