Work - Firetiger
Firetiger
Designing investigation interfaces and monitoring dashboards for security analysts to trace, correlate, and resolve incidents.
Auth Latency Spike
Webhook trigger
2 min ago
Auth service p99 latency exceeded 500ms SLO. Investigate root cause across traces and notify the on-call channel.
Reading runbook
CompleteResearch on service topology
CompleteRetrieve relevant traces
CompleteQuery connection pool metrics
CompleteCorrelate deploy timeline
CompleteAgent: Latency Root Cause
CompleteFound 3 related incidents
CompleteInvestigation UI
Designed the agent session UI at Firetiger — the primary surface for viewing how autonomous agents investigate production incidents. Each session renders the trigger message followed by a step-by-step execution trace with research actions, queries, tool uses, and a final outcome status. The interface lets engineers follow the agent's reasoning, inspect individual actions, and understand how production data, codebase context, and business logic were correlated to reach a resolution.
Monitoring Agents
3 activeAuth Latency Monitor
Every 15m · 142 sessions
Connection Pool Watch
Every 5m · 89 sessions
Deploy Health Check
Post-deploy · 24 sessions
Availability
99.97%
Error Budget
12.3%
Issues
3 open
SLO Monitoring
Built the monitoring and observer views at Firetiger — combining autonomous agent management with SLO dashboards over the data lake. Engineers configure monitoring agents with scheduled, webhook, or post-deploy triggers, then track session outcomes alongside key observability metrics like availability, error budget, and open issues across logs, traces, and metrics.