Skip to content

Incident Response

The Incident Response plugin acts as an AI first-responder for live incidents. It correlates the incident window with recent deployments, surfaces anomalies from logs and metrics, suggests immediate mitigations, and drafts a post-mortem timeline — so your team can focus on resolving the incident, not reconstructing it.

Works with Azure DevOps Work Items, GitHub Issues, or plain text input.


Each run produces five structured outputs posted directly on the incident item — one comment per finding, the original description is never modified:

#OutputDescription
1🚨 Incident SummarySeverity, affected services, blast radius window, status signal
2🚀 Deployment CorrelationRecent deployments within the incident window; each flagged as likely-cause / possible-cause / unrelated
3📊 Logs & Metrics AnalysisError spikes, latency degradation, anomaly timestamps surfaced from Azure Monitor
4🛠️ Mitigation SuggestionsImmediate actions, rollback candidates, config change hints, blast radius containment
5📝 Post-Mortem DraftChronological timeline of events, contributing factors, action items

A status signal is applied as a tag/label on the incident item:

SignalMeaning
investigatingCorrelation found; root cause not yet confirmed
resolvedRoot cause identified and mitigation confirmed
needs-dataInsufficient signal — the output lists what data is missing

Sections with no real findings are skipped, never filled with “None identified.”


flowchart TD
    A[Receive incident ID / alert] --> B[Fetch incident details from platform]
    B --> C[Identify blast radius time window]
    C --> D[Phase 1: 3 analysts in parallel]
    D --> D1[Deployment Correlator]
    D --> D2[Log Analyzer]
    D --> D3[Metrics Analyzer]
    D1 & D2 & D3 --> E[Mitigation Advisor]
    E --> F[Post-Mortem Drafter]
    F --> G[Post 5 structured outputs to incident]
  1. Fetch incident — pulls the work item from Azure DevOps (REST API), GitHub issue, or pasted text. Extracts severity, start time, affected service(s), and any description the team has already written.
  2. Identify time window — computes a blast radius window ending at incident start time and extending back by INCIDENT_WINDOW_HOURS (default: 2 hours). All subsequent queries use this window.
  3. Phase 1 (parallel) — three analysts run simultaneously:
    • Deployment Correlator — queries ADO Release / Pipeline history or GitHub Actions runs for any deployments that touched the affected services within the blast radius window. Each deployment is tagged as likely-cause, possible-cause, or unrelated based on timing and service overlap.
    • Log Analyzer — runs KQL queries against Azure Monitor Log Analytics (az monitor log-analytics query) targeting error counts, exception traces, and latency anomalies within the window.
    • Metrics Analyzer — reads a metrics snapshot (JSON export or az monitor metrics list output) and identifies deviations from baseline for error rate, latency P95/P99, and saturation.
  4. Mitigation Advisor — synthesises Phase 1 output and suggests concrete next steps: rollback candidates (with deployment IDs), config changes, hotfix hints, and blast radius containment actions.
  5. Post-Mortem Drafter — builds a chronological timeline from all findings, drafts contributing factors, and lists action items as discussion prompts for the team.
  6. Post outputs — five ordered comments on the incident item. For unsupported platforms, output is written to incident-response-report.md.

InputSourceRequiredDescription
Repository URLAgent ruleYesThe repository containing deployment history — provided by the Xianix Agent rule, not typed in the prompt
Incident ID / URLPromptYesADO work item ID, GitHub issue number, or full item URL (e.g. 4821)
Blast radius windowINCIDENT_WINDOW_HOURS env varNoHours before incident start to search for deployments (default: 2)
Metrics snapshotMETRICS_SOURCE env varNoPath to a JSON file exported from az monitor metrics list

The platform (Azure DevOps, GitHub, etc.) is auto-detected from git remote — you don’t need to specify it.


/incident-response 4821
/incident-response INC-4821
/incident-response https://dev.azure.com/org/project/_workitems/edit/4821

The Xianix Agent reads these from its secrets store and injects them at runtime via the rule’s with-envs block (see the rule examples below). For local CLI use, export them in your shell.

VariablePlatformRequiredPurpose
AZURE-DEVOPS-TOKENAzure DevOpsYesPAT for work items, build, and release pipeline API
GITHUB-TOKENGitHubYesAuthenticate gh CLI for issues and Actions API
ACTIONS-TOKENGitHub ActionsOptionalSeparate token if Actions API needs elevated scope
AZURE-CLIENT-IDAzure MonitorYes (for logs)Service principal client ID for az CLI authentication
AZURE-CLIENT-SECRETAzure MonitorYes (for logs)Service principal secret
AZURE-TENANT-IDAzure MonitorYes (for logs)Azure AD tenant ID
LOG-ANALYTICS-WORKSPACE-IDAzure MonitorYes (for logs)Log Analytics workspace ID to query
INCIDENT-WINDOW-HOURSAllNoBlast radius window in hours (default: 2)
METRICS-SOURCEAllNoPath to metrics JSON snapshot file
RoleScopePurpose
Log Analytics ReaderLog Analytics WorkspaceRun az monitor log-analytics query
Monitoring ReaderResource Group / SubscriptionRead metrics via az monitor metrics list

Terminal window
# Point Claude Code at the plugin
claude --plugin-dir /path/to/xianix-plugins-official/plugins/incident-response
# Then in the chat
/incident-response 4821

See docs/platform-config.md for full credential setup and docs/incident-sources.md for how to configure log and metrics sources.

Or trigger it automatically via the Xianix Agent by adding a rule — see the examples below and the Rules Configuration guide.


Add one (or both) of the execution blocks below to your rules.json so the Xianix Agent automatically kicks off incident response when a webhook fires.

The Incident Response Agent is tag-driven. It runs when the ai-dlc/incident/respond label (GitHub) or tag (Azure DevOps) is present and one of the following happens (OR logic across match-any entries):

ScenarioWhat it covers
Tag newly appliedA human (or on-call automation) adds ai-dlc/incident/respond to an existing incident item
Item created with tag already presentThe incident is opened with the tag included from the start
PlatformScenarioWebhook eventFilter rule
GitHubTag newly appliedissuesaction==labeled and the just-added label.name=='ai-dlc/incident/respond'
GitHubIssue opened with tagissuesaction==opened and ai-dlc/incident/respond is in issue.labels
Azure DevOpsTag newly appliedworkitem.updatedai-dlc/incident/respond appears in new System.Tags but not in oldValue
Azure DevOpsWork item created with tagworkitem.createdai-dlc/incident/respond is in resource.fields["System.Tags"]

Each execution block in rules.json follows this top-level shape:

FieldPurpose
nameHuman-readable id for the execution
platform"github" or "azuredevops" — drives which provider the plugin uses
repository.urlWebhook path to the repository URL (e.g. repository.clone_url). Omit the entire repository block for Azure DevOps work items — the work item itself is not bound to a single repo.
match-anyArray of trigger filters — first one to match wins
use-inputsMinimal — usually just the entry-point id (e.g. issue-number, workitem-id). The repository URL is injected automatically from the repository block when present.
use-pluginsThe plugin to invoke
with-envsRequired environment variables, sourced from the agent’s secrets.* store and marked mandatory: true
execute-promptThe prompt sent to the agent. Implicit interpolations: {{repository-name}} from the repository block (when present), plus any name from use-inputs
{
"name": "github-issue-incident-response",
"platform": "github",
"repository": {
"url": "repository.clone_url"
},
"match-any": [
{
"name": "github-issue-tag-applied",
"rule": "action==labeled&&label.name=='ai-dlc/incident/respond'"
},
{
"name": "github-issue-opened-with-tag",
"rule": "action==opened&&issue.labels.*.name=='ai-dlc/incident/respond'"
}
],
"use-inputs": [
{ "name": "issue-number", "value": "issue.number", "mandatory": true }
],
"use-plugins": [
{
"plugin-name": "incident-response@xianix-plugins-official",
"marketplace": "xianix-team/plugins-official"
}
],
"with-envs": [
{ "name": "GITHUB-TOKEN", "value": "secrets.GITHUB-TOKEN", "mandatory": true }
],
"execute-prompt": "Issue #{{issue-number}} in {{repository-name}} has been tagged with `ai-dlc/incident/respond`.\n\nRun /incident-response {{issue-number}} to begin automated incident investigation."
}
{
"name": "azuredevops-workitem-incident-response",
"platform": "azuredevops",
"match-any": [
{
"name": "azuredevops-workitem-tag-applied",
"rule": "eventType==workitem.updated&&resource.revision.fields.\"System.Tags\"*='ai-dlc/incident/respond'&&resource.fields.\"System.Tags\".oldValue!*='ai-dlc/incident/respond'"
},
{
"name": "azuredevops-workitem-created-with-tag",
"rule": "eventType==workitem.created&&resource.fields.\"System.Tags\"*='ai-dlc/incident/respond'"
}
],
"use-inputs": [
{ "name": "workitem-id", "value": "resource.workItemId", "mandatory": true }
],
"use-plugins": [
{
"plugin-name": "incident-response@xianix-plugins-official",
"marketplace": "xianix-team/plugins-official"
}
],
"with-envs": [
{ "name": "AZURE-DEVOPS-TOKEN", "value": "secrets.AZURE-DEVOPS-TOKEN", "mandatory": true }
],
"execute-prompt": "Work item #{{workitem-id}} has been tagged with `ai-dlc/incident/respond`.\n\nRun /incident-response {{workitem-id}} to begin automated incident investigation."
}