Grafana Assistant Pre-Builds Infrastructure Knowledge to Cut Incident Response Time
Breaking: Grafana Assistant Now Pre-Loads Infrastructure Context for Instant Incident Response
NEW YORK — Grafana Labs today unveiled a major update to its AI-powered observability assistant, Grafana Assistant, that eliminates the time-consuming context-sharing step during incident response. The assistant now automatically builds a persistent knowledge base of an organization's infrastructure before any query is made, allowing engineers to dive directly into troubleshooting.
"Assistant doesn't learn about your environment on demand. Instead, it studies your infrastructure ahead of time and builds a persistent knowledge base," said Tom Wilkie, Grafana Labs CTO. "By the time you ask your first question, it already knows what's running, how it's connected, and where to look."
The update addresses a common pain point: when an unexpected alert fires, engineers typically ask an AI assistant for help—only to spend precious minutes explaining data sources, services, labels, and metrics. With the new feature, every conversation no longer starts from scratch.
Background: The Context-Sharing Crunch
During an incident, speed is critical. Traditional AI assistants require extensive setup or real-time context injection, wasting valuable time. "That discovery process eats into the time you actually need for troubleshooting," noted Jennifer Lee, senior product manager at Grafana Labs.
Grafana Assistant's pre-built knowledge base changes that. It automatically identifies all connected Prometheus, Loki, and Tempo data sources in a Grafana Cloud stack, scans for services and deployments, and correlates logs and traces with metrics.
How It Works: Zero-Configuration Swarm of AI Agents
The system runs in the background with no setup required. A swarm of AI agents performs three key tasks:
- Data source discovery: Identifies all connected Prometheus, Loki, and Tempo data sources.
- Metrics scans: Queries Prometheus in parallel to find services, deployments, and infrastructure components.
- Enrichments via logs and traces: Correlates Loki and Tempo data with corresponding metrics, adding context about log formats, trace structures, and service dependencies.
For each discovered service group, the agents produce structured documentation covering five areas: the service's identity, key metrics and labels, deployment details, dependencies, and critical integration points.
What This Means for Incident Response
With context preloaded, conversations become faster and more accurate. When an engineer asks about a service, the assistant already knows, for example, that the payment system talks to three downstream services and its latency metrics live in a specific Prometheus data source.
This capability is especially powerful for teams where not everyone has full infrastructure knowledge. A developer investigating an issue in their own service can ask about upstream dependencies and get accurate answers, even without prior familiarity. "Having that context preloaded can shave valuable minutes off your response time," said Wilkie, "even if you're experienced with the system."
Grafana Assistant is available now for Grafana Cloud customers. The company plans to extend the feature to support additional data sources and custom infrastructure types in future releases. For more details, visit the Grafana Labs website.
Related Articles
- Mastering Markdown on GitHub: A Beginner's Q&A Guide
- Microsoft and Coursera Launch 11 New Career-Focused Certificates for AI, Data, and Software Development
- AI in Database Management: A Practical Q&A
- Accelerating Bacterial Evolution: A Step-by-Step Guide to Engineering Microbes for Plastic Degradation
- Building Production-Grade ML Pipelines with ZenML: A Comprehensive Q&A Guide
- Decoding Language from Brain Waves: A Modern NeuroAI Pipeline Using MEG and Deep Learning
- The Feedback Flywheel: Accelerating Team Growth Through AI-Assisted Development Learnings
- Novice Programmer Develops AI Agent to Hack Coding Leaderboards: A Breakthrough in Agentic AI?