This section covers the observability capabilities deployed, how many tools were used for those capabilities, open-source usage for those capabilities, whether telemetry data is unified or siloed, what kinds of data are being integrated with telemetry data, best practices employed, the annual observability spend at the time of the survey, and how often respondents use observability.
Highlights:
were spending at least $1 million per year on observability
were using an open-source solution for one or more observability capabilities
were using 5+ tools for observability
Observability capabilities deployed
Capabilities, not to be confused with characteristics or tools, are specific components of observability. Survey respondents share which of 19 different observability capabilities they deployed. Below are findings by capability, by number of capabilities, and by how many have achieved full-stack observability.
By capability
Survey respondents indicated their organizations deploy observability capabilities by as much as 58% (security monitoring) and as little as 24% (artificial intelligence for IT operations (AIOps) capabilities).
- At least half had deployed each of the core observability capabilities, including security monitoring (58%), network monitoring (57%), database monitoring (55%), alerts (55%), dashboards (54%), infrastructure monitoring (54%), log management (51%), and application performance monitoring (APM; 50%).
- More than a third had deployed key digital experience monitoring (DEM) capabilities—including browser monitoring (44%), error tracking (43%), and mobile monitoring (35%)—as well as artificial intelligence (AI) monitoring (42%) and business observability (40%).
- Less than a third had deployed each of the more advanced capabilities, including AIOps capabilities (24%), synthetic monitoring (26%), distributed tracing (29%), Kubernetes (K8s) monitoring (29%), machine learning (ML) model monitoring (29%), and serverless monitoring (30%).
Organization size insight
Large organizations were the most likely to deploy all capabilities except for AI monitoring, business observability, and serverless monitoring.
Regional insight
Respondents surveyed in Asia Pacific were more likely to deploy AI monitoring, AIOps, and synthetic monitoring, but the least likely to deploy all other capabilities. Those surveyed in Europe were the most likely to deploy most capabilities.
Industry insight
IT respondents were generally more likely than average to deploy most capabilities. Media/entertainment respondents were the most likely to deploy AI-related capabilities and DEM capabilities.
Those who had deployed at least five capabilities were more likely than average to experience less annual downtime, spend less on outages per year, and spend less time addressing disruptions than those who had deployed four or fewer:
- 5+ capabilities: 45% lower median annual downtime, and 24% less engineering time spent addressing disruptions
- 10+ capabilities: 74% lower median annual downtime, 32% lower median hourly outage costs, and 41% less engineering time spent addressing disruptions
- 15+ capabilities: 80% lower median annual downtime, 47% lower median hourly outage costs, and 39% less engineering time spent addressing disruptions
had deployed at least 10 observability capabilities
Organization size insight
Large organizations were the most likely to deploy 10 or more capabilities (40%), followed by midsize (35%) and small (27%).
Regional insight
Respondents surveyed in Europe were the most likely to deploy 10 or more capabilities (46%), followed by the Americas (42%) and Asia Pacific (29%).
Industry insight
IT respondents were the most likely to deploy 10 or more capabilities (53%), followed by healthcare/pharma (48%) and services/consulting (43%).
Full-stack observability prevalence
Based on our definition of full-stack observability, a quarter (25%) of survey respondents’ organizations had achieved it.
Notably, organizations that had achieved full-stack observability experienced 79% less median annual downtime, 48% lower median hourly outage costs, and 44% less time spent addressing disruptions than those that hadn’t achieved it. They also had a 27% lower annual observability spend and were 51% more likely to learn about interruptions with observability. And they were more likely to employ all observability best practices and experience most benefits and business outcomes.
had NOT achieved full-stack observability
Organization size insight
Large organizations were the most likely to have achieved full-stack observability (27% compared to 23% for midsize and 20% for small).
Regional insight
Those surveyed in Europe were the most likely to have achieved full-stack observability (32% compared to 29% for Asia Pacific and 28% for the Americas).
Industry insight
IT respondents were the most likely to have achieved full-stack observability (35%), followed by healthcare/pharma (34%) and services/consulting (31%). Education respondents were the least likely to have achieved full-stack observability (11%), followed by telco (14%), energy/utilities (15%), and government (15%).
Number of monitoring tools
When asked about the number of tools, not to be confused with capabilities or characteristics, they use to monitor the health of their systems, survey respondents overwhelmingly reported using more than one.
- Most (88%) were using multiple tools, including 45% who were using five or more tools (compared to 52% in 2023 and 73% in 2022) and 3% who were using 10 or more tools.
- The average (mean) number of tools used was 4.5, which is 11% fewer than in 2023 (5.1) and 24% fewer than in 2022 (5.9). Similarly, the median number of tools went from six in 2022, to five in 2023, to four in 2024. The most common answer (mode) for 2024 was three tools (18%), followed by five tools (15%).
- Only 6% used just one tool. However, the proportion of respondents using a single tool increased by 37% year-over-year (YoY).
Compared to those using multiple tools for observability, those using a single tool experienced the following benefits:
- 65% lower median annual observability spend ($700,000 compared to $2 million)
- 18% less median annual downtime (249 hours per year compared to 305 hours per year)
- 45% less on median hourly outage costs ($1.1 million per hour compared to $2.0 million per hour)
- 50% less engineering time spent addressing disruptions (about seven hours compared to 13 hours based on a 40-hour work week)
were using 5+ tools for observability
Organization size insight
Small organizations were much more likely to use a single tool (17%) than midsize (7%) or large (4%) organizations, while large and midsize organizations were much more likely to use 5+ tools (47% for both compared to just 26% for small).
Regional insight
Respondents surveyed in Europe were the most likely to use a single tool (8% compared to 6% for those in both the Americas and Asia Pacific), while those surveyed in Asia Pacific were the most likely to use 5+ tools (55% compared to 43% for those in Europe and 35% for those in the Americas).
Industry insight
Healthcare/pharma respondents were the most likely to use a single tool (13%), followed by education (10%) and energy/utilities (9%). Media/entertainment respondents were the most likely to use 5+ tools (60%), followed by financial services/insurance (57%) and telco (55%).
Open-source usage
When we asked survey takers whether they were using an open-source solution in addition to a proprietary solution for each of the 19 observability capabilities listed above, we found that:
- More than half (51%) of respondents were using an open-source solution for one or more observability capabilities. But only about 1% were using only open-source.
- Of the three open-source solutions included in the survey, 38% were using Grafana, 23% were using Prometheus, and 19% were using OpenTelemetry for one or more observability capabilities.
- More than a quarter were using an open-source solution for AI monitoring (31%), synthetic monitoring (28%), distributed tracing (28%), K8s monitoring (27%), APM (27%), and AIOps capabilities (26%).
Respondents were least likely to select open-source support (portability) as the most important observability vendor criteria (16%), and only 21% said that adoption of open-source technologies is a strategy or trend driving the need for observability. However, 24% said they are most likely to use open source in the next year to maximize value from their observability investment.
were using an open-source solution for one or more observability capabilities
Organization size insight
Large organizations were much more likely to use an open-source solution for one or more observability capabilities (55%) compared to midsize (46%) and small (39%) organizations.
Regional insight
Respondents surveyed in Asia Pacific were much more likely to use an open-source solution for one or more observability capabilities (61%) compared to those surveyed in the Americas (44%) and Europe (42%).
Industry insight
Government respondents were the most likely to use an open-source solution for one or more observability capabilities (65%), followed by telco (65%) and financial services/insurance (61%). Services/consulting respondents were the least likely (37%), followed by energy/utilities (43%) and healthcare/pharma (45%).
Unified or siloed telemetry data
When we asked survey respondents about how unified or siloed their organizations’ telemetry data (metrics, events, logs, and traces, or MELT) is, we found:
- Collectively, 38% had more unified telemetry data (increased by 2% from 2023), compared to 37% with more siloed telemetry data (decreased by 6% from 2023)—a roughly even split.
- Only 12% said they had mostly unified telemetry data (they unify telemetry data in one place), and 11% said they had mostly siloed telemetry data (they silo telemetry data in discrete data stores).
- About a quarter (24%) said their telemetry data is roughly equally unified and siloed (increased by 12% from 2023).
Those with five or more tools were 13% more likely to say they have siloed telemetry data to some extent (64%) compared to those with one to four tools (57%).
Compared to respondents with more siloed telemetry data, those with more unified telemetry data on average:
- Experienced 78% less annual downtime (107 hours per year compared to 488 hours per year)
- Spent 11% less engineering time addressing disruptions (28% compared to 32%)
- Had a 4% higher median ROI (302% compared to 290%)
said their telemetry data is siloed to some extent
Organization size insight
Large organizations were the most likely to have more siloed telemetry data (38%), followed by midsize (37%) and small (33%) organizations.
Regional insight
Respondents surveyed in Asia Pacific reported more siloed telemetry data (42%) than those in Europe (39%) or the Americas (30%).
Industry insight
The industries with the highest rates of siloed telemetry data were government (53%), media/entertainment (47%), and education (45%). Those with the highest rates of unified telemetry data were telco (53%), retail/consumer (52%), and services/consulting (41%).
Data integration
To practice true business observability, organizations must integrate their business-related data with their telemetry data (MELT). In reviewing the types of business-related data they said they currently integrate:
- Most (87%) had integrated at least one business-related data type with their telemetry data, including 77% who’d integrated at least two and 35% who’d integrated at least five. Just 4% had integrated all 10.
- Operations data (43%) and customer data (41%) were the most likely to be integrated.
- Product research and human resources data (both 32%) were the least likely to be integrated.
Compared to those who had less than five business-related data types currently integrated with their telemetry data, those who had integrated five or more:
- Spent 32% less on hourly outage costs ($1.5 million compared to $2.2 million)
- Experienced 63% less annual downtime (139 hours compared to 370 hours)
- Spent 27% less engineering time addressing disruptions (11 hours compared to 15 hours based on a 40-hour work week)
had integrated 5+ business-related data types with their telemetry data
Regional insight
Respondents surveyed in Europe were the most likely to integrate 5+ types of business-related data with their telemetry data (39% compared to 34% of those in both the Americas and Asia Pacific).
Industry insight
IT respondents were the most likely to have 5+ types of business-related data integrated with their telemetry data (47%), followed by media/entertainment (41%) and healthcare/pharma (38%). Education respondents were the least likely (19%), followed by energy/utilities (25%) and government (27%).
Best practices employed
We once again asked survey takers which of nine different observability best practices listed in the chart below they employ. We found that:
- Most (83%) had employed at least two best practices, but only 16% had employed five or more.
- Respondents were most likely to say their software deployment uses CI/CD practices (40%) and their infrastructure is provisioned and orchestrated using automation tooling (39%)—but less likely than previous years.
- Compared to last year, 24% more said their telemetry data includes rich metadata and business context to quantify the business impact of events and incidents, 18% more said users broadly have access to telemetry data and visualizations, 13% more said their telemetry is unified in a single pane for consumption across teams, 8% more said their telemetry is captured across the full tech stack, and 2% more said they can query data on the fly.
On average, compared to those employing one to four, those employing five or more observability best practices:
- Experienced 19% less annual downtime (239 hours compared to 294 hours)
- Spent 35% less on hourly outage costs ($1.3 million compared to $2.0 million)
- Spent 38% less engineering time addressing disruptions (21% compared to 34%)
- Were 36% more likely to say MTTD improved to some extent since adopting an observability solution (72% compared to 53%)
- Were 38% more likely to say MTTR improved to some extent since adopting an observability solution (77% compared to 56%)
- Spent 20% less on observability per year ($1.6 million compared to $2.00 million)
Organization size insight
Small organizations were more likely to employ 5+ best practices (24%) than large (17%) and midsize (11%) organizations.
Regional insight
Respondents surveyed in the Americas were the most likely to employ 5+ best practices (21%) than those in Asia Pacific (13%) and Europe (12%).
Industry insight
Services/consulting respondents were the most likely to say they employ 5+ best practices (23%), followed by financial services/insurance (21%) and healthcare/pharma (20%).
Annual observability spend
The median annual spend on observability per year was $1.95 million. More than two-thirds (67%) were spending at least $1 million per year on observability, and only 2% were spending $5 million or more. Just 13% were spending less than $500,000 per year.
Seven factors were associated with a lower median annual observability spend, including:
- Using a single tool for observability: Those using a single tool for observability spent 67% less on observability than those using two or more tools ($700,000 compared to $2.00 million).
- Deploying more observability capabilities: The more capabilities they deployed, the less they spent on observability. For example, those who had deployed five or more observability capabilities spent 13% less on observability per year than those with four or fewer ($1.90 million compared to $2.18 million). Those who had deployed 10 or more observability capabilities spent 30% less on observability per year than those with nine or fewer ($1.50 million compared to $2.15 million).
- Achieving full-stack observability: Those with full-stack observability spent 27% less on observability per year than those without full-stack observability ($1.50 million compared to $2.05 million).
- Learning about interruptions with observability: Those who learn about interruptions with observability spent 23% less on observability per year than those who didn’t ($1.70 million compared to $2.2 million).
- Employing more observability best practices: Those who had employed five or more observability best practices spent 20% less on observability per year than those employing four or fewer ($1.60 million compared to $2.00 million).
- Integrating more types of business-related data with telemetry data: Those who had integrated five or more types of business-related data with their telemetry data spent 10% less on observability per year than those who integrated one to four types ($1.85 million compared to $2.05 million).
- Having more unified telemetry data: Those who had more unified telemetry data spent 5% less on observability per year than those who had more siloed telemetry data ($1.90 million compared to $2.00 million).
Learn about their plans to get the most value out of their observability spend for next year and their median ROI.
were spending at least $1 million per year on observability
Organization size insight
As expected, those from large organizations reported higher median spend ($2.20 million) than those from midsize ($1.85 million) and small ($650,000) organizations.
Regional insight
Those surveyed in Asia Pacific reported a higher median spend ($2.50 million) than those in Europe ($1.75 million) or the Americas ($1.30 million).
Industry insight
Media/entertainment respondents reported the highest median annual observability spend ($2.60 million), followed by financial services/insurance ($2.50 million) and telco ($2.35 million). Education respondents reported the lowest spend ($1.00 million), followed by healthcare/pharma ($1.20 million) and services/consulting ($1.40 million).