It’s no secret to anyone working in technology that IT’s operating world is becoming more demanding and complex.
Digital transformation, hybrid working, exponentially increasing data volumes,
greater security risks, and expanding global regulations are all driving up
business demands and expectations for reliable and robust technology
operations. Business leaders expect IT teams to evolve their digital
operational capabilities and support the speed and breadth of technological
capabilities needed to compete.
The speed of change and the breadth of technologies drives IT operation’s challenges. Supporting innovation, multicloud environments, frequent application deployments, microservice architectures, high-performing customer experiences, machine learning model operations (ModelOps), and real-time data processing requirements are complexities IT operations must address.
In The 2021 State of Digital Operations Management produced by OpsRamp, survey respondents highlighted the paradoxes of improving operations and rationalizing technology complexities. IT leaders conceded their biggest barriers to meeting organizational goals include keeping up with the pace of technology innovation (55%) and the pace of business innovation (35%). At the same time, they acknowledged complexities driven by legacy tools (42%), siloed organizations (40%), understaffed IT, (30%), and the lack of skills (30%).
What’s the answer for IT to improve operations amid growing complexities?
Saying “no” to the business or aiming to consolidate to homogenous technology
platforms is not a viable strategy for controlling the demand or managing the
technical challenges.
Here are three steps IT operations should consider:
1. Consolidate Monitoring Tools Used in IT Operations
In the OpsRamp survey, 83 percent of respondents state that they have eleven
or more IT operations tools in use. For understaffed IT departments struggling
to keep up with the skills required to support IT, having too many tools can
be a drag force on productivity and performance.
One place to review and audit is where monitoring tools are deployed and
utilized. Over two decades of businesses supporting internet applications, IT
added tools to monitor user experiences, application performance, APIs,
integrations, and databases. They capture data and send alerts on
infrastructure issues, application errors, and business service disruptions,
and are far more effective than having end-users open incident tickets. But
having too many tools, data sources, and uncorrelated alerts is also a
problem.
AIOps and applying machine learning in IT operations
paves a path to tool consolidation, and in the OpsRam survey, 63 percent plan
to use AIOps as part of their
IT tool consolidation strategy.
How does IT tool consolidation happen with AIOps?
An AIOps solution helps consolidate the data, alerts, and tools used during
incident management. Instead of a bridge call of experts investigating issues
with multiple tools, the team starts its triage using one AIOps tool,
correlated data, and fewer independent alerts. Leaders can then streamline
which monitoring tools to standardize and look to sunset redundant tools and
data sources.
2. Leverage Machine Learning to Improve Incident Management
Consolidating the number of monitoring tools has a financial ROI and reduces
the required IT skills, but IT leaders also use AIOps solutions to improve
operational performance. In the OpsRamp survey, 70% of respondents looking to
implement AIOps solutions aim to solve critical issues faster.
When there is a major incident, count the number of people that join the
bridge call or participate in the war room. How many alerts need
investigating, and which monitoring tools are most useful for diagnosing root
causes?
AIOps solutions improve incident management
by reducing noisy alerts, correlating events, delivering actionable
inferences, and identifying probabilistic root causes.
In other words, instead of requiring a bunch of people to gather and decipher
all the alerts, a machine learning algorithm has already started the process
of analyzing the data, correlating the alerts, and presenting information in a
consistent way for incident management teams to review. The analysis can help
reduce the amount of time required to resolve major incidents by as much as 95
percent.
IT can achieve these dramatic improvements in resolving incidents by combining the machine learning and automation capabilities in AIOps solutions. For example, when tier-1 support teams review incidents where machine learning has correlated alerts into a high-likelihood root cause, the support team can trigger automated recovery tasks and close incident tickets faster.
3. Enable IT to Support Multicloud Digital Experiences
Here are some of my recommendations on what IT should target in their operating charters:
- Enhance business capabilities by delivering reliable, high performing, and secure digital experiences to customers and employees
- Provide technology agility and flexibilities as most organizations are operating hybrid clouds and many target multicloud capabilities
- Automate repeatable tasks and orchestrate complex procedures to reduce risk, improve quality, address security, enhance communications, and free up people’s time
- Leverage data in decision-making, servicing customers, reducing risks, and prioritizing initiatives
- Simplify operations by using machine learning, integration, and automation capabilities
Achieving this charter requires standardizing on platforms that help balance
the paradox. On the one hand, AIOps solutions, automation capabilities, and
integrations improve IT productivity and reduce complexity. On the other, IT
expands capabilities by supporting multicloud applications and proactively
addressing issues that impact digital experiences. In the middle, IT operation
focuses on selecting and improving
key performance indicators driven by AIOps, including service level objectives, mean time to resolve (MTTR) incidents,
and time savings driven by automation.
Proactive IT leaders should communicate the goals (digital experiences, technical agility), the strategy (AIOps and automation), and the service level objectives to align IT and execute an incremental roadmap. AIOps and automation is the key capability for IT to deliver against a growing business charter while rationalizing operational complexities.
This post is brought to you by OpsRamp.
The views and opinions expressed herein are those of the author and do not necessarily represent the views and opinions of OpsRamp.
No comments:
Post a Comment
Comments on this blog are moderated and we do not accept comments that have links to other websites.